lec4-virt-mem

Embed Size (px)

Citation preview

  • 8/3/2019 lec4-virt-mem

    1/8

    Virtual Memory Architecture of Parallel Computers 1

    Virtual Memory Systems

    Virtual memory is another level in the memory hierarchy.

    All general-purpose processors today employ virtual memory.

    In our quest to make the fastest possible computer system, we sometimesmust acquiesce to the programmersand add function that slows thesystem down.

    We add cache memoryto a computer system solely for the purposeof speeding up the processing rate.

    We add virtual memoryto a computer system as a convenience forthe programmer. Virtual memory slows down the processing rate.

    ExternalCache

    (Kbytes toMbytes)

    10 - 20 ns

    (with small cache)2 - 5 ns

    MainMemory

    (Mbytes toGbytes)

    50 - 100 ns

    Virtual Memory(Gbytes and up)

    10 - 100 ms

    Processor

  • 8/3/2019 lec4-virt-mem

    2/8

    1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 2

    Multiple address spaces

    Virtual memory allows multiple large linear address spacestoconcurrently exist in a single real memory by breaking each address spaceinto fixed sized pages, and keeping only a subset of the pages in realmemory. The remaining pages of each virtual memory space may or maynot be defined and allocated space in the system. If they are defined, theyare allocated space in a backing store, usually fixed disk.

    The job of the computer system architect is to minimize the overheadimposed by the virtual memory design.

    The subject of virtual memory is covered in detail in CSC 501, OperatingSystem Principles. We will consider here only those aspects of virtualmemory that affect system performance and the physical memory systemdesign.

    Whats the problem with Virtual Memory?

    Cache memory deals with physical memory addresses.

    The processor actually generates virtual memory addresses.

    Before we can begin our search of the cache, we must translate thevirtual address into a physical address.

  • 8/3/2019 lec4-virt-mem

    3/8

    Virtual Memory Architecture of Parallel Computers 3

    Example Virtual Memory System

    16 Megabytes of real memory (24-bit address)

    4 Gigabytes of virtual memory per address space (32-bit address).

    4K page size.

    Page Frame 0

    Page Frame 1

    Page Frame 2

    Page Frame 4094

    ::

    Physical Memory

    Page Frame 3

    Page Frame 4

    Page Frame 5

    Page Frame 4095

    16 Mbytes Real

    Page 0

    Page 1

    Page 2

    Page 3

    Page 4

    Page 5

    Page 1,048,574

    Page 1,048,575

    Page 6

    ::::

    Virtual Memory 1

    Page 7

    4 Gbytes Virtual

    Page 0

    Page 1

    Page 2

    Page 3

    Page 4

    Page 5

    Page 1,048,574

    Page 1,048,575

    Page 6

    ::::

    Virtual Memory 0

    Page 7

    4 Gbytes Virtual

    Page 0

    Page 1

    Page 2

    Page 3

    Page 4

    Page 5

    Page 1,048,574

    Page 1,048,575

    Page 6

    ::::

    Virtual Memory n

    Page 7

    4 Gbytes Virtual

  • 8/3/2019 lec4-virt-mem

    4/8

    1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 4

    Mapping of the virtual addresses to physical addresses

    Most virtual memory systems use a two-level lookup to translate virtualaddresses into real addresses.

    The virtual address is divided into:

    A segment table entry (STE) number

    A page table entry (PTE) number

    A byte displacement into the page

    The system uses a single segment table pointerto locate (in realmemory) the segment table that is currently in use.

    Each segment table entry contains:

    Whether or not the corresponding page table is valid (defined).

    Whether or not it is resident in real memory or paged out.

    The address of the page table (real or disk) if it is valid.

    Each page table entry contains:

    Whether or not the page is valid (defined).

    Whether or not it is resident in real memory or paged out.

    The address of the page (real or disk) if it is valid.

  • 8/3/2019 lec4-virt-mem

    5/8

    Virtual Memory Architecture of Parallel Computers 5

    Example virtual address translation

    The following diagram shows the translation of a 32-bit virtual address intoa 24-bit real address using a two-level lookup system.

    1K possible page tables

    1K pages per page table

    4K page size

    Page Frame 0

    Page Frame 1

    Page Frame 2

    Page Frame 4095

    ::

    Physical Memory

    10 bits 12 bits

    Virtual Memory Address

    Bytedisplacementwithin a page

    Page tableentry number

    10 bits

    Segment table

    entry number

    STE 0 (^PT0)

    STE 1 (^PT1)

    STE 2 (^PT2)

    STE 1046

    STE 1047

    Segment Table

    ::

    PTE 0 (^Page0)

    PTE 1 (^Page1)

    PTE 2 (^Page2)

    PTE 1046

    PTE 1047

    Page Table 0

    ::

    PTE 0 (^Page0)

    PTE 1 (^Page1)

    PTE 2 (^Page2)

    PTE 1046

    PTE 1047

    Page Table 1

    ::

    PTE 0 (^Page0)

    Page Table 2

    :

  • 8/3/2019 lec4-virt-mem

    6/8

    1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 6

    Steps to translate virtual to real

    Index into the segment table using the segment table address and theSTE number in the virtual address.

    If the page table is not valid, generate an address exception interrupt.

    If the page table is valid but paged out, generate a page faultinterrupt.

    Index into the corresponding page table using the page table addressin the STE and the PTE number in the virtual address.

    If the page is not valid, generate an address exception interrupt.

    If the page is valid but paged out, generate a page fault interrupt.

    Concatenate the page frame address in the PTE and the bytedisplacement in the virtual address to form the real address.

    We are now ready to see if the data at the real address is in the cache!

    BUT, we just had to access the system memory TWICE in order totranslate the virtual address into the real address so we can get the data

    the processor really wants.

    Every access to memory is now three accesses to memory:

    Access memory to get the segment table entry.

    Access memory to get the page table entry.

    Access memory to read or write the data requested by the processor.

    This does not bode well for the performance of our computer system.

  • 8/3/2019 lec4-virt-mem

    7/8

    Virtual Memory Architecture of Parallel Computers 7

    The Translation Lookaside Buffer

    We can minimize the performance effects of this two-level lookup with theuse of a translation lookaside buffer(TLB).

    The TLB is simply a cache of translated addresses. Whenever wetranslate a virtual address to a real address, we save the translation in theTLB.

    We can manage the TLB in the same way as a data cache:

    Direct map, n-way set associative, Fully associative.

    LRU replacement algorithm.

    Split TLB (separate instruction and data TLBs).

    Virtually ALL TLBs today are implemented on-chip with the processor.

    Performance is critical.

    Very few bits (relative to data cache) are needed to implement it.

    Sizes are on the order of 64 to 256 entries.

    23EF0 013

    602D2 2E3

    9A517 800

    01B69 014

    C3459 16B

    VirtualPage

    RealPage

  • 8/3/2019 lec4-virt-mem

    8/8

    1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 8

    A flowchart of memory access operation:

    Dr. Ed Gehringer shows this flowchart to give an overall idea of thecomplexity of what goes on in a modern computer system to retrieve aword from memory. Note that this chart assumes that the desired page isresident in main memory there is no provision for a page fault, and thatwe do not have a multi-level cache.

    Page

    number

    Byte within

    page

    Virtual address

    Search TLB

    TLB hit?

    Select TLB victim

    to be replaced

    Translate virt. addr.

    to physical addr.

    No

    Enter new

    (virt., phys.)

    addr. pair in TLB

    YesBlock

    number

    Byte within

    block

    Update

    replacement status

    of TLB entries

    Search tags

    of cache lines

    Cache

    hit?

    No

    Yes

    Fetch block from

    main memory

    Select cache victim

    to be replaced

    Store new block

    in cache

    Update

    replacement status

    of cache entries

    Fetch block

    from cache

    Select desired

    bytes from block

    Send byte(s)to processor