Upload
ah-chong
View
215
Download
0
Embed Size (px)
Citation preview
8/3/2019 lec4-virt-mem
1/8
Virtual Memory Architecture of Parallel Computers 1
Virtual Memory Systems
Virtual memory is another level in the memory hierarchy.
All general-purpose processors today employ virtual memory.
In our quest to make the fastest possible computer system, we sometimesmust acquiesce to the programmersand add function that slows thesystem down.
We add cache memoryto a computer system solely for the purposeof speeding up the processing rate.
We add virtual memoryto a computer system as a convenience forthe programmer. Virtual memory slows down the processing rate.
ExternalCache
(Kbytes toMbytes)
10 - 20 ns
(with small cache)2 - 5 ns
MainMemory
(Mbytes toGbytes)
50 - 100 ns
Virtual Memory(Gbytes and up)
10 - 100 ms
Processor
8/3/2019 lec4-virt-mem
2/8
1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 2
Multiple address spaces
Virtual memory allows multiple large linear address spacestoconcurrently exist in a single real memory by breaking each address spaceinto fixed sized pages, and keeping only a subset of the pages in realmemory. The remaining pages of each virtual memory space may or maynot be defined and allocated space in the system. If they are defined, theyare allocated space in a backing store, usually fixed disk.
The job of the computer system architect is to minimize the overheadimposed by the virtual memory design.
The subject of virtual memory is covered in detail in CSC 501, OperatingSystem Principles. We will consider here only those aspects of virtualmemory that affect system performance and the physical memory systemdesign.
Whats the problem with Virtual Memory?
Cache memory deals with physical memory addresses.
The processor actually generates virtual memory addresses.
Before we can begin our search of the cache, we must translate thevirtual address into a physical address.
8/3/2019 lec4-virt-mem
3/8
Virtual Memory Architecture of Parallel Computers 3
Example Virtual Memory System
16 Megabytes of real memory (24-bit address)
4 Gigabytes of virtual memory per address space (32-bit address).
4K page size.
Page Frame 0
Page Frame 1
Page Frame 2
Page Frame 4094
::
Physical Memory
Page Frame 3
Page Frame 4
Page Frame 5
Page Frame 4095
16 Mbytes Real
Page 0
Page 1
Page 2
Page 3
Page 4
Page 5
Page 1,048,574
Page 1,048,575
Page 6
::::
Virtual Memory 1
Page 7
4 Gbytes Virtual
Page 0
Page 1
Page 2
Page 3
Page 4
Page 5
Page 1,048,574
Page 1,048,575
Page 6
::::
Virtual Memory 0
Page 7
4 Gbytes Virtual
Page 0
Page 1
Page 2
Page 3
Page 4
Page 5
Page 1,048,574
Page 1,048,575
Page 6
::::
Virtual Memory n
Page 7
4 Gbytes Virtual
8/3/2019 lec4-virt-mem
4/8
1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 4
Mapping of the virtual addresses to physical addresses
Most virtual memory systems use a two-level lookup to translate virtualaddresses into real addresses.
The virtual address is divided into:
A segment table entry (STE) number
A page table entry (PTE) number
A byte displacement into the page
The system uses a single segment table pointerto locate (in realmemory) the segment table that is currently in use.
Each segment table entry contains:
Whether or not the corresponding page table is valid (defined).
Whether or not it is resident in real memory or paged out.
The address of the page table (real or disk) if it is valid.
Each page table entry contains:
Whether or not the page is valid (defined).
Whether or not it is resident in real memory or paged out.
The address of the page (real or disk) if it is valid.
8/3/2019 lec4-virt-mem
5/8
Virtual Memory Architecture of Parallel Computers 5
Example virtual address translation
The following diagram shows the translation of a 32-bit virtual address intoa 24-bit real address using a two-level lookup system.
1K possible page tables
1K pages per page table
4K page size
Page Frame 0
Page Frame 1
Page Frame 2
Page Frame 4095
::
Physical Memory
10 bits 12 bits
Virtual Memory Address
Bytedisplacementwithin a page
Page tableentry number
10 bits
Segment table
entry number
STE 0 (^PT0)
STE 1 (^PT1)
STE 2 (^PT2)
STE 1046
STE 1047
Segment Table
::
PTE 0 (^Page0)
PTE 1 (^Page1)
PTE 2 (^Page2)
PTE 1046
PTE 1047
Page Table 0
::
PTE 0 (^Page0)
PTE 1 (^Page1)
PTE 2 (^Page2)
PTE 1046
PTE 1047
Page Table 1
::
PTE 0 (^Page0)
Page Table 2
:
8/3/2019 lec4-virt-mem
6/8
1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 6
Steps to translate virtual to real
Index into the segment table using the segment table address and theSTE number in the virtual address.
If the page table is not valid, generate an address exception interrupt.
If the page table is valid but paged out, generate a page faultinterrupt.
Index into the corresponding page table using the page table addressin the STE and the PTE number in the virtual address.
If the page is not valid, generate an address exception interrupt.
If the page is valid but paged out, generate a page fault interrupt.
Concatenate the page frame address in the PTE and the bytedisplacement in the virtual address to form the real address.
We are now ready to see if the data at the real address is in the cache!
BUT, we just had to access the system memory TWICE in order totranslate the virtual address into the real address so we can get the data
the processor really wants.
Every access to memory is now three accesses to memory:
Access memory to get the segment table entry.
Access memory to get the page table entry.
Access memory to read or write the data requested by the processor.
This does not bode well for the performance of our computer system.
8/3/2019 lec4-virt-mem
7/8
Virtual Memory Architecture of Parallel Computers 7
The Translation Lookaside Buffer
We can minimize the performance effects of this two-level lookup with theuse of a translation lookaside buffer(TLB).
The TLB is simply a cache of translated addresses. Whenever wetranslate a virtual address to a real address, we save the translation in theTLB.
We can manage the TLB in the same way as a data cache:
Direct map, n-way set associative, Fully associative.
LRU replacement algorithm.
Split TLB (separate instruction and data TLBs).
Virtually ALL TLBs today are implemented on-chip with the processor.
Performance is critical.
Very few bits (relative to data cache) are needed to implement it.
Sizes are on the order of 64 to 256 entries.
23EF0 013
602D2 2E3
9A517 800
01B69 014
C3459 16B
VirtualPage
RealPage
8/3/2019 lec4-virt-mem
8/8
1997, 1999 E.F. Gehringer, G.Q. Kenney CSC 506, Summer 1999 8
A flowchart of memory access operation:
Dr. Ed Gehringer shows this flowchart to give an overall idea of thecomplexity of what goes on in a modern computer system to retrieve aword from memory. Note that this chart assumes that the desired page isresident in main memory there is no provision for a page fault, and thatwe do not have a multi-level cache.
Page
number
Byte within
page
Virtual address
Search TLB
TLB hit?
Select TLB victim
to be replaced
Translate virt. addr.
to physical addr.
No
Enter new
(virt., phys.)
addr. pair in TLB
YesBlock
number
Byte within
block
Update
replacement status
of TLB entries
Search tags
of cache lines
Cache
hit?
No
Yes
Fetch block from
main memory
Select cache victim
to be replaced
Store new block
in cache
Update
replacement status
of cache entries
Fetch block
from cache
Select desired
bytes from block
Send byte(s)to processor