



Solution:
Solution:
We need to analyze memory and time requirements of paging schemes in order to make a decision. Average process size is considered in the calculations below.
1 Level Paging
Since we have 2^23 pages in each virtual address space, and we use
4 bytes per page table entry, the size of the page table will be 2^23 * 2^2 =
2^25. This is 1/256 of the process' own memory space, so it is quite costly.
(32 MB)
2 Level Paging
The address would be divided up as 12  11  13 since we want page
table pages to fit into one page and we also want to divide the bits roughly
equally.
Since the process' size is 8GB = 2^33 B, I assume what this means is that the total size of all the distinct pages that the process accesses is 2^33 B. Hence, this process accesses 2^33 / 2^13 = 2^20 pages. The bottom level of the page table then holds 2^20 references. We know the size of each bottom level chunk of the page table is 2^11 entries. So we need 2^20 / 2^11 = 2^9 of those bottom level chunks.
The total size of the page table is then:
//size of the outer page table  //total size of the inner pages  
1 * 2^12 * 4  + 2^9 * 2^11 * 4  = 2^20 * ( 2^6 + 4) ~4MB 
3 Level Paging
For 3 level paging we can divide up the address as follows:
8  8  7  13
Again using the same reasoning as above we need 2^20/2^7 = 2^13 level 3 page table chunks. Each level 2 page table chunk references 2^8 level 3 page table chunks. So we need 2^13/2^8 = 2^5 level2 tables. And, of course, one level1 table.
The total size of the page table is then:
//size of the outer page table  //total size of the level 2 tables  //total size of innermost tables  
1 * 2^8 * 4  2^5 * 2^8 *4  2^13 * 2^7 * 4  ~4MB 
8bit 4bit 8bit 12bitWe use a 3level page table, such that the first 8 bits are for the first level and so on. Physical addresses are 44 bits and there are 4 protection bits per page. Answer the following questions, showing all the steps you take to reach the answer. A simple number will not receive any credit.
Solution:
Since physical addresses are 44 bits and page size is 4K, the page frame number occupies 32 bits. Taking the 4 protection bits into account, each entry of the level3 page table takes (32+4) = 36 bits. Rounding up to make entries byte (word) aligned would make each entry consume 40 (64) bits or 5 (8) bytes. For a 256 entry table, we need 1280 (2048) bytes.
The toplevel page table should not assume that 2^{nd} level page tables are pagealigned. So, we store full physical addresses there. Fortunately, we do not need control bits. So, each entry is at least 44 bits (6 bytes for bytealigned, 8 bytes for wordaligned). Each toplevel page table is therefore 256*6 = 1536 bytes (256 * 8 = 2048 bytes).
Trying to take advantage of the 256entry alignment to reduce entry size is probably not worth the trouble. Doing so would be complex; you would need to write a new memory allocator that guarantees such alignment. Further, we cannot quite fit a table into a 1024byte aligned region (4410 = 34 bits per address, which would require more than 4 bytes per entry), and rounding the size up to the next power of 2 would not save us any size over just storing pointers and using the regular allocator.
Similarly, each entry in the 2^{nd} level page table is a 44bit physical pointer, 6 bytes (8 bytes) when aligned to byte (word) alignment. A 16 entry table is therefore 96 (128) bytes. So the space required is 1536 (2048) bytes for the toplevel page table + 96 (128) bytes for one secondlevel page table + 1280 (2048) bytes for one thirdlevel page table = 2912 (4224) bytes. Since the process can fit exactly into 16 pages, there is no memory wasted by internal fragmentation.
So the space required is 1536 (2048) bytes for the top level page table + 3 * 96 (3 * 128) bytes for 3 secondlevel page tables + 3 * 1280 (3 * 2048) for 3 thirdlevel page table = 5664 (8576) bytes.
As the code, data, stack segment of the process fits exactly into 12, 150, 16 pages respectively, there is no memory wasted by internal fragmentation.
Solution:
Need solution.
Solution:
Need solution.
int a[1024][1024], b[1024][1024], c[1024][1024]; multiply() { unsigned i, j, k; for(i = 0; i < 1024; i++) for(j = 0; j < 1024; j++) for(k = 0; k < 1024; k++) c[i][j] += a[i,k] * b[k,j]; }Assume that the binary for executing this function fits in one page, and the stack also fits in one page. Assume further that an integer requires 4 bytes for storage. Compute the number of TLB misses if the page size is 4096 and the TLB has 8 entries with a replacement policy consisting of LRU.
Solution:
1024*(2+1024*1024) = 1073743872
The binary and the stack
each fit in one page, thus each takes one entry in the TLB. While the function
is running, it is accessing the binary page and the stack page all the time. So
the two TLB entries for these two pages would reside in the TLB all the time
and the data can only take the remaining 6 TLB entries.
We assume the two entries are already in TLB when the function begins to run. Then we need only consider those data pages.
Since an integer requires 4 bytes for storage and the page size is 4096 bytes, each array requires 1024 pages. Suppose each row of an array is stored in one page. Then these pages can be represented as a[0..1023], b[0..1023], c[0..1023]: Page a[0] contains the elements a[0][0..1023], page a[1] contains the elements a[1][0..1023], etc.
For a fixed value of i, say 0, the function loops over j and k, we have the following reference string:
a[0], b[0], c[0], a[0], b[1], c[0], ¡ a[0], b[1023], c[0]For the reference string (1024 rows in total), a[0], c[0] will contribute two TLB misses. Since a[0] and b[0] each will be accessed every four memory references, the two pages will not be replaced by the LRU algorithm. For each page in b[0..1023], it will incur one TLB miss every time it is accessed. So the number of TLB misses for the second inner loop is
¡
a[0], b[0], c[0], a[0], b[1], c[0], ¡ a[0], b[1023], c[0]
Solution:
Need solution.
Solution: