CS372: Solutions for Homework 4

Problem 1:

Which of the following components is responsible for loading the initial value in the program counter for an application program before it starts running: The loader. The linker groups several modules and resolves external references, but the linker really does not perform the
actual loading of the program itself (except for dynamic linking, which we will study later in the course).

Problem 2:

Given the following program that uses three memory segments in an address space as described in class (code segment, data segment and stack segment):

char a[100];
main(int argc, char ** argv)
{
    int d;
    static double b;
    char * s = "boo", * p;

    p = malloc(300);
    return 0;
}

Identify the segment in which each variable resides and indicate if the variable is private to a thread or shared across threads. Be careful.

The array a, the static variable b, and the string constant "boo" are all in the data segment and are shared across threads.
The arguments argc and argv are in the stack segment and are private to the thread.
The automatic variables d, s, and p are also in the stack segment and are private to the thread.
Note that the variable p itself is in the stack segment (private), but the object it points to is in the data segment which is a shared region  of an address space(that's why the be careful warning). The contents of s consist of the address of the string "boo" which happens to be in the data segment (shared).

Problem 3

A hardware designer argues that there are enough transistors on the chip to provide 1024 integer registers and 512 floating point registers. You have been invited as the operating system guru to give opinion about the new design.
  1. What is the effect of having such a large number of registers on the operating system?
  2. What additional hardware features you would recommend added to the design above.
  3. What happens if the hardware designer also wants to add a 16-station pipeline into the CPU. How would that affect the context switching overhead?
  1. The cost of a context switch or interrupt handling becomes very expensive. Also, the data structures that store the registers as part of a process context become large. This leads high overhead in memory and also may make the implementation of lightweight threads expensive.
  2. A multithreaded architecture would be a good use of the abundance of registers. The idea is to partition the registers into a number of banks, for example, a set of 1024 registers could be divided into 32 banks of 32 registers each. Then, wewould be able to have the register sets of 32 processes without having to save it or restore it each time a context switchoccurs.
  3. Since the pipeline has to be flushed on an interrupt, a deeper pipeline will take time to be flushed increasing latency ofserving the interrupt and overhead. The penalty of a context switch increases.

Problem 4

Given the following piece of code:
    main(int argc, char ** argv)
    { }
How many different copies of the variable c are there? What are their values?

The piece of code shown creates two processes. Therefore, we have a total of three processes, the parent, the first and second
child. Each of these has its own private copy of the variable c. For the parent, the variable c be 20 before the end of the program.
For the first child (the one created in the first program statement), the variable c will contain the value 10 before the end of the
program. For the second child (the one created in the else clause), the variable c will contain the value 15 before the end of the
program.

Problem 5

Given the following piece of code How many processes are created if the above piece of code is run? Hint: It may be easier to solve this problem by induction.

To solve this problem, we compute the number of processes that get created by calling the function forkthem(). This can be given
by the following equation:

n > 0:  T(n) = 2 T(n-1) + 1
n = 0:  T(0) = 0

where T(n) is the number of processes created by the function. To see why this is the case, consider what happens when the
function is called. The first statement calls the system call fork() which creates a child in addition to the caller. Both the caller and the
child then execute a recursive call to forkthem() with an argument set to n-1. Therefore, a call to forkthem() creates one process of
its own, and then is responsible for all the children that will get created by the function with n-1.
The solution to the recurrence equation is 2^n-1((2ra\ ised_to_n) - 1).

Problem 6

What is the output of the following programs (inspect the manual for the system calls if you need more information, but please solve the problem without compiling and running the program).

Program 1:
main()
{
    val = 5;
    if(fork())
        wait(&val);
    val++;
    printf("%d\n", val);
    return val;
}

Program 2:
main()
{
    val = 5;
    if(fork())
        wait(&val);
    else
        exit(val);
    val++;
    printf("%d\n", val);
    return val;
}

In the first program, the parent process creates a child and then waits for the child to exit (through the system call "wait"). The child executes and prints out the value of val, which is "6" after the v++ statement. The child then returns the value of val to the parent, which receives it in the argument to "wait" (& val). The parent then prints out the value of val, which is now 7. Note that the parent and child have seperate copies of the variable "val".

Using similar reasoning, you can see that the parent in program 2 waits for the child to return, and the child exits immediately. In this case only one value gets printed out which is the number 6 (from the parent process.)
 
 

Problem 7

Suppose a program references pages in the following sequence:
ACBDBAEFBFAGEFA
Suppose the computer on which this program is running has 4 pages of physical memory.
  1. Show how LRU-based demand paging would fault pages into the four frames of physical memory.
    Solution:
      ACBDBA EFBFA GEFA
    1 A     +     +   +
    2 C      E     G   
    3  B +    +     E  
    4   D    F  +   +  

  2. Show how OPT (or MIN) based demand paging would fault pages into the four frames of physical memory.
    Solution:
      ACBDBA EFBFA GEFA
    1 A     +     +   +
    2  C      E      +   
    3   B +    +   G    
    4   D    F  +   +  

  3. Show how clock-based demand paging would fault pages into the four frames of physical memory.
    Solution:
      ACBDBA EFBFA GEFA
    1 A11 11 1+1 E11111 0+111
    2  C11111 0 F11+1100 +11
    3    B11+110 0 +1 1 0G1111
    4     D1110000 A111 1+1

    "+" (plus sign) means cache hit
    1/0 means the state of the "used" bit at the end of the specified step
    bold italics is the position of the clock hand at the end of the specified step (this is what will be looked at first the next time the clock hand moves)

Problem 8

Assume that you have a page reference string for a process. Let the page reference string have length p with n distinct page numbers occurring in it. Let m be the number of page frames that are allocated to the process (all the page frames are initially empty). Let n > m.
  1. What is the lower bound on the number of page faults?
  2. What is the upper bound on the number of page faults?
Hint: Your answer is independent of the page replacement scheme that you use.

Solution:
Need solution.

Problem 9

Belady's anomaly: Intuitively, it seems that the more frames the memory has, the fewer page faults a program will get. Surprisingly enough, this is not always true. Belady (1969) discovered an example in which FIFO page replacement causes more faults with four page frames than with three. This strange situation has become known as Belady's anomaly. To illustrate, a program with five virtual pages numbered from 0 to 4 references its pages in the order:
0 1 2 3 0 1 4 0 1 2 3 4
  1. Using FIFO replacement, compute the number of page faults with 3 frames. Repeat for 4 frames.
  2. Compute the number of page faults under LRU, NRU, the clock algorithm, and the optimal algorithm.
Solution:
  1. The number of page faults is 9 with 3 frames, 10 with 4.
  2. The point of the exercise is to see that the performance of LRU, NRU, clock algorithm and optimal all offer better performance with larger number of frames.