CS 307 Assignment 6, Algorithm Analysis
Programming Assignment 6: This is a pair assignment. You may work with one other person on this assignment using the pair programming technique. Review this paper on pair programming. You are not required to work in a pair on the assignment. If you begin working with one partner and do not wish to finish the assignment with that partner you must complete the assignment individually. If you work with a partner the intent is that work together, at the same computer, on the assignment. One person "drives" (does the typing and explains what they are doing) and the other person "navigates" (watches and asks questions when something is unclear). You should not partition the work, work on your own, and then put things together. You may not acquire, from any source (e.g., another student or student pair or an internet site), a partial or complete solution to a problem or project that has been assigned. You may not show another student or student pair your solution to an assignment. You may not have another person (current student, former student, tutor, friend, anyone) “walk you through” how to solve an assignment. Review the class policy on collaboration from the syllabus.
If you work with a partner you will turn in one version of your code. Pick which account to submit the code to.
If you are working with a partner and want to use slip days you must both have the required number of slip days and both students use slip days. If the assignment is turned in 1 day late each student in a pair must have at least 1 slip day and it costs each student in the pair 1 slip day.
The purposes of this assignment are to
Summary: This assignment has very little programming and is more like a math assignment. You will analyze small sections of code to determine Big O and run experiments on code fragments to compare actual run times to predicted run times. When stating Big O use the most restrictive correct function. (Recall most algorithms are O(NN) according to the formal definition of Big O.)
Provided Files:
| File | Responsibility | |
| Questions | AlgorithmAnalysis.txt. Answer the questions in this file. | Provided by me. |
| Source code | SampleAlgorithms.java. Contains a main method and various methods. You must call each method with various values as described in the assignment handout | Provided by me and you. |
| Utility class | FileToStringConverter.java. A class that allows a user to pick a file from the system to analyze. | Provided by me. |
| Utility class | Stopwatch.java. A class for calculating elapsed time when running other code. | Provided by me. |
| Submission | A6.txt. A text file (no word processor or rtf formats. Most word processors allow you to save a file as a text file.) with your answers to all the questions in the assignment. If working with a partner turn in one file to either partners account. | Provided by me and you. (Okay, mostly you.) |
Instructions for completing the assignment:
1. Answer all of the questions AlgorithmAnalysis.txt. There are 16 questions about various code segments and expected run times. Answer all the questions and place your answers in A6.txt.
2. SampleAlgorithms.java has 11 methods named method1 through method11. Here is what to with these methods:
a. Determine the T(N) for methods method1, method2, method3, and method4. Place your answers in your A6.txt file. See format below.
b. Determine the Big O for all 11 methods. Place your answers in your A6.txt file. See format below.
c. You will run each method with an initial value of N. (amount of data, in most cases on the assignment the magnitude of an integer.) Record the time it takes for the method to complete for that initial value of N and repeat this 10 times. Use the Stopwatch class to do this. You should try to find a value for N that gives an answer greater than 0.005 seconds. (5 milliseconds) (This may not be possible due to the speed of some computers. If you cannot obtain a time greater than 5 milliseconds it is permissible to use a smaller time. Since you have to double N three times the largest valid initial value of N is
268435455.)
Record the average time it took for the method to complete with that initial value of N based on 10 runs. (Use a loop to automate the 10 runs.)
Now double N and repeat. Run the method 10 times with the new value. Again, record the average time.Do this (doubling the amount of data or magnitude of your integer) two more times and run the method 10 times recording the average time.
For example assume we run method5 with n = 1000 . We run method5 10 times with n = 1000. We use the Stopwatch to get the total time it takes for the 10 calls to method5 with n = 1000 to complete. Compute the average time for method5 to complete with n = 1000 (total time / 10). Now repeat the experiement with n = 2000, n = 4000, and n = 8,000. For each method list the various values for N and the average time.
d. For each method calculate the factor the running time changed from one value of N to the next. (Examples of factor changes are shown in the Big O slides, slide numbers 42 and 43.) How did the actual run times obtained when doubling N compare with the predicted run time for doubling N based on the Big O of the code? For example if a method is O(N) then you would expect the run time to double when N is doubled. Did this happen or not?
Here is an example of the format you should use to show the data for each method. (First 3 digits of times are okay. You can truncate what Stopwatch gives you.)
methodX:
T(N) = 3N^3 + 4N + 3 (T(N) on methods 1 - 4 only)
Order = O(N^3)
Timing Data:
N Average Time Factor
200 0.0101 seconds ----
400 0.0804 seconds 7.96
800 0.633 seconds 7.87
1600 5.03 seconds 7.95
With an O(N^3) algorithm when the amount of data doubles the expected running times goes up by a factor of 8. The data from the experiment meets this expectation as the factors are very close to 8 each time the amount of data is doubled. (If the experimental results do not meet expectations state that. What could cause the discrepancy?)There is a sample nested loop in the main method for running the test. You will need to adjust the initial value of n and change (or copy and paste and change) the method name.
3. There is a method in
SampleAlgorithms.java named processFiles. The method is currently
commented out. The method processFiles performs the following actions:
String.String to an array of words and prints out the time it
takes to perform this action.ArrayList and prints out the time it takes to perform this action.ArrayList and prints out the time it takes to perform this action.TreeSet and prints
out the time it takes to perform this action.HashSet and prints
out the time it takes to perform this action.Before running processFiles download 4 texts from
Project Gutenberg. The
texts should be various sizes. Try to locate texts that are
significantly (at least 50%) larger or smaller than each other text you
pick.
The size of the texts are listed in bytes on the download page for each
text. If interested read
the Wikipedia
entry on Project Gutenberg.
Run processFiles on the four different texts you chose. For each text
record the following information:
TreeSet or HashSet number of words.Based on your results record what you think the Big O of the following actions are.
ArrayListArrayListTreeSetHashSet 4. The final part of SampleAlgorithms is a method that finds the
minimum in array as shown in class. One interesting question is how many
times does the method find a new minimum while searching as array of length N. The
best case is 1, the worst case is N, and we argued in class that the
average case would be approximately log N. (base 2) Does the numMin method and the
calls to it with various size lists support this argument? Why or why
not?
Submission: Place the answers to all these questions in A6.txt. If working with a partner turn in one file to either partners' account. You are not turning in any code, just your report. Turn in a text file named A6.txt that contains all the answers to the questions in this assignment. Use the turnin program to turn the file in.
Checklist: Did you remember to: