The goal of this lab is to study the efficiency of sorting algorithms experimentally and to compare the results with the theoretical findings of complexity analysis.
In engineering, instrumentation refers to the methods and techniques of measuring and controlling physical systems. Software engineering has its own methods for instrumenting applications and other pieces of software. As was mentioned last week, the two main attributes that software developers measure is time and space (ie, computer memory usage).
The main use of instrumentation in software development is in optimizing software for which we already had a working version (so, this would come after design, implementation, and testing in the software development process). Usually software developers are interested in finding parts of the application that would most benefit from optimization. To do this, parts of the program are monitored to keep track of how much time is spent (in a given method, for example), or how much memory is consumed (by a given class or group of classes).
Suppose an application contains a method a
whose
algorithm runs in O(n^3) time and a method b
whose algorithm runs in O(n) time.
From a theoretical analysis, it may seem like method a
should be subject to closer inspection.
However, after running experiments, a developer might discover that
80% of the application's running is spent in b
(because it is called frequently with large amounts of data)
but only 2% in a
(because it is called rarely, and always
with small amounts of data).
These experimental results suggest that the developer's efforts would
be much better spent improving b
's complexity even a little bit
than improving a
's complexity a lot.
Similarly there are tools which monitor which which classes are most instantiated and in total take up the most memory-- to identify which classes could most use to be streamlined.
Last week we measured the number of comparisons that our methods made. This time we will use a more concrete measure: actual running time in milliseconds. While this might seem like the definitive measurement, some care needs to be taken to ensure that the data collected is useful-- something you will need to be thinking about as you conduct these experiments.
Make a directory for this lab. Copy the lot of sorting algorithms and helper programs from the course directory:
cp /cslab/class/cs245/lab2/* .
You have the following tools/pieces at your disposal for use in the experiments assigned below.
SortUtil
, a class that contains public methods
createRandomArray()
, createRandomList()
,
displayArray()
, and displayList()
.
System.currentTimeMillis()
,
which reads from the the system clock to determine the current time
(represented by the number of milliseconds that have elapsed since
midnight, January 1, 1970).
By calling this method before and after a piece of code, you
can determine the number of milliseconds that have elapsed
while a piece of code runs.
The return type of the method is long
, not
int
, so keep that in mind for storing results in
variables.
For more information on this method,
see the Java API reference.
Write programs to conduct experiments to answer the questions below. Don't just do experiments "by hand" like we did last week. Write programs that actually automate the experiment. For example, if you decided to run selection sort on 10 arrays for each size 10, 50, 100, 500, 1000, and 5000, you might write something like
int[] sizes = {10, 50, 100, 500, 1000, 5000}; for (int i = 0; i < sizes.length; i++) for (int j = 0; j < 10; j++) { int[] array = SortUtil.createRandomArray(sizes[i]); SelectionArray.sort(array); }
Use your program also to do things like calculate averages and high/low and generate tables, where appropriate.
For each experiment, turn in your code and a brief (paragraph-sized) write up of your methodology, results, and conclusions. Include a table and/or (if you think it illustrates the case for your conclusions) a graph.