Lab 3: Instrumentation

The goal of this lab is to study the efficiency of sorting algorithms experimentally.

1. Set up

Before starting on the formal part of this lab, we'll set up for it by practicing what we learned yesterday about Subversion (svn). First of all, make a new directory for this lab.

cd cs245
mkdir lab3
cd lab3

Since you as a class will be sharing code during this lab, I have set up a Subversion repository for you. We will be using the instrumentation module (or project) in that repository. At the beginning of lab, the module contains only the files that you were given at the beginning of lab last week (ie, the driver, the utility class, and the array sorters). Checkout that module.

svn checkout file:///cslab/class/cs245/repos/instrumentation

That long string at the end says that the repository you're reading from is in a local directory (as opposed to a remote location that you're getting across the internet), and it names the directory for svn to read. If you do an ls now, you should see the a directory instrumentation. cd into it, and ls.

cd instrumentation
ls

You should see

Bubble.java     IntListPair.java  Node.java       SelectionList.java  SortList.java
Insertion.java  Merge.java        Selection.java  SortArray.java      SortUtil.java

If you see more than this, than another group is reading this lab description faster that you are. :)

Now to share code, you want to add your sort from last week. Copy it from your earlier lab directory.

 
cp ../../lab2/SomeListSort.java .

(If the partner logged in is different from who was logged in last time, you can get your sort from the public directory to which you copied it at the end of lab last week... if you followed those directions correctly: type "cp /cslab/pubshare/SomeListSort.java ." .

Then "add" that file, so svn can start tracking it.

svn add SomeListSort.java

Then commit your new version. Don't forget to give a message describing your change using the "-m" flag.

svn commit -m "Added SomeListSort.java to the project."

If svn complains about you not having the correct version, it's because another group has commited their code during the time since you checked out your copy. Just update

svn update

And try to commit again.

2. Introduction

In engineering, instrumentation refers to the methods and techniques of measuring and controlling physical systems. Software engineering has its own methods for instrumenting applications and other pieces of software. As was mentioned yesterday, the two main attributes that software developers measure is time and space (ie, computer memory usage).

The main use of instrumentation in software development is in optimizing software for which we already had a working version (so, this would come after design, implementation, and testing in the software development process). Usually software developers are interested in finding parts of the application that would most benefit from optimization. To do this, parts of the program are monitored to keep track of how much time is spent (in a given method, for example), or how much memory is consumed (by a given class or group of classes).

Yesterday we measured the number of comparisons that our methods made. This time we will use a more concrete measure: running time in milliseconds. While this might seem like the definitive measurement, some care needs to be taken to ensure that the data collected is useful-- something you will need to be thinking about as you conduct these experiments.

3. The tools

You have the following tools/pieces at your disposal for use in the experiments assigned below.

4. Experiments

Write programs to conduct experiments to answer the questions below. Don't just do experiments "by hand" like we did last week. Write programs that actually automate the experiment. For example, if you decided to run selection sort on 10 arrays for each size 10, 50, 100, 500, 1000, and 5000, you might write something like

     int[] sizes = {10, 50, 100, 500, 1000, 5000};
     for (int i = 0; i < sizes.length; i++)
          for (int j = 0; j < 10; j++) {
             int[] array = SortUtil.createRandomArray(sizes[i]);
             SelectionArray.sort(array);
          }

Use your program also to do things like calculate averages and high/low and generate tables, where appropriate.

  1. Running time vs. comparisons. How good a predictor of running time is the number of comparisons for insertion sort? Do the number of comparisons and the running time increase with size at equivalent rates? At a given size, is the number of comparisons and the running time correlated? (Pick either arrays or lists to work with on this one).

  2. Arrays vs. lists. Pick a sorting algorithm. Which takes longer to sort, arrays or lists? Does size make a difference, or is one data structure faster for both big and small amounts of data?

  3. Best case vs. worst case. The version of Bubble sort that monitors whether or not a change has been made differs in complexity between best case and worst case. Is there much variance experimentally? (Experiment on a single size; choose either arrays or lists to work with.)

  4. Information vs. noise. If you sort the same array several times, is there any difference in the running time? (Pick a size, a sort, and either lists or arrays (the opposite of what you chose in question 4).) Be careful how you repeat this on the same array/list; you should first generate a random one, and the make a copy before each sort, so you'll always be sorting the same original array. What would account for the differences in running time?

5. To turn in

For each experiment, turn in your code and a brief (paragraph-sized) write up of your methodology, results, and conclusions. Include a table and/or (if you think it illustrates the case for your conclusions) a graph.


Thomas VanDrunen
Last modified: Thu Sep 13 11:22:02 CDT 2007