Lab 2: Instrumentation

The goal of this lab is to give you more practice programming in C and working with the C compiler, to demonstrate another sorting algorithm, and to to study the efficiency of sorting algorithms experimentally, comparing the results with the theoretical findings of complexity analysis.

1. Set up

Make a directory for this lab and move into it. Copy given the code from the course directory.

mkdir lab2
cd lab2
cp /homes/tvandrun/Public/cs245/lab2/* .

The given files are

2. Counting comparisons

The sorting algorithms we saw in class are already implemented in sorts.c. The functions also return an int intended to be interpreted as the number of comparisons. However, these functions do not actually return the number of comparisons--- they just return 0. Your first task is to modify them so that they count and return the number of comparisons.

Specifically, we're interested in the number of times we compare two data from the array-- that is, the number of times an expression like min > array[j] is evaluated. Expressions like i < n don't count because they don't compare items in the array.

Use the makefile to comiple and the driver to test. To use the driver, give the name of the sorting algorithm and the number of elements in the array, for example

./sDriver selection 15

If you do not give the number of array elements, the default is 10 elements. This will print out the array before and after sorting (if the size is 20 or fewer), report on whether the sorting is correct, and report on the number of comparisons.

3. Shell sort

Your next task is to implement yet another sorting algorithm, called Shell sort. Suppose we have the array

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

First consider the items separated by 7 spaces, starting at 0.

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

Sort them.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

The items separated by 7 starting at 1 are already sorted.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

Sort the next bunch.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84

Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.

(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)

Implement Shell sort using the stub in sorts.c You can use your code from insertion sort as a starting point. You'll need to wrap it in two more loops: one to iterate through the various starting points for the current gap, and an outermost one to iterate through the gaps.

In theory, any decreasing sequence can be used for the gaps, but the last gap must be 1. Also, make sure that you use gap 1 only once (or you'll probably end up with an infinite loop) and don't use gap 0.

Your implementation should also count comparisons.

Compile and test. Make sure you test not only one arrays of size 10 but also on large arrays.

4. Experiments

Read all of the experimental questions below. Then pick two of them to experiment on (if you have time left over, then pick a third, just for fun). Write programs to conduct experiments to answer the questions. Write programs that actually automate the experiment. For example, if you decided to run selection sort on 10 arrays for each size 10, 50, 100, 500, 1000, and 5000, you might write something like

     int i, j;
     int* array;
     int sizes = {10, 50, 100, 500, 1000, 5000};
     for ( i = 0; i < 6; i++)
          for (j = 0; j < 10; j++) 
           {
             array = randomArray(sizes[i]);
             selectionSort(array);
          }

Use your program also to do things like calculate averages and high/low and generate tables, where appropriate.

  1. Running time vs. comparisons. How good a predictor of running time is the number of comparisons for insertion sort? Do the number of comparisons and the running time increase with size at equivalent rates? At a given size, is the number of comparisons and the running time correlated?

  2. Theory vs. experiments. In theory, selection sort (for example) should run in O(n^2) time. Is this the case experimentally? Pick a sorting algorithm and compare how the running time increases with the input size with the rate predicted by the theory.

  3. Best case vs. worst case. The version of bubble sort that monitors whether or not a change has been made differs in complexity between best case and worst case. Is there much variance experimentally? (Pick one algorithm and one size; time the algorithm on many random arrays of that size.)

  4. Information vs. noise. If you sort the same array several times, is there any difference in the running time? Pick one sort and experiment. Be careful how you repeat this on the same array; you should first generate a random one, and the make a copy before each sort, so you'll always be sorting the same original array.

5. To turn in

Instead of a hard-copy turn in, the TA will log into the lab account and grade your code and results there.

For each experiment, write a brief (paragraph-sized) summary of your methodology, results, and conclusions. Include a table and/or (if you think it illustrates the case for your conclusions) a graph. Leave these as appropriately-named files in the lab 2 directory.


Thomas VanDrunen
Last modified: Thu Sep 1 11:09:27 CDT 2011