Lab 2: Instrumentation

The goal of this lab is to give you more practice programming in C and working with the C compiler, to demonstrate another sorting algorithm, and to to study the efficiency of sorting algorithms experimentally, comparing the results with the theoretical findings of complexity analysis.

1. Set up

Make a directory for this lab and move into it. Copy given the code from the course directory.

mkdir lab2
cd lab2
cp /homes/tvandrun/Public/cs245/lab2/* .

As with last time, the given files are

3. Shell sort

Your next task is to implement yet another sorting algorithm, called Shell sort. Suppose we have the array

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

First consider the items separated by 7 spaces, starting at 0.

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

Sort them.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

The items separated by 7 starting at 1 are already sorted.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

Sort the next bunch.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84

Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.

(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)

Implement Shell sort using the stub in sorts.c You can use your code from insertion sort as a starting point. You'll need to wrap it in two more loops: one to iterate through the various starting points for the current gap, and an outermost one to iterate through the gaps.

In theory, any decreasing sequence can be used for the gaps, but the last gap must be 1. Also, make sure that you use gap 1 only once (or you'll probably end up with an infinite loop) and don't use gap 0.

Your implementation should also count and return the number of comparisons.

Compile and test. Make sure you test not only one arrays of size 10 but also on large arrays.

3. Merge sort

You may remember the merge sort algorithm from Programming I. In brief the algorithm sorts by

In sorts.c there are two functions. mergeSortR() takes not only an array but also a starting index and a stopping index, and it is to sort the subarray from start (inclusive) to stop (exclusive). Thus you can use it to sort increasingly smaller subarray. The other function, mergeSort(), is written for you; it starts the process of sorting using the entire array.

Your task is to write mergeSortR(). The difficult part is the merging. Make a second, auxiliary array using blankArray() in the arrayUtil library, big enough for the subarray being worked on by this call of mergSortR(). Merge the two halves into that array and then copy the values back into the original array.

As with the other sorts, this should count the total number of comparisons.

Compile and test.

4. Experiments

Read all of the experimental questions below. Then pick two of them to experiment on (if you have time left over, then pick a third, just for fun). Write programs to conduct experiments to answer the questions. Write programs that actually automate the experiment. For example, if you decided to run selection sort on 10 arrays for each size 10, 50, 100, 500, 1000, and 5000, you might write something like

     int i, j;
     int* array;
     int sizes[] = {10, 50, 100, 500, 1000, 5000};
     for ( i = 0; i < 6; i++)
          for (j = 0; j < 10; j++) 
           {
             array = randomArray(sizes[i]);
             selectionSort(array);
          }

Use your program also to do things like calculate averages and high/low and generate tables, where appropriate.

  1. Running time vs. comparisons. How good a predictor of running time is the number of comparisons for insertion sort? Do the number of comparisons and the running time increase with size at equivalent rates? At a given size, is the number of comparisons and the running time correlated?

  2. Theory vs. experiments. In theory, selection sort (for example) should run in O(n^2) time. Is this the case experimentally? Pick a sorting algorithm and compare how the running time increases with the input size with the rate predicted by the theory.

  3. Best case vs. worst case. The version of bubble sort that monitors whether or not a change has been made differs in complexity between best case and worst case. Is there much variance experimentally? (Pick one algorithm and one size; time the algorithm on many random arrays of that size.)

  4. Information vs. noise. If you sort the same array several times, is there any difference in the running time? Pick one sort and experiment. Be careful how you repeat this on the same array; you should first generate a random one, and the make a copy before each sort, so you'll always be sorting the same original array.

5. To turn in

Instead of a hard-copy turn in, the TA will log into the lab account and grade your code and results there.

For each experiment, write a brief (paragraph-sized) summary of your methodology, results, and conclusions. Include a table and/or (if you think it illustrates the case for your conclusions) a graph. Leave these as appropriately-named files in the lab 2 directory.


Thomas VanDrunen
Last modified: Wed Jan 18 16:01:43 CST 2012