Lab 2: Comparing Sorting Algorithms

This lab has a variety of goals: to continue working with C, to get your hands on with sorting, to introduce an additional sorting algorithm, and to start you thinking about how to measure and compare the performance of programs.

For this lab, you will be adding instrumentation to measure one facet of performance, and you will be implementing a new algorithm.

1. Set up

Make a directory for this lab and move into it.

mkdir lab2
cd lab2

As in most labs and projects, I am giving you some code base to work on. Copy the following files from the course directory for this lab.

cp /cslab/class/csci245/lab2/* .

This gives you seven files:

The last two of these are the only files that you will need to modify.

Inspect the makefile with your partner. I'm not having you write your own makefile today, but you will have to write one in the future; so review its format and the logic of the dependencies. The -Wall to the compiler asks it to generate more warnings than it normally would; that is actually quite helpful. Notice that this makefile includes phony targets for cleaning the directory (and not removing arrayUtil.o!) as well as for handing in the lab.

Note: I've supplied you with the object file arrayUtil.o; be careful that you do not delete it. If you do, however, you can copy it again from the class directory.

Open sorts.c in emacs or gedit.

2. Counting comparisons

The sorting algorithms we saw in class are already implemented in sorts.c. The functions also return an int intended to be interpreted as the number of comparisons. However, these functions do not actually return the number of comparisons--- they just return 0. Your first task is to modify them so that they count and return the number of comparisons.

Specifically, we're interested in the number of times we compare two data from the array-- that is, the number of times an expression like min > array[j] is evaluated. Expressions like i < n don't count because they don't compare items in the array.

Use the makefile to comiple and the driver to test. To use the driver, give the name of the sorting algorithm and the number of elements in the array, for example

./sDriver selection 15

If you do not give the number of array elements, the default is 10 elements. This will print out the array before and after sorting (if the size is 20 or fewer), report on whether the sorting is correct, and report on the number of comparisons.

3. Shell sort

Your next task is to implement yet another sorting algorithm, called Shell sort. Suppose we have the array

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

First consider the items separated by 7 spaces, starting at 0.

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

Sort them.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

The items separated by 7 starting at 1 are already sorted.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

Sort the next bunch.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84

Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.

(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)

Implement Shell sort using the stub in sorts.c You can use your code from insertion sort as a starting point. You'll need to wrap it in two more loops: one to iterate through the various starting points for the current gap, and an outermost one to iterate through the gaps.

In theory, any decreasing sequence can be used for the gaps, but the last gap must be 1. Also, make sure that you use gap 1 only once (or you'll probably end up with an infinite loop) and don't use gap 0.

Your implementation should also count and return the number of comparisons.

Compile and test. Make sure you test not only one arrays of size 10 but also on large arrays.

4. Experiments

Now you want to compare the sorting algorithms against each other. Write a program to run an experiment: How do the number of comparisons increase as a function of the size of the array? For each of several array sizes (for exampe, 10, 50, 100, 500, 1000, 5000...), make a random array. Run each sorting algorithm on that array and determine the number of comparisons. Display the results. The file experiment.c can be used as a starting point, and there already rules in the makefile to compile and link it.

As an example of how you would write a program that runs an experiment, suppose you decided to run selection sort on 10 arrays for each size 10, 50, 100, 500, 1000, and 5000, you might write something like

     int i, j;
     int* array;
     int sizes[] = {10, 50, 100, 500, 1000, 5000, 0};
     for ( i = 0; sizes[i] > 0; i++)
          for (j = 0; j < 10; j++) 
           {
             array = randomArray(sizes[i]);
             selectionSort(array);
          }

But in your experiment, you want to run several sorting algorithms on the same array. (Remember that you can make a copy of an array using copyArray() from the arrayUtil I've given you. You don't want to simply sort the array with the first algorithm and then give a sorted array to the subsequent algorithms.)

Make sure that you don't do something like this:

       int* array = randomArray(sizes[i]);
       int nCompares = insertionSort(array);
       printf("Insertion sort tooked %d comparison's\n", nCompares);
       nCompares = selectionSort(array);
       printf("Selectino sort tooked %d comparsions\n", nCompares);

(Spelling, grammar, and punctuation mistakes are deliberate to remind you that this is bad code. :) )

What's the problem with that? When we used selection sort, the array was already sorted! Instead, you should make a copy of the original, random array using copyArray() for each sorting algorithm.

Your program should generate readable output, something like

Size   	Insertion 	Selection 	Shell
5000	6233199		12497500	147508
5000	6242791		12497500	148041
...

What do you observe? (If you have time, you can add larger sizes.)

5. To turn in

Redirect your program's output to a file named output with a command

./experiment > output
Turn in that file and the two source files you modified with the command
make handin

Copying your work to your account

See this page for instructions on how you can copy your files to your personal account.


Thomas VanDrunen, Cary Gray
Last modified: Thu Jan 24 11:16:42 CST 2013