Lab 2: Experiments on sorting

The goal of this lab is to give you more practice programming in C and working with the C compiler, to demonstrate another sorting algorithm, and to introduce using experiments to compare the efficiency of algorithms

1. Set up

Even though we are continuing your work from last week, set up a new directory for this lab and move into it.

mkdir lab2
cd lab2

Copy some code I've provided for you from the course public directory.

cp /homes/tvandrun/Public/cs245/lab2/* .

You'll notice that this looks like the same set of files as last week. However, I've made some changes to them.

2. Getting last week's work

Open in gedit or another text editor sorts.c. Also open your sorts.c from last week (it should be in ~/cs245/lab1).

Replace the stubs of insertionSort() and selectionSort() in the new file with your code from last week. If you did not finish insertion sort last week, then do so now.

Verify that both algorithms work (even if they were working last week!).

gcc -c sorts.c
gcc sDriver sorts.o arrayUtil.o -o sDriver
./sDriver selection
./sDriver insertion

I've made a change to sDriver. Now you can specify the size of the array it will test. Typing

./sDriver selection 10000

will test selection sort on an array of size 10000. If you leave off the size, then it will test arrays of size 10 by default. It will display the arrays only if they are of size 20 or smaller.

3. Counting comparisons

The functions bubbleSort(), bubbleSort2(), merge(), and mergeSort() appear in sorts.c as we saw them in class. However, they don't actually count comparisons---they just return zero.

Modify these so that they count the number of times we compare two values in the arrays. (Remember we aren't counting comparisons on loop indices.) Since the comparisons of merge sort actually happen in the function merge(), you'll notice that merge() returns an int, the number of comparisons. The function mergeSort() needs to use this result. Compile and test. (What files do you need to recompile?)

Test each of the algorithms several times to see what range of comparisons you're getting on arrays of size 10, and ask me whether they look right or whether it looks like you're missing some comparisons.

4. Shell sort

Next, implement yet another sorting algorithm, called Shell sort. Like merge sort, Shell sort works by sorting sections of the array in isolation. However, the "sections" used by Shell sort are not contiguous. It works like this. Suppose we have the array

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

First consider the items separated by 7 spaces, starting at 0.

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

Sort them.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

The items separated by 7 starting at 1 are already sorted.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

Sort the next bunch.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84

Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.

(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)

Implement Shell sort using the stub in sorts.c You can use your code from insertion sort as a starting point. You'll need to wrap it in two more loops: one to iterate through the various starting points for the current gap, and an outermost one to iterate through the gaps.

In theory, any decreasing sequence can be used for the gaps, but the last gap must be 1. Also, make sure that you use gap 1 only once (or you'll probably end up with an infinite loop) and don't use gap 0.

Your implementation should also count comparisons.

Compile and test. Make sure you test not only one arrays of size 10 but also on large arrays.

5. Experiment

Now we want to run an experiment to determine how these algorithms vary in terms of the comparisons they require. For this you will write a program that runs the experiment.

Open a new file, called something like experiment.c. Write opening documentation like what we have seen, giving your name and the occasion (lab 1).

In this experiment, we ask three questions:

If we run the same algorithm more than once on the same (original) array, will we get the same number of comparisons? (We better!)
If we run different algorithms on the same (original) array, will we get the same number of comparisons? (Perhaps it will depend on which algorithms we try.)
If we run the same algorithm on different arrays of the same size, will we get the same number of comparisons? (Perhaps we will for some algorithms, and not for others.)

We can easily address all three of these questions in the same experiment. Write a program with generates several (say, 5) arrays of the same size (say, 50 items). Then for each sorting algorithm, it sorts each array twice and displays the number of comparisons.

Use the methods from arrayUtil.h to help. randomArray() will generate an array of a given size with random integers between 0 and 100. Make sure that when you repeat the sort of an array that you sort the original, unsorted sequence, not the sorted version. To do this, I recommend you first generate a "master" array and then make copies of it using copyArray() and sort the copy.

Your program should generate readable output, something like

Array 1:
Insertion: 500 500
Selection: 550 550
Bubble 1: 625 625
Bubble 2: 550 550

Array 2:
...

What do you observe?

6. To turn in

In a typescript, display your sort implementations and experiment.

script
cat sorts.c
cat experiment.c

Then run your experiment to show the results.

./experiment

In a paragraph, describe the results you observe.

Thomas VanDrunen

Last modified: Wed Sep 1 16:54:02 CDT 2010