Lab 2: Shell sort, merge sort, and experimentation

John Murdoch: Here, let me ask you a question. You heard of a place called Shell Beach?
Inspector Frank Bumstead: Sure.
John Murdoch: Do you know how to get there?
Inspector Frank Bumstead: Yeah.
John Murdoch: Tell me.
Inspector Frank Bumstead: Right. You just... you go to the...
John Murdoch: Where? Where do you go?
Inspector Frank Bumstead: Just give me a second, will you...
John Murdoch: You can't remember, can you?
---
John Murdoch: Hey, do you know the way to Shell Beach?
Taxi Driver: You're kidding! Me and the Mrs. spent our honeymoon there. All you gotta do is take Main Street West to... or is it the Cross... You know, that's funny, I can't remember if it's Main Street West or the Crosstown.
---Dark City

The goal of this lab is to give you more practice programming in C and working with the C compiler, to demonstrate two other sorting algorithms, and to to study the efficiency of sorting algorithms experimentally, comparing the results with the theoretical findings of complexity analysis.

1. Set up

Make a directory for this lab and move into it. Copy given the code from the course directory.

mkdir lab2
cd lab2
cp /homes/tvandrun/Public/cs245/lab2/* .

The given files are

Note that I am giving you the makefile for this lab. In some future labs and projects, you will need to modify a makefile or write your own from scratch. Open the makefile in a text editor and talk through it with your partner; review what we talked about in class and make sure you understand it, anticipating writing your own sometime in the future.

2. Shell sort

Your first task is to implement yet another sorting algorithm, called Shell sort. Suppose we have the array

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

First consider the items separated by 7 spaces, starting at 0.

49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84

Sort them.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

The items separated by 7 starting at 1 are already sorted.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84

Sort the next bunch.

14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84

Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.

(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)

To implement this, we will use an incremental approach in which you will test your program using the sort driver. To compile at test at each step, use the make command. When your code compiles, use ./sort_driver to test various sorting algorithms for correctness. Try this out now.

make
./sort_driver selection
./sort_driver insertion
./sort_driver bubble

Each test should report that "All tests pass." If you're not getting that message, ask for help.

Now test Shell sort (yes, I mean that; even though you haven't written it yet).

./sort_driver shell

You should see many failure messages about tests that did not pass. That's good, becuase you haven't written shell sort yet. If it passed all the tests before you wrote it, then there would be something seriously wrong with your testing apparatus. (If you aren't getting a screenful of failure messages, ask for help.)

Here are the steps:

3. Merge sort

You may remember the merge sort algorithm from Programming I. In brief the algorithm sorts by

In sorts.c there are three relevant functions. mergeSort(), the one sort_driver will call, is written for you and it uses some features of C that we haven't learned yet. It starts the recursion going.

mergeSortR() takes not only an array but also a starting index and a stopping index, and it is to sort the subarray from start (inclusive) to stop (exclusive). Thus you can use it to sort increasingly smaller subarray.

Finally, merge() is a helper function for the algorithm that takes two arrays (array and aux) plus start and stop indices. It merges the two sub arrays found in array from start (inclusive) to the midpoint (exclusive) and from the midpoint (inclusive) to stop (exclusive) into a single sequence filling up the first (stop - start) spaces in aux. Thus mergeSortR() calls merge() to do the merging. (Assume that merge() does not copy the values back into array. That's the responsibility of mergeSortR().)

We'll do this one in steps as well, although we won't break it down into quite as many. For the first step, implement merge(). You can test it in isolation with a separate driver:

make merge_driver
./merge_driver

When it is working, then write the rest of mergeSortR(). You can test it with the main driver.

make sort_driver
./sort_driver merge

Finally, add code to count the number of comparisons.

4. Experiments

Now open the file experiment.c. Inspect it with your partner and figure out what is going on and how it is using the array_util library. Ask if there is anything you don't understand. Remember you will need to write your own experiment in Project 1.

Compile and run the experiment. You may want to run it several times to see how much variation there is on different runs. What do you observe about the running times of the various sorting algorithms?

5. To turn in

Write a brief (paragraph-sized) summary of what you observed in the experiments in a file called EXPERIMENTS. Make sure your names appear in the files that the grader will inspect.


Thomas VanDrunen
Last modified: Tue Sep 9 11:08:21 CDT 2014