The goal of this lab is to give you more practice programming in C and working with the C compiler; to demonstrate two other sorting algorithms; and to to study the efficiency of sorting algorithms experimentally, comparing the results with the theoretical findings of algorithmic analysis.
We will be working with five sorting algorithms. You will be given the code for three of them (selection, insertion, and bubble), and you will need to implement two others (described below). Then we will run experiments on the sorting algorithms and observe their relative performance.
Your first task is to implement yet another sorting algorithm, called Shell sort. Suppose we have the array
49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84
First consider the items separated by 7 spaces, starting at 0.
49 7 83 22 8 45 72 91 22 80 53 88 43 29 14 35 55 24 37 84
Sort them.
14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
The items separated by 7 starting at 1 are already sorted.
14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
Sort the next bunch.
14 7 83 22 8 45 72 49 22 80 53 88 43 29 91 35 55 24 37 84
14 7 55 22 8 45 72 49 22 80 53 88 43 29 91 35 83 24 37 84
Keep doing that until all the array slices with gap 7 are sorted. The actual sorting can be done using a modified insertion sort. Then, we decrease the gap and repeat the process, say sorting all the slices with gap 3. Finally, sort with gap 1, which is just insertion sort, except that this should be close to the best case for insertion sort because the items by now are nearly sorted.
(It might be tempting to call these sections "shells" and pretend that's where the name of the sort comes from. Actually the algorithm was invented by someone named Donald Shell.)
Is is to be hoped that you have seen merge sort and insertion sort before, in Programming I or whatever prior experience you have. In brief the merge sort algorithm sorts by
The recursive structure is pretty simple, once you're able to wrap your mind around recursion in general. To use the the "pile of cards" analogy, suppose you take an unsorted pile of cards. Split the pile in two halves. Take a nap; wake up to find the two halves each sorted. Now you need to collate those two sorted halves. Start a new sorted pile (initially empty). Take the top card from each half-pile and add the smaller one to the new sorted pile. Repeat until all the cards have been moved to the unified sorted pile.
The tricky part of all this is getting the merging part right, but we did that in class on Friday.
Insertion sort, as we mentioned briefly on Wednesday, is analogous to selection sort in that it also maintains a sorted section and an unsorted section, but instead of finding the smallest value in the unsorted section in placing it at the end of the sorted section, insertion sort takes the first value of the unsorted section and places it in the right place in the sorted section.
In this second part of the lab, you will be given partially written implementations of merge sort and insertion sort. Your task is to finish them.
Finally we are going to compare these two sorting algorithms with the other three we have looked at already (selection, insertion, and bubble) using experiments: we run the algorithms on some sample arrays and see how well they do.
I am providing a library of utility functions to help
working with arrays.
These functions will do things like populate an array with
random values, copy the contents from one array to another,
etc.
The library is found in files array_util.h
and array_util.c
.
There are several ways we could use to measure their runtime performance. We will use two: counting the number of comparisons and timing how long they take.
To count the number of comparisons, we need to add code to
the sorting algorithms to implement a counter that will
be incremented every time we compare two elements of the
array we're sorting.
(We won't count comparisons of indices, such as i < n
.)
To measure their running time, we will need to read from the
computer's clock and simply time them.
The standard way to keep track of time on a computer is
the number of milliseconds since midnight, Jan 1, 1970.
array_util.h
provides a function
get_time_millis()
which reads the current time
from the clock.
Thus we can time a call of selection sort by doing the following:
fore = get_time_millis(); selectionSort(copy, sizes[i]); aft = get_time_millis();
Then aft - fore
is the number of milliseconds
it took.
In our experiment, we will use counting comparisons for small arrays (because the number of milliseconds would be too small) and real time for large arrays (because the number of comparisons would be too large). I am providing the code from running the experiment, but you will need to code up your own experiments in an expanded version of this exercise in Project 1.