Lab 1: Sorting

The goal of this lab is to practice writing sorting algorithms on arrays and linked lists, and to experiment with one metric for comparing the efficiency of different algorithms.

1. Introduction

There are many ways to compare algorithms (and implementations of algorithms) to consider how efficiently they make use of resources. The main resources of interest are time and space (that is, computer memory), but occasionally other resources (such as network bandwidth) are of concern.

Furthermore, there are several ways to study an algorithm's efficiency with respect to a specific resource-- it can be studied experimentally, theoretically, and probabilistically; best case, worst case, average case, and expected case; and with various degrees of precision. One crude (but not altogether useless) way to study the efficiency of sorting algorithms is to count the number of comparisons of data, since most (but not all...) sorting algorithms rearrange the data based on comparing pairs. In this lab you will program several sorting algorithms and compare their efficiency based on the number of comparisons they need to make to sort arrays and lists.

2. Set up

Make a directory for this course (245) and one for this lab, and move into the lab1 directory. Copy the following files from the course public directory.

cp /homeemp/tvandrun/pub/245/IntListPair.java
cp /homeemp/tvandrun/pub/245/Node.java 
cp /homeemp/tvandrun/pub/245/SortArray.java
cp /homeemp/tvandrun/pub/245/SortList.java

SortArray.java is a driver program. It will generate a random array of integers and run a sort method on it. If you use it in the following way on the command line

java SortArray Classname 20

It will look for a method with signature int sort(int[]) in class Classname to invoke, passing the random array. The number following the name of the class specifies the size of the array. If you leave this off, it will default to size 10. The sort method should return the number of comparisons it took to sort the array.

It is difficult to predict how long the various sorting implementations described in sections 3-6 will take you. Try to finish all of them, but whatever is left over will be incorporated into next week's lab. For this week, leave yourself at least 20 minutes at the end to do section 7. That last section guides you in running experiments on the code you will write in sections 3-6 You do not need to have all the code completed in order to do the experiments; you can run them on however much you have.

3. Selection and insertion on arrays

Open a file (something like SelectionArray.java) in xemacs, and write a static method to sort a given array of integers using selection sort (try to do this without looking at your notes). Inside your method, keep a counter variable that tracks the number of times you compare two values from the array. Return that amount from the method.

Use SortArray to debug your method. Once it is working, try running it on several arrays and vary the size. Take note of the number of comparisons.

Now make a new file for writing an insertion sort method, also counting comparisons. Debug, and run some experiments.

4. Selection and insertion on lists

SortList.java is similar to SortArray.java except that it generates a random linked list and expects sort methods that operate on lists. These lists are made up of nodes using the provided Node class. We do not have a separate List class; lists are merely represented by head Nodes.

In the array versions in section 3, both the main method of SortArray and the sort method maintained references to the the array being worked on, so the sort method did not need to pass the sorted array back to the caller. In the case of lists, however, the method needs to communicate the sorted list back to the method that is calling it; it also needs to return the number of comparisons in order for us to conduct our experiments. To deal with this, you are provided with a class IntListPair, which serves as a wrapper for an integer and a linked list. Your sort methods for lists should return IntListPairs, containing the number of comparisons and the sorted list.

Write selection and insertion sort classes for linked lists, similar to the ones you did in section 3. In this case, you want to keep track of the number of times you compare any node's datum with any other. We have already seen how to use insertion sort on linked lists; selection will require some reasoning through on your part. Here and in section 6, do not for get about tricks like looking one node ahead (looking at current.next().datum() or even current.next().next().datum()) and keeping a previous reference. Debug, and get a feeling for the number of required comparisons.

5. Bubble on arrays

Bubble sort is a sorting algorithm that scans through the array or list, looking for adjacent pairs that are out of order and swapping them; it keeps scanning until there are no pairs out of order, and if no pairs are out of order, then the entire collection must be sorted. There are two approaches to this; the more naive approach simply monitors whether there has been a change to the array in during the current scan, and quits if it detects a scan with no changes. The basic structure would look like

    boolean changed = true;
    while (changed) {
         changed = false;

              // scan, setting changed to true if we do a swap

    }

A more optimized approach would notice that after one pass, the largest item in the array is correctly placed in the last position. (If you do not immediately see why, then try an example by hand.) Accordingly, the second scan does not need to be of the entire array, but on only of the first n - 1 items; the third pass of the first n - 2 items, and so forth. After n - 1 scans, the entire collection will be sorted, so instead of monitoring if a change has been made, we can count out n - 1 scans, and use that counter to keep track of how far in the array we need to scan.

Write two bubble sort classes for arrays, one monitoring changes and the other using the more optimized approach. Run a few experiments and observe differences (if any) in the number of comparisons.

6. Bubble on lists

Finally, write two bubble sort classes for lists. The swaps will be a little tricky. For the optimized version, you will need a way to compute the size of the list and some sort of counter to determine how deep into the list you are.

7. Experiments

Now you will use your code to run systematic experiments to compare sorting algorithms. To make these experiments scientific and rigorous, you need to consider the variables in the system. First, there is a variable you cannot control---- how permuted the array or list is. Some of these algorithms vary quite a bit on different runs, even though other variables are held constant, because different collections simply require a different number of comparisons, whereas other algorithms are less affected.

The variables you can control are

This last part of the lab is open-ended: Choose for yourself two or three questions to ask (for examples, "How does the number of comparisons in selection sort increase as the size of the array increases?" "For linked lists of size 100, in which algorithm is there the biggest difference between best and worst case?") and design a set of experiments to address those questions. Then perform your experiments and record your results.

Suppose you wanted to address the first example ("How does the number of comparisons in selection sort increase as the size of the array increases?"). You might choose several specific sizes (say 10, 100, 1000, 10000), and then run selection sort 10 times at each size and average the number of comparisons at each size--- that way you can deal with the variation over several runs.

Finally, write a short report on your experiments, describing your methodology (precisely enough that someone else could replicate your experiments) and report on the results in a table and graphically. I recommend typing the report as a text file in xemacs. The graphs you may do by hand; make it look as nice as possible, but I'm not looking for Rembrandts.


Thomas VanDrunen
Last modified: Tue Jan 9 16:51:47 CST 2007