Lab 1: Evaluating sorting algorithms

The goal of this lab is to practice writing sorting algorithms on arrays and to experiment with one metric for comparing the efficiency of different algorithms.

1. Introduction

There are many ways to compare algorithms and their implementations, to consider how efficiently they make use of resources. The main resources of interest are time and space (that is, computer memory), but occasionally other resources (such as network bandwidth) are of concern.

Furthermore, there are several ways to study an algorithm's efficiency with respect to a specific resource-- it can be studied experimentally, theoretically, and probabilistically; looking at best case, worst case, average case, and expected case; and with various degrees of precision. One crude (but not altogether useless) way to study the efficiency of sorting algorithms is to count the number of comparisons of data, since most (but not all...) sorting algorithms rearrange the data based on comparing pairs. In this lab you will experiment with several sorting algorithms and compare their efficiency based on the number of comparisons they need to make to sort arrays and lists.

2. Set up

Make a directory for this lab and move into it.

mkdir lab1
cd lab1

As in most labs and projects, I am giving you some code base to work on. Copy the following files from the course directory for this lab.

cp /cslab/class/cs245/lab1/* .

SortArray.java is a driver program. It will generate a random array of integers and run a sort method on it. If you use it in the following way on the command line

java SortArray Classname 20

It will look for a method with signature int sort(int[]) in class Classname to invoke, passing the random array. The number following the name of the class specifies the size of the array. If you leave this off, it will default to size 10. The sort method should return the number of comparisons it took to sort the array.

You do not need to look at the file SortArray.java; it uses some Java features that we won't even get to in this course.

(The driver can also read an array from a file; this feature will be described later.)

The driver will report the number of comparisons and, if the array has 20 or fewer entries, it will display the array itself before and after sorting, for debugging purposes.

3. Selection sort and Insertion sort on arrays

Open the file SelectionA.java in xemacs. The algorithm we derived in class is encapsulated in the method. However, the method does not do any counting of the comparisons. Your first task is to complete this method so that it does. Basically, add one variable to tabulate the comparisons, increment that variable for every comparison, and return it in place of "0" as is currently returned.

Specifically, we're interested in the number of times we compare two data from the array-- that is, the number of times the expression min > array[j] is evaluated. Expressions like i < array.length don't count because they don't compare items in the array.

Then compile and test the revised selection sort. It should already sort correctly; check that its number of comparisons looks reasonable.

Next make the same change to InsertionA.java.

4. Bubble sort on arrays

You have probably seen another sorting algorithm called bubble sort. While not a very good sort in terms of efficiency, it is easy to program and understand. This algorithm's strategy is to iterate through the array, swapping adjacent values that are out of order. It repeats this until the array is sorted. Make a new class (BubbleA) for this sort and implement it from scratch, with comparison counting, so that it can be used with our driver. Test that it sorts correctly and gives a reasonable number of comparisons.

5. Selection sort on lists

The program SortList is just like SortArray except that it works on lists. The sort methods it invokes must return both the sorted list and the number of comparisons. To do this, I've provided a dummy class IntListPair which serves as a carrier for the number of comparisons and sorted list that these sort methods must return.

Open the file SelectionL.java. This contains the insertion sort algorithm we derived in class yesterday. Unfortunately, it's not quite right. Compile and run it and debug it.

Once it is sorting correctly, add code to count the numbers of comparisons.

6. Insertion sort on lists

Now for the main algorithmic challenge today: Open a new file InsertionL.java and write a method to perform insertion sort on lists, counting comparisons. Compile and test.

Time permitting: Write another class for performing bubble sort on lists and counting comparisons. However, make sure you leave at least 20 minutes to do the experiments in the next section.

7. Experiments

Now you will use this code to run systematic experiments to compare sorting algorithms against each other. To make these experiments scientific and rigorous, you need to consider the variables in the system.

The variables you can control are

The permutedness of the array is also a variable. You also can choose either to let the permutedness vary (ie, don't control it) by allowing the driver to generate a random array each time, or you can make your own array and store it in a file, and sort the same array each time.

You can specify a file to read by using the -f flag with the driver. Executing

java SortArray Selection -f somearray

will read data for the array from the file somearray. These data files should contain only integers (don't use punctuation to separate the numbers), but you may separate the numbers by either spaces or new lines, or a combination of both.

Choose two or three out of the following questions to explore:

Suppose you wanted to address the first example ("How does the number of comparisons in selection sort increase as the size of the array increases?"). You might choose several specific sizes (say 10, 100, 1000, 10000), and then run selection sort 10 times at each size and average the number of comparisons at each size--- that way you can deal with the variation over several runs.

Finally, write a short report on your experiments, describing your methodology (precisely enough that someone else could replicate your experiments) and report on the results in a table I recommend typing the report as a text file in xemacs.

8. To turn in

Turn in hard copies of the files you wrote or modified, in addition to your report. The command to print files neatly, two to a page, is a2ps. You will probably want to execute the line

a2ps -P sp SelectionA.java InsertionA.java BubbleA.java

or

a2ps SelectionA.java InsertionA.java BubbleA.java --file-align=fill

The flag --file-align=fill tells it not to start each file on a new page, so less paper is wasted.


Thomas VanDrunen
Last modified: Wed Jan 16 12:50:36 CST 2008