The goal of this lab is to practice writing sorting algorithms on arrays and to experiment with one metric for comparing the efficiency of different algorithms.
There are many ways to compare algorithms and their implementations, to consider how efficiently they make use of resources. The main resources of interest are time and space (that is, computer memory), but occasionally other resources (such as network bandwidth) are of concern.
Furthermore, there are several ways to study an algorithm's efficiency with respect to a specific resource-- it can be studied experimentally, theoretically, and probabilistically; looking at best case, worst case, average case, and expected case; and with various degrees of precision. One crude (but not altogether useless) way to study the efficiency of sorting algorithms is to count the number of comparisons of data, since most (but not all...) sorting algorithms rearrange the data based on comparing pairs. In this lab you will experiment with several sorting algorithms and compare their efficiency based on the number of comparisons they need to make to sort arrays and lists.
Make a directory for this lab and move into it.
mkdir lab1 cd lab1
As in most labs and projects, I am giving you some code base to work on. Copy the following files from the course directory for this lab.
cp /cslab/class/cs245/lab1/* .
SortArray.java
is a driver program.
It will generate a random array of integers and run a sort method on it.
If you use it in the following way on the command line
java SortArray Classname 20
It will look for a method with signature int sort(int[])
in class Classname
to invoke,
passing the random array.
The number following the name of the class specifies the size of the array.
If you leave this off, it will default to size 10.
The sort method should return the number of comparisons it took
to sort the array.
You do not need to look at the file SortArray.java
;
it uses some Java features that we won't even get to in this course.
(The driver can also read an array from a file; this feature will be described later.)
The driver will report the number of comparisons and, if the array has 20 or fewer entries, it will display the array itself before and after sorting, for debugging purposes.
Open the file SelectionA.java
in xemacs.
The algorithm we derived in class is encapsulated in the method.
However, the method does not do any counting of the comparisons.
Your first task is to complete this method so that it does.
Basically, add one variable to tabulate the comparisons, increment
that variable for every comparison, and return it in place of "0"
as is currently returned.
Specifically, we're interested in the number of times we
compare two data from the array-- that is, the number of
times the expression min > array[j]
is evaluated.
Expressions like i < array.length
don't count
because they don't compare items in the array.
Then compile and test the revised selection sort. It should already sort correctly; check that its number of comparisons looks reasonable.
Next make the same change to InsertionA.java
.
You have probably seen another sorting
algorithm called bubble sort.
While not a very good sort in terms of efficiency, it is easy to program and understand.
This algorithm's strategy is to iterate through the array, swapping adjacent
values that are out of order.
It repeats this until the array is sorted.
Make a new class (BubbleA
) for this sort and implement it from scratch, with
comparison counting, so that it can be used with our driver.
Test that it sorts correctly and gives a reasonable number of comparisons.
The program SortList
is just like SortArray
except that it works on lists.
The sort
methods it invokes must return both
the sorted list and the number of comparisons.
To do this, I've provided a dummy class IntListPair
which serves as a carrier for the number of comparisons and sorted list
that these sort methods must return.
Open the file SelectionL.java
.
This contains the insertion sort algorithm we derived in class yesterday.
Unfortunately, it's not quite right.
Compile and run it and debug it.
Once it is sorting correctly, add code to count the numbers of comparisons.
Now for the main algorithmic challenge today:
Open a new file InsertionL.java
and write a method to perform insertion sort on lists, counting
comparisons.
Compile and test.
Time permitting: Write another class for performing bubble sort on lists and counting comparisons. However, make sure you leave at least 20 minutes to do the experiments in the next section.
Now you will use this code to run systematic experiments to compare sorting algorithms against each other. To make these experiments scientific and rigorous, you need to consider the variables in the system.
The variables you can control are
The permutedness of the array is also a variable. You also can choose either to let the permutedness vary (ie, don't control it) by allowing the driver to generate a random array each time, or you can make your own array and store it in a file, and sort the same array each time.
You can specify a file to read by using the -f
flag
with the driver.
Executing
java SortArray Selection -f somearray
will read data for the array from the file somearray
.
These data files should contain only integers (don't use punctuation to
separate the numbers), but you may separate the numbers by either
spaces or new lines, or a combination of both.
Choose two or three out of the following questions to explore:
Suppose you wanted to address the first example ("How does the number of comparisons in selection sort increase as the size of the array increases?"). You might choose several specific sizes (say 10, 100, 1000, 10000), and then run selection sort 10 times at each size and average the number of comparisons at each size--- that way you can deal with the variation over several runs.
Finally, write a short report on your experiments, describing your methodology (precisely enough that someone else could replicate your experiments) and report on the results in a table I recommend typing the report as a text file in xemacs.
Turn in hard copies of the files you wrote or modified, in addition to your
report.
The command to print files neatly, two to a page, is a2ps
.
You will probably want to execute the line
a2ps -P sp SelectionA.java InsertionA.java BubbleA.java
or
a2ps SelectionA.java InsertionA.java BubbleA.java --file-align=fill
The flag --file-align=fill
tells it not to start each file on
a new page, so less paper is wasted.