The goal of this lab is to practice writing sorting algorithms on arrays and to experiment with one metric for comparing the efficiency of different algorithms.
There are many ways to compare algorithms and their implementations, to consider how efficiently they make use of resources. The main resources of interest are time and space (that is, computer memory), but occasionally other resources (such as network bandwidth) are of concern.
Furthermore, there are several ways to study an algorithm's efficiency with respect to a specific resource-- it can be studied experimentally, theoretically, and probabilistically; looking at best case, worst case, average case, and expected case; and with various degrees of precision. One crude (but not altogether useless) way to study the efficiency of sorting algorithms is to count the number of comparisons of data, since most (but not all...) sorting algorithms rearrange the data based on comparing pairs. In this lab you will experiment with several sorting algorithms and compare their efficiency based on the number of comparisons they need to make to sort arrays.
Make a directory for this lab and move into it.
mkdir lab1 cd lab1
As in most labs and projects, I am giving you some code base to work on. Copy the following files from the course directory for this lab.
cp /cslab.all/ubuntu/cs245/lab1/* .
SortArray.java
is a driver program that will
allow you to test parts of your lab as you go along.
Eventually you will write a couple of other programs with main()
methods.
If you use it in the following way on the command line
java SortArray Classname 20
It will look for a method with signature int sort(int[])
in class Classname
to invoke,
passing the random array.
The number following the name of the class specifies the size of the array.
If you leave this off, it will default to size 10.
The sort method should return the number of comparisons it took
to sort the array.
You do not need to look at the file SortArray.java
;
it uses some Java features that we won't even get to in this course.
(The driver can also read an array from a file; this feature will be described later.)
The driver will report the number of comparisons and, if the array has 20 or fewer entries, it will display the array itself before and after sorting, for debugging purposes.
Open the file SelectionA.java
in xemacs.
The algorithm we derived in class is encapsulated in the method.
However, the method does not do any counting of the comparisons.
Your first task is to complete this method so that it does.
Basically, add one variable to tabulate the comparisons, increment
that variable for every comparison, and return it in place of "0"
as is currently returned.
Specifically, we're interested in the number of times we
compare two data from the array-- that is, the number of
times the expression min > array[j]
is evaluated.
Expressions like i < array.length
don't count
because they don't compare items in the array.
Then compile and test the revised selection sort. It should already sort correctly; check that its number of comparisons looks reasonable.
Next we want to do the same thing to InsertionA.java
,
except that the sort()
method isn't complete.
Finish the method by writing the body of the inner loop
(if you need review on how insertion sort works, see
here, section 5.
Also count and return the number of comparisons
You have probably seen another sorting algorithm called bubble sort. While not a very good sort in terms of efficiency, it is easy to program and understand. This algorithm's strategy is to iterate through the array, swapping adjacent values that are out of order.
Clearly one pass through the array of this sort of swapping won't sort the array. Many passes are necessary to put all the elements in the right order. There are two ways to monitor repeated passes: First, one could keep track of whether any changes were made to the array (whether any actual swaps happened) on the current pass; if a pass completes without any swaps, then the array is sorted and we can quit. Second, we can observe the fact that after the first pass through the array, the largest element has made it all the way to the end, and so the next pass can stop one element short; the second pass will put the second largest element in the right place, and so the third pass doesn't need to examine the last two positions; an outer loop, therefore, can count down the ending point of the potentially unsorted portion of the array until that portion is empty.
The best version of Bubble sort would incorporate both
of these ideas, but for our purposes, we would like to compare them
against each other.
Accordingly, complete the two classes BubbleA1
and BubbleA2
so that they implement
these two versions of Bubble sort -- plus counting comparisons.
Make sure your code both sorts correctly and gives a
reasonable-looking report on the number of comparisons.
(The algorithm in BubbleA2
, second version, is completed
for you; you need only to add the counting of comparisons.
In BubbleA1
, you need also to complete the algorithm.)
Now we want to run some experiments to determine
which algorithms require more or fewer comparisons to
sort, and also how they may vary.
We will conduct two sets of experiments;
you will write two short programs (classes just with a main()
method) to run these experiments.
In this experiment, we ask three questions:
We can easily address all three of these questions in the same experiment. Write a program with generates several (say, 5) arrays of the same size (say, 50 items). Then for each sorting algorithm, it sorts each array twice and displays the number of comparisons.
Use the methods from SortUtil
to help.
createRandomArray()
will generate an
array of a given size with random integers between 0 and 100.
Make sure that when you repeat the sort of an array that
you sort the original, unsorted sequence, not the sorted version.
To do this, I recommend you first generate a "master" array
and then make copies of it using SortUtil.copyArray()
and sort the copy.
Your program should generate readable output, something like
Array 1: Insertion: 500 500 Selection: 550 550 Bubble 1: 625 625 Bubble 2: 550 550 Array 2: ...
What do you observe?
Another interesting range of questions is, how does the number of comparisons that an algorithm makes grow as the size of the array grows. If you give it an array twice as big, does it require twice as many comparisons--or perhaps four times as many comparisons?
Write a program that conducts this experiment: Loop through several sizes (for example, 10, 50, 100, 250, 500, 1000). For each size, for each algorithm, generate five random arrays, and find the average number of comparisons the algorithm makes when sorting those arrays.
When you have that data, see if you can find a pattern. If you have time, launch the OpenOffice.org Spreadsheet program and generate a graph that plots each algorithm's average number of comparisons versus array size; otherwise, do your best to eyeball it.
Can you guess what sort of functions these are? Is there an algorithm that grows most slowly (and therefore is the fastest)?
Turn in hard copies of the files you wrote or modified. Also, run your experiment programs to show the results. Finally, write a short report (one paragraph for each experiment) describing your conclusions.
The command to print files neatly, two to a page, is a2ps
.