Lab1: Sorting

This lab has a variety of goals: To refresh your skills in programming generally, to introduce you to C (including the C compiler), to introduce the topic of sorting, and to start you thinking about how to measure and compare the performance of programs.

In this lab you will write two sorting algorithms and run an experiment to measure one aspect of their performance.

This lab description assumes that you know and remember the essentials of three sorting algorithms: insertion sort, selection sort, and bubble sort. If you need a refresher on any of these, you can find descriptions of them at this old CSCI 235 project description.

1. Set up

Make a directory for this lab and move into it.

mkdir lab1
cd lab1

As in most labs and projects, I am giving you some code base to work on. Copy the following files from the course directory for this lab.

cp /homes/tvandrun/Public/cs245/lab1/* .

This gives you five files:

arrayUtil.h The header file for the collection of useful array functions.
arrayUtil.o The "object" (compiled) file of the implementations of the array functions.
sDriver.c A program to test the code you will write.
sorts.h The header file for the sorting algorithms.
sorts.c The implementation file for the sorting algorithms. This is the only file of the ones given to you that you will need to modify.

Open sorts.c in xemacs or gedit.

2. Selection sort

Implement selection sort in the function selectionSort(). As the algorithm progresses, keep track of comparisons in the variable compars.

Specifically, we're interested in the number of times we compare two data from the array-- that is, the number of times an expression like min > array[j] is evaluated. Expressions like i < n don't count because they don't compare items in the array.

Then compile the revised selection sort.

gcc -c sorts.c

Do you remember what that compilation command means? Working in C will mean that you'll get a whole new world of compilation errors in addition to the ones you are used to with Java. As me for help if any don't make sense.

When your file compiles without error, then compile and link the driver program.

gcc -c sDriver.c
gcc sDriver.o sorts.o arrayUtil.o -o sDriver

And test.

./sDriver selection

If it doesn't sort correctly, then debug. Also, make sure that you are testing for comparisons correctly. If you are not getting between 40 and 50 comparisons, then you're missing some.

3. Insertion sort

Now do the same with insertion sort. Comparisons will range from around 15 to about 40. If you're not getting in the high 30s some of the time, then you're missing some.

4. Bubble sort

You have probably seen another sorting algorithm called bubble sort. While not a very good sort in terms of efficiency, it is easy to program and understand. This algorithm's strategy is to iterate through the array, swapping adjacent values that are out of order.

Clearly one pass through the array of this sort of swapping won't sort the array. Many passes are necessary to put all the elements in the right order. There are two ways to monitor repeated passes: First, one could keep track of whether any changes were made to the array (whether any actual swaps happened) on the current pass; if a pass completes without any swaps, then the array is sorted and we can quit. Second, we can observe the fact that after the first pass through the array, the largest element has made it all the way to the end, and so the next pass can stop one element short; the second pass will put the second largest element in the right place, and so the third pass doesn't need to examine the last two positions; an outer loop, therefore, can count down the ending point of the potentially unsorted portion of the array until that portion is empty.

Implement a version of bubble sort and make sure it sorts correctly.

4. Experiment

Now we want to run an experiment to determine how these algorithms vary in terms of the comparisons they require. For this you will write a program that runs the experiment.

Open a new file, called something like experiment.c. Write opening documentation like what we have seen, giving your name and the occasion (lab 1).

In this experiment, we ask three questions:

If we run the same algorithm more than once on the same (original) array, will we get the same number of comparisons? (We better!)
If we run different algorithms on the same (original) array, will we get the same number of comparisons? (Perhaps it will depend on which algorithms we try.)
If we run the same algorithm on different arrays of the same size, will we get the same number of comparisons? (Perhaps we will for some algorithms, and not for others.)

We can easily address all three of these questions in the same experiment. Write a program with generates several (say, 5) arrays of the same size (say, 50 items). Then for each sorting algorithm, it sorts each array twice and displays the number of comparisons.

Use the methods from arrayUtil.h to help. randomArray() will generate an array of a given size with random integers between 0 and 100. Make sure that when you repeat the sort of an array that you sort the original, unsorted sequence, not the sorted version. To do this, I recommend you first generate a "master" array and then make copies of it using copyArray() and sort the copy.

Your program should generate readable output, something like

Array 1:
Insertion: 500 500
Selection: 550 550
Bubble 1: 625 625
Bubble 2: 550 550

Array 2:
...

What do you observe?

5. To turn in

In a typescript, display your sort implementations and experiment.

script
cat sorts.c
cat experiment.c

Then run your experiment to show the results.

./experiment

In a paragraph, describe the results you observe.

Thomas VanDrunen

Last modified: Thu Jan 13 10:31:32 CST 2011