Project 3 preview: Sorting arrays

1. Introduction

Sorting is a fundamental problem in computer science. Not only do many applications require putting data in order, but also sorting is a good case study for problems and algorithms. There are many ways to sort a set of data—some ways are more intuitive (that is, easier for a human to understand) than others, some are faster than others, and some require more computer memory than others. This project description will describe three algorithms for sorting arrays of integers. The first of these three we will implement during class on Wednesday. The other two are yours to implement.

To get used to the idea, you may find it helpful to try sorting numbers by hand. For example, take a pile of index cards and write a number on each of them. Then sort the pile. Pay careful attention to the steps you take. How would you instruct another person to sort a pile of cards using the same process that you used? In this lab you will think about how a computer should sort an array of numbers.

2. Bubble sort

The algorithm called Bubble sort is not a particularly efficient algorithm (in fact, it's one of the worst), but it is very easy to understand and to program, so we will start with it. The idea is to iterate through the array and to compare adjacent elements, swapping them if they are out of order. Consider this example on an array containing the values 33, 11, 44, 22.

In the left column, we see what happens during one pass through the array. 33 is compared with 11; since they are out of order, they are swapped. Next, 33 is compared with 44; since they are in order, they are not swapped. Finally, 44 is compared with 22; since they are out of order, they are swapped. 44, the largest number, has made it all the way to the end of the array, so it is in the right place. The shaded area of the array is still unsorted, but the white area is sorted.

Notice that one pass through the array is not enough. In fact, we've placed only a single element in the right place. The right column shows the state of the array after successive rounds of the process we just described. If we pass through the entire array, we will place 33 in the second-to-last position. If we pass through again, we will place 22 in the third-to-last position.

Notice that this requires a loop within a loop—an inner loop which conducts the pass over the array and an outer loop which repeats those passes.

3. Selection Sort

Our second alorithm, Selection sort works this way. Suppose again than you have a pile of cards in your hand, each with a number on it, which you would like to set in order. You might try flipping through the pile searching for the smallest. Once you find the smallest number, you could remove its card and set it in a new pile by itself—the new, sorted pile. Then you would flip through the original pile to find the next smallest number, and put its card next on the pile. By continuing this process, eventually you will transfer all the cards from the old, unsorted pile to the new, sorted pile.

Now think about how to do that in a computer program on arrays. The array is like a pile. The obvious, analogical way to do selection sort would be to create a second array (standing for the new pile) and repeatedly copy values from the old array to the new array, starting from smallest and continuing to the largest. However, to be efficient, we want to avoid making that extra array. Here's how:

We will consider the array at all times to have a "sorted" section and an "unsorted" section, just like we saw how Bubble sort worked. Before we begin, the sorted part will be empty, and the whole array will be the unsorted section. While the sorting happens, the sorted section will "grow", that is, it will comprise a larger and larger portion of the array; the unsorted section likewise will shrink. When we finish, the sorted section will take up the entire array and the unsorted section will be empty.

Assume that the sorted part comes first and the unsorted part is the rest. Then we search the array looking for the smallest number. When we find it, we swap that value with the value in the first position in the unsorted part. That way the sorted part of the array grows by one element and the unsorted shrinks by one element. Follow how it works in this illustration; the unsorted portion of the array is in gray.

Initially everything is unsorted. We identify 11 as the smallest and swap it with the first unsorted value, 33. Now 11 is in the correct position, so the sorted portion of the array is the first position. Next we find 22 as the smallest element in the unsorted part. We swap it with the first unsorted position, and so the sorted portion grows by one. Eventually, all elements are sorted.

4. Insertion sort

Insertion sort works this way: If you were using insertion sort over a pile of cards, you would place cards in a new, sorted pile. You would take the next card on the unsorted pile and seach for the correct position in the sorted pile. For example, if you had cards with numbers 33, 11, 44, 22, initially they would all be in the unsorted pile. First, you pick up the card on top of that pile, 33, and place it in a new, sorted pile. Since 33 is all by itself, it is "sorted." Next you pick the next card in the pile, 11. You put 11 into the correct position in the sorted pile by putting it on top of the 33. Next, pick up 44. Since it is larger than 11 and 33, you put it under the 33. Finally, you pick up 22, and since it is between 11 and 33, you insert it between those cards in the pile. In the end, the unsorted pile is empty, and the sorted pile contains 11, 22, 33, 44.

For efficiency, we do not use two arrays, as you would use two piles. Instead, again consider the array to have two parts at any moment in the computation, a sorted part and an unsorted part. Initially, the sorted part is empty and the unsorted part is the entire array.

So, insertion sort maintains sorted and unsorted sections of the same array, and the sorted section grows at each step and eventually the whole array is sorted. It takes the first element in the unsorted section, and then shifts over part of the sorted section to make room for it. Consider this illustration. The left side shows a "big-step" view; the right side shows the same process in slow-motion.

Notice that in the "slow motion" version of the illustration, the way we move numbers over to make room is that we actually shift the new number over one space at a time (by swapping, like we did for Bubble sort) until it is in the right position.

Thomas VanDrunen, Cary Gray

Last modified: Sun Sep 21 22:17:12 CDT 2014