Practice: Sorting strings

The goal of this exercise is to think through the details of sorting algorithms that exploit properties of strings. Secondarily this will review sorting algorithms quick sort and radix sort more generally.

0. Before you start

Back when we did the BST-as-ordered-map in-class activity (Feb 26), I hadn't yet set up a turn-in folder for that. Please go back to that and turn in the files you modified (should be just impl/RecursiveBSTMap.java) to /cslab/class/cs345/(your user id)/bstomap. It counts as part of class participation (under the category of "other" on the syllabus), not as a project. Same thing goes with this activity.

1. Set up

Please grab the given code from

~tvandrun/Public/cs345/stringsort

and make a new Eclipse project for it. The interesting stuff is in StringThreeWayQuickSort and StringRadixSort, which you will need to complete. You are also given implementations of merge sort and selection sort for comparison; there is nothing special about these, they use String.compareTo().

2. String three-way quick sort

In class yesterday we talked through the principles of quick sort and how that can be applied to strings, in particular by partioning the array into three sections based on comparing just one character.

Finish the method alg.StringThreeWayQuickSort.sortR(). Make sure you first understand the documentation, which gives a full specification of what it should do. Your work is in writing (a) the loop to partition the current range in the array, and (b) make three recursive calls .

For doing the partition, I recommend having three index variables:

I also recommend that your loop have the following invariant (where pivot is some string used as the pivot):

Although pivot was used in the above invariants, my implementation didn't have a variable called pivot; it's just used conceptually. I did have a char variable for storing pivot.charAt(pre).

Test using test.ThreeWayQuickTest.

3. String radix sort

We also talked about radix sort for strings. In this case, I'm giving you more starter code, and there are several bits and pieces that you need to fill in. Look at alg.StringRadixSort and familiarize yourself with the general outline. Instead of calling some implementation of counting sort, the method alg.StringRadixSort.sort() itself performs counting sort for each pass (ie, for each letter position). Then consider the missing parts.

Test using test.RadixTest.

4. Experiments

The class expr.StringExperiments runs merge sort, radix sort, three-way quick sort, and (if the array is small enough) selection sort. It runs these sorting algorithms on arrays of randomly generated strings of a specified lengths. You can specify the size of the array with the -n flag and the number of characters per string with the -d flag. For example

java expr.StringExperiments -n 100000 -d 3

runs and displays the running time in nanoseconds of merge, radix, and TWQ on 100000 3-character-long strings. (100000 is too big to use with selection sort.)

Play around with thise experiment to see how the sorts perform. You may also modify StringExperiments or design your own from scratch. The class alg.StringSortUtil has a few helpful methods to generate, copy, and validate arrays of strings. What do you observe?

5. Turn-in

To verify that you did this exercise, copy StringThreeWayQuickSort.java and StringRadixSort.java to /cslab/class/cs345/(your user id)/stringsort.


Thomas VanDrunen
Last modified: Wed Apr 18 12:42:26 CDT 2018