Sorting Project

In this project you will conduct an experimental or empirical study of algorithms. Experimental results play a major role in assessing the worth of many things in computing, but they are rarely done in an undergraduate course of study. In some ways the class Analysis of Algorithms is the most distant course from an experimental approach since it is primarily concerned with a theoretical framework for comparing and assessing algorithms. However, on the same token it is an appropriate place to include some experimental work because it provides a counterpoint to most of the content of the course and it allows us to see how well the theoretical framework predicts reality.

In this project in particular, you will study the efficiency of various sorting algorithms under specific conditions.

Just as in natural science and in engineering, good experiments in computer science must be controlled and reproducible. To allow another person to verify your experiments, you must report such things as the statistics of the machine or machines on which you ran the experiments (processor, speed, memory including cache, operating system), your methodologies (for example, are you taking the average over several runs? how many runs? where did the test cases come from?), and your code.

To make your experiments controlled, particularly in cases when you are concerned with running time, you want to make sure you restrict what you are timing. The Java method for reading from the system clock is System.currentTimeMillis() which gives the time in milliseconds since midnight, Jan 1, 1970. Readings can be made by calling this method before and after the section you wish to time and computing the difference.

However, you want to make sure that you are not including the running of other programs in that timing. At any time, other programs---some even by other users also logged in---could be sharing the processor with your program. If that is the case, the processor cycles spent on those programs affect your running times.

For this reason, rigorous experiments must be done in single user mode or recovery mode of the operating system, where the operating system itself is running at a minimum and where no other user programs may be running. There will be a machine in the lab dedicated for your use in this project. You will be given root access to this machine.

Your first step in this project is to decide on a question to investigate, around which you will build your experiment. Some suggestions follow (but I will consider proposals for questions not on this list):

You will need to code up the sorts you want to test as well as coding a program that will automate the tests. For your convenience, I am providing a class with some utility methods for generating a random array or list, or reading one from a file. You can get this class from /homes/tvandrun/Public/cs445/SortUtil.java. The linked-list-based methods in SortUtil assume a Node class which you can find at /homes/tvandrun/Public/cs445/Node.java. Feel free to modify these to suit your needs.

Your tasks:

  1. Choose a question you are interested in pursuing and make a proposal to me of what you plan to do, including the nature of your experiments and methodologies. I will help you refine this proposal. You may make this proposal either in person or by email.
  2. Code up the sorting algorithms and the testing framework. Make sure your algorithms are correct (they sort correctly).
  3. Run your experiments.
  4. Analyze your data. Make a graphical display of the data which will help you reason about the performance. For example, does the data fit an n lg n curve? If you have a strong background in statistics or data analysis, make use of this skill.
  5. Write a report. Shoot for about two pages of text. Describe your methodologies and data analysis, and come to a conclusion. Include your code and professional-looking graphical display.

Due dates: Proposal-- Monday, Sept 29; finished report-- Friday, Oct 17.


Thomas VanDrunen
Last modified: Tue Sep 23 16:15:23 CDT 2008