Project 3: Balanced Trees

The goal of this project is to understand how two balanced-tree strategies (AVL trees and RB trees) by implenting them. Moreover, you will compare their performance experimentally.

An important thing to know about this project up front is that there are three parts (plus a possible fourth...): the first two involve implementing part of each of the two strategies and is very narrowly focused, with JUnit tests to help you determine whether you've done it right, like the first two projects; the third part is more open-ended. Make sure you give yourself enough time to do the third part.

1. Set up

Find the project code in ~tvandrun/Public/cs345/proj3. As usual, it will have three packages, adt, impl, and test. Remember that using the cp command as

cp -r ~tvandrun/Public/cs345/proj3/. .

will grab the hidden .classpath file so that Eclipse wlll but the JUnit libraries in the build path when you start an Eclipse project.

Speaking of which, make a new Eclipse project. Remember to make the project in the folder containing the adt etc folders.

In the adt package there is an interface Map. This has a slight modification from the previous versions we've seen: the remove() method signature has (ironically) been removed. Also, in class we've seen that binary search trees can easily be used to implement ordered maps. Eliminating athe ordering operations and the remove() operation, we greatly simplify our task.

2. Implementing an AVL tree

AVL trees are implemented through two classes, AVLTreeMapAbs and AVLTreeMap. The former is nearly identical to the included BasicBSTMap (which is a reduced form of the BasicBSTOMap we saw in class). AVLTreeMapAbs differs from BasicBSTMap in that it also has code to verify whether the tree is balanced (the verify() method in the Node class). An exception will be thrown if the tree violates the AVL property.

But AVLTreeMapAbs is missing a part. At the end of every put() it calls fixup() to correct any violation that has been introduced by the put operation. fixup() is abstract, and is to be implemented in the child class AVLTreeMap.

(The reason for moving fixup() to a separate class is so that you can then turn in only AVLTreeMap for this part. You can't make a tree appear to be balanced by tampering with the code in AVLTreeMapAbs that checks if a tree is balanced since I will be grading your code using the original AVLTreeMapAbs. So, don't modify AVLTreeMapAbs, or, if you do, know that anything you do will be ignored for grading.)

Run the JUnit test AMTest to verify that some cases fail because of an imbalanced tree.

Now, write the fixup() method (and any helper methods you think will simplify the task). Make sure all your changes in this part are in AVLTreeMap. Note that there is a protected instance variable in the parent class called searchTrace that records the route from the root to the parent of the newly added node (if any). See the documentation of both classes for details.

Word to the wise: The given JUnit testcases do not cover everything that could happen in an AVL tree as it grows. You may want to add your own test cases.

3. Implementing a red-black tree

The red-black tree implementation differs quite a bit from AVLTreeMap and BasicBSTMap in that the operations are all written recursively in the nodes. In that way it is more similar to the binary search tree version found in the textbook. Your first step will be to take some time to get to know the code as it currently stands.

Like AVLTreeMap, this class has code to check that the tree satisfies the red-black tree properties. Thus there are exception classes defined for an inconsistent black height and for cases when there are two reds in a row. There actually are two "double-red" exceptions, one to be thrown when a double-red naturally occurs temporarily during an insertion and needs to be fixed, and one for when a double-red is found when verifying the tree (and thus indicates the code is incorrect).

Unlike for AVLTreeMap, I don't have a fancy way to prevent you from tampering with the verification code. I will manually inspect what you submit to make sure the verification code is doing what it should be doing.

Since this is implemented recursively in the nodes, the Node class---or, actually the Node interface, implemented by the RBNode class---has methods for all the map operations. The methods in the RedBlackTreeMap class itself simply start the process by calling the appropriate method on the root.

What if the root is null? In this case we don't use Java's null value to stand for (conceptual) nulls in the tree. Instead we make our own "null" object (referred to by the instance variable nully; the object is really the instance of a singleton class we implement using an anonymous inner class). We do this so that the null object can respond to the same messages that full-fledged nodes can.

Look at the put() method in the RedBlackTreeMap class and the RBNode class. Notice that RBNode.put() itself returns a node. That is the node to replace it in the modified tree. Currently a normal (non-null) node will simply return itself. The null object will make a new RBNode for the key/value pair and return that. Make sure you understand how this works before moving on.

Now, RBNode.put() is the method you need to fix up, because it doesn't actually do any corrections to the tree. Fix this. (The code I give you retains the basic structure of my solution, but in my solution the first and third branches---compare > 0 and compare < 0---are very long.) You will probably want to add helper methods for rotations.

The way I have set things up suggests a strategy where you propogate the need to fix-up double-reds up the tree by throwing a DoubleRedException, but there are other ways to do it.

Then you will also need to fix the first part of RedBlackTreeMap.put() incase a double-red violation has propogated all the way back to the top. (The try/catch you'll find there also suggests the exception-throwing approach that I used; but you may change this if you take a different approach.)

Use RMTest to test, but, again, you are advised to add your own test cases.

4. Experiments

Finally (and this is a big part, though I'm going to give less specification and guidance), design and implement an experiment or series of experiments to investigate the difference in performance among BasicBSTMap (no balancing), AVLTreeMap, and RedBlackTreeMap.

You will need to determine what exactly "difference in performance" means, but generally we want to know how much better balanced trees are to the naive approach and which of the two balancing strategies is better and under what circumstances. In theory since AVL trees keep a stricter balance but accordingly also spend more time keeping that balance, AVL trees should perform better when the operations have a lot of gets and fewer puts, but red-black trees should be better if there are are greater percentage of puts compared to other operations.

So, decide a specific question or questions that you want to test; design an experiment to answer that question; implement and run that experiment; and interpret the data to answer your questions.

Practical pieces:

Finally, write up a report in which you explain the question(s) your experiment was to explore; what your experimental methods were (in detail); what your results were; how you interpreted those results to answer your questions. Which strategy performs better, how much better, and under what circumstances?

5. More?

I had originally intended one more leg of this project where you would implement removal/deletion. At the time of putting this project together, it wasn't working out to include an exercise like that. I reserve the right to make an "addendum" to this project (a Project 3.5), but if I do it will be on its own schedule.

6. Turn-in

Copy the files you modified:

...to /cslab.all/ubuntu/cs345/turnin/(your id)/proj3

Due Monday, Feb 23, 5:00 pm.


Thomas VanDrunen
Last modified: Thu Feb 12 11:34:38 CST 2015