[The following talks about "three projects." Only the first two of the three are assigned projects to be turned in. The third (one left-leaning red-black trees) is recommended for practice, but does not need to be turned in.]
The goal of these next three projects (AVL trees, traditional red-black trees, and left-learning red-black trees) Is to understand how these three balanced binary search tree strategies work by implementing them. The three projects will use the same code base, and in some ways can be seen as a single, big project. Splitting them up into three projects, however, will help you spread out the work on them; for example, you are encouraged to start work on the AVL tree project (described here) after we learn AVL trees rather than waiting until we learn all the varieties of trees.
One part that is missing from this series of projects is an experimental section. I haven't developed that part enough yet. You are encouraged to write experiments to compare the running time for your own learning. (Caution: you'll probably find that you need fairly large amounts of data to see the effects.)
As mentioned before, the code base is the same for all three
projects,
and this section will give an overview of the whole code.
Copy the given code from ~tvandrun/Public/cs345/bal-tree
and make an Eclipse project for it.
As usual, you will find adt
, impl
,
and test
packages.
The most important part of the adt
package is
the Map
interface, which has a slight modification
from the previous versions we've seen: the remove()
method signature has (ironically) been removed.
Because the various kinds of balanced BSTs have a fair amount of code in common, we have a complicated type heirarchy:
BasicIterativeBSTMap
does not share any code
with the other types, since it takes a completely different approach.
In fact, it's not even included with the given code of this project,
though we have seen it in class.
The abstract class AbstractRecursiveBSTMap
contains all
the code for manipulating a binary search tree except for anything
that would verify that the tree meets the properties of various
balancing strategies and any code that would fix up a tree that is
out of balance.
That is deferred to the child classes.
The child class BasicRecursiveBSTMap
implements
verificaation and fixup by... doing nothing.
The abstract classes AbstractAVLBSTMap
,
AbstractRedBlackTreeMap
, and
AbstractLLRedBlackTreeMap
contain code for verifying the properties of AVL trees,
general red-black trees, and left-leaning red-black trees,
respectively.
The classes AVLBSTMap
, TraditionalRedBlackTreeMap
,
and LLRedBlackTreeMap
each provide the fix-up code---or
they will, once you finish them, since that is your task
in the three projects.
The reason for separating the verification and fixup code into different levels of the class hierarchy is to prevent students for submitting code that is wrong (doesn't rebalance properly) but appears correct (because the code to check if the trees are balanced is wrong). In the set up as given, you will modify and submit the files that do the fix up, not the files that do the verification. Accordingly, you should not modify the verification code found in the abstract classes or, if you do, know that your changes will not be used in grading your project.
However, since these are implemented recursively in the nodes, the type hierarchy for nodes is just as important and even more complicated:
This mainly mirrors the type hierarchy for the tree classes, but with another dimension: The trees are not going to have acutal null references, since that would require extra checks every time we use a link. Instead, "null" links will be references to special objects called null nodes. The advantage is that these objects can respond to the same methods as real nodes. Hence for every kind of tree, we have both a null node class and a "real" node class.
Take some time to understand how AbstractRecursiveBSTMap
and its node classes
are set up and how their code works.
In the node classes in particular notice realHeight()
,
countLeaves()
and totalDepth()
to compute simple statistics about the trees.
verify()
is to check that the tree meets
certain conditions, which will be different for each
kind of tree.
Look carefully at how AbstractNullNode
implements
these things.
AbstractRealNode
, on the other hand, has an
additional method signature called fixup()
.
Most of your work in each of these three projects will
be writing implementations for this, to rebalance the
tree when it is in violation of the balance properties.
Turn your attention to AbstractAVLBSTMap
.
The interface AVLNode
defines some
additional operations for the nodes of AVL trees.
AVL tree nodes will store information about the size, height,
and balance of the subtree rooted at that node.
Note that "balance" is defined as an integer which is
the left height minus the right height.
Thus if the subtrees have the same height, balance is zero.
That doesn't mean the subtee is perfectly balanced, since the
left and right subtrees might themselves be off balance.
But it means that there is no problem with respect to each other.
These attributes could be computed on demand, recursively. However, that would require traversing the whole (sub-)tree, which would kill performance---it would defeat the purpose of using binary search trees. So instead we store that pre-computed information in the nodes. However, that information could become out of date when an insertion is made or when the tree rotates. We'll need to recompute those values. But even then, we don't want to traverse the whole tree; we recompute a node's height, size, and balance by assuming the node's children's values are correct and recomputing based on those values (for example, subtracting the children's heights to get a new balance value). We'll call that a soft recompute.
By contrast a hard recompute is when we traverse the whole tree and recompute all the attributes of all the nodes, brute-force. We would do this only when debugging. When running for performance, we would never need nor want to do this.
The verify()
method soft-recomputes
and then checkes that (recursively, each node) has a balance
no more than one a way from zero.
Otherwise an ImbalanceException
is thrown, with
a message that will hopefully help with debugging.
The only thing left undone in AbstractAVLBSTMap
and its
node classes is fixup()
, which is triggered by
put()
in
AbstractRecursiveBSTMap.AbstractRealNode
.
That's...
Write the body of AVLBSTMap.AVLRealNode.fixup()
.
This is a hard task; my solution took around 60 lines of code,
and there's no other way to do this than work through the details
of the various cases.
Here's a way to organize it:
replacement
is initialized to this
.
Other hints:
oldRight
and
oldLeftRight
to keep track of nodes in relation to
this
while doing rotations.
assert
.
If you think something has to be a certain way (balance is in or
outside a certain range, a node has to have a certain type), then
assert it.
AVLNode
, which
means the objects they refer to could be AVLNullNode
.
This makes a difference because you'll need to get at a
node's left
and right
, but
AVLNullNode
doesn't have them.
If you think you need to get a node's left
and right
, then you better be sure it is not null.
In that case, cast it.
Make a new variable of the type AbstractAVLRealNode
,
and then you can get at its left and right.
For example:
AbstractAVLRealNode oldLeft = (AbstractAVLRealNode) left; AbstractAVLRealNode oldLeftRight = (AbstractAVLRealNode) (oldLeft.right);Make sure you understand how casting works! This doesn't change any node object to be a different type, and this doesn't make any new objects. This merely asserts that
left
is
an AbstractAVLRealNode
and uses it as such.
This won't work (ie, it will throw a ClassCastException
)
if the object left
refers to is not
an instance of a subtype of AbstractAVLRealNode
.
Test using AVLBSTMTest
.
Copy the file you modified (AVLBSTMap
)
to your turn-in folder /cslab/class/cs345/(your id)/avl
.
To keep up with the course, this should be finished by March 17.