The goal of this project is to understand hashing and hash tables by implementing the linear probe hash map strategy and perfect hashing, and by comparing their performance experimentally.
Like project 3, there are three parts: implementing linear probe, implementing perfect hashing, and setting up an experiment.
Find the project code in ~tvandrun/Public/cs345/proj3
.
As usual, it will have three packages, adt
,
impl
, and test
.
Remember that using the cp command as
cp -r ~tvandrun/Public/cs345/proj5/. .
will grab the hidden .classpath
file so that
Eclipse wlll but the JUnit libraries in the build path when you
start an Eclipse project.
Speaking of which, make a new Eclipse project.
Remember to make the project in the folder containing the adt
etc folders.
In addition to the classes you need to modify (and their interfaces, and their
JUnit tests), I am giving you the "basic" (separate chaining) hash map
from class (and from CSCI 245), as well as the simple linear
ListMap implementation of a map.
These can be used in the experiments for comparison.
Also included are the hash function factory, the prime source,
and an implementation of a set using a list (since you may find
a set to be of use at some points).
Every implementation of every ADT has a JUnit test case for it
(for example, LMTest
is a test for the ListMap
class;
that way you can modify them if you want to for the purposes of your
experiments and check that your modifications don't break things.
I have already provided the instance variables (I suppose you may add
to these, but I don't think you'll need to).
You need to complete the constructor, the helper function
findIndex()
, put()
, remove()
,
and iterator()
.
For my implementation of remove()
, I used
a helper method compareIdealPlace()
, which I've
provided a stub (and thorough documentation) for in case you find it useful.
I recommend that you develop this incrementally, saving
iterator()
and remove()
for the
end.
You can test, for example, that your implementation of findIndex()
and put()
work before starting on the harder ones
by running the JUnit test cases and looking only at the ones that rely only on what
you've done so far.
(I think every test case that relies on the iterator and remove()
actually have "iterator" and/or "remove" in the name.)
I again have provided the instance variables and stubs for nested classes that act as the secondary maps (and their instance variables). At first glance it may look like a lot of things need to be done, but keep in mind that all of the put, get, containsKey, and even remove methods, both in the PerfectHashMap class and the SecondaryMap class, are very simple. The interesting parts are the constructors. (The iterator is also difficult, but save that till the end.)
Formulate a question about the performance of these hasing techniques and, as in Project 3, design and implement an experiment, run the experiment, and interpret the results.
Be sure you start by formulating a specific question. There are several variables you could test, several questions you could investigate:
You may study these with either dynamic (ie, running time) or static (eg, how well things get distributed) measures, or a combination of them---whatever makes sense to answer the question you're investigating.
Write a report stating the question you investigated, explaining your methodology, presenting your results, and drawing conclusions from them.
It seemed on Project 3 that many of you had trouble identifying a specific question and/or isolating the variables you're testing. Feel free to run your idea by me ahead of time or ask for help on experiment design.
Copy the files you modified:
LinProbHashMap.java
PerfectHashMap.java
...to /cslab.all/ubuntu/cs345/turnin/(your id)/proj5
Due Monday, Apr 6, 5:00 pm.