Project: Perfect hashing

1. Introduction

The goal of this project is to understand the "perfect hashing" strategy for optimizing hash tables when all the keys are known ahead of time. This project is analogous to the optimal BST project; in both cases we take advantage of knowing the keys of the map before anything else happens, and I anticipate students finding both projects to be among the hardest of the semester. Perfect hashing, however, does not involve dynamic programming.

This appears as Project 6.12 in the book on page 360.

2. Set up

The code base for this project is the same as in the linear-probing project. In this project your work is in the class PerfectHashMap.

3. Implementing perfect hashing

You are essentially finishing two classes: not only PerfectHashMap itself, but also its member class SecondaryMap. The methods for the map operations (put, get, containsKey, and remove) are implemented already. I considered leaving that for you, but I then decided that those would only take up time and not be much of a challenge. But read those methods before starting on the unfinished code; make sure you understand how the primary table and secondary tables interact. What is left for you are the the interesting parts: the constructors for the two classes and the iterator.

The constructor for PerfectHashMap involves

The constructor of SecondaryMap involves generating new hash functions until you find one that has no collisions for the keys (in addition to initializing the instance variables).

The test for this class is PHMapTest. After writing those constructors (ie, before writing the iterator), all the test cases that don't have the word iterator in their name should pass.

The iterator is an interesting problem since it requires you to iterate through an array of (secondary) hashtables. In my own solution, I wrote an iterator for SecondaryMap and made the iterator for PerfectHashMap to be an "iterator of iterators", that is, an iterator over the current secondary table is part of the state of the iterator of the primary table. But, as is noted in a comment, it isn't required that SecondaryMap has an iterator at all. You may choose to write your iterator a different way. But don't simply save the big list of all keys given to the constructor and iterate through that. The iterator of PerfectHashMap should return just those keys currently associated with something in the map, not necessarily all possible keys.

4. Turn in

Copy the file you modified (PerfectHashMap) to your turn-in folder /cslab/class/cs345/(your id)/perfecthash .

To keep up with the course, this should be finished by April 29.


Thomas VanDrunen
Last modified: Fri Dec 21 15:29:13 CST 2018