Lab 11: Implementing a hash map in C

The goal of this lab is to practice implementing a hash table and to reinforce your understanding of pointers and dynamic memory in C.

1. Set up

Copy the given code to an appropriate directory:

cp ~tvandrun/Public/cs245/lab11/* .

2. Introduction

Alright already, you're probably sick to death of writing maps by now, espeically if they're linked-list based. You did this in Java twice (project 3 and lab 7) This is the last one, I promise. Also, this one is mostly written for you, with just three carefully chosen functions for you to implement.

A part of this lab, then, is reading through some of the given code with your partner so that (a) you are certain you get how to do linked lists by now, (b) you understand how hashing works, especially how we can use it to speed up a map, and (c) you understand how pointers and dynamic allocation works in C. The tricky part of course is the interaction between the array containing the buckets and the linked lists that make the buckets.

The hashmap you finish here will have one extra feature in addition to what we did in class: When your code detects that the buckets are too full, either because there is imbalance or simply because the hashmap itself is too full, you will rehash---make a bigger array of buckets, and redistribute the items.

First you need to become familiar with the given code (there is a lot of it). The program driver.c exercises your hashmap, using it to associate countries with capitals.

3. The struct

Your first task is to understand the structs hashmap_t and node_t in the file hashmap.h. They are a little different from the class StringHashMap from class---for example, since C arrays do not carry their own length. The struct also needs to hold the number of items in each bucket (itself an array) and the total number of items so we can monitor how balanced the hashmap is.

4. create()

Read and understand the create() function, comparing it with the hashmap_t struct.

5. hash()

Read and understand the hash() function, which is similar to the hash() method from StringHashMap.

6. getNode(), put(), get(), rem(), and containsKey()

In some ways, these are the "easy" ones, because they will be somewhat similar to the versions in the Java example from class. Read them carefully and ask if there is anything you don't understand. Note at the end of put() how rehash() is called if the number of items is more than five times the number of buckets or if the bucket to which we just added exceeds 10 (we maintain an invariant that no bucket has more than 10).

7. keys()

Now for a "warm-up" task. Since there is no equivalent to iterators in C (unless you're really clever), I've specified this project so that there will be this function which returns an array of all the keys. Look at the driver to see how this is used. Notice that this function must allocate a new array, and it is the driver's responsibility to free it.

8. rehash()

Here is your main taks, and a relatively challenging one. Write rehash() Think carefully how you can make a new set of buckets and redistribute the items.

(It might make your job easier to make a "temporary" hashmap and make use of your put, rem, get, and keys() functions---but be careful. Ending your function with map = temp will not work. You need to modify the hashmap "object" that the parameter map points to.)

Also, don't forget to free things no longer in use.

9. destroy()

Finally, there's a lot to clean up (and null-out): nodes, array (or arrays), and the entire struct.


Thomas VanDrunen
Last modified: Fri Nov 22 17:00:59 CST 2013