The goal of this lab is to practice implementing a hash table and to reinforce your understanding of pointers and dynamic memory in C.
Copy the given code to an appropriate directory:
cp ~tvandrun/Public/cs245/lab12/* .
Alright already, you're probably sick to death of writing maps by now, espeically if they're linked-list based. You did this in C in lab 8 and in Java in lab 9. This is the last one, I promise. Also, this one is mostly written for you, with just three carefully chosen functions for you to implement.
A part of this lab, then, is reading through some of the given code with your partner so that (a) you are certain you get how to do linked lists by now, (b) you understand how hashing works, especially how we can use it to speed up a map, and (c) you understand how pointers and dynamic allocation works in C. The tricky part of course is the interaction between the array containing the buckets and the linked lists that make the buckets.
The hashmap you finish here will have one extra feature in addition to what we did in class: When your code detects that the buckets are too full, either because there is imbalance or simply because the hashmap itself is too full, you will rehash---make a bigger array of buckets, and redistribute the items.
First you need to become familiar with the given code (there is a lot of it).
The program driver.c
exercises
your hashmap, using it to associate countries
with capitals (like in Lab 9).
Your first task is to understand the structs hashmap_t
and node_t
in the file hashmap.h
.
They are
a little different from the class StringHashMap
from class---for example, since C arrays do not
carry their own length.
The struct also needs to hold the number of items in
each bucket (itself an array) and the total number of items so we can
monitor how balanced the hashmap is.
create()
Read and understand the create()
function,
comparing it with the hashmap_t
struct.
hash()
Read and understand the hash()
function, which is similar to the hash()
method from StringHashMap
.
getNode()
, put()
,
get()
, rem()
, and containsKey()
In some ways, these are the "easy" ones, because they
will be somewhat similar to the versions in the Java
example from class.
Read them carefully and ask if there is anything you don't understand.
Note at the end of put()
how rehash()
is called if
the number of items is more than five times the number of buckets
or if the bucket to which we just added exceeds 10
(we maintain an invariant that no bucket has more than 10).
rehash()
Now for your task, and it's a relatively challenging one.
Write rehash()
Think carefully how you can make a new set of buckets
and redistribute the items.
(It might make your job easier to make a "temporary" hashmap
and make use of your put
, rem
,
get
, and keys()
functions---but
be careful.
Ending your function with map = temp
will not work.
You need to modify the hashmap "object" that the
parameter map
points to.)
Also, don't forget to free things no longer in use.
keys()
Since there is no equivalent to iterators in C (unless you're really clever), I've specified this project so that there will be this function which returns an array of all the keys. Look at the driver to see how this is used. Notice that this function must allocate a new array, and it is the driver's responsibility to free it.
destroy()
Finally, there's a lot to clean up (and null-out): nodes, array (or arrays), and the entire struct.