The goal of this lab is to practice implementing a hash table and to reinforce your understanding of pointers and dynamic memory in C.
Copy the given code to an appropriate directory:
cp ~tvandrun/Public/cs245/lab14/* .
In this lab, you will implement a hash map in C, similar to
the StringHashMap
written in Java
that we saw in class.
However, your hash map will have one extra feature: When your code detects that the buckets are too full, either because there is imbalance or simply because the hashmap itself is too full, you will rehash---make a bigger array of buckets, and redistribute the items.
First you need to become familiar with the given code (there is a lot of it).
The program driver.c
exercises
your hashmap, using it to associate countries
with capitals.
It then tests containsKey()
positively and negatively,
tests rem()
(remove),
and an iteration over the hashmap.
(If you look at the list of countries, you'll notice some entries like Northern Ireland, Puerto Rico, Transnistria, and Palestine, which for various reasons (status, independence, recognition, etc) raise the question of how we define country. I simply grabbed this list from Wikipedia's list of national capitals, which you can check out yourself if you're interested in the political status of any of the nations included. No political message is intended on my part by including or excluding any entity.)
Your first task is to understand the structs hashmap_t
and node_t
in the file hashmap.h
.
They are
a little different from the class StringHashMap
from class---for example, since C arrays do not
carry their own length.
The struct also needs to hold the number of items in
each bucket (itself an array) and the total number of items so we can
monitor how balanced the hashmap is.
create()
Read and understand the create()
function,
comparing it with the hashmap_t
struct.
hash()
Read and understand the hash()
function, which is similar to the hash()
method from StringHashMap
.
getNode()
, put()
,
get()
, rem()
, and containsKey()
In some ways, these are the "easy" ones, because they
will be somewhat similar to the versions in the Java
example from class.
Read them carefully and ask if there is anything you don't understand.
Note at the end of put()
how rehash()
is called if
the number of items is more than five times the number of buckets
or if the bucket to which we just added exceeds 10
(we maintain an invariant that no bucket has more than 10).
rehash()
Now for your task, and it's a hard one.
Write rehash()
Think carefully how you can make a new set of buckets
and redistribute the items.
(It might make your job easier to make a "temporary" hashmap
and make use of your put
, rem
,
get
, and keys()
functions---but
be careful.
Ending your function with map = temp
will not work.
You need to modify the hashmap "object" that the
parameter map
points to.)
Also, don't forget to free things no longer in use.
keys()
Since there is no equivalent to iterators in C (unless you're really clever), I've specified this project so that there will be this function which returns an array of all the keys. Look at the driver to see how this is used. Notice that this function must allocate a new array, and it is the driver's responsibility to free it.
destroy()
Finally, there's a lot to clean up (and null-out): nodes, array (or arrays), and the entire struct.