Project 5: Dynamic memory in C

The goal of this project is for you to practice using pointers and dynamic memory in C. Along the way, it will help you to think through how strings work in C, and also to think carefully about hashing and maps.

This is a longer project, so be sure you leave plenty of time for it.

Set up

Find the given code for this project in ~tvandrun/Public/cs245/proj5 . There are two sub-directories, one for each of the two parts of this project. So, you will need to use the -r flag when you do the copying.

cp -r ~tvandrun/Public/cs245/proj5 .

Part I: Homemade strings in C

In the hmstring directory, write "homemade" versions of standard C string functions. In all of these, use the end-of-string marker to determine the boundaries of the strings. In all of them, you may assume there is enough memory allocated to do what is required; that is the responsibility of the code calling these functions.

Part II: Implementing a hash map in C

1. Introduction

In this part, you will implement a hash map in C, similar to the StringHashMap written in Java that we saw in class.

However, your hash map will have one extra feature: When your code detects that the buckets are full, either because there is imbalance or simply because the hashmap itself is full, you will rehash---make a bigger array of buckets, and redistribute the items.

The given code can be found in the pt2 directory. The program driver.c exercises your hashmap, using it to associate countries with capitals. It then tests containsKey() positively and negatively, tests rem() (remove), and an iteration over the hashmap.

(If you look at the list of countries, you'll notice some entries like Northern Ireland, Puerto Rico, Transnistria, and Palestine, which for various reasons (status, independence, recognition, etc) raise the question of how we define country. I simply grabbed this list from Wikipedia's list of national capitals, which you can check out yourself if you're interested in the political status of any of the nations included. No political message is intended on my part by including or excluding any entity.)

2. The struct

Your first task is to finish the struct hashmap_t. You will need to think through this tasks carefully, as it will be a little different from the class StringHashMap from class---for example, since C arrays do not carry their own length. You may find that your first attempt is incorrect and that you will have to revise this struct as you go along. One thing in particular to think about is that you will need to know how many items are in each bucket to monitor how balanced the hashmap is.

I have provided a node_t struct, the one that I used in my solution. You may chose to modify it, however.

3. create()

Write the create() function, thinking carefully about all the things in your struct that need to be initialized.

4. hash()

The function hash() does not appear in hashmap.h because the client code does not need to use it. You'll need to write it for use in hashmap.c, though. It requires the number of buckets, in order to compute the index properly. You may use the hash algorithm found in the in-class example, or you may research an implement a better one (if you do, document it).

The rehash() function comes next in the file, but I recommend putting that one off until you have more experience from writing some of the other functions.

5. getNode(), put(), get(), rem(), and containsKey()

In some ways, these are the "easy" ones, because they will be somewhat similar to the versions in the Java example from class. Note, however,

6. rehash()

This is one of the hardest parts of the project. Think carefully how you can make a new set of buckets and redistribute the items.

(It might make your job easier to make "temporary" hashmap and make use of your put, rem, get, and keys() functions---but be careful. Ending your function with map = temp will not work. You need to modify the hashmap "object" that the parameter map points to.)

Also, don't forget to free things no longer in use.

7. numKeys()

This is an extra little function so the client code can determine the number of keys; it's necessary for iteration over the keys.

8. keys()

Since there is no equivalent to iterators in C (unless you're really clever), I've specified this project so that there will be this function which returns an array of all the keys. Look at the driver to see how this is used. Notice that this function must allocate a new array, and it is the driver's responsibility to free it.

9. destroy()

Finally, there's a lot to clean up (and null-out): nodes, array (or arrays), and the entire struct.

To turn in

Turn in a hard copy of a script showing all your code and the results of showing the drivers.

DUE: Wed, Apr 6, at 5 PM.


Thomas VanDrunen
Last modified: Mon Apr 4 16:38:44 CDT 2011