Project: Bit Vectors

1. Introduction

The goal of this project is to reason through and implement a bit vector as a strategy for implementing a set of integers. This project is similar to an assignment from Programming II (Project 5 in Fall 2015 and Lab 11 in Sprgin 2015), so some elements will be review.

An N-Set as an abstract data type similar to a dynamic set. Specifically its keys are integers drawn from a the range [0, N) for some upper bound N. That restriction on its keys affords additional operations beyond those usually associated witht he set ADT: complement, difference, intersection, and union. (I will refer to these as whole-set operations because they compute something about the entire set as opposed to a specfic key in the set.) An N-set also allows for a very efficient implementation using a bit vector.

Conceptually, an N-set can be represented as a boolean array of length N where the values stored in the array indicate whether the position's index is a key in the set. Thus is N is 9 and we have the set {1, 4, 7}, we can represent this with a boolean array with true in positions 1, 4, and 7 and false elsewhere. Adding to or removing from the set is as easy as setting the appropriate array position to true or false, respectively.

The efficiency, especially space efficiency, increases when we use a single bit for each (conceptual) boolean value rather than a value of Java's boolean type. In other words, we use a bit vector to indicate which values are in the N set. In this project you will implement an N set both using a plain boolean array and as a bit vector made from an array of bytes.

2. Set up

Copy the code from ~tvandrun/Public/cs345/bitvec and make an Eclipse project for it. As usual it has adt, impl, and test packages. Additionally there is a exper package for a quick experiment.

In the adt package, the NSet interface extends Set. There are three implementations of NSet. NaiveNSet is merely an extension of ListSet; it does not use a boolean array or bit vector but provides "brute force" implementations of the whole-set operations, and it is written for you. More importantly for our purposes are the classes BArrayNSet and BitVecNSet which use a binary array and bit vector, respectively. You will implement these. In some ways, BArrayNSet is a warm-up for BitVecNSet.

The structure of the classes in adt and impl are illustrated in this UML diagram:

There is also an exception class adt.BadNSetParameterException to indicate when an N-set operation is called with a parameter that doesn't make sense---either a number outside the range [0, N) for basic operations or an N-set with a different N or class for whole-set operations.

3. The BArrayNSet class

Your first task is to complete the boolean array implementation, which is tested using test.BANSTest. The instance variables and constructor are given, as are the basic operations add(), contains(), remove(), and isEmpty(). Study these carefully so that you understand how the class works.

The operations for you to write are

5. The BitVectorNSet class

Your second task is to complete the bit vector implementation, which is tested using test.BVNSTest. The instance variables and constructor are given to you. Again, study these carefully and understand them thoroughly. The operations add() and complement() are given and are a hint at how to do the other operations.

The operations for you to write are

Of course, isEmpty() could be implemented "easily" by calling size(). However, there is an efficient and fairly simple way to compute isEmpty() from scratch which you are encouraged to do before the more suble isEmpty(). (Hint: The set is empty if and only if all the bytes are zero.)

5. Experiment

Now examine and run the program exper.Experiment, which measures the running time of the three implementations of NSet at different sizes. It measures basic and whole-set operations separately. What do you observe, especially when sizes and different kinds of operations are differentiated?

6. Turn-in

Copy the files you modified (BArrayNSet.java and BitVecNSet.java) to your turn-in folder /cslab.all/linux/class/cs345/(your id)/bitvec .

To keep up with the course, this should be finished by Feb 10.


Thomas VanDrunen
Last modified: Tue Jan 5 14:51:30 CST 2016