CSCI 235 Lab 12: Trees

0. Catching up

Check out a fresh copy of the code from lab 11:

$ hg clone /cslab/class/csci235/labs/lab11
$ cd lab11

Even if you did it on Monday, repeat parts 3 and 4 of Monday’s lab. Take up to about 30 minutes to write removeFromFront() andinsertSorted() in a single version of the file. If one of you finished it Monday and one did not, let the partner who did not previously finish take the lead.

Hand in IList0.java as assignment lab11b.

1. Set up

The goal of this lab is to practice writing operations for binary trees, as an application of and variation on linked lists.

As usual, clone the repository for this lab and then cd into it. If your are in the directory where you were completing lab11, move up to the containing directory

$ cd ..

before cloning the repository

$ hg clone /cslab/class/csci235/labs/lab12
$ cd lab12

Note that because it is a repository, you should feel free to commit your changes at any point that you deem appropriate.

2. Introduction

We have worked some with lists, and in class we introduced another kind of linked structure: the tree. In a tree, a node can have multiple children; the case of at most two children is called a binary tree. If we draw a picture with each node above its children, we get something like this:

        1  
      /   \  
     2     3  
    / \   / \  
   4   5 6   7

We call the node at the top, which has no parent, the root, and the nodes at the bottom, that have no children, are called leaves. (Yes, computer scientists grow trees upside down.)

Change into the tree subdirectory and look at the code there. You’ll see that there are two classes, TreeNode and TreeDriver, that are very similar to what we saw in class. For warm-up, you’ll repeat complete what we started in class.

Look at the main method in TreeDriver to see how it builds the tree drawn above. (Note that, in general, either child of a node might be null.) Note that the node 2 is the left child of the node 1, but it is also the root of a subtree made up of it and its children (and its children’s children...).

Trees are natually traversed in a recursive manner. Fill in the method count() so that it tells you how many nodes there are in the subtree rooted at the node on which it is called. Note that the number of nodes in a tree is 1 (for the node) plus the number of nodes in its left subtree, plus the number of nodes in its right subtree.

When walking a binary tree recursively, you have three choices for when to process a node relative to its children. If you do the node before the children, it is called preorder; if you do it after the children, it is called postorder; and if you do a node between its children, it is called inorder. Fill in the three display methods so that they print out each node’s datum in the appropriate order. (You can print the nodes on one line, but put a space between them.)

Finally, add a few more nodes to the construction of the tree so that there are nodes that have only a left child or only a right child. Make a script of your program running.

Note that in this program, we have a single node class, and we use null to indicate that a particular child (subtree) is empty. In the next program, we will use two different node classes: one will represent interior nodes (which will always have two children), and the other leaf nodes.

3. Parsing

Change into the other directory for this lab; from the tree directory, you can do that with

$ cd ../parse

(Had you noticed that directories form a tree, and that .. names the parent?)

One of the files there is ExprStringSlicer.class, which saves you some work. It would be a good idea if you make it harder to discard, which you can do by taking away your own permission to write it. The command to do that is

$ chmod u-w ExprStringSlicer.class

(A brief explanation: “chmod” means “change mode”; “mode” is another name for permissions. The next part “u-w” says to take away write (w) permission from the user (u) who owns the file—which is you.)

One use for trees is in the grammatical (or syntactical) analysis (or “parsing”) of languages—both human languages and programming languages. We can write a description of a language as a grammar; in the notation below, the means “consists of” and the | separates alternatives. So we can specify the language of fully parenthesized arithmetic expressions with this grammar:

expression  →   integer
              | ( expression op expression )
      op  →   + |-| / | *

This means “An expression is either a single integer, or it is a left parenthesis, a (sub-)expression, an operator, another (sub-)expression, and a right parenthesis. An operator is a plus, a minus, a slash, or a star.” According to this grammar, the following are expressions:

These can be represented as trees. (We can ignore the parentheses now, since the information they give about grouping is contained in the structure of the tree itself.)

15         -           /              +  
          / \         / \           /   \  
        42   17      *   147       *     -  
                    / \           / \   / \  
                  18   10       14  91 22  17

Expressions can be modelled by a node-like interface with two implementing classes: one for leaf nodes containing integers, and one for nodes containing an operator and having two children. Evaluation of the expressions can be performed with a depth-first post-order traversal of the tree.

3a. ExprNode classes

Your first task is to design classes that model the two kinds of nodes, implementing the ExprNode interface. These nodes will have to have evaluate() methods, but we’ll deal with that later.

So, write two classes, one for leaf/integer nodes, the other for non-leaf/operation nodes. What instance variables will each have?

(So that your repository knows about the new files, you will want to either use hg copy to make them as copies of ExprNode.java or hg add if you make them some other way.)

3b. Building expression trees

Next, you will need to write code to build an expression tree given a String containing a fully-parenthesized expression; this process of extracting the structure from a sequence is called parsimg. The driver program Interpreter is intended to work so that the user can write an expression on the command-line, and the program will build an appropriate expression tree, and then evaluate the expression using the tree. The main() method in Interpreter expects the (static) method parse() to turn a string into the appropriately constructed tree of the node classes that you wrote above.

One thing you will find useful is a method Dr. VanDrunen wrote and included in the code you copied. In the class ExprStringSlicer, the static method slice() takes a String, assumed to contain an expression in our grammar. It will return an array of Strings: if the String passed to it is just an integer, then it will return an array with exactly one element, a String version of that integer; if it is passed a String with a parenthesized operation over subexpressions, it will return an array with three elements, the first subexpression, the operator, and the second subexpression. For example, given "5", it will return "5" . Given "((14 * 91) + (22 - 17))", it will return "(14 * 91)", "+", "(22 - 17").

(Extra credit: After you finish the rest of the lab, try writing our own slice() method and use it instead of the provided one.)

Note that ExprStringSlicer.slice() will throw one of several unchecked exceptions if you hand it a malformed string; the most likely is StringIndexOutOfBoundsException. It is reasonable for your constructors to throw similar exceptions (e.g., NumberFormatException) if the arguments passed to them are not well-formed.

When you test out your program, you will need to put quotes around the string you give to Interpreter. For example, you could run the program with

$ java Interpreter "((14 * 91) + (22 - 17))"

The quotes tell the shell to pass ((14 * 91) + (22 - 17)) to Java as a single String (it will be the zeroth item in args). Otherwise every space would be assumed to begin and end a separate String; furthermore, the parentheses would really mess things up.

3c. Evaluating expression trees

Finally, the interpreter calls the evaluate() method on the trees you have built. Write the evaluate methods so that they return the integer value of the given expression.

Create a script file in which you remove the .class files for your sources (do not remove ExprStringSlicer.class!) and then compile and run your program for several different expressions.

4. Turn in

To turn in all of your files, get in the lab12 directory (the one containing directories tree and parse) and hand in the entire directory with

$ /cslab/class/csci235/bin/handin lab12 .

Note the dot, which refers to the current directory.

Be sure you copy your code from today to your own directory, because you’ll want both parts for future projects.