Project 5: Translating from RecRayJay to PoiJay

Once when I was an undergraduate studying in the library on a sunny day in March I looked out the window and saw a quad filled with my classmates lying on blankets and playing Frisbee, soaking in the first warm weather of spring. It was then that I first knew that I wanted to be a professor---to dedicate my life to the prevention of college students having fun.
--- Evil Professor Samoht NenurdNav

The goal of this project is to explore the meaning of programming language constructs like array creation and dereference and C-style structs by compiling them away to a language that has only pointer types (as opposed to struct and array types) and malloc (as opposed to new). Keep in mind that although RecRayJay uses Java syntax for arrays and (struct-like) classes and as a subset of Java its programs can be executed as Java programs, conceptually it is more like C except for having a Java-like new operator and having a true array type; also, we are assuming that all class/struct/record types are reference/pointer types, so the dot operator in RecRayJay is like the dot operator in Java and the arrow operator in C, not the dot operator in C.

Remember that I'm giving you three and a half weeks to complete this (including all of spring break, should you choose to work on this project over break). You will need that time.

1. Setup

Copy and untar the starter code for this project.

cp ~tvandrun/Public/cs365/proj5.tar .
tar xvf proj5.tar

In this you will find the package for RecRayJay (including recrayjay.translatorVisitor.py, which is the only thing you need to modify), plus the package for PoiJay (since the recrayjay program relies on the PoiJay interpreter) and the package for FunJay (since both RecRayJay and PoiJay inherit some stuff from FunJay).

2. General problem and advice

You will want to reread this section after you have read through the whole project description. Here I am presenting the big picture and giving some practical advice on engineering a very large project incrementally. The next section delves into specifics of the various RecRayJay constructs that need to be handled.

What makes this project especially difficult is that seemingly simple RecRayJay expressions that make indirections will require several steps in the PoiJay code they are translated to. Consider the following:

a[5].g = x.f;

Assume that field f has offset 3 in whatever record type x is and the field g has offset 2 in whatever record type is the base type of array a. Then this statement would be translated to

lhs_1 = a + 5;
lhs_2 = *lhs_1 + 2;
* lhs_2 = *(x + 3);

Thus it is not enough simply to replace a RecRayJay statement with a PoiJay statement. You may in fact need to produce a series of statements. Some expressions may also need some statements generated for it, which would then need to be added to the statements generated for the statement it is in. And then what if that statement is itself in an array index, or in an actual parameter, or in a loop guard...

Note that this all also requires generating new temporary variables. Lots and lots of new temporary variables.

I recommend you approach this project incrementally by making it work for increasingly complicated cases. For example (the list below isn't exhaustive, or necessarily the best order):

Given helper stuff

Some of the code that appears in recrayjay/translatorVisitor.py as I give it to you contains hints about how I made my solution to this project (note that "how Dr VD did it" is not synonymous with "the best way to do it"). I use the following instance variables in TranslatorVisitor:

Note that this translation involves two AST class packages---that for RecRayJay and that for PoiJay (there was a similar situation in project 3). Since Python uses duck typing, the AST classes in the two packages are interoperable for AST classes that are identical (Block, for example) or even ones where the RecRayJay class merely has more attributes than the PoiJay class (Program, for example; it's ok to feed a recrayjay_ast.Program into the PoiJay visitors because they will simply ignore the "classes" in the recrayjay_ast.Program).

Things get inconvenient when the equivalent classes aren't compatible (the respective LeftHandSide classes, for example). Thus in the translator visitor, you will be visiting recrayjay_ast classes but generating poijay_ast classes. To simplify the task slightly, I've provided some functions in recrayjay/pgen.py that allow you to write

pgen.pjLHS(foo, bar, quux)

instead of

poijay.poijay_ast.LeftHandSide(foo, bar, quux)

Over the course of this project, the typing that saves will add up.

3. Details

visit_Program. I've given (part of) this to you, but you will probably need to add more. For example, you'll need some way of generating new temporary variables, and you may need to initialize that here. visit_Declaration is also given to you, and that one I don't think you'll need to change.

visit_Block. This is my complete method for translating blocks. It demonstrates how I use addBlock etc. Note that when I make a new Block as in self.addBlock = Block([], []), that's technically a RecRayJay block, but it quacks just like a PoiJay block.

visit_Assignment. In an uncharacteristic show of generosity, I've included my solution to this one, too. Perhaps you can think of a better way, though.

visit_LeftHandSide. Note that if you make new LeftHandSides and Variables (and you'll need to do so), you can leave the VarInfo to be None, since the PoiJay typechecker will supply that. My completed version took 13 dense lines (not counting comments or white space)

visit_IndexQual. In many ways similar to LeftHandSide. It took me 9 lines.

visit_FieldQual. It took me 11 lines.

visit_Conditional. Very important: you are not required to support short circuit evaluation; that is, your translation may break short circuit evaluation semantics. You may decide to handle bodies with blocks differently from non-block bodies. It took me 24 lines.

visit_Loop. This is similar to Conditional, but one thing to watch out for: if the guard spins off a block of code, it can't land only before the loop itself---it will need to be executed every iteration. I made that mistake myself the first time. My final solution was 23 lines.

visit_QualVariable. Remember---this qualified variable may still occur on the left hand side of an assignment if it is inside an array index. This took me 17 lines.

visit_Null and vsit_Instantiation. They took me 1 line apiece.

visit_Creation. This is where things get hard. You will need to work this out carefully by hand---try translating examples of creations of different dimensions (which is itself no easy task), and then look for patterns and see if you can generalize them. Here are some things to think about (although you may find it better to think about this on your own first, before you read the following hints, and then compare what you came up with with what I say here):

4. Turn in

Copy your TranslatorVisitor.py to

/cslab.all/ubuntu/cs365/turnin/xxxxxx/proj5

where "xxxxxx" is [elliot|payton|davidemmanuel|nathan|gill|brandon|sean|simon]. I will grade your project by running it against a collection of test files.

DUE: Wednesday, March 26, 5:00 pm.


Thomas VanDrunen
Last modified: Thu Feb 27 11:25:57 CST 2014