Once when I was an undergraduate studying in the library on a sunny day in March I looked out the window and saw a quad filled with my classmates lying on blankets and playing Frisbee, soaking in the first warm weather of spring. It was then that I first knew that I wanted to be a professor---to dedicate my life to the prevention of college students having fun.
--- Evil Professor Samoht NenurdNav
The goal of this project is to explore the meaning of
programming language constructs like array creation and
dereference and C-style structs by compiling them away
to a language that has only pointer types (as opposed to struct
and array types) and malloc
(as opposed to new
).
Keep in mind that although RecRayJay uses Java syntax for arrays
and (struct-like) classes and as a subset of Java its programs can
be executed as Java programs, conceptually it is more like C
except for having a Java-like new
operator
and having a true array type;
also, we are assuming that all class/struct/record types are
reference/pointer types, so the dot operator in RecRayJay is
like the dot operator in Java and the arrow operator in C,
not the dot operator in C.
Remember that I'm giving you three and a half weeks to complete this (including all of spring break, should you choose to work on this project over break). You will need that time.
Copy and untar the starter code for this project.
cp ~tvandrun/Public/cs365/proj5.tar . tar xvf proj5.tar
In this you will find the package for RecRayJay (including
recrayjay.translatorVisitor.py
, which is the only thing
you need to modify),
plus the package for PoiJay (since the recrayjay
program relies on the PoiJay interpreter) and
the package for FunJay (since both RecRayJay and PoiJay inherit some
stuff from FunJay).
You will want to reread this section after you have read through the whole project description. Here I am presenting the big picture and giving some practical advice on engineering a very large project incrementally. The next section delves into specifics of the various RecRayJay constructs that need to be handled.
What makes this project especially difficult is that seemingly simple RecRayJay expressions that make indirections will require several steps in the PoiJay code they are translated to. Consider the following:
a[5].g = x.f;
Assume that field f
has offset 3
in whatever record type x
is
and the field g
has offset 2
in whatever record type is the base type of array a
.
Then this statement would be translated to
lhs_1 = a + 5; lhs_2 = *lhs_1 + 2; * lhs_2 = *(x + 3);
Thus it is not enough simply to replace a RecRayJay statement with a PoiJay statement. You may in fact need to produce a series of statements. Some expressions may also need some statements generated for it, which would then need to be added to the statements generated for the statement it is in. And then what if that statement is itself in an array index, or in an actual parameter, or in a loop guard...
Note that this all also requires generating new temporary variables. Lots and lots of new temporary variables.
I recommend you approach this project incrementally by making it work for increasingly complicated cases. For example (the list below isn't exhaustive, or necessarily the best order):
Some of the code that appears in recrayjay/translatorVisitor.py
as I give it to you contains hints about how I made my solution
to this project
(note that "how Dr VD did it" is not synonymous with "the best way to do it").
I use the following instance variables in TranslatorVisitor
:
newDecls
, a list of declarations to be added
to the program.
I sloppily make them all globals. Why not? Compilers don't care about style.
replaceStmt
, the (PoiJay) statement that is the replacement
for the most recently visited (RecRayJay) statement.
addBlock
, the block of statements that
are incidentally generated by the translation of the most recent statement.
That is, it is the block of things we need to add before the statement that
can be found in replaceStmt
.
replaceExpr
, the (PoiJay) expression that is the
replacement for the most recently visited (RecRayJay) expression.
That translation may also have added statements to addBlock
Note that this translation involves two AST class packages---that for
RecRayJay and that for PoiJay (there was a similar situation in project 3).
Since Python uses duck typing, the AST classes in the two packages
are interoperable for AST classes
that are identical (Block
, for example) or even
ones where the RecRayJay class merely has more attributes than
the PoiJay class (Program
, for example;
it's ok to feed a recrayjay_ast.Program
into the
PoiJay visitors because they will simply ignore the "classes" in the
recrayjay_ast.Program
).
Things get inconvenient when the equivalent classes aren't
compatible (the respective LeftHandSide
classes, for example).
Thus in the translator visitor, you will be visiting
recrayjay_ast
classes but
generating poijay_ast
classes.
To simplify the task slightly, I've provided some functions in
recrayjay/pgen.py
that allow you to write
pgen.pjLHS(foo, bar, quux)
instead of
poijay.poijay_ast.LeftHandSide(foo, bar, quux)
Over the course of this project, the typing that saves will add up.
visit_Program
.
I've given (part of) this to you, but you will probably need to
add more.
For example, you'll need some way of generating new
temporary variables, and you may need to initialize that here.
visit_Declaration
is also given to you, and
that one I don't think you'll need to change.
visit_Block
.
This is my complete method for translating blocks.
It demonstrates how I use addBlock
etc.
Note that when I make a new Block
as in self.addBlock = Block([], [])
, that's
technically a RecRayJay block, but it quacks just like a PoiJay block.
visit_Assignment
.
In an uncharacteristic show of generosity, I've included my solution to
this one, too.
Perhaps you can think of a better way, though.
visit_LeftHandSide
.
Note that if you make new LeftHandSides and Variables
(and you'll need to do so), you can leave the VarInfo
to be None
, since the PoiJay typechecker
will supply that.
My completed version took 13 dense lines (not counting
comments or white space)
visit_IndexQual
.
In many ways similar to LeftHandSide.
It took me 9 lines.
visit_FieldQual
.
It took me 11 lines.
visit_Conditional
.
Very important: you are not required to support
short circuit evaluation; that is, your translation
may break short circuit evaluation semantics.
You may decide to handle bodies with blocks differently
from non-block bodies.
It took me 24 lines.
visit_Loop
.
This is similar to Conditional, but one thing to watch out for:
if the guard spins off a block of code, it can't land only
before the loop itself---it will need to be executed every iteration.
I made that mistake myself the first time.
My final solution was 23 lines.
visit_QualVariable
.
Remember---this qualified variable may still occur on the
left hand side of an assignment if it is inside an array index.
This took me 17 lines.
visit_Null
and vsit_Instantiation
.
They took me 1 line apiece.
visit_Creation
.
This is where things get hard.
You will need to work this out carefully by hand---try
translating examples of creations of different
dimensions (which is itself no easy task), and then look for
patterns and see if you can generalize them.
Here are some things to think about (although you may find it better
to think about this on your own first, before you
read the following hints, and then compare what you came up
with with what I say here):
new int[5][3][2][7]
need
in total?
5 + 5*3 + 5*3*2 + 5*3*2*7 =
5 + 15 + 30 + 210 = 260.
Convince yourself why this is so.
Copy your TranslatorVisitor.py
to
/cslab.all/linux/class/cs365/(your id)/proj5
I will grade your project by running it against a collection of test files.
DUE: Wednesday, March 23, 5:00 pm.