Compiler Construction
Grading
Grading is based on the score obtained during the labs,
based on the rules presented here (preliminary version).
The project must be your own project, it is allowed to share ideas and
algorithms among students, source code must not be shared.
Sharing of source code is not allowed, all parties involved will be punished by zero score.
Course Material
Project hints
- The most important (and most difficult) element of the project is
generation of a symbol table and binding the variable addresses in memory. The rest
follows easily according to the recommended book. You should introduce two kinds
of variables - regular and reference - the latter for function arguments, returned values and
array elements.
-
Place all variables, constants and labels in the symbol table. It greatly simplifies
the code generation, attributes of all nodes can be single integers, storing the index of
variable, containing the result of subexpression evaluation.
- It is convenient to write a function
void gencode(const string& m, int v1, varmode lv1, int v2, varmode lv2, int v3,varmode lv3);
where m is the operation code, integer parameters are the indices in the symbol table, and varmode parameters
determine, if we mean the value or address of the variable. The type of operation (integer or real) can
be established based on the type of first operand.
-
Factor-out what is common and write separate functions, e.g. type conversions are identical
for all operators.
-
If some function takes more than one screenfull of code, it is probably too long.
-
Do not reinvent the wheel. Use STL.
Vector<> is your friend. String too.
Map<> will rather not be useful. Do not use the fixed-lengh buffers - they are always one byte too short.
-
Compute the array element address as an integer, then change the type in the symbol table
to the reference to real or integer. In this way you can use it at both sides of the assignment operator.
-
The size of local variable area will be known after generation of code for the entire function.
Generate the code to the memory and then generate the function prologue and copy the function body to the file.
-
Lexer has about 100 lines, parser 1000, symbol table 200. If your program has more lines,
you are making something too complex.
-
When in doubt - hesitate ask the lab assistant or lecturer. Do not postpone it to the last moment,
to the day just before the deadline.
Bibliography
Basic
- A.V. Aho, R. Sethi, J. D. Ullman, "Compilers - Principles, Techniques, and Tools", Addison-Wesley 1986 (Polish edition WNT 2002)
Auxillary
- John R. Levine, "Linkers and Loaders", Morgan Kaufmann Publishers 1999 (manuscript available online)
- W. M. Waite, G. Goos, "Konstrukcja kompilatorow", WNT 1989
- S. S. Muchnick, "Advanced Compiler Design and Implementation", Morgan Kaufmann Publishers 1997
Updated: 16.04.2018