The Semantic Analyzer Checkpoint
- Due:
- 11:00 pm, Monday April 5, 2021
Max grace days: 2
Overview
For this assignment you will write a partial semantic analyzer. Among other things, this involves traversing the abstract syntax tree and the class hierarchy. You will reject all Cool programs that do not comply with the Cool type system.
Specification
You must create two artifacts:
- A program that takes a single command-line argument (e.g.,
file.cl-ast
). That argument will be an ASCII text Cool abstract syntax tree file. Your program must either indicate that there is an error in the input or emitfile.cl-type
, a class map. Your program will consist of a number of OCaml files. The starter code contains a file namedsemantic_analysis.ml
; this is file that you need to edit. - A plain ASCII text file called
README
describing your design decisions. See the grading rubric. A few paragraphs should suffice.
Considerations:
Line numbers: The typing rules do not directly specify the line numbers on which errors are to be reported. As of v1.11, the Cool reference compiler uses these guidelines (possibly surprising ones are italicized):
- Errors related to parameter-less method
main
in classMain
: always line 0 - Inheritance cycle: always line 0
- Other inheritance type problem: inherited type identifier location
self
orSELF_TYPE
used in wrong place: self (resp. SELF_TYPE) identifier (resp. type) location- Redefining a feature: (second) feature location
- Redefining a formal or class: (second) identifier location
- Other attribute problems: attribute location
- Redefining a method and changing types: (second) type location
- Other problems with redefining a method: method location
- Method body type does not conform: method name identifier location
- Attribute initializer does not conform: attribute name identifier location
- Errors with types of arguments to relational/arithmetic operations: location of relational/arithmetic operation expression
- Errors with types of
while
/if
subexpression(s): location of (enclosing)while
orif
expression (not the location of the conditional expression) - Errors with
case
expression (e.g., lub): location ofcase
expression - Errors with conformance in
let
: location oflet
expression (not location of initializer) - Errors in blocks: location of (beginning of) block expression
- Errors in actual arguments: location of method invocation expression (not the location of any particular actual argument)
- Assignment does not conform: assignment expression location (not right-hand-side location)
- Unknown identifier: location of identifier
- Unknown method: location of method name identifier
- Unknown type: location of type
- Errors related to parameter-less method
Error reporting: To report an error, write the string
ERROR: line_number: Type-Check: message
to standard output and terminate the program. You may write whatever you want in the message, but it should be fairly indicative. Example erroneous input:
class Main inherits IO { main() : Object { out_string("Hello, world.\n" + 16777216) -- adding string + int !? } ; } ;
Example error report output:
ERROR: 3: Type-Check: arithmetic on String Int instead of Ints
Remember that you do not have to match the English prose of the reference compiler's error messages at all. You just have to get the line number right.
Semantic checks are unordered — if a program contains two or more errors, you may indicate whichever you like. You can infer from this that all of our test cases will contain at most one error.
The .cl-type
File Format
If there are no errors in file.cl-ast
your program should create file.cl-type
and serialize the class map to it.
The class map is described in the Cool Reference Manual.
A .cl-type
file consists of one section for this assignment:
- The class map.
We will now describe exactly what to output for the class map. The general idea and notation (one string per line, recursive descent) are the same as in the previous assignment.
The Class Map
- Output
class_map
\n. - Output the number of classes and then \n.
- Output each class in turn (in ascending alphabetical order):
- Output the name of the class and then \n.
- Output the number of attributes and then \n.
- Output each attribute in turn (in order of appearance, with inherited attributes from a superclass coming first):
- Output
no_initializer
\n and then the attribute name \n and then the type name \n. - or Output
initializer
\n and then the attribute name \n and then the type name \n and then the initializer expression.
- Output
Detailed .cl-type
Example
Now that we've formally defined the output specification, we can present a worked example. Here's the example input we will consider:
class Main inherits IO {
my_attribute : Int <- 5 ;
main() : Object {
out_string("Hello, world.\n")
} ;
} ;
Resulting .cl-type
class map output with comments:
class_map
6 -- number of classes
Bool -- note: includes predefined base classes
0
IO
0
Int
0
Main
1 -- our Main has 1 attribute
initializer
my_attribute -- named "my_attribute" Int with type Int
2 -- initializer expression line number
Int -- initializer expression type (NOT PART OF THIS ASSIGNMENT)
integer -- initializer expression kind
5 -- which integer constant is it?
Object
0
String
0
Commentary
This is a checkpoint (a partial implementation) of the complete Semantic Analysis assignment. The implementation of the checkpoint should do the following:
- Read in the
.cl-ast
file given as a command-line argument. - Do every bit of typechecking and semantic analysis possible without typechecking expressions.
- Thus you should not annotate types in initializer expressions in the class map.
- Print out error messages as normal.
- Output only the class map to
.cl-type
if there are no errors.
Thus you should build the class hierarchy and check everything related to that. For example:
- Check to see if a class inherits from
Int
(etc.). - Check to see if a class inherits from an undeclared class.
- Check for cycles in the class hierarchy.
- Check for duplicate method or attribute definitions in the same class.
- Check for a child class that redefines a parent method but changes the parameters.
- Check for a missing method
main
in classMain
. - Check for
self
andSELF_TYPE
mistakes in classes and methods. - This list is not exhaustive -- read the Cool Reference Manual carefully and find everything you might check for without typechecking expressions.
- Basically, you'll look at classes, methods and attributes (but not method bodies).
You can do basic testing with something like the following:
linux> cool.exe --parse file.cl
linux> cool.exe --out reference --class-map file.cl
linux> my-checker file.cl-ast
linux> diff -b -B -E -w file.cl-type reference.cl-type
However, the reference implementation produces expressions with type annotations (an extra line of output per expression), so the diff
command will report differences; you need to examine the differences. You can output the diff
results side-by-side by passing the -y
flag.
Getting the Assignment
The starter code for the assignment is on the Linux server at the path:
/export/home/public/schwesin/csc310/semantic-analyzer-checkpoint-handout
Turning in the Assignment
You must turn in a zip file containing these files:
ast.ml
serialize.ml
deserialize.ml
semantic_analysis.ml
main.ml
README
There is a makefile provided with this assignment. To submit the assignment, execute the command:
make submit
from within the assignment directory.
Grading Criteria
Grading (out of 100 points):
- 70 points — for autograder tests
- 70 points 90% or greater passing test cases
- 55 points between 75% and 89% passing test cases
- 35 points between 50% and 74% passing test cases
- 20 points between 25% and 49% passing test cases
- 0 points less than 25% passing test cases
- 15 points — for a clear description in your README
- 15 — thorough discussion of design decisions (e.g., handling of the class hierarchy,
case
andnew
and dispatch); a few paragraphs of coherent English sentences should be fine - 8 — vague or hard to understand; omits important details
- 0 — little to no effort, or submitted an RTF/DOC/PDF file instead of plain TXT
- 15 — thorough discussion of design decisions (e.g., handling of the class hierarchy,
- 15 points — for code cleanliness
- 15 — code is mostly clean and well-commented
- 8 — code is sloppy and/or poorly commented in places
- 0 — little to no effort to organize and document code