Type Checking
Outline
- Typing Rules
- Typing Enviroments
- “Let” Rules
- Subtyping
- Incorrect Rules
Soundness
A type system is sound if, whenever \(\vdash e : T\), then \(e\) evaluates to a value of type \(T\)
We only want sound rules, but some sound rules are better than others: \[\frac{i\text{ is an integer}}{\vdash i : \texttt{Object}}\]
Type Checking Proofs
Type checking proves facts \(e: T\)
Proof is on the structure of the AST
Proof has the shape of the AST
One type rule is used for each kind of AST node
In the type rule used for a node \(e\)
Hypotheses are the proofs of types of \(e\)’s subexpressions
Conclusion is the type of \(e\)
Types are computed in a bottom-up pass over the AST
Rules for Constants
\[\frac{i\text{ is an integer}}{\vdash i : \texttt{Int}}\text{[Int]}\] \[\frac{}{\vdash true : \texttt{Bool}}\text{[Bool]}\] \[\frac{}{\vdash false : \texttt{Bool}}\text{[Bool]}\] \[\frac{s \text{ is a string constant}}{\vdash s : \texttt{String}}\text{[String]}\]
Rule for new
new T
produces an object of typeT
- Ignore
SELF_TYPE
for now …
\[ \frac{}{\vdash \texttt{new} \; T : T}\text{[New]} \]
- Ignore
Some Other Rules
Not
\[ \frac{\vdash e : \texttt{Bool}} {\vdash \texttt{not} \; e : \texttt{Bool}}\text{[Not]} \]
Loop
\[ \frac{ \begin{array}{l} \vdash e_1 : \texttt{Bool}\\ \vdash e_2 : T \end{array} } {\vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 : T \; \texttt{pool}}\text{[Loop]} \]
A Problem
What is the type of a variable reference? \[\frac{x \text{ is an identifier}}{\vdash x : ?}\text{[Var]}\]
The local, structural rule does not carry enough information to give \(x\) a type
A Solution
Put more information in the rules
A type environment give types for free variables
A type environment is a function from identifiers to types
A variable is free in an expression if it is not defined within the expression
Example: in the expression
let x : Int in x + y
,y
is free, butx
is not
Type Environments
Let \(O\) be a function from object identifiers to types
The sentence \(O \vdash e : T\) is read: under the assumption that variables have the types given by \(O\), it is provable that the expression \(e\) has type \(T\)
Type Environments and Rules
The type environment is added to the earlier rules, for example \[ \frac{i\text{ is an integer}}{O \vdash i : \texttt{Int}}\text{[Int]} \]
\[ \frac{ \begin{array}{l} O \vdash e_1 : \texttt{Int}\\ O \vdash e_2 : \texttt{Int} \end{array}} {O \vdash e_1 + e_2 : \texttt{Int}}\text{[Add]} \]
New Rules
And we can now write a rule for variables:
\[ \frac{O(x) = T} {O \vdash x : T}\text{[Var]} \]
Let
\[ \frac{O[T_0/x] \vdash e_1 : T_1} {0 \vdash \texttt{let} \; x : T_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-No-Init]} \]
\(O[T_0/x]\) means “\(O\) modified to map \(x\) to \(T_0\) and behaving as \(O\) on all other arguments”: \[O[T_0/x](x) = T_0\] \[O[T_0/x](y) = O(y)\]
Let Example
Consider the Cool Expression
\[ \texttt{let} \; x : T_0 \; \texttt{in} \; (\texttt{let} \; y : T_1 \; \texttt{in} \; E_{x,y}) + (\texttt{let} \; x : T_2 \; \texttt{in} \; F_{x,y}) \] where \(E_{x,y}\) and \(F_{x,y}\) are some Cool expression that contain occurrences of \(x\) and \(y\)
Scope
- of \(y\) is \(E_{x,y}\)
- of outer \(x\) is \(E_{x,y}\)
- of inner \(x\) is \(F_{x,y}\)
This is captured precisely in the typing rule
Notes
The type of enviroment gives types to the free identifiers in the current scope
The type environment is passed down the AST from the root towards the leaves
Types are computed up the AST from the leaves towards the root
Let with Initialization
Now consider
let
with initialization\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
This rule is weak.
Let with Initialization
Consider the example:
class C inherits P { ... } ... let x : P <- new C in ... ...
The previous rule does not allow this code; we say that the rule is too weak or incomplete
Subtyping
Define a relation \(X \leq Y\) on classes to say that:
- An object of type \(X\) could be used when one of type \(Y\) is acceptable, or equivalently
- \(X\) conforms with \(Y\)
- In Cool, this means that \(X\) is a subclass of \(Y\)
Define a relation \(\leq\) on classes
- \(X \leq X\)
- \(X \leq Y\) if \(X\) inherits from \(Y\)
- \(X \leq Z\) if \(X \leq Y\) and \(Y \leq Z\)
Let with Initialization (Better)
New rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
Both rules for let are sound
But, more programs type check with this new rule (it is more complete)
Type System Tug-of-War
There is a tension between
- Flexible rules that do not constrain programming
- Restrictive rules that ensure safety of execution
Expressiveness of Static Type Systems
A static type system enables a compiler to detect many common programming errors
The cost is that some correct programs are disallowed
But more expressive type systems are also more complex
Dynamic and Static Types
The dynamic type of an object is the class \(C\) that is used in the
new C
expression that creates the object- A run-time notion
- Even languages that are not statically typed have the notion of dynamic type
The static type of an expression is a notation that captures all possible dynamic types the expression could take
- A compile-time notion
Dynamic and Static Types
In early type systems the set of static types correspond directly with the dynamic types
Soundness theorem: for all expressions \(E\), \(dynamic\_type(E) = static\_type(E)\), that is, in all executions, \(E\) evaluates to values of the type inferred by the compiler.
This gets more complicated in advanced type systems
Dynamic and Static Types in Cool
A variable of static type \(A\) can hold values of static type \(B\), if \(B \leq A\)
class A {...} class B inherits A {...} class Main { x : A <- new A; -- x has static type A ... x <- new B; -- here x's value has dynamic type B ... };
Dynamic and Static Types
Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]
Why is this correct?
- For \(E\), compiler uses \(static\_type(E)\)
- All operations that can be used on an object of type \(C\) can also be used on an object of type \(C' \leq C\)
- Subclasses can only add attributes or methods
- Methods can be redefined but with the same types
Subtyping Example
Consider the following Cool class definitions
class A { a() : Int { 0 }; }; class B inherits A { b() : Int { 1 }; };
- An instance of
B
has methodsa
andb
An instance of
A
has methoda
- A type error occurs if we try to invoke method
b
on an instance ofA
- A type error occurs if we try to invoke method
Example of an Incorrect Let Rule (1)
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following good program does not typecheck:
let x : Int <- 0 in x + 1
Example of an Incorrect Let Rule (2)
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T_0 \leq T\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following bad program is well typed:
let x : B <- new A in x.b()
Example of an Incorrect Let Rule (3)
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following good program is not well typed:
let x : A <- new B in {... x <- new A; x.a(); }
Typing Rule Notation
- The typing rules use very concise notation
- They are carefully constructed
Virtually any change in a rule either:
- Makes the type system unsound
- Or, makes the type system less usable (incomplete)
But some good programs will be rejected anyway; the notion of a good program is undecidable
Assignment
More uses of subtyping:
\[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]
Initialized Attributes
Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)
- \(O_c\) represents the class-wide scope
Attribute initialization is similar to
let
, except for the scope of names\[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]
If-Then-Else
Consider: \[\texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi}\]
The result can be either \(e_1\) or \(e_2\)
The dynamic type is either \(e_1\)’s or \(e_2\)’s type
The best we can do is the smallest supertype larger than the type of \(e_1\) and \(e_2\)
If-Then-Else Example
Consider the class hierarchy
class P {...} class A inherits P {...} class B inherits P {...}
and the expression
if ... then new A else new B fi
Its type should allow for the dynamic type to be both
A
orB
; the smallest supertype isP
Least Upper Bounds
Define \(lub(X,y)\) to be the least upper bound of \(X\) and \(Y\). The \(lub(X,Y)\) is \(Z\) if
- \(X \leq Z\) and \(Y \leq Z\) (\(Z\) is an upper bound)
- \(X \leq Z'\) and \(Y \leq Z'\) implies \(Z \leq Z'\) (\(Z\) is least among upper bounds)
In Cool, the least upper bound of two types is their least common ancestor in the inheritance tree
If-Then-Else Revisited
\[ \frac{ \begin{array}{l} O \vdash e_0 : \texttt{Bool}\\ O \vdash e_1 : T_1\\ O \vdash e_2 : T_2\\ \end{array}} {O \vdash \texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi} : lub(T_1, T_2)}\text{[If-Then-Else]} \]
Case
- The rule for case expressions takes a lub over all branches
\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O[T_1/x_1] \vdash e_1 : T_1'\\ ...\\ O[T_n/x_n] \vdash e_n : T_n'\\ \end{array}} { \begin{array}{l} O \vdash \texttt{case} \; e_0 \; \texttt{of} \; x_1 : T_1 \; \texttt{=>} \; e_1;\\ ...;\\ x_n : T_n \; \texttt{=>} \; e_n; \; \texttt{esac} : lub(T_1', \ldots, T_n') \end{array} }\text{[Case]} \]
Summary
A type environment gives types for free variables. You typecheck a let-body with an environment that has been updated to contain the new let-variable
If an object of type \(X\) could be used when one of type \(Y\) is acceptable then we say \(X\) is a subtype of \(Y\), also written \(X \leq Y\)
A type system is sound if \(\forall E. dynamic\_type(E) \leq static\_type(E)\)