A type system is sound if, whenever \(\vdash e : T\), then \(e\) evaluates to a value of type \(T\)
We only want sound rules, but some sound rules are better than others: \[\frac{i\text{ is an integer}}{\vdash i : \texttt{Object}}\]
Type checking proves facts \(e: T\)
Proof is on the structure of the AST
Proof has the shape of the AST
One type rule is used for each kind of AST node
In the type rule used for a node \(e\)
Hypotheses are the proofs of types of \(e\)’s subexpressions
Conclusion is the type of \(e\)
Types are computed in a bottom-up pass over the AST
\[\frac{i\text{ is an integer}}{\vdash i : \texttt{Int}}\text{[Int]}\] \[\frac{}{\vdash true : \texttt{Bool}}\text{[Bool]}\] \[\frac{}{\vdash false : \texttt{Bool}}\text{[Bool]}\] \[\frac{s \text{ is a string constant}}{\vdash s : \texttt{String}}\text{[String]}\]
new
new T
produces an object of type T
SELF_TYPE
for now …\[ \frac{}{\vdash \texttt{new} \; T : T}\text{[New]} \]
Not
\[ \frac{\vdash e : \texttt{Bool}} {\vdash \texttt{not} \; e : \texttt{Bool}}\text{[Not]} \]
Loop
\[ \frac{ \begin{array}{l} \vdash e_1 : \texttt{Bool}\\ \vdash e_2 : T \end{array} } {\vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 : T \; \texttt{pool}}\text{[Loop]} \]
What is the type of a variable reference? \[\frac{x \text{ is an identifier}}{\vdash x : ?}\text{[Var]}\]
The local, structural rule does not carry enough information to give \(x\) a type
Put more information in the rules
A type environment give types for free variables
A type environment is a function from identifiers to types
A variable is free in an expression if it is not defined within the expression
Example: in the expression let x : Int in x + y
, y
is free, but x
is not
Let \(O\) be a function from object identifiers to types
The sentence \(O \vdash e : T\) is read: under the assumption that variables have the types given by \(O\), it is provable that the expression \(e\) has type \(T\)
The type environment is added to the earlier rules, for example \[ \frac{i\text{ is an integer}}{O \vdash i : \texttt{Int}}\text{[Int]} \]
\[ \frac{ \begin{array}{l} O \vdash e_1 : \texttt{Int}\\ O \vdash e_2 : \texttt{Int} \end{array}} {O \vdash e_1 + e_2 : \texttt{Int}}\text{[Add]} \]
And we can now write a rule for variables:
\[ \frac{O(x) = T} {O \vdash x : T}\text{[Var]} \]
\[ \frac{O[T_0/x] \vdash e_1 : T_1} {0 \vdash \texttt{let} \; x : T_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-No-Init]} \]
\(O[T_0/x]\) means “\(O\) modified to map \(x\) to \(T_0\) and behaving as \(O\) on all other arguments”: \[O[T_0/x](x) = T_0\] \[O[T_0/x](y) = O(y)\]
Consider the Cool Expression
\[ \texttt{let} \; x : T_0 \; \texttt{in} \; (\texttt{let} \; y : T_1 \; \texttt{in} \; E_{x,y}) + (\texttt{let} \; x : T_2 \; \texttt{in} \; F_{x,y}) \] where \(E_{x,y}\) and \(F_{x,y}\) are some Cool expression that contain occurrences of \(x\) and \(y\)
Scope
This is captured precisely in the typing rule
The type of enviroment gives types to the free identifiers in the current scope
The type environment is passed down the AST from the root towards the leaves
Types are computed up the AST from the leaves towards the root
Now consider let
with initialization
\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
This rule is weak.
Consider the example:
class C inherits P { ... }
...
let x : P <- new C in ...
...
The previous rule does not allow this code; we say that the rule is too weak or incomplete
Define a relation \(X \leq Y\) on classes to say that:
Define a relation \(\leq\) on classes
New rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
Both rules for let are sound
But, more programs type check with this new rule (it is more complete)
There is a tension between
A static type system enables a compiler to detect many common programming errors
The cost is that some correct programs are disallowed
But more expressive type systems are also more complex
The dynamic type of an object is the class \(C\) that is used in the new C
expression that creates the object
The static type of an expression is a notation that captures all possible dynamic types the expression could take
In early type systems the set of static types correspond directly with the dynamic types
Soundness theorem: for all expressions \(E\), \(dynamic\_type(E) = static\_type(E)\), that is, in all executions, \(E\) evaluates to values of the type inferred by the compiler.
This gets more complicated in advanced type systems
A variable of static type \(A\) can hold values of static type \(B\), if \(B \leq A\)
class A {...}
class B inherits A {...}
class Main {
x : A <- new A; -- x has static type A
...
x <- new B; -- here x's value has dynamic type B
...
};
Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]
Why is this correct?
Consider the following Cool class definitions
class A { a() : Int { 0 }; };
class B inherits A { b() : Int { 1 }; };
B
has methods a
and b
An instance of A
has method a
b
on an instance of A
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following good program does not typecheck:
let x : Int <- 0 in x + 1
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T_0 \leq T\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following bad program is well typed:
let x : B <- new A in x.b()
Consider a hypothetical incorrect let rule:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following good program is not well typed:
let x : A <- new B in {... x <- new A; x.a(); }
Virtually any change in a rule either:
But some good programs will be rejected anyway; the notion of a good program is undecidable
More uses of subtyping:
\[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]
Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)
Attribute initialization is similar to let
, except for the scope of names
\[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]
Consider: \[\texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi}\]
The result can be either \(e_1\) or \(e_2\)
The dynamic type is either \(e_1\)’s or \(e_2\)’s type
The best we can do is the smallest supertype larger than the type of \(e_1\) and \(e_2\)
Consider the class hierarchy
class P {...}
class A inherits P {...}
class B inherits P {...}
and the expression
if ... then new A else new B fi
Its type should allow for the dynamic type to be both A
or B
; the smallest supertype is P
Define \(lub(X,y)\) to be the least upper bound of \(X\) and \(Y\). The \(lub(X,Y)\) is \(Z\) if
In Cool, the least upper bound of two types is their least common ancestor in the inheritance tree
\[ \frac{ \begin{array}{l} O \vdash e_0 : \texttt{Bool}\\ O \vdash e_1 : T_1\\ O \vdash e_2 : T_2\\ \end{array}} {O \vdash \texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi} : lub(T_1, T_2)}\text{[If-Then-Else]} \]
\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O[T_1/x_1] \vdash e_1 : T_1'\\ ...\\ O[T_n/x_n] \vdash e_n : T_n'\\ \end{array}} { \begin{array}{l} O \vdash \texttt{case} \; e_0 \; \texttt{of} \; x_1 : T_1 \; \texttt{=>} \; e_1;\\ ...;\\ x_n : T_n \; \texttt{=>} \; e_n; \; \texttt{esac} : lub(T_1', \ldots, T_n') \end{array} }\text{[Case]} \]
A type environment gives types for free variables. You typecheck a let-body with an environment that has been updated to contain the new let-variable
If an object of type \(X\) could be used when one of type \(Y\) is acceptable then we say \(X\) is a subtype of \(Y\), also written \(X \leq Y\)
A type system is sound if \(\forall E. dynamic\_type(E) \leq static\_type(E)\)