Type Checking

Outline

Typing Rules
Typing Enviroments
“Let” Rules
Subtyping
Incorrect Rules

Soundness

A type system is sound if, whenever \(\vdash e : T\), then \(e\) evaluates to a value of type \(T\)
We only want sound rules, but some sound rules are better than others: \[\frac{i\text{ is an integer}}{\vdash i : \texttt{Object}}\]

Type Checking Proofs

Type checking proves facts \(e: T\)
- Proof is on the structure of the AST
- Proof has the shape of the AST
- One type rule is used for each kind of AST node
In the type rule used for a node \(e\)
- Hypotheses are the proofs of types of \(e\)’s subexpressions
- Conclusion is the type of \(e\)
Types are computed in a bottom-up pass over the AST

Rules for Constants

\[\frac{i\text{ is an integer}}{\vdash i : \texttt{Int}}\text{[Int]}\] \[\frac{}{\vdash true : \texttt{Bool}}\text{[Bool]}\] \[\frac{}{\vdash false : \texttt{Bool}}\text{[Bool]}\] \[\frac{s \text{ is a string constant}}{\vdash s : \texttt{String}}\text{[String]}\]

Rule for `new`

new T produces an object of type T
- Ignore SELF_TYPE for now …
\[ \frac{}{\vdash \texttt{new} \; T : T}\text{[New]} \]

Some Other Rules

Not

\[ \frac{\vdash e : \texttt{Bool}} {\vdash \texttt{not} \; e : \texttt{Bool}}\text{[Not]} \]
Loop

\[ \frac{ \begin{array}{l} \vdash e_1 : \texttt{Bool}\\ \vdash e_2 : T \end{array} } {\vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 : T \; \texttt{pool}}\text{[Loop]} \]

A Problem

What is the type of a variable reference? \[\frac{x \text{ is an identifier}}{\vdash x : ?}\text{[Var]}\]
The local, structural rule does not carry enough information to give \(x\) a type

A Solution

Put more information in the rules
A type environment give types for free variables
- A type environment is a function from identifiers to types
- A variable is free in an expression if it is not defined within the expression
Example: in the expression let x : Int in x + y, y is free, but x is not

Type Environments

Let \(O\) be a function from object identifiers to types
The sentence \(O \vdash e : T\) is read: under the assumption that variables have the types given by \(O\), it is provable that the expression \(e\) has type \(T\)

Type Environments and Rules

The type environment is added to the earlier rules, for example \[ \frac{i\text{ is an integer}}{O \vdash i : \texttt{Int}}\text{[Int]} \]

\[ \frac{ \begin{array}{l} O \vdash e_1 : \texttt{Int}\\ O \vdash e_2 : \texttt{Int} \end{array}} {O \vdash e_1 + e_2 : \texttt{Int}}\text{[Add]} \]

New Rules

And we can now write a rule for variables:

\[ \frac{O(x) = T} {O \vdash x : T}\text{[Var]} \]

Let

\[ \frac{O[T_0/x] \vdash e_1 : T_1} {0 \vdash \texttt{let} \; x : T_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-No-Init]} \]

\(O[T_0/x]\) means “\(O\) modified to map \(x\) to \(T_0\) and behaving as \(O\) on all other arguments”: \[O[T_0/x](x) = T_0\] \[O[T_0/x](y) = O(y)\]

Let Example

Consider the Cool Expression

\[ \texttt{let} \; x : T_0 \; \texttt{in} \; (\texttt{let} \; y : T_1 \; \texttt{in} \; E_{x,y}) + (\texttt{let} \; x : T_2 \; \texttt{in} \; F_{x,y}) \] where \(E_{x,y}\) and \(F_{x,y}\) are some Cool expression that contain occurrences of \(x\) and \(y\)
Scope
- of \(y\) is \(E_{x,y}\)
- of outer \(x\) is \(E_{x,y}\)
- of inner \(x\) is \(F_{x,y}\)
This is captured precisely in the typing rule

Notes

The type of enviroment gives types to the free identifiers in the current scope
The type environment is passed down the AST from the root towards the leaves
Types are computed up the AST from the leaves towards the root

Let with Initialization

Now consider let with initialization

\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
This rule is weak.

Let with Initialization

Consider the example:

class C inherits P { ... }
...
let x : P <- new C in ...
...

The previous rule does not allow this code; we say that the rule is too weak or incomplete

Subtyping

Define a relation \(X \leq Y\) on classes to say that:
- An object of type \(X\) could be used when one of type \(Y\) is acceptable, or equivalently
- \(X\) conforms with \(Y\)
- In Cool, this means that \(X\) is a subclass of \(Y\)
Define a relation \(\leq\) on classes
- \(X \leq X\)
- \(X \leq Y\) if \(X\) inherits from \(Y\)
- \(X \leq Z\) if \(X \leq Y\) and \(Y \leq Z\)

Let with Initialization (Better)

New rule:

\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
Both rules for let are sound
But, more programs type check with this new rule (it is more complete)

Type System Tug-of-War

There is a tension between
- Flexible rules that do not constrain programming
- Restrictive rules that ensure safety of execution

Expressiveness of Static Type Systems

A static type system enables a compiler to detect many common programming errors
The cost is that some correct programs are disallowed
But more expressive type systems are also more complex

Dynamic and Static Types

The dynamic type of an object is the class \(C\) that is used in the new C expression that creates the object
- A run-time notion
- Even languages that are not statically typed have the notion of dynamic type
The static type of an expression is a notation that captures all possible dynamic types the expression could take
- A compile-time notion

Dynamic and Static Types

In early type systems the set of static types correspond directly with the dynamic types
Soundness theorem: for all expressions \(E\), \(dynamic\_type(E) = static\_type(E)\), that is, in all executions, \(E\) evaluates to values of the type inferred by the compiler.
This gets more complicated in advanced type systems

Dynamic and Static Types in Cool

A variable of static type \(A\) can hold values of static type \(B\), if \(B \leq A\)

class A {...}
class B inherits A {...}
class Main {
  x : A <- new A; -- x has static type A
  ...
  x <- new B; -- here x's value has dynamic type B
  ...
};

Dynamic and Static Types

Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]
Why is this correct?
- For \(E\), compiler uses \(static\_type(E)\)
- All operations that can be used on an object of type \(C\) can also be used on an object of type \(C' \leq C\)
- Subclasses can only add attributes or methods
- Methods can be redefined but with the same types

Subtyping Example

Consider the following Cool class definitions

class A { a() : Int { 0 }; };
class B inherits A { b() : Int { 1 }; };

An instance of B has methods a and b
An instance of A has method a
- A type error occurs if we try to invoke method b on an instance of A

Example of an Incorrect Let Rule (1)

Consider a hypothetical incorrect let rule:

\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following good program does not typecheck:
```
let x : Int <- 0 in x + 1
```

Example of an Incorrect Let Rule (2)

Consider a hypothetical incorrect let rule:

\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T_0 \leq T\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]
The following bad program is well typed:
```
let x : B <- new A in x.b()
```

Example of an Incorrect Let Rule (3)

Consider a hypothetical incorrect let rule:

\[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

The following good program is not well typed:

let x : A <- new B in {... x <- new A; x.a(); }

Typing Rule Notation

The typing rules use very concise notation
They are carefully constructed
Virtually any change in a rule either:
- Makes the type system unsound
- Or, makes the type system less usable (incomplete)
But some good programs will be rejected anyway; the notion of a good program is undecidable

Assignment

More uses of subtyping:

\[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]

Initialized Attributes

Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)
- \(O_c\) represents the class-wide scope
Attribute initialization is similar to let, except for the scope of names

\[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]

If-Then-Else

Consider: \[\texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi}\]
The result can be either \(e_1\) or \(e_2\)
The dynamic type is either \(e_1\)’s or \(e_2\)’s type
The best we can do is the smallest supertype larger than the type of \(e_1\) and \(e_2\)

If-Then-Else Example

Consider the class hierarchy

class P {...}
class A inherits P {...}
class B inherits P {...}

and the expression
```
if ... then new A else new B fi
```
Its type should allow for the dynamic type to be both A or B; the smallest supertype is P

Least Upper Bounds

Define \(lub(X,y)\) to be the least upper bound of \(X\) and \(Y\). The \(lub(X,Y)\) is \(Z\) if
- \(X \leq Z\) and \(Y \leq Z\) (\(Z\) is an upper bound)
- \(X \leq Z'\) and \(Y \leq Z'\) implies \(Z \leq Z'\) (\(Z\) is least among upper bounds)
In Cool, the least upper bound of two types is their least common ancestor in the inheritance tree

If-Then-Else Revisited

\[ \frac{ \begin{array}{l} O \vdash e_0 : \texttt{Bool}\\ O \vdash e_1 : T_1\\ O \vdash e_2 : T_2\\ \end{array}} {O \vdash \texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi} : lub(T_1, T_2)}\text{[If-Then-Else]} \]

Case

The rule for case expressions takes a lub over all branches

\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O[T_1/x_1] \vdash e_1 : T_1'\\ ...\\ O[T_n/x_n] \vdash e_n : T_n'\\ \end{array}} { \begin{array}{l} O \vdash \texttt{case} \; e_0 \; \texttt{of} \; x_1 : T_1 \; \texttt{=>} \; e_1;\\ ...;\\ x_n : T_n \; \texttt{=>} \; e_n; \; \texttt{esac} : lub(T_1', \ldots, T_n') \end{array} }\text{[Case]} \]

Summary

A type environment gives types for free variables. You typecheck a let-body with an environment that has been updated to contain the new let-variable
If an object of type \(X\) could be used when one of type \(Y\) is acceptable then we say \(X\) is a subtype of \(Y\), also written \(X \leq Y\)
A type system is sound if \(\forall E. dynamic\_type(E) \leq static\_type(E)\)

Type Checking

Outline

Soundness

Type Checking Proofs

Rules for Constants

Rule for new

Some Other Rules

A Problem

A Solution

Type Environments

Type Environments and Rules

New Rules

Let

Let Example

Notes

Let with Initialization

Let with Initialization

Subtyping

Let with Initialization (Better)

Type System Tug-of-War

Expressiveness of Static Type Systems

Dynamic and Static Types

Dynamic and Static Types

Dynamic and Static Types in Cool

Dynamic and Static Types

Subtyping Example

Example of an Incorrect Let Rule (1)

Example of an Incorrect Let Rule (2)

Example of an Incorrect Let Rule (3)

Typing Rule Notation

Assignment

Initialized Attributes

If-Then-Else

If-Then-Else Example

Least Upper Bounds

If-Then-Else Revisited

Case

Summary

Rule for `new`