Type Checking

CSC 310 - Programming Languages

Outline

  • Typing Rules
  • Typing Enviroments
  • “Let” Rules
  • Subtyping
  • Incorrect Rules

Soundness

  • A type system is sound if, whenever \(\vdash e : T\), then \(e\) evaluates to a value of type \(T\)

  • We only want sound rules, but some sound rules are better than others: \[\frac{i\text{ is an integer}}{\vdash i : \texttt{Object}}\]

Type Checking Proofs

  • Type checking proves facts \(e: T\)

    • Proof is on the structure of the AST

    • Proof has the shape of the AST

    • One type rule is used for each kind of AST node

  • In the type rule used for a node \(e\)

    • Hypotheses are the proofs of types of \(e\)’s subexpressions

    • Conclusion is the type of \(e\)

  • Types are computed in a bottom-up pass over the AST

Rules for Constants

\[\frac{i\text{ is an integer}}{\vdash i : \texttt{Int}}\text{[Int]}\] \[\frac{}{\vdash true : \texttt{Bool}}\text{[Bool]}\] \[\frac{}{\vdash false : \texttt{Bool}}\text{[Bool]}\] \[\frac{s \text{ is a string constant}}{\vdash s : \texttt{String}}\text{[String]}\]

Rule for new

  • new T produces an object of type T

    • Ignore SELF_TYPE for now …

    \[ \frac{}{\vdash \texttt{new} \; T : T}\text{[New]} \]

Some Other Rules

  • Not

    \[ \frac{\vdash e : \texttt{Bool}} {\vdash \texttt{not} \; e : \texttt{Bool}}\text{[Not]} \]

  • Loop

    \[ \frac{ \begin{array}{l} \vdash e_1 : \texttt{Bool}\\ \vdash e_2 : T \end{array} } {\vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 : T \; \texttt{pool}}\text{[Loop]} \]

A Problem

  • What is the type of a variable reference? \[\frac{x \text{ is an identifier}}{\vdash x : ?}\text{[Var]}\]

  • The local, structural rule does not carry enough information to give \(x\) a type

A Solution

  • Put more information in the rules

  • A type environment give types for free variables

    • A type environment is a function from identifiers to types

    • A variable is free in an expression if it is not defined within the expression

  • Example: in the expression let x : Int in x + y, y is free, but x is not

Type Environments

  • Let \(O\) be a function from object identifiers to types

  • The sentence \(O \vdash e : T\) is read: under the assumption that variables have the types given by \(O\), it is provable that the expression \(e\) has type \(T\)

Type Environments and Rules

  • The type environment is added to the earlier rules, for example \[ \frac{i\text{ is an integer}}{O \vdash i : \texttt{Int}}\text{[Int]} \]

    \[ \frac{ \begin{array}{l} O \vdash e_1 : \texttt{Int}\\ O \vdash e_2 : \texttt{Int} \end{array}} {O \vdash e_1 + e_2 : \texttt{Int}}\text{[Add]} \]

New Rules

  • And we can now write a rule for variables:

    \[ \frac{O(x) = T} {O \vdash x : T}\text{[Var]} \]

Let

\[ \frac{O[T_0/x] \vdash e_1 : T_1} {0 \vdash \texttt{let} \; x : T_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-No-Init]} \]

\(O[T_0/x]\) means “\(O\) modified to map \(x\) to \(T_0\) and behaving as \(O\) on all other arguments”: \[O[T_0/x](x) = T_0\] \[O[T_0/x](y) = O(y)\]

Let Example

  • Consider the Cool Expression

    \[ \texttt{let} \; x : T_0 \; \texttt{in} \; (\texttt{let} \; y : T_1 \; \texttt{in} \; E_{x,y}) + (\texttt{let} \; x : T_2 \; \texttt{in} \; F_{x,y}) \] where \(E_{x,y}\) and \(F_{x,y}\) are some Cool expression that contain occurrences of \(x\) and \(y\)

  • Scope

    • of \(y\) is \(E_{x,y}\)
    • of outer \(x\) is \(E_{x,y}\)
    • of inner \(x\) is \(F_{x,y}\)
  • This is captured precisely in the typing rule

Notes

  • The type of enviroment gives types to the free identifiers in the current scope

  • The type environment is passed down the AST from the root towards the leaves

  • Types are computed up the AST from the leaves towards the root

Let with Initialization

  • Now consider let with initialization

    \[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

  • This rule is weak.

Let with Initialization

  • Consider the example:

    class C inherits P { ... }
    ...
    let x : P <- new C in ...
    ...
  • The previous rule does not allow this code; we say that the rule is too weak or incomplete

Subtyping

  • Define a relation \(X \leq Y\) on classes to say that:

    • An object of type \(X\) could be used when one of type \(Y\) is acceptable, or equivalently
    • \(X\) conforms with \(Y\)
    • In Cool, this means that \(X\) is a subclass of \(Y\)
  • Define a relation \(\leq\) on classes

    • \(X \leq X\)
    • \(X \leq Y\) if \(X\) inherits from \(Y\)
    • \(X \leq Z\) if \(X \leq Y\) and \(Y \leq Z\)

Let with Initialization (Better)

  • New rule:

    \[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

  • Both rules for let are sound

  • But, more programs type check with this new rule (it is more complete)

Type System Tug-of-War

  • There is a tension between

    • Flexible rules that do not constrain programming
    • Restrictive rules that ensure safety of execution

Expressiveness of Static Type Systems

  • A static type system enables a compiler to detect many common programming errors

  • The cost is that some correct programs are disallowed

  • But more expressive type systems are also more complex

Dynamic and Static Types

  • The dynamic type of an object is the class \(C\) that is used in the new C expression that creates the object

    • A run-time notion
    • Even languages that are not statically typed have the notion of dynamic type
  • The static type of an expression is a notation that captures all possible dynamic types the expression could take

    • A compile-time notion

Dynamic and Static Types

  • In early type systems the set of static types correspond directly with the dynamic types

  • Soundness theorem: for all expressions \(E\), \(dynamic\_type(E) = static\_type(E)\), that is, in all executions, \(E\) evaluates to values of the type inferred by the compiler.

  • This gets more complicated in advanced type systems

Dynamic and Static Types in Cool

  • A variable of static type \(A\) can hold values of static type \(B\), if \(B \leq A\)

    class A {...}
    class B inherits A {...}
    class Main {
      x : A <- new A; -- x has static type A
      ...
      x <- new B; -- here x's value has dynamic type B
      ...
    };

Dynamic and Static Types

  • Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]

  • Why is this correct?

    • For \(E\), compiler uses \(static\_type(E)\)
    • All operations that can be used on an object of type \(C\) can also be used on an object of type \(C' \leq C\)
    • Subclasses can only add attributes or methods
    • Methods can be redefined but with the same types

Subtyping Example

  • Consider the following Cool class definitions

    class A { a() : Int { 0 }; };
    class B inherits A { b() : Int { 1 }; };
  • An instance of B has methods a and b
  • An instance of A has method a

    • A type error occurs if we try to invoke method b on an instance of A

Example of an Incorrect Let Rule (1)

  • Consider a hypothetical incorrect let rule:

    \[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

  • The following good program does not typecheck:

    let x : Int <- 0 in x + 1

Example of an Incorrect Let Rule (2)

  • Consider a hypothetical incorrect let rule:

    \[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T_0 \leq T\\ O[T_0/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

  • The following bad program is well typed:

    let x : B <- new A in x.b()

Example of an Incorrect Let Rule (3)

  • Consider a hypothetical incorrect let rule:

    \[ \frac{ \begin{array}{l} O \vdash e_0 : T\\ T \leq T_0\\ O[T/x] \vdash e_1 : T_1 \end{array}} {O \vdash \texttt{let} \; x : T_0 \; \texttt{<-} \; e_0 \; \texttt{in} \; e_1 : T_1}\text{[Let-Init]} \]

  • The following good program is not well typed:

    let x : A <- new B in {... x <- new A; x.a(); }

Typing Rule Notation

  • The typing rules use very concise notation
  • They are carefully constructed
  • Virtually any change in a rule either:

    • Makes the type system unsound
    • Or, makes the type system less usable (incomplete)
  • But some good programs will be rejected anyway; the notion of a good program is undecidable

Assignment

  • More uses of subtyping:

    \[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]

Initialized Attributes

  • Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)

    • \(O_c\) represents the class-wide scope
  • Attribute initialization is similar to let, except for the scope of names

    \[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]

If-Then-Else

  • Consider: \[\texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi}\]

  • The result can be either \(e_1\) or \(e_2\)

  • The dynamic type is either \(e_1\)’s or \(e_2\)’s type

  • The best we can do is the smallest supertype larger than the type of \(e_1\) and \(e_2\)

If-Then-Else Example

  • Consider the class hierarchy

    class P {...}
    class A inherits P {...}
    class B inherits P {...}
  • and the expression

    if ... then new A else new B fi
  • Its type should allow for the dynamic type to be both A or B; the smallest supertype is P

Least Upper Bounds

  • Define \(lub(X,y)\) to be the least upper bound of \(X\) and \(Y\). The \(lub(X,Y)\) is \(Z\) if

    • \(X \leq Z\) and \(Y \leq Z\) (\(Z\) is an upper bound)
    • \(X \leq Z'\) and \(Y \leq Z'\) implies \(Z \leq Z'\) (\(Z\) is least among upper bounds)
  • In Cool, the least upper bound of two types is their least common ancestor in the inheritance tree

If-Then-Else Revisited

\[ \frac{ \begin{array}{l} O \vdash e_0 : \texttt{Bool}\\ O \vdash e_1 : T_1\\ O \vdash e_2 : T_2\\ \end{array}} {O \vdash \texttt{if} \; e_0 \; \texttt{then} \; e_1 \; \texttt{else} \; e_2 \; \texttt{fi} : lub(T_1, T_2)}\text{[If-Then-Else]} \]

Case

  • The rule for case expressions takes a lub over all branches

\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O[T_1/x_1] \vdash e_1 : T_1'\\ ...\\ O[T_n/x_n] \vdash e_n : T_n'\\ \end{array}} { \begin{array}{l} O \vdash \texttt{case} \; e_0 \; \texttt{of} \; x_1 : T_1 \; \texttt{=>} \; e_1;\\ ...;\\ x_n : T_n \; \texttt{=>} \; e_n; \; \texttt{esac} : lub(T_1', \ldots, T_n') \end{array} }\text{[Case]} \]

Summary

  • A type environment gives types for free variables. You typecheck a let-body with an environment that has been updated to contain the new let-variable

  • If an object of type \(X\) could be used when one of type \(Y\) is acceptable then we say \(X\) is a subtype of \(Y\), also written \(X \leq Y\)

  • A type system is sound if \(\forall E. dynamic\_type(E) \leq static\_type(E)\)