More Type Checking

Assignment

Review: what are \(\vdash\), \(O\), and \(\leq\)?

\[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]

Initialized Attributes

Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)
- \(O_c\) represents the class-wide scope
Attribute initialization is similar to let, except for the scope of names

\[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]

Method Dispatch

There is a problem with type checking method calls:

\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1\\ ...\\ O \vdash e_n : T_n\\ \end{array}} {O \vdash e_0.f(e_1, \ldots, e_n) : \; ?}\text{[Dispatch]} \]
We need information about the formal parameters and return type of \(f\)

Notes on Dispatch

In Cool, method and object identifiers live in different name spaces
- A method foo and an object foo can coexist in the same scope
In the type rules, this is reflected by a separate mapping \(M\) for method signatures:

\[M(C, f) = (T_1, \ldots, T_n, T_{ret})\]

which means in class \(C\) there is a method \(f\) where

\[f(x_1 : T_1, \ldots, x_n : T_n) : T_{ret}\]

An Extended Typing Judgment

Now we have two environments: \(O\) and \(M\)
The form of the typing judgment is

\[O, M \vdash e : T\]

which can be read as: “with the assumption that the object identifiers have types as given by \(O\) and the method identifiers have signatures as given by \(M\), the expression \(e\) has type \(T\)”

The Method Environment

The method enviroment must be added to all rules
In most cases, \(M\) is passed down but not actually used
Only the dispatch rule uses \(M\)

The Dispatch Rule Revisited

Steps: check reciever object, check actual arguments, then look up the formal argument types \(T_i'\)

\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_{n+1}'}\text{[Dispatch]} \]

Static Dispatch

Static dispatch is a variation of normal dispatch
The method is found in the class explicitly named by the programmer (not via \(e_0\))
The inferred type of the dispatch expression must conform to the specified type

Static Dispatch (Continued)

\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_{n+1}'}\text{[Static Dispatch]} \]

Flexibility vs. Soundness

Recall that type systems have two conflicting goals:
- Give flexibility to the programmer
- Prevent valid programs from “going wrong”
An active line of research is in the area of inventing more flexibile type systems while preserving soundness

Dynamic and Static Types

The dynamic type of an object is the class C that is used in the new C expression that created it
- A run-time notion
- Even languages that are not statically typed have the notion of dynamic type
The static type of an expression is a notion that captures all possible dynamic types the expression could take
- A compile-time notion

Recall: Soundness

Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]
Why is this correct?
- For \(E\), compiler uses \(static\_type(E)\)
- All operations that can be used on an object of type \(C\) can also be used on an object of type \(C' \leq C\)
- Subclasses can only add attributes or methods
- Methods can be redefined but with the same types

An Example

Class Count incorporates a counter; the inc method works for any subclass

class Count {
    i : Int <- 0;
    inc () : Count {
        {
            i <- i + 1;
            self;
        }
    };
};

But, there is disaster lurking in the type system

Continuing Example

Consider a subclass Stock of Count

class Stock inherits Count {
    name() : String {...}; -- name of item
};

And the following use of Stock

class Main {
    a : Stock <- (new Stock).inc(); -- Type checking error
    ...  a.name() ...
};

Post-Mortem

(new Stock).inc() has dynamic type Stock
So it is legitimate to write
```
a : Stock <- (new Stock).inc()
```
But (new Stock).inc() has static type Count
The type checker “loses” type information
This makes inheriting inc useless
- That is, we must redefine inc for each of the subclasses, with a specialized return type

`SELF_TYPE` to the Rescue

We will extend the type system
Insight:
- inc returns self
- Therefore the return value has the same type as self
- Which could be Count or any subtype of Count
- In the case of (new Stock).inc() the type is Stock
We introduce the keyword SELF_TYPE to use for the return value of such functions
- We will also modify the typing rules to handle SELF_TYPE

`SELF_TYPE` to the Rescue (Continued)

SELF_TYPE allows the return type of inc to change when inc is inherited
Modify the declaration of inc to read
```
inc() : SELF_TYPE { ... }
```
The type checker can now prove:
- \(O, M \vdash\) (new Count).inc() : Count
- \(O, M \vdash\) (new Stock).inc() : Stock
The program from before is now well typed

`SELF_TYPE` as a Tool

SELF_TYPE is not a dynamic type
SELF_TYPE is a static type
It helps the type checker to keep better track of types
It enables the type checker to accept more correct programs
In short, having SELF_TYPE increases the expressive power of the type system

`SELF_TYPE` and Dynamic Types

What can the dynamic type of the object returned by inc be?
Answer: whatever the type of self could be

Example: the dynamic type could be Count or any subtype of Count

class A inherits Count { };
class B inherits Count { };
class C inherits Count { };

`SELF_TYPE` and Dynamic Types (Continued)

In general, if SELF_TYPE appears textually in the class \(C\) as the declared type of \(E\) then it denotes the dynamic type of the self expression:

\[dynamic\_type(E) = dynamic\_type(\texttt{self}) \leq C\]
Note: the meaning of SELF_TYPE depends on where it appears
- We write \(\texttt{SELF_TYPE}_C\) to refer to an occurrence of SELF_TYPE in the body of \(C\)

Type Checking

This suggests a typing rule:

\[\texttt{SELF_TYPE}_C \leq C\]
This rule has an important consequence:
- In type checking it is always safe to replace \(\texttt{SELF_TYPE}_C\) with \(C\)
This suggests one way to handle SELF_TYPE: replace all \(\texttt{SELF_TYPE}_C\) with with \(C\)
This would be correct but it is like not having SELF_TYPE at all (whoops!)

Operations on `SELF_TYPE`

Recall the operations on types:
- \(T_1 \leq T_2\): \(T_1\) is a subtype of \(T_2\)
- \(lub(T_1, T_2)\): the least-upper bound of \(T_1\) and \(T_2\)
We must extend these operations to handle SELF_TYPE

Extending \(\leq\)

Let \(T\) and \(T'\) be any types except SELF_TYPE. There are four cases in the definition of \(\leq\)

\(\texttt{SELF_TYPE}_C \leq T\) if \(C \leq T\)
- \(\texttt{SELF_TYPE}_C\) can be any subtype of \(C\)
- This includes \(C\) itself
- Thus this is the most flexible rule we can allow
\(\texttt{SELF_TYPE}_C \leq \texttt{SELF_TYPE}_C\)
- \(\texttt{SELF_TYPE}_C\) is the type of the self expression
- In Cool, we never need to compare SELF_TYPEs comming from different classes

Extending \(\leq\) (Continued)

\(T \leq \texttt{SELF_TYPE}_C\) is always false
- Note: \(\texttt{SELF_TYPE}_C\) can denote any subtype of \(C\)
\(T \leq T'\) (according to the rules from before

Based on these rules, we can extend \(lub\)

Extending \(lub(T, T')\)

Let \(T\) and \(T'\) be any types except SELF_TYPE. Again, there are four cases:

\(lub(\texttt{SELF_TYPE}_C, \texttt{SELF_TYPE}_C) = \texttt{SELF_TYPE}_C\)
\(lub(\texttt{SELF_TYPE}_C, T) = lub(C, T)\)
\(lub(T, \texttt{SELF_TYPE}_C) = lub(T, C)\)
\(lub(T, T')\) defined as before

Where Can `SELF_TYPE` Appear in Cool?

The parser checks that SELF_TYPE appears only where a type is expected
But SELF_TYPE is not allowed everywhere a type can appear:
class \(T\) inherits \(T'\) {...}
- \(T, T'\) cannot be SELF_TYPE because SELF_TYPE is never a dynamic type
x : \(T\)
- \(T\) can be SELF_TYPE
- An attribute whose type is \(\texttt{SELF_TYPE}_C\)

Where Can `SELF_TYPE` Appear in Cool?

let x : \(T\) in \(E\)
- \(T\) can be SELF_TYPE
- x has type \(\texttt{SELF_TYPE}_C\)
new \(T\)
- \(T\) can be SELF_TYPE
- Creates an object of the same type as self
m@ \(T(E_1, \ldots, E_n)\)
- \(T\) cannot be SELF_TYPE

Typing Rules for `SELF_TYPE`

Since occurrences of SELF_TYPE depend on the enclosing class we need to carry more context during type checking
New form of the typing judgment:

\[O, M, C \vdash e : T\]

(an expression \(e\) occurring in the body of \(C\) has static type \(T\) given a variable type environment \(O\) and method signatures \(M\))

Type Checking Rules

The next step is to design type rules using SELF_TYPE for each language construct
Most of the rules remain the same except that \(\leq\) and \(lub\) are the new ones
Example:

\[ \frac{ \begin{array}{l} O(id) = T_0\\ O,M,C \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O,M,C \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]

What is Different?

Compare this to the old rule for dispatch

\[ \frac{ \begin{array}{l} O, M,C \vdash e_0 : T_0\\ O, M,C \vdash e_1 : T_1\\ ...\\ O, M,C \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_{n+1}' \neq \texttt{SELF_TYPE}\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_{n+1}'} \]

The Big Rule for `SELF_TYPE`

If the return type of the method is SELF_TYPE, then the type of the dispatch is the type of the dispatch expressions:

\[ \frac{ \begin{array}{l} O, M,C \vdash e_0 : T_0\\ O, M,C \vdash e_1 : T_1\\ ...\\ O, M,C \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', \texttt{SELF_TYPE})\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_0} \]

What is Different?

Note this rule handles the Stock example
Formal parameters cannot be SELF_TYPE
Actual arguments can be SELF_TYPE
- The extended \(\leq\) relation handles this case
The type \(T_0\) of the dispatch expression could be SELF_TYPE
- Which class is used to find the declaration of \(f\)?
- Answer: if is safe to use the class where the dispatch appears

Static Dispatch

Compare this to the old rule for static dispatch

\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_{n+1}' \neq \texttt{SELF_TYPE}\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_{n+1}'} \]

Static Dispatch

If the return type of the method is SELF_TYPE, then we have:

\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', \texttt{SELF_TYPE})\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_0} \]

Static Dispatch

Why is this rule correct?
If we dispatch a method returning SELF_TYPE in class \(T\), don’t we get back a \(T\)?
Answer: No. SELF_TYPE is the type of the self parameter, which may be a subtype of the class in which the method body appears (not the class in which the call appears)
The static dispatch class cannot be SELF_TYPE

New Rules

There are two new rules using SELF_TYPE

\[\frac{}{O,M,C \vdash \texttt{self} : \texttt{SELF_TYPE}_C}\]

\[\frac{}{O,M,C \vdash \texttt{new SELF_TYPE} : \texttt{SELF_TYPE}_C}\]
There are a number of other places where SELF_TYPE is used

Where is `SELF_TYPE` Illegal in Cool?

In m(x : \(T\) ) : \(T'\), only \(T'\) can be SELF_TYPE

Example: what could go wrong if \(T\) were SELF_TYPE?

class A { comp(x : SELF_TYPE) : Bool {...}; };
class B inherits A {
    b(): Int {...};
    comp(y: SELF_TYPE) : Bool {... y.b() ...}; };
};
...
let x : A <- new B in ... x.comp(new A); ...

Summary of `SELF_TYPE`

The extended \(\leq\) and \(lub\) operations can do a lot of the work; implement them to handle SELF_TYPE
SELF_TYPE can be used only in a few places; be sure it is not used anywhere else
A use of SELF_TYPE always refers to any subtype in the current class
- The exception is the type checking of dispatch
- SELF_TYPE as the return type in a invoked method might have nothing to do with the current class

Why Cover `SELF_TYPE`?

SELF_TYPE is a research idea; it adds more expressiveness to the type system without allowing any “bad” programs
SELF_TYPE itself is not so important (except for the course project)
In practice, there should be a balance between the complexity of the type system and its expressiveness

Type Systems

The rules in this lecture were Cool-specific
- Other languages have very different rules
- We will survey a few more type systems later as time permits
General Themes
- Type rules are defined on the structure of expressions
- Types of variables are modeled by an environment
Types are a play between flexibility and safety

More Type Checking

Assignment

Initialized Attributes

Method Dispatch

Notes on Dispatch

An Extended Typing Judgment

The Method Environment

The Dispatch Rule Revisited

Static Dispatch

Static Dispatch (Continued)

Flexibility vs. Soundness

Dynamic and Static Types

Recall: Soundness

An Example

Continuing Example

Post-Mortem

SELF_TYPE to the Rescue

SELF_TYPE to the Rescue (Continued)

SELF_TYPE as a Tool

SELF_TYPE and Dynamic Types

SELF_TYPE and Dynamic Types (Continued)

Type Checking

Operations on SELF_TYPE

Extending \(\leq\)

Extending \(\leq\) (Continued)

Extending \(lub(T, T')\)

Where Can SELF_TYPE Appear in Cool?

Where Can SELF_TYPE Appear in Cool?

Typing Rules for SELF_TYPE

Type Checking Rules

What is Different?

The Big Rule for SELF_TYPE

What is Different?

Static Dispatch

Static Dispatch

Static Dispatch

New Rules

Where is SELF_TYPE Illegal in Cool?

Summary of SELF_TYPE

Why Cover SELF_TYPE?

Type Systems

`SELF_TYPE` to the Rescue

`SELF_TYPE` to the Rescue (Continued)

`SELF_TYPE` as a Tool

`SELF_TYPE` and Dynamic Types

`SELF_TYPE` and Dynamic Types (Continued)

Operations on `SELF_TYPE`

Where Can `SELF_TYPE` Appear in Cool?

Where Can `SELF_TYPE` Appear in Cool?

Typing Rules for `SELF_TYPE`

The Big Rule for `SELF_TYPE`

Where is `SELF_TYPE` Illegal in Cool?

Summary of `SELF_TYPE`

Why Cover `SELF_TYPE`?