More Type Checking
Assignment
Review: what are \(\vdash\), \(O\), and \(\leq\)?
\[ \frac{ \begin{array}{l} O(id) = T_0\\ O \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]
Initialized Attributes
Let \(O_c(x) = T\) for all attributes \(x : T\) in class \(C\)
- \(O_c\) represents the class-wide scope
Attribute initialization is similar to
let
, except for the scope of names\[ \frac{ \begin{array}{l} O_c(id) = T_0\\ O_c \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O_c \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Attr-Init]} \]
Method Dispatch
There is a problem with type checking method calls:
\[ \frac{ \begin{array}{l} O \vdash e_0 : T_0\\ O \vdash e_1 : T_1\\ ...\\ O \vdash e_n : T_n\\ \end{array}} {O \vdash e_0.f(e_1, \ldots, e_n) : \; ?}\text{[Dispatch]} \]
We need information about the formal parameters and return type of \(f\)
Notes on Dispatch
In Cool, method and object identifiers live in different name spaces
- A method
foo
and an objectfoo
can coexist in the same scope
- A method
In the type rules, this is reflected by a separate mapping \(M\) for method signatures:
\[M(C, f) = (T_1, \ldots, T_n, T_{ret})\]
which means in class \(C\) there is a method \(f\) where
\[f(x_1 : T_1, \ldots, x_n : T_n) : T_{ret}\]
An Extended Typing Judgment
Now we have two environments: \(O\) and \(M\)
The form of the typing judgment is
\[O, M \vdash e : T\]
which can be read as: “with the assumption that the object identifiers have types as given by \(O\) and the method identifiers have signatures as given by \(M\), the expression \(e\) has type \(T\)”
The Method Environment
The method enviroment must be added to all rules
In most cases, \(M\) is passed down but not actually used
Only the dispatch rule uses \(M\)
The Dispatch Rule Revisited
Steps: check reciever object, check actual arguments, then look up the formal argument types \(T_i'\)
\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_{n+1}'}\text{[Dispatch]} \]
Static Dispatch
Static dispatch is a variation of normal dispatch
The method is found in the class explicitly named by the programmer (not via \(e_0\))
The inferred type of the dispatch expression must conform to the specified type
Static Dispatch (Continued)
\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_{n+1}'}\text{[Static Dispatch]} \]
Flexibility vs. Soundness
Recall that type systems have two conflicting goals:
- Give flexibility to the programmer
- Prevent valid programs from “going wrong”
An active line of research is in the area of inventing more flexibile type systems while preserving soundness
Dynamic and Static Types
The dynamic type of an object is the class
C
that is used in thenew C
expression that created it- A run-time notion
- Even languages that are not statically typed have the notion of dynamic type
The static type of an expression is a notion that captures all possible dynamic types the expression could take
- A compile-time notion
Recall: Soundness
Soundness theorem for the Cool type system: \[\forall E. dynamic\_type(E) \leq static\_type(E)\]
Why is this correct?
- For \(E\), compiler uses \(static\_type(E)\)
- All operations that can be used on an object of type \(C\) can also be used on an object of type \(C' \leq C\)
- Subclasses can only add attributes or methods
- Methods can be redefined but with the same types
An Example
Class
Count
incorporates a counter; theinc
method works for any subclassclass Count { i : Int <- 0; inc () : Count { { i <- i + 1; self; } }; };
But, there is disaster lurking in the type system
Continuing Example
Consider a subclass
Stock
ofCount
class Stock inherits Count { name() : String {...}; -- name of item };
And the following use of
Stock
class Main { a : Stock <- (new Stock).inc(); -- Type checking error ... a.name() ... };
Post-Mortem
(new Stock).inc()
has dynamic typeStock
So it is legitimate to write
a : Stock <- (new Stock).inc()
But
(new Stock).inc()
has static typeCount
The type checker “loses” type information
This makes inheriting
inc
useless- That is, we must redefine
inc
for each of the subclasses, with a specialized return type
- That is, we must redefine
SELF_TYPE
to the Rescue
We will extend the type system
Insight:
inc
returnsself
- Therefore the return value has the same type as
self
- Which could be
Count
or any subtype ofCount
- In the case of
(new Stock).inc()
the type isStock
We introduce the keyword
SELF_TYPE
to use for the return value of such functions- We will also modify the typing rules to handle
SELF_TYPE
- We will also modify the typing rules to handle
SELF_TYPE
to the Rescue (Continued)
SELF_TYPE
allows the return type ofinc
to change wheninc
is inheritedModify the declaration of
inc
to readinc() : SELF_TYPE { ... }
The type checker can now prove:
- \(O, M \vdash\)
(new Count).inc() : Count
- \(O, M \vdash\)
(new Stock).inc() : Stock
- \(O, M \vdash\)
The program from before is now well typed
SELF_TYPE
as a Tool
SELF_TYPE
is not a dynamic typeSELF_TYPE
is a static type- It helps the type checker to keep better track of types
- It enables the type checker to accept more correct programs
- In short, having
SELF_TYPE
increases the expressive power of the type system
SELF_TYPE
and Dynamic Types
What can the dynamic type of the object returned by
inc
be?Answer: whatever the type of
self
could beExample: the dynamic type could be
Count
or any subtype ofCount
class A inherits Count { }; class B inherits Count { }; class C inherits Count { };
SELF_TYPE
and Dynamic Types (Continued)
In general, if
SELF_TYPE
appears textually in the class \(C\) as the declared type of \(E\) then it denotes the dynamic type of theself
expression:\[dynamic\_type(E) = dynamic\_type(\texttt{self}) \leq C\]
Note: the meaning of
SELF_TYPE
depends on where it appears- We write \(\texttt{SELF_TYPE}_C\) to refer to an occurrence of
SELF_TYPE
in the body of \(C\)
- We write \(\texttt{SELF_TYPE}_C\) to refer to an occurrence of
Type Checking
This suggests a typing rule:
\[\texttt{SELF_TYPE}_C \leq C\]
This rule has an important consequence:
- In type checking it is always safe to replace \(\texttt{SELF_TYPE}_C\) with \(C\)
This suggests one way to handle
SELF_TYPE
: replace all \(\texttt{SELF_TYPE}_C\) with with \(C\)This would be correct but it is like not having
SELF_TYPE
at all (whoops!)
Operations on SELF_TYPE
Recall the operations on types:
- \(T_1 \leq T_2\): \(T_1\) is a subtype of \(T_2\)
- \(lub(T_1, T_2)\): the least-upper bound of \(T_1\) and \(T_2\)
We must extend these operations to handle
SELF_TYPE
Extending \(\leq\)
Let \(T\) and \(T'\) be any types except SELF_TYPE
. There are four cases in the definition of \(\leq\)
\(\texttt{SELF_TYPE}_C \leq T\) if \(C \leq T\)
- \(\texttt{SELF_TYPE}_C\) can be any subtype of \(C\)
- This includes \(C\) itself
- Thus this is the most flexible rule we can allow
\(\texttt{SELF_TYPE}_C \leq \texttt{SELF_TYPE}_C\)
- \(\texttt{SELF_TYPE}_C\) is the type of the
self
expression - In Cool, we never need to compare
SELF_TYPE
s comming from different classes
- \(\texttt{SELF_TYPE}_C\) is the type of the
Extending \(\leq\) (Continued)
\(T \leq \texttt{SELF_TYPE}_C\) is always false
- Note: \(\texttt{SELF_TYPE}_C\) can denote any subtype of \(C\)
\(T \leq T'\) (according to the rules from before
Based on these rules, we can extend \(lub\)
Extending \(lub(T, T')\)
Let \(T\) and \(T'\) be any types except SELF_TYPE
. Again, there are four cases:
- \(lub(\texttt{SELF_TYPE}_C, \texttt{SELF_TYPE}_C) = \texttt{SELF_TYPE}_C\)
- \(lub(\texttt{SELF_TYPE}_C, T) = lub(C, T)\)
- \(lub(T, \texttt{SELF_TYPE}_C) = lub(T, C)\)
- \(lub(T, T')\) defined as before
Where Can SELF_TYPE
Appear in Cool?
The parser checks that
SELF_TYPE
appears only where a type is expectedBut
SELF_TYPE
is not allowed everywhere a type can appear:class
\(T\)inherits
\(T'\){...}
- \(T, T'\) cannot be
SELF_TYPE
becauseSELF_TYPE
is never a dynamic type
- \(T, T'\) cannot be
x :
\(T\)- \(T\) can be
SELF_TYPE
- An attribute whose type is \(\texttt{SELF_TYPE}_C\)
- \(T\) can be
Where Can SELF_TYPE
Appear in Cool?
let x :
\(T\)in
\(E\)- \(T\) can be
SELF_TYPE
x
has type \(\texttt{SELF_TYPE}_C\)
- \(T\) can be
new
\(T\)- \(T\) can be
SELF_TYPE
- Creates an object of the same type as
self
- \(T\) can be
m@
\(T(E_1, \ldots, E_n)\)- \(T\) cannot be
SELF_TYPE
- \(T\) cannot be
Typing Rules for SELF_TYPE
Since occurrences of
SELF_TYPE
depend on the enclosing class we need to carry more context during type checkingNew form of the typing judgment:
\[O, M, C \vdash e : T\]
(an expression \(e\) occurring in the body of \(C\) has static type \(T\) given a variable type environment \(O\) and method signatures \(M\))
Type Checking Rules
The next step is to design type rules using
SELF_TYPE
for each language constructMost of the rules remain the same except that \(\leq\) and \(lub\) are the new ones
Example:
\[ \frac{ \begin{array}{l} O(id) = T_0\\ O,M,C \vdash e_1 : T_1\\ T_1 \leq T_0 \end{array}} {O,M,C \vdash id \; \texttt{<-} \; e_1 : T_1}\text{[Assign]} \]
What is Different?
Compare this to the old rule for dispatch
\[ \frac{ \begin{array}{l} O, M,C \vdash e_0 : T_0\\ O, M,C \vdash e_1 : T_1\\ ...\\ O, M,C \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_{n+1}' \neq \texttt{SELF_TYPE}\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_{n+1}'} \]
The Big Rule for SELF_TYPE
If the return type of the method is
SELF_TYPE
, then the type of the dispatch is the type of the dispatch expressions:\[ \frac{ \begin{array}{l} O, M,C \vdash e_0 : T_0\\ O, M,C \vdash e_1 : T_1\\ ...\\ O, M,C \vdash e_n : T_n\\ M(T_0, f) = (T_1', \ldots, T_n', \texttt{SELF_TYPE})\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0.f(e_1, \ldots, e_n) : \; T_0} \]
What is Different?
- Note this rule handles the
Stock
example - Formal parameters cannot be
SELF_TYPE
Actual arguments can be
SELF_TYPE
- The extended \(\leq\) relation handles this case
The type \(T_0\) of the dispatch expression could be
SELF_TYPE
- Which class is used to find the declaration of \(f\)?
- Answer: if is safe to use the class where the dispatch appears
Static Dispatch
Compare this to the old rule for static dispatch
\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', T_{n+1}')\\ T_{n+1}' \neq \texttt{SELF_TYPE}\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_{n+1}'} \]
Static Dispatch
If the return type of the method is
SELF_TYPE
, then we have:\[ \frac{ \begin{array}{l} O, M \vdash e_0 : T_0\\ O, M \vdash e_1 : T_1\\ ...\\ O, M \vdash e_n : T_n\\ T_0 \leq T\\ M(T, f) = (T_1', \ldots, T_n', \texttt{SELF_TYPE})\\ T_i \leq T_i' \; \text{for} \; 1 \leq i \leq n \end{array}} {O, M \vdash e_0@T.f(e_1, \ldots, e_n) : \; T_0} \]
Static Dispatch
Why is this rule correct?
If we dispatch a method returning
SELF_TYPE
in class \(T\), don’t we get back a \(T\)?Answer: No.
SELF_TYPE
is the type of the self parameter, which may be a subtype of the class in which the method body appears (not the class in which the call appears)The static dispatch class cannot be
SELF_TYPE
New Rules
There are two new rules using
SELF_TYPE
\[\frac{}{O,M,C \vdash \texttt{self} : \texttt{SELF_TYPE}_C}\]
\[\frac{}{O,M,C \vdash \texttt{new SELF_TYPE} : \texttt{SELF_TYPE}_C}\]
There are a number of other places where
SELF_TYPE
is used
Where is SELF_TYPE
Illegal in Cool?
In
m(x :
\(T\)) :
\(T'\), only \(T'\) can beSELF_TYPE
Example: what could go wrong if \(T\) were
SELF_TYPE
?class A { comp(x : SELF_TYPE) : Bool {...}; }; class B inherits A { b(): Int {...}; comp(y: SELF_TYPE) : Bool {... y.b() ...}; }; }; ... let x : A <- new B in ... x.comp(new A); ...
Summary of SELF_TYPE
The extended \(\leq\) and \(lub\) operations can do a lot of the work; implement them to handle
SELF_TYPE
SELF_TYPE
can be used only in a few places; be sure it is not used anywhere elseA use of
SELF_TYPE
always refers to any subtype in the current class- The exception is the type checking of dispatch
SELF_TYPE
as the return type in a invoked method might have nothing to do with the current class
Why Cover SELF_TYPE
?
SELF_TYPE
is a research idea; it adds more expressiveness to the type system without allowing any “bad” programsSELF_TYPE
itself is not so important (except for the course project)In practice, there should be a balance between the complexity of the type system and its expressiveness
Type Systems
The rules in this lecture were Cool-specific
- Other languages have very different rules
- We will survey a few more type systems later as time permits
General Themes
- Type rules are defined on the structure of expressions
- Types of variables are modeled by an environment
Types are a play between flexibility and safety