Operational Semantics
Outline
Operational semantics is a precise way of specifying how to evaluate a program
A formal semantics tells you what each expression means
Meaning depends on context: a variable environment will map variables to memory locations and a store will map memory locations to values
Motivation
The meaning of an expression is what happens when it is evaluated
The definition of a programming language:
The tokens \(\Rightarrow\) lexical analysis
The grammar \(\Rightarrow\) syntactic analysis
The typing rules \(\Rightarrow\) semantic analysis
The evaluation rules \(\Rightarrow\) interpretation
Assembly Language Description of Semantics
Assembly language descriptions of language implementation have too many irrelevant details
Which way the stack grows
How integers are represented on a particular machine
The particular instruction set of the architecture
We need a complete but not overly restrictive specification
Programming Language Semantics
There many ways to specify programming language semantics
They are all equivalent, but some are more suitable to various tasks than others
Operational semantics
Describes the evaluation of programs on an abstract machine
Most useful for specifying implementations
Other Kinds of Semantics
Denotational semantics
The meaning of a program is expressed as a mathematical object
Elegant but quite complicated
Axiomatic semantics
Useful for checking that programs satisfy certain correctness properties
The foundation of many program verification systems
Introduction to Operational Semantics
Once again we introduce a formal notation using logical rules of inference
Recall the typing judgement \[Context \vdash e : T\] (in the given \(Context\), expression \(e\) has type \(T\))
We try something similar for evaluation \[Context \vdash e : v\] (in the given \(Context\), expression \(e\) evaluates to value \(v\))
Example Operational Semantics Inference Rule
\[ \frac{\begin{array}{l} Context \; \vdash e_1 : 5 \\ Context \; \vdash e_2 : 7 \end{array} } {Context \; \vdash e_1 + e_2 : 12} \]
In general, the result of evaluating an expression depends on the result of evaluating its subexpressions
The logical rules specify everything that is needed to evaluate an expression
What Contexts are Needed?
Contexts are needed to handle variables
Consider the evaluation of
y <- x + 1
We need to keep track of values of variables
We need to allow variables to change their values during evaluation
We track variables and their values with:
An environment: tells us at what address in memory is the value of a variable stored
A store: tells us what is the contents of a memory location
Variable Environments
A variable environment is a map from variable names to locations
Tells in what memory location the value of a variable is stored; locations = memory addresses
Environment tracks in-scope variables only
Example environment: \[E = [a : l_1, b : l_2 ]\]
To lookup a variable \(a\) in environment \(E\), we write \(E(a)\)
Stores
A store maps memory locations to values
Example store: \[S = [l_1 \rightarrow 5, l_2 \rightarrow 7 ]\]
To lookup the contents of a location \(l_1\) in store \(S\), we write \(S(l_1)\)
To perform an assignment of 23 to location \(l_1\), we write \(S[23/l_1]\); this denotes a new store \(S'\) such that \(S'(l_1) = 23\) and \(S'(l) = S(l)\) if \(l \neq l_1\)
Cool Values
All values in Cool are objects
To denote a Cool object we us the notation \(X(a_1 = l_1, \ldots, a_n = l_n)\) where
- \(X\) is the dynamic type of the object (type tag)
- \(a_i\) are the attributes (including those inherited)
- \(l_i\) are the locations where the values of the attributes are stored
Cool Values (Continued)
Special cases (without named attributes)
Int(5)
Bool(true)
String(4, "Cool")
There is a special value void that is a member of all types
- No operations can be performed on it
- Except for the test
isvoid
- Concrete implementations might use NULL here
Operational Rules
The evaluation judgement is \[E, S \vdash e : v, S'\] read:
Given \(E\) the current environment
And \(S\) the current store
If the evaluation of \(e\) terminates, then
The returned value is \(v\)
And the new store is \(S'\)
Notes
The “result” of evaluating an expression is both a value and also a new store
Changes to the store model side-effects, that is, assignments to mutable variables
The variable environment does not change
The operational semantics allows for non-terminating evaluations
We define one rule for each kind of expression
Example Operational Semantics for Base Values
\[\frac{}{so, E, S \vdash true : Bool(true), S}\] \[\frac{}{so, E, S \vdash false : Bool(false), S}\] \[\frac{i \; is \; an \; integer \; literal}{so, E, S \vdash i : Int(i), S}\] \[\frac{s \; is \; an \; string \; literal}{so, E, S \vdash s : String(s), S}\]
Note: no side effects in these cases
Bool, Int, and String represent type constructors of some sort
Example Operational Semantics of Variable References
\[\frac{\begin{array}{l} E(id) = l_{id} \\ S(l_{id}) = v \end{array} } {so, E, S \vdash id : v, S}\]
Note the double lookup of variables
First from name to location (compile time)
Then from location to value (run time)
The store does not change
Example Operational Semantics of Assignment
\[\frac{\begin{array}{l} so, E, S \vdash e : v, S_1 \\ E(id) = l_{id} \\ S_2 = S_1[v/l_{id}] \end{array} } {so, E, S \vdash id \; \texttt{<-} \; e : v, S_2}\]
A three step process
Evaluate the right hand side; a value \(v\) and a new store \(S_1\)
Fetch the location of the assigned variable
The result is the value \(v\) and an updated store
The environment does not change
Example Operational Semantics of Conditionals
\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{if} \; e_1 \; \texttt{then} \; e_2 \; \texttt{else} \; e_3 : v, S_2}\]
The “threading” of the store enforces an evaluation sequence
\(e_1\) must be evaluated first to produce \(S_1\)
The \(e_2\) can be evaluated
The result of evaluating \(e_1\) is a boolean
The typing rules ensure this fact
There is another similar rule for \(Bool(false)\)
Example Operational Semantics of Sequences
\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, E, S_1 \vdash e_2 : v_2, S_2 \\ \ldots \\ so, E, S_{n-1} \vdash e_n : v_n, S_n \end{array} } {so, E, S \vdash \{ e_1; \ldots; e_n \} : v_n, S_n}\]
Again, the “threading” of the store enforces an evaluation sequence
Only the last value is used
But, all the side-effects are collected
Example Operational Semantics of Loops
\[\frac{so, E, S \vdash e_1 : Bool(false), S_1 } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_1}\]
If \(e_1\) evaluates to \(Bool(false)\), then the loop terminates immediately
With the side-effects from the evaluation of \(e_1\)
And with (arbitrary) result value \(void\)
The typing rules ensure that \(e_1\) evaluates to a boolean
Example Operational Semantics of Loops
\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \\ so, E, S_2 \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3 \end{array} } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3}\]
Note the sequencing (\(S \rightarrow S_1 \rightarrow S_2 \rightarrow S_3\))
Note how looping is expressed
- Evaluation of “while ...” is expressed in terms of the evaluation of itself in another state
The result of evaluating \(e_2\) is discarded; only the side-effect is preserved
Example Operational Semantics of Let Expressions
\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, ?, ? \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]
What is the context in which \(e_2\) must be evaluated?
Environment like \(E\), but with a new binding of \(id\) to a fresh location \(l_{new}\)
Store like \(S_1\), but with \(l_{new}\) mapped to \(v_1\)
Example Operational Semantics of Let Expressions
We write \(l_{new} = newloc(S)\) to say that \(l_{new}\) is a location that is not already used in \(S\)
- Think of \(newloc\) as the dynamic memory allocation function (or reserving stack space)
The operational rule for let: \[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ l_{new} = newloc(S_1) \\ so, E[l_{new}/id], S_1[v_1/l_{new}] \vdash e_2 : v, S_2 \end{array} } {E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]
Balancing Act
Now we are going to do some very difficult rules
This may initially seem tricky
- How could that possibly work?
- What is going on here?
With time, these rules can actually be elegant
Operational Semantics of new
Consider the expression
new T
Informal semantics
Allocate new locations to hold the values for all attributes of an object of class
T
Initialize those locations with the default values of attributes
Evaluate the initializers and set the resulting attribute values
Return the newly allocated object
Default Values
For each class
A
there is a default value denoted by \(D_A\)- \(D_{Int} = Int(0)\)
- \(D_{Bool} = Bool(false)\)
- \(D_{String} = String(0, "")\)
- \(D_{A} = void\)
More Notation
For a class
A
we write \[class(A) = (a_1 : T_1 \leftarrow e_1, \ldots, a_n : T_n \leftarrow e_n)\]where
- \(a_i\) are the attributes (including inherited ones)
- \(T_i\) are their declared types
- \(e_i\) are the initializers
Operational Semantics of new
- Observation:
new SELF_TYPE
allocates an object with the same dynamic type asself
\[ \frac{ \begin{array}{l} T_0 = \left\{ \begin{array}{rl} X & \text{if}\ T = {\tt SELF\_TYPE}\ \text{and}\ so = X(\dots) \\ T & \text{otherwise} \end{array} \right. \\ class(T_{0}) = (a_{1} : T_{1} \leftarrow e_{1} , \dots , a_{n} : T_{n} \leftarrow e_{n}) \\ l_{i} = newloc(S_{1}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{i}\ \text{is distinct} \\ v_{1} = T_{0}(a_{1} = l_{1}, \dots , a_{n} = l_{n}) \\ S_{2} = S_{1}[D_{T_{1}}/l_{1}, \dots , D_{T_{n}}/l_{n}] \\ v_{1}, S_{2}, [a_{1} : l_{1}, \dots , a_{n} : l_{n}] \vdash {a_{1} \leftarrow e_{1}; \dots ; a_{n} \leftarrow e_{n};} : v_{2}, S_{3} \end{array} } {so, S_{1}, E \vdash \texttt{new}\ T : v_{1}, S_{3}}\text{[New]} \]
Operational Semantics of new
The first three lines allocate the object
The rest of the lines initialize it
State in which the initializers are evaluated:
self
is the current object- Only the attributes are in scope
- Starting value of attributes are the default ones
Side-effects of initialization are kept (in \(S_2\))
Operational Semantics of Method Dispatch
Consider the expression \(e_0.f(e_1, \ldots, e_n)\)
Informal semantics:
- Evaluate the arguments in order \(e_1, \ldots, e_n\)
- Evaluate \(e_0\) to the target object
- Let \(X\) be the dynamic type of the target object
- Fetch from \(X\) the definition of \(f\) (with \(n\) args)
- Create \(n\) new locations and an environment that maps \(f\)’s formal arguments to those locations
- Initialize the locations with the actual arguments
- Set
self
to the target object and evaluate \(f\)’s body
More Notation
For a class \(A\) and a method \(f\) of \(A\) (possibly inherited) we write: \[imp(A, f) = (x_1, \ldots, x_n, e_{body})\]
where
- \(x_i\) are the names of the formal arguments
- \(e_{body}\) is the body of the method
Operational Semantics of Dispatch
\[ \frac{ \begin{array}{l} so, S_{1}, E \vdash e_{1} : v_{1}, S_{2} \\ so, S_{2}, E \vdash e_{2} : v_{2}, S_{3} \\ \vdots \\ so, S_{n}, E \vdash e_{n} : v_{n}, S_{n+1} \\ so, S_{n+1}, E \vdash e_{0} : v_{0}, S_{n+2} \\ v_{0} = X(a_{1} = l_{a_{1}} , \dots , a_{m} = l_{a_{m}}) \\ imp(X,f) = (x_{1}, \dots , x_{n}, e_{n+1}) \\ l_{x_{i}} = newloc(S_{n+2}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{x_{i}}\ \text{is distinct} \\ S_{n+3} = S_{n+2}[v_{1}/l_{x_{1}} , \dots , v_{n}/l_{x_{n}}] \\ v_{0}, S_{n+3}, [a_{1} : l_{a_{1}}, \dots , a_{m} : l_{a_{m}}, x_{1} : l_{x_{1}} , \dots , x_{n} : l_{x_{n}}] \vdash e_{n+1} : v_{n+1} , S_{n+4} \end{array}} {so, S_{1}, E \vdash e_{0}.f(e_{1}, \dots , e_{n}) : v_{n+1}, S_{n+4}}\text{[Dispatch]} \]
Operational Semantics of Dispatch
The body of the method is invoked with
- \(E\) mapping the formal arguments and
self
’s attributes - \(S\) like the caller’s except with the actual arguments bound to the locations allocated for formals
- \(E\) mapping the formal arguments and
The notion of the activation record is implicit
The semantics of static dispatch is similar except the implementation of \(f\) is taken from the specified class
Runtime Errors
There are some runtime errors that the type checker does not try to prevent
Dispatch on void
Division by zero
Substring out of range
Heap overflow
In such cases, the execution must abort gracefully
Conclusions
Operational rules are very precise; nothing is left unspecified
Operational rules contain a lot of details
Most languages do not have a well specified operational semantics
When portability is important, an operational semantics becomes essential