Operational Semantics

Outline

Operational semantics is a precise way of specifying how to evaluate a program
A formal semantics tells you what each expression means
Meaning depends on context: a variable environment will map variables to memory locations and a store will map memory locations to values

Motivation

The meaning of an expression is what happens when it is evaluated
The definition of a programming language:
- The tokens \(\Rightarrow\) lexical analysis
- The grammar \(\Rightarrow\) syntactic analysis
- The typing rules \(\Rightarrow\) semantic analysis
- The evaluation rules \(\Rightarrow\) interpretation

Assembly Language Description of Semantics

Assembly language descriptions of language implementation have too many irrelevant details
- Which way the stack grows
- How integers are represented on a particular machine
- The particular instruction set of the architecture
We need a complete but not overly restrictive specification

Programming Language Semantics

There many ways to specify programming language semantics
They are all equivalent, but some are more suitable to various tasks than others
Operational semantics
- Describes the evaluation of programs on an abstract machine
- Most useful for specifying implementations

Other Kinds of Semantics

Denotational semantics
- The meaning of a program is expressed as a mathematical object
- Elegant but quite complicated
Axiomatic semantics
- Useful for checking that programs satisfy certain correctness properties
- The foundation of many program verification systems

Introduction to Operational Semantics

Once again we introduce a formal notation using logical rules of inference
Recall the typing judgement \[Context \vdash e : T\] (in the given \(Context\), expression \(e\) has type \(T\))
We try something similar for evaluation \[Context \vdash e : v\] (in the given \(Context\), expression \(e\) evaluates to value \(v\))

Example Operational Semantics Inference Rule

\[ \frac{\begin{array}{l} Context \; \vdash e_1 : 5 \\ Context \; \vdash e_2 : 7 \end{array} } {Context \; \vdash e_1 + e_2 : 12} \]

In general, the result of evaluating an expression depends on the result of evaluating its subexpressions
The logical rules specify everything that is needed to evaluate an expression

What Contexts are Needed?

Contexts are needed to handle variables
Consider the evaluation of y <- x + 1
- We need to keep track of values of variables
- We need to allow variables to change their values during evaluation
We track variables and their values with:
- An environment: tells us at what address in memory is the value of a variable stored
- A store: tells us what is the contents of a memory location

Variable Environments

A variable environment is a map from variable names to locations
Tells in what memory location the value of a variable is stored; locations = memory addresses
Environment tracks in-scope variables only
Example environment: \[E = [a : l_1, b : l_2 ]\]
To lookup a variable \(a\) in environment \(E\), we write \(E(a)\)

Stores

A store maps memory locations to values
Example store: \[S = [l_1 \rightarrow 5, l_2 \rightarrow 7 ]\]
To lookup the contents of a location \(l_1\) in store \(S\), we write \(S(l_1)\)
To perform an assignment of 23 to location \(l_1\), we write \(S[23/l_1]\); this denotes a new store \(S'\) such that \(S'(l_1) = 23\) and \(S'(l) = S(l)\) if \(l \neq l_1\)

Cool Values

All values in Cool are objects
To denote a Cool object we us the notation \(X(a_1 = l_1, \ldots, a_n = l_n)\) where
- \(X\) is the dynamic type of the object (type tag)
- \(a_i\) are the attributes (including those inherited)
- \(l_i\) are the locations where the values of the attributes are stored

Cool Values (Continued)

Special cases (without named attributes)
- Int(5)
- Bool(true)
- String(4, "Cool")
There is a special value void that is a member of all types
- No operations can be performed on it
- Except for the test isvoid
- Concrete implementations might use NULL here

Operational Rules

The evaluation judgement is \[E, S \vdash e : v, S'\] read:
- Given \(E\) the current environment
- And \(S\) the current store
- If the evaluation of \(e\) terminates, then
- The returned value is \(v\)
- And the new store is \(S'\)

Notes

The “result” of evaluating an expression is both a value and also a new store
Changes to the store model side-effects, that is, assignments to mutable variables
The variable environment does not change
The operational semantics allows for non-terminating evaluations
We define one rule for each kind of expression

Example Operational Semantics for Base Values

\[\frac{}{so, E, S \vdash true : Bool(true), S}\] \[\frac{}{so, E, S \vdash false : Bool(false), S}\] \[\frac{i \; is \; an \; integer \; literal}{so, E, S \vdash i : Int(i), S}\] \[\frac{s \; is \; an \; string \; literal}{so, E, S \vdash s : String(s), S}\]

Note: no side effects in these cases
Bool, Int, and String represent type constructors of some sort

Example Operational Semantics of Variable References

\[\frac{\begin{array}{l} E(id) = l_{id} \\ S(l_{id}) = v \end{array} } {so, E, S \vdash id : v, S}\]

Note the double lookup of variables
- First from name to location (compile time)
- Then from location to value (run time)
The store does not change

Example Operational Semantics of Assignment

\[\frac{\begin{array}{l} so, E, S \vdash e : v, S_1 \\ E(id) = l_{id} \\ S_2 = S_1[v/l_{id}] \end{array} } {so, E, S \vdash id \; \texttt{<-} \; e : v, S_2}\]

A three step process
- Evaluate the right hand side; a value \(v\) and a new store \(S_1\)
- Fetch the location of the assigned variable
- The result is the value \(v\) and an updated store
The environment does not change

Example Operational Semantics of Conditionals

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{if} \; e_1 \; \texttt{then} \; e_2 \; \texttt{else} \; e_3 : v, S_2}\]

The “threading” of the store enforces an evaluation sequence
- \(e_1\) must be evaluated first to produce \(S_1\)
- The \(e_2\) can be evaluated
The result of evaluating \(e_1\) is a boolean
- The typing rules ensure this fact
- There is another similar rule for \(Bool(false)\)

Example Operational Semantics of Sequences

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, E, S_1 \vdash e_2 : v_2, S_2 \\ \ldots \\ so, E, S_{n-1} \vdash e_n : v_n, S_n \end{array} } {so, E, S \vdash \{ e_1; \ldots; e_n \} : v_n, S_n}\]

Again, the “threading” of the store enforces an evaluation sequence
Only the last value is used
But, all the side-effects are collected

Example Operational Semantics of Loops

\[\frac{so, E, S \vdash e_1 : Bool(false), S_1 } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_1}\]

If \(e_1\) evaluates to \(Bool(false)\), then the loop terminates immediately
- With the side-effects from the evaluation of \(e_1\)
- And with (arbitrary) result value \(void\)
The typing rules ensure that \(e_1\) evaluates to a boolean

Example Operational Semantics of Loops

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \\ so, E, S_2 \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3 \end{array} } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3}\]

Note the sequencing (\(S \rightarrow S_1 \rightarrow S_2 \rightarrow S_3\))
Note how looping is expressed
- Evaluation of “while ...” is expressed in terms of the evaluation of itself in another state
The result of evaluating \(e_2\) is discarded; only the side-effect is preserved

Example Operational Semantics of Let Expressions

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, ?, ? \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]

What is the context in which \(e_2\) must be evaluated?
- Environment like \(E\), but with a new binding of \(id\) to a fresh location \(l_{new}\)
- Store like \(S_1\), but with \(l_{new}\) mapped to \(v_1\)

Example Operational Semantics of Let Expressions

We write \(l_{new} = newloc(S)\) to say that \(l_{new}\) is a location that is not already used in \(S\)
- Think of \(newloc\) as the dynamic memory allocation function (or reserving stack space)
The operational rule for let: \[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ l_{new} = newloc(S_1) \\ so, E[l_{new}/id], S_1[v_1/l_{new}] \vdash e_2 : v, S_2 \end{array} } {E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]

Balancing Act

Now we are going to do some very difficult rules
This may initially seem tricky
- How could that possibly work?
- What is going on here?
With time, these rules can actually be elegant

Operational Semantics of `new`

Consider the expression new T
Informal semantics
- Allocate new locations to hold the values for all attributes of an object of class T
- Initialize those locations with the default values of attributes
- Evaluate the initializers and set the resulting attribute values
- Return the newly allocated object

Default Values

For each class A there is a default value denoted by \(D_A\)
- \(D_{Int} = Int(0)\)
- \(D_{Bool} = Bool(false)\)
- \(D_{String} = String(0, "")\)
- \(D_{A} = void\)

More Notation

For a class A we write \[class(A) = (a_1 : T_1 \leftarrow e_1, \ldots, a_n : T_n \leftarrow e_n)\]

where
- \(a_i\) are the attributes (including inherited ones)
- \(T_i\) are their declared types
- \(e_i\) are the initializers

Operational Semantics of `new`

Observation: new SELF_TYPE allocates an object with the same dynamic type as self

\[ \frac{ \begin{array}{l} T_0 = \left\{ \begin{array}{rl} X & \text{if}\ T = {\tt SELF\_TYPE}\ \text{and}\ so = X(\dots) \\ T & \text{otherwise} \end{array} \right. \\ class(T_{0}) = (a_{1} : T_{1} \leftarrow e_{1} , \dots , a_{n} : T_{n} \leftarrow e_{n}) \\ l_{i} = newloc(S_{1}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{i}\ \text{is distinct} \\ v_{1} = T_{0}(a_{1} = l_{1}, \dots , a_{n} = l_{n}) \\ S_{2} = S_{1}[D_{T_{1}}/l_{1}, \dots , D_{T_{n}}/l_{n}] \\ v_{1}, S_{2}, [a_{1} : l_{1}, \dots , a_{n} : l_{n}] \vdash {a_{1} \leftarrow e_{1}; \dots ; a_{n} \leftarrow e_{n};} : v_{2}, S_{3} \end{array} } {so, S_{1}, E \vdash \texttt{new}\ T : v_{1}, S_{3}}\text{[New]} \]

Operational Semantics of `new`

The first three lines allocate the object
The rest of the lines initialize it
State in which the initializers are evaluated:
- self is the current object
- Only the attributes are in scope
- Starting value of attributes are the default ones
Side-effects of initialization are kept (in \(S_2\))

Operational Semantics of Method Dispatch

Consider the expression \(e_0.f(e_1, \ldots, e_n)\)
Informal semantics:
- Evaluate the arguments in order \(e_1, \ldots, e_n\)
- Evaluate \(e_0\) to the target object
- Let \(X\) be the dynamic type of the target object
- Fetch from \(X\) the definition of \(f\) (with \(n\) args)
- Create \(n\) new locations and an environment that maps \(f\)’s formal arguments to those locations
- Initialize the locations with the actual arguments
- Set self to the target object and evaluate \(f\)’s body

More Notation

For a class \(A\) and a method \(f\) of \(A\) (possibly inherited) we write: \[imp(A, f) = (x_1, \ldots, x_n, e_{body})\]

where
- \(x_i\) are the names of the formal arguments
- \(e_{body}\) is the body of the method

Operational Semantics of Dispatch

\[ \frac{ \begin{array}{l} so, S_{1}, E \vdash e_{1} : v_{1}, S_{2} \\ so, S_{2}, E \vdash e_{2} : v_{2}, S_{3} \\ \vdots \\ so, S_{n}, E \vdash e_{n} : v_{n}, S_{n+1} \\ so, S_{n+1}, E \vdash e_{0} : v_{0}, S_{n+2} \\ v_{0} = X(a_{1} = l_{a_{1}} , \dots , a_{m} = l_{a_{m}}) \\ imp(X,f) = (x_{1}, \dots , x_{n}, e_{n+1}) \\ l_{x_{i}} = newloc(S_{n+2}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{x_{i}}\ \text{is distinct} \\ S_{n+3} = S_{n+2}[v_{1}/l_{x_{1}} , \dots , v_{n}/l_{x_{n}}] \\ v_{0}, S_{n+3}, [a_{1} : l_{a_{1}}, \dots , a_{m} : l_{a_{m}}, x_{1} : l_{x_{1}} , \dots , x_{n} : l_{x_{n}}] \vdash e_{n+1} : v_{n+1} , S_{n+4} \end{array}} {so, S_{1}, E \vdash e_{0}.f(e_{1}, \dots , e_{n}) : v_{n+1}, S_{n+4}}\text{[Dispatch]} \]

Operational Semantics of Dispatch

The body of the method is invoked with
- \(E\) mapping the formal arguments and self’s attributes
- \(S\) like the caller’s except with the actual arguments bound to the locations allocated for formals
The notion of the activation record is implicit
The semantics of static dispatch is similar except the implementation of \(f\) is taken from the specified class

Runtime Errors

There are some runtime errors that the type checker does not try to prevent
- Dispatch on void
- Division by zero
- Substring out of range
- Heap overflow
In such cases, the execution must abort gracefully

Conclusions

Operational rules are very precise; nothing is left unspecified
Operational rules contain a lot of details
Most languages do not have a well specified operational semantics
When portability is important, an operational semantics becomes essential

Operational Semantics

Outline

Motivation

Assembly Language Description of Semantics

Programming Language Semantics

Other Kinds of Semantics

Introduction to Operational Semantics

Example Operational Semantics Inference Rule

What Contexts are Needed?

Variable Environments

Stores

Cool Values

Cool Values (Continued)

Operational Rules

Notes

Example Operational Semantics for Base Values

Example Operational Semantics of Variable References

Example Operational Semantics of Assignment

Example Operational Semantics of Conditionals

Example Operational Semantics of Sequences

Example Operational Semantics of Loops

Example Operational Semantics of Loops

Example Operational Semantics of Let Expressions

Example Operational Semantics of Let Expressions

Balancing Act

Operational Semantics of new

Default Values

More Notation

Operational Semantics of new

Operational Semantics of new

Operational Semantics of Method Dispatch

More Notation

Operational Semantics of Dispatch

Operational Semantics of Dispatch

Runtime Errors

Conclusions

Operational Semantics of `new`

Operational Semantics of `new`

Operational Semantics of `new`