Operational Semantics

CSC 310 - Programming Languages

Outline

  • Operational semantics is a precise way of specifying how to evaluate a program

  • A formal semantics tells you what each expression means

  • Meaning depends on context: a variable environment will map variables to memory locations and a store will map memory locations to values

Motivation

  • The meaning of an expression is what happens when it is evaluated

  • The definition of a programming language:

    • The tokens \(\Rightarrow\) lexical analysis

    • The grammar \(\Rightarrow\) syntactic analysis

    • The typing rules \(\Rightarrow\) semantic analysis

    • The evaluation rules \(\Rightarrow\) interpretation

Assembly Language Description of Semantics

  • Assembly language descriptions of language implementation have too many irrelevant details

    • Which way the stack grows

    • How integers are represented on a particular machine

    • The particular instruction set of the architecture

  • We need a complete but not overly restrictive specification

Programming Language Semantics

  • There many ways to specify programming language semantics

  • They are all equivalent, but some are more suitable to various tasks than others

  • Operational semantics

    • Describes the evaluation of programs on an abstract machine

    • Most useful for specifying implementations

Other Kinds of Semantics

  • Denotational semantics

    • The meaning of a program is expressed as a mathematical object

    • Elegant but quite complicated

  • Axiomatic semantics

    • Useful for checking that programs satisfy certain correctness properties

    • The foundation of many program verification systems

Introduction to Operational Semantics

  • Once again we introduce a formal notation using logical rules of inference

  • Recall the typing judgement \[Context \vdash e : T\] (in the given \(Context\), expression \(e\) has type \(T\))

  • We try something similar for evaluation \[Context \vdash e : v\] (in the given \(Context\), expression \(e\) evaluates to value \(v\))

Example Operational Semantics Inference Rule

\[ \frac{\begin{array}{l} Context \; \vdash e_1 : 5 \\ Context \; \vdash e_2 : 7 \end{array} } {Context \; \vdash e_1 + e_2 : 12} \]

  • In general, the result of evaluating an expression depends on the result of evaluating its subexpressions

  • The logical rules specify everything that is needed to evaluate an expression

What Contexts are Needed?

  • Contexts are needed to handle variables

  • Consider the evaluation of y <- x + 1

    • We need to keep track of values of variables

    • We need to allow variables to change their values during evaluation

  • We track variables and their values with:

    • An environment: tells us at what address in memory is the value of a variable stored

    • A store: tells us what is the contents of a memory location

Variable Environments

  • A variable environment is a map from variable names to locations

  • Tells in what memory location the value of a variable is stored; locations = memory addresses

  • Environment tracks in-scope variables only

  • Example environment: \[E = [a : l_1, b : l_2 ]\]

  • To lookup a variable \(a\) in environment \(E\), we write \(E(a)\)

Stores

  • A store maps memory locations to values

  • Example store: \[S = [l_1 \rightarrow 5, l_2 \rightarrow 7 ]\]

  • To lookup the contents of a location \(l_1\) in store \(S\), we write \(S(l_1)\)

  • To perform an assignment of 23 to location \(l_1\), we write \(S[23/l_1]\); this denotes a new store \(S'\) such that \(S'(l_1) = 23\) and \(S'(l) = S(l)\) if \(l \neq l_1\)

Cool Values

  • All values in Cool are objects

  • To denote a Cool object we us the notation \(X(a_1 = l_1, \ldots, a_n = l_n)\) where

    • \(X\) is the dynamic type of the object (type tag)
    • \(a_i\) are the attributes (including those inherited)
    • \(l_i\) are the locations where the values of the attributes are stored

Cool Values (Continued)

  • Special cases (without named attributes)

    • Int(5)
    • Bool(true)
    • String(4, "Cool")
  • There is a special value void that is a member of all types

    • No operations can be performed on it
    • Except for the test isvoid
    • Concrete implementations might use NULL here

Operational Rules

  • The evaluation judgement is \[E, S \vdash e : v, S'\] read:

    • Given \(E\) the current environment

    • And \(S\) the current store

    • If the evaluation of \(e\) terminates, then

    • The returned value is \(v\)

    • And the new store is \(S'\)

Notes

  • The “result” of evaluating an expression is both a value and also a new store

  • Changes to the store model side-effects, that is, assignments to mutable variables

  • The variable environment does not change

  • The operational semantics allows for non-terminating evaluations

  • We define one rule for each kind of expression

Example Operational Semantics for Base Values

\[\frac{}{so, E, S \vdash true : Bool(true), S}\] \[\frac{}{so, E, S \vdash false : Bool(false), S}\] \[\frac{i \; is \; an \; integer \; literal}{so, E, S \vdash i : Int(i), S}\] \[\frac{s \; is \; an \; string \; literal}{so, E, S \vdash s : String(s), S}\]

  • Note: no side effects in these cases

  • Bool, Int, and String represent type constructors of some sort

Example Operational Semantics of Variable References

\[\frac{\begin{array}{l} E(id) = l_{id} \\ S(l_{id}) = v \end{array} } {so, E, S \vdash id : v, S}\]

  • Note the double lookup of variables

    • First from name to location (compile time)

    • Then from location to value (run time)

  • The store does not change

Example Operational Semantics of Assignment

\[\frac{\begin{array}{l} so, E, S \vdash e : v, S_1 \\ E(id) = l_{id} \\ S_2 = S_1[v/l_{id}] \end{array} } {so, E, S \vdash id \; \texttt{<-} \; e : v, S_2}\]

  • A three step process

    • Evaluate the right hand side; a value \(v\) and a new store \(S_1\)

    • Fetch the location of the assigned variable

    • The result is the value \(v\) and an updated store

  • The environment does not change

Example Operational Semantics of Conditionals

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{if} \; e_1 \; \texttt{then} \; e_2 \; \texttt{else} \; e_3 : v, S_2}\]

  • The “threading” of the store enforces an evaluation sequence

    • \(e_1\) must be evaluated first to produce \(S_1\)

    • The \(e_2\) can be evaluated

  • The result of evaluating \(e_1\) is a boolean

    • The typing rules ensure this fact

    • There is another similar rule for \(Bool(false)\)

Example Operational Semantics of Sequences

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, E, S_1 \vdash e_2 : v_2, S_2 \\ \ldots \\ so, E, S_{n-1} \vdash e_n : v_n, S_n \end{array} } {so, E, S \vdash \{ e_1; \ldots; e_n \} : v_n, S_n}\]

  • Again, the “threading” of the store enforces an evaluation sequence

  • Only the last value is used

  • But, all the side-effects are collected

Example Operational Semantics of Loops

\[\frac{so, E, S \vdash e_1 : Bool(false), S_1 } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_1}\]

  • If \(e_1\) evaluates to \(Bool(false)\), then the loop terminates immediately

    • With the side-effects from the evaluation of \(e_1\)

    • And with (arbitrary) result value \(void\)

  • The typing rules ensure that \(e_1\) evaluates to a boolean

Example Operational Semantics of Loops

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : Bool(true), S_1 \\ so, E, S_1 \vdash e_2 : v, S_2 \\ so, E, S_2 \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3 \end{array} } {so, E, S \vdash \texttt{while} \; e_1 \; \texttt{loop} \; e_2 \; \texttt{pool} : void, S_3}\]

  • Note the sequencing (\(S \rightarrow S_1 \rightarrow S_2 \rightarrow S_3\))

  • Note how looping is expressed

    • Evaluation of “while ...” is expressed in terms of the evaluation of itself in another state
  • The result of evaluating \(e_2\) is discarded; only the side-effect is preserved

Example Operational Semantics of Let Expressions

\[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ so, ?, ? \vdash e_2 : v, S_2 \end{array} } {so, E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]

  • What is the context in which \(e_2\) must be evaluated?

    • Environment like \(E\), but with a new binding of \(id\) to a fresh location \(l_{new}\)

    • Store like \(S_1\), but with \(l_{new}\) mapped to \(v_1\)

Example Operational Semantics of Let Expressions

  • We write \(l_{new} = newloc(S)\) to say that \(l_{new}\) is a location that is not already used in \(S\)

    • Think of \(newloc\) as the dynamic memory allocation function (or reserving stack space)
  • The operational rule for let: \[\frac{\begin{array}{l} so, E, S \vdash e_1 : v_1, S_1 \\ l_{new} = newloc(S_1) \\ so, E[l_{new}/id], S_1[v_1/l_{new}] \vdash e_2 : v, S_2 \end{array} } {E, S \vdash \texttt{let} \; id : T \; \texttt{<-} \; e_1 \; \texttt{in} \; e_2: v_2, S_2}\]

Balancing Act

  • Now we are going to do some very difficult rules

  • This may initially seem tricky

    • How could that possibly work?
    • What is going on here?
  • With time, these rules can actually be elegant

Operational Semantics of new

  • Consider the expression new T

  • Informal semantics

    • Allocate new locations to hold the values for all attributes of an object of class T

    • Initialize those locations with the default values of attributes

    • Evaluate the initializers and set the resulting attribute values

    • Return the newly allocated object

Default Values

  • For each class A there is a default value denoted by \(D_A\)

    • \(D_{Int} = Int(0)\)
    • \(D_{Bool} = Bool(false)\)
    • \(D_{String} = String(0, "")\)
    • \(D_{A} = void\)

More Notation

  • For a class A we write \[class(A) = (a_1 : T_1 \leftarrow e_1, \ldots, a_n : T_n \leftarrow e_n)\]

    where

    • \(a_i\) are the attributes (including inherited ones)
    • \(T_i\) are their declared types
    • \(e_i\) are the initializers

Operational Semantics of new

  • Observation: new SELF_TYPE allocates an object with the same dynamic type as self

\[ \frac{ \begin{array}{l} T_0 = \left\{ \begin{array}{rl} X & \text{if}\ T = {\tt SELF\_TYPE}\ \text{and}\ so = X(\dots) \\ T & \text{otherwise} \end{array} \right. \\ class(T_{0}) = (a_{1} : T_{1} \leftarrow e_{1} , \dots , a_{n} : T_{n} \leftarrow e_{n}) \\ l_{i} = newloc(S_{1}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{i}\ \text{is distinct} \\ v_{1} = T_{0}(a_{1} = l_{1}, \dots , a_{n} = l_{n}) \\ S_{2} = S_{1}[D_{T_{1}}/l_{1}, \dots , D_{T_{n}}/l_{n}] \\ v_{1}, S_{2}, [a_{1} : l_{1}, \dots , a_{n} : l_{n}] \vdash {a_{1} \leftarrow e_{1}; \dots ; a_{n} \leftarrow e_{n};} : v_{2}, S_{3} \end{array} } {so, S_{1}, E \vdash \texttt{new}\ T : v_{1}, S_{3}}\text{[New]} \]

Operational Semantics of new

  • The first three lines allocate the object

  • The rest of the lines initialize it

  • State in which the initializers are evaluated:

    • self is the current object
    • Only the attributes are in scope
    • Starting value of attributes are the default ones
  • Side-effects of initialization are kept (in \(S_2\))

Operational Semantics of Method Dispatch

  • Consider the expression \(e_0.f(e_1, \ldots, e_n)\)

  • Informal semantics:

    • Evaluate the arguments in order \(e_1, \ldots, e_n\)
    • Evaluate \(e_0\) to the target object
    • Let \(X\) be the dynamic type of the target object
    • Fetch from \(X\) the definition of \(f\) (with \(n\) args)
    • Create \(n\) new locations and an environment that maps \(f\)’s formal arguments to those locations
    • Initialize the locations with the actual arguments
    • Set self to the target object and evaluate \(f\)’s body

More Notation

  • For a class \(A\) and a method \(f\) of \(A\) (possibly inherited) we write: \[imp(A, f) = (x_1, \ldots, x_n, e_{body})\]

    where

    • \(x_i\) are the names of the formal arguments
    • \(e_{body}\) is the body of the method

Operational Semantics of Dispatch

\[ \frac{ \begin{array}{l} so, S_{1}, E \vdash e_{1} : v_{1}, S_{2} \\ so, S_{2}, E \vdash e_{2} : v_{2}, S_{3} \\ \vdots \\ so, S_{n}, E \vdash e_{n} : v_{n}, S_{n+1} \\ so, S_{n+1}, E \vdash e_{0} : v_{0}, S_{n+2} \\ v_{0} = X(a_{1} = l_{a_{1}} , \dots , a_{m} = l_{a_{m}}) \\ imp(X,f) = (x_{1}, \dots , x_{n}, e_{n+1}) \\ l_{x_{i}} = newloc(S_{n+2}), \text{for}\ i = 1 \dots n\ \text{and each}\ l_{x_{i}}\ \text{is distinct} \\ S_{n+3} = S_{n+2}[v_{1}/l_{x_{1}} , \dots , v_{n}/l_{x_{n}}] \\ v_{0}, S_{n+3}, [a_{1} : l_{a_{1}}, \dots , a_{m} : l_{a_{m}}, x_{1} : l_{x_{1}} , \dots , x_{n} : l_{x_{n}}] \vdash e_{n+1} : v_{n+1} , S_{n+4} \end{array}} {so, S_{1}, E \vdash e_{0}.f(e_{1}, \dots , e_{n}) : v_{n+1}, S_{n+4}}\text{[Dispatch]} \]

Operational Semantics of Dispatch

  • The body of the method is invoked with

    • \(E\) mapping the formal arguments and self’s attributes
    • \(S\) like the caller’s except with the actual arguments bound to the locations allocated for formals
  • The notion of the activation record is implicit

  • The semantics of static dispatch is similar except the implementation of \(f\) is taken from the specified class

Runtime Errors

  • There are some runtime errors that the type checker does not try to prevent

    • Dispatch on void

    • Division by zero

    • Substring out of range

    • Heap overflow

  • In such cases, the execution must abort gracefully

Conclusions

  • Operational rules are very precise; nothing is left unspecified

  • Operational rules contain a lot of details

  • Most languages do not have a well specified operational semantics

  • When portability is important, an operational semantics becomes essential