Stanford Compiler Week 7 Operational Semantics

课程主页：

https://www.edx.org/course/compilers

课程视频：

https://www.bilibili.com/video/BV1NE411376V?p=20&t=6

备注：

图片均来自于课件。

这次回顾操作语义。

语义概述

我们必须为每个Cool表达式指定在求值时会发生什么
- 这是表达的“含义”。
编程语言的定义：
- token对应词法分析；
- 语法对应句法分析；
- 类型规则对应语义分析；
- 评估规则对应代码生成与优化。
我们其实间接指定了评估规则：
- 将Cool编译为堆栈机；
- 然后利用堆栈机的评估规则。
该评估规则这是完整的描述
- 但是为什么不够好？
实现语言的汇编语言描述有不相关的细节，例如
- 是否使用堆栈机？
- 堆栈以哪种方式增长？
- 如何表示整数？
- 体系结构的特定指令集。
我们需要完整的描述
- 但不是过于严格的规范。
指定语义有许多方法
- 它们同样强大
- 适用的场景不同
操作语意
- 通过执行规则描述程序评估
  - 在抽象机器上
- 对特定实现最有用
- 这就是我们用于Cool的语义描述

指定语意的方法

指称语义（Denotational semantics）
- 程序的含义是数学函数
公理语义（Axiomatic semantics）
- 通过逻辑公式描述程序行为
  - 如果在满足X的状态下开始，则在满足Y的状态下结束
  - X，Y公式
- 建立了许多程序验证系统

操作语意

引入符号

我们再次引入形式符号
逻辑推理规则，如类型检查
回忆类型判断
$\text { Context } \vdash \mathrm{e}: \mathrm{C}$
该符号表示，在给定的上下文中，表达式$\mathrm e$具有类型$\mathrm C$
我们使用类似的方法进行评估
$\text { Context } \vdash \mathrm{e}: \mathrm{v}$
在给定的上下文中，表达式$\mathrm e$的取值为$\mathrm v$

例子

$\frac{ \text { Context } \vdash \mathrm{e}_{1}: 5 \\ \text { Context } \vdash \mathrm{e}_{2}: 7 } { \text { Context } \vdash \mathrm{e}_{1}+\mathrm{e}_{2}: 12}$

该表达式的含义是，在给定上下文中，如果$\mathrm {e_1}$值为$5$，$\mathrm {e_2}$值为$7$，则$\mathrm{e_1+e_2}$值为$12$。

环境和存储

考虑对$\mathrm{y\leftarrow x + 1}$ 求值
我们通过以下方式跟踪变量及其值：
- 环境：变量在内存中的位置
  - $\mathrm {vars \to memory\ locations}$
  - 变量环境将变量映射到位置
    - 跟踪作用域的变量
    - 告诉我们这些变量在哪里
    - 形式$\mathrm{E}=\left[\mathrm{a}: \mathrm{I}_{1}, \mathrm{~b}: \mathrm{I}_{2}\right]$
- 存储：内存中的内容
  - $\mathrm{memory\ locations\to values}$
  - 存储将内存位置映射到值
    $\mathrm{S}=\left[\mathrm{l}_{1} \rightarrow 5, \mathrm{I}_{2} \rightarrow 7\right]$
  - $\mathrm{S}^{\prime}=\mathrm{S}\left[12 / \mathrm{I}_{1}\right]$定义了满足如下条件的存储$\mathrm S’$
    $\mathrm{S'(l)}=\begin{cases} 12, & \mathrm{l= l_1}\\ \mathrm{S(l)}, & \mathrm{l\neq l_1} \end{cases}$

Cool中的语义

Cool中的值都是对象
- 所有对象都是某个类的实例
$\mathrm{X\left(a_{1}=I_{1}, \ldots, a_{n}=I_{n}\right)}$是一个Cool的对象，其中
- $\mathrm{X}$是对象的类
- $\mathrm{a_i}$是属性（包括继承的属性）
- $\mathrm{l_i}$是$\mathrm{a_i}$值的存储位置
特殊情况（无属性的类）
- $\mathrm{Int(5)}$：整数5
- $\mathrm{Bool(true)}$：布尔变量true
- $\text { String(4, “Cool”) }$：长度为4的字符串“Cool”
有一个类型为$\mathrm {Object}$的特殊值$\mathrm {void}$
- 无法对其执行任何操作
- 除了测试$\mathrm {isvoid}$
- 具体的实现可能使用$\mathrm {NULL}$

完整的符号

评估判断为
- 含义为
  - 给定$\mathrm{so}$的当前值$\mathrm{self}$
  - 当前变量环境$\mathrm{E}$
  - 当前存储$\mathrm{S}$
  - 如果$\mathrm{e}$的评估终止，则
    - $\mathrm{e}$的值为$\mathrm v$
    - 新的存储为$\mathrm{S’}$
评估的“结果”是值和新的存储
- 新存储建模了副作用
有些事情没有改变
- 变量的环境
- $\mathrm {Self}$的值
- 操作语义允许不终止的评估

Cool语意I

规则1

$\begin{aligned} &\frac{} {\text {so, } \mathrm{E}, \mathrm{S} \vdash \text {true : Bool(true), } \mathrm{S} }\\ \\ & \frac{} {\text {so, } \mathrm{E}, \mathrm{S} \vdash \text {false : Bool(false), } \mathrm{S} }\\ \\ &\frac{\text {i is an integer literal} } {\text {so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{i}: \operatorname{Int}(\mathrm{i}), \mathrm{S} }\\ \\ & \frac{ \mathrm{s} \text { is a string literal } \\ \mathrm{n} \text { is the length of } \mathrm{s} } { \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{s}: \text { String }(\mathrm{n}, \mathrm{s}), \mathrm{S} } \end{aligned}$

规则2

$\begin{aligned} &\frac{ \mathrm{E}(\mathrm{id})=\mathrm{I}_{\mathrm{id} } \\ \mathrm{S}\left(\mathrm{I}_{\mathrm{id} }\right)=\mathrm{v} } { \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{id}: \mathrm{v}, \mathrm{S} }\\ \\ &\frac{} {\mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \text {self}: \mathrm{so}, \mathrm{S} }\\ \\ &\frac{ \text {so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}: \mathrm{v}, \mathrm{S}_{1} \\ \mathrm{E}(\mathrm{id})=\mathrm{I}_{\mathrm{id} } \\ \mathrm{S}_{2}=\mathrm{S}_{1}\left[\mathrm{v} / \mathrm{I}_{\mathrm{id} }\right] } {\mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{id} \leftarrow \mathrm{e}: \mathrm{v}, \mathrm{S}_{2} }\\ \\ &\frac{ \text { so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \mathrm{v}_{1}, \mathrm{~S}_{1} \\ \text { so, } \mathrm{E}, \mathrm{S}_{1} \vdash \mathrm{e}_{2}: \mathrm{v}_{2}, \mathrm{~S}_{2} } { \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}+\mathrm{e}_{2}: \mathrm{v}_{1}+\mathrm{v}_{2}, \mathrm{~S}_{2} } \\\\ &\frac{ \text { so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \mathrm{v}_{1}, \mathrm{S}_{1} \\ \text { so, } \mathrm{E}, \mathrm{S}_{1} \vdash \mathrm{e}_{2}: \mathrm{v}_{2}, \mathrm{~S}_{2} \\ \ldots\\ \text { so, } \mathrm{E}, \mathrm{S_{n-1} } \vdash \mathrm{e_n}: \mathrm{v_n}, \mathrm{S_n} } { \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \{\mathrm{e_1;\ldots; e_n;}\}: \mathrm{v_{n} }, \mathrm{S_n} } \end{aligned}$

例子

考虑表达式

$\{\mathrm{X\to 7+5;4;}\}$

计算过程：

第一步

$\frac{ \begin{aligned} \mathrm{so,[x:I],[I\leftarrow 0]\vdash 7: Int(7),[I\leftarrow 0]}\\ \mathrm{so,[x:I],[I\leftarrow 0]\vdash 5: Int(5),[I\leftarrow 0]} \end{aligned} } {\mathrm{so,[x:I],[I\leftarrow 0]\vdash 7+5: Int(12),[I\leftarrow 0]} }$

第二步

注意到

$\mathrm {[I\leftarrow 0](12/I) =[I\leftarrow 12]}$

所以可得

$\frac {\mathrm{so,[x:I],[I\leftarrow 0]\vdash 7+5: Int(12),[I\leftarrow 12]; so,[x:I],[I\leftarrow 12]\vdash 4: Int(4),[I\leftarrow 12]} } { \mathrm{so,[x:I],[I\leftarrow 0]\vdash \{x\leftarrow 7 + 5;4;\}\vdash Int(4),[I\leftarrow 12]} }$

规则3

if表达式

$\begin{array}{c} \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \text { Bool(true) }, \mathrm{S}_{1} \\ \mathrm{so}, \mathrm{E}, \mathrm{S}_{1} \vdash \mathrm{e}_{2}: \mathrm{v}, \mathrm{S}_{2} \\ \hline \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \text {if } \mathrm{e}_{1} \text { then } \mathrm{e}_{2} \text { else } \mathrm{e}_{3} \ \mathrm{fi}: \mathrm{v}, \mathrm{S}_{2} \end{array}$

while表达式

$\frac{\text { so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \mathrm{Bool}(\mathrm{false}), \mathrm{S}_{1} }{\mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \text { while } \mathrm{e}_{1} \text { loop } \mathrm{e}_{2} \text { pool }: \text { void, } \mathrm{S}_{1} }$ $\begin{array}{c} \text { so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \text { Bool(true) }, \mathrm{S}_{1} \\ \mathrm{so}, \mathrm{E}, \mathrm{S}_{1} \vdash \mathrm{e}_{2}: \mathrm{v}, \mathrm{S}_{2} \\ \mathrm{so}, \mathrm{E}, \mathrm{S}_{2} \vdash \text {while } \mathrm{e}_{1} \text { loop } \mathrm{e}_{2} \text { pool : void, } \mathrm{S}_{3} \\ \hline \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \text {while } \mathrm{e}_{1} \text { loop } \mathrm{e}_{2} \text { pool : void, } \mathrm{S}_{3} \end{array}$

let表达式

$\begin{aligned} \frac{ \text { so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \mathrm{v}_{1},\ \mathrm{S}_{1} \\ \text { so, ?, ? } \vdash \mathrm{e}_{2}: \mathrm{v_2,}\ \mathrm{S}_{2} } { \text { so, } \mathrm{E}, \mathrm{S} \vdash \text {let id }: \mathrm{T} \leftarrow \mathrm{e}_{1} \text { in } \mathrm{e}_{2}: \mathrm{v}_{2}, \mathrm{S}_{2} } \end{aligned}$

在什么情况下应该评估$\mathrm{e_2}$？
- 类似$\mathrm{E}$的环境，新的$\mathrm{id}$绑定到新位置$\mathrm{l_{new} }$
- 像$\mathrm{S_1}$一样存储，但$\mathrm{l_{new} }$映射到$\mathrm{v_1}$

解决方式

我们用$\mathrm{I_{new}=newloc(S) }$表示$\mathrm{I_\text{new} }$是$\mathrm S$中尚未使用的位置
- $\mathrm{newloc}$类似内存分配函数

$\begin{aligned} \frac{ \mathrm { so, E, S \vdash e_{1}: v_{1}, S_{1} } \\ \mathrm{I_{new }= newloc (S_{1})} \\ \mathrm{ so, E[I_{ {new } } / { id }], S_{1}[v_{1} / I_{ {new } }] \vdash e_{2}: v_{2}, S_{2} } } { \mathrm{ so, E, S \vdash let\ id : T \leftarrow e_{1} \ in\ e_{2}: v_{2}, S_{2} } } \end{aligned}$

习题

选择最后一项。

Cool语意II

规则4

$\mathrm{new\ T}$的非正式语义
- 分配位置以保存类$\mathrm{T}$对象的所有属性
  - 本质上，分配一个新对象
- 使用默认值设置属性
- 评估初始化程序并设置结果属性值
- 返回新分配的对象
对于类$\mathrm{A}$，存在初始值$\mathrm{D_A}$
- $\mathrm{D_{int} = Int(0)}$
- $\mathrm{D_{\text{bool} } = Bool(false)}$
- $\mathrm{D_{string} = String(0, “”)}$
- $\mathrm{D_A = void}$（对于其他类$\mathrm{A}$）
对于类$\mathrm{A}$，我们记作
- 其中$\mathrm{a_i}$是属性（包括继承的属性）。
- $\mathrm{T_i}$是属性申明的类型。
- $\mathrm{e_i}$是初始化列表。
对于包含继承的类
```
class A{
    a1, a2;
}

class B{
    b1, b2;
}

class C{
    c1, c2;
}
```
那么类$\mathrm C$记作

定义

$\begin{array}{l} \mathrm{ T_{0}=\text { if }\left(T==\operatorname{SELF}_{-} T Y P E \text { and } s o=X(\ldots)\right) \text { then } X \text { else } T \\ \operatorname{class}\left(T_{0}\right)=\left(a_{1}: T_{1} \leftarrow e_{1,} \ldots, a_{n}: T_{n} \leftarrow e_{n}\right) \\ I_{i}=\text { newloc}(S) \text { for } i=1, \ldots, n \\ v=T_{0}\left(a_{1}=I_{1,} \ldots,a_{n}=I_{n}\right) \\ S_{1}=S\left[D_{T1} / I_{1, }\ldots, D_{T n} / I_{n}\right] \\ E^{\prime}=\left[a_{1}: I_{1}, \ldots, a_{n}: I_{n}\right] \\ v_{1}, E^{\prime}, S_{1} \vdash\left\{a_{1} \leftarrow e_{1} ; \ldots ; a_{n} \leftarrow e_{n} ;\right\}: v_{n}, S_{2} } \\ \hline \mathrm{s o, E, S \vdash \text { new } T: v, S_{2} } \end{array}$

前三个步骤分配对象
其余步骤将其初始化
- 通过评估任务序列
评估初始化的状态
- self是当前的对象
- 只有属性在作用域内（与类型相同）
- 属性的初始值为默认值

规则5

$\mathrm{e_{0} . f\left(e_{1}, \ldots, e_{n}\right)}$的非正式语义
- 以$\mathrm{e_1,\ldots,e_n}$的顺序评估参数
- 将$\mathrm{e_0}$评估为目标对象
- 令$\mathrm{X}$为目标对象的动态类型
- 从$\mathrm{X}$获取$\mathrm{f}$的定义（带有$\mathrm{n}$个参数）
- 创建$\mathrm{n}$个新位置以及一个将$\mathrm{f}$的形式参数映射到这些位置的环境
- 使用实际参数初始化位置
- 将自己设置为目标对象并评估$\mathrm{f}$的主体
对于类$\mathrm{A}$和$\mathrm{A}$的方法$\mathrm{f}$（可能是继承的）：
- 定义$\mathrm{ {impl}(A, f)=\left(x_{1}, \ldots, x_{n}, e_{\text {body} }\right)}$，其中
  - $\mathrm{x_i}$是形式参数的名称
  - $\mathrm{e_{body} }$是方法的主体

定义

$\begin{aligned} &\mathrm{ \mathrm{so}, \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{1}: \mathrm{v}_{1}, \mathrm{~S}_{1}\\ \mathrm{so}, \mathrm{E}, \mathrm{S}_{1} \vdash \mathrm{e}_{2}: \mathrm{v}_{2}, \mathrm{S}_{2}\\ \ldots\\ so, \mathrm{E}, \mathrm{S}_{\mathrm{n}-1} \vdash \mathrm{e}_{\mathrm{n} }: \mathrm{v}_{\mathrm{n} }, \mathrm{S}_{\mathrm{n} }\\ \mathrm{so}, \mathrm{E}, \mathrm{S}_{\mathrm{n} } \vdash \mathrm{e}_{0}: \mathrm{v}_{0}, \mathrm{~S}_{\mathrm{n}+1}\\ v_{0}=X\left(a_{1}=I_{1}, \ldots, a_{m}=I_{m}\right)\\ \operatorname{impl}(X, f)=\left(x_{1}, \ldots, x_{n}, e_{\text {body } }\right)\\ I_{x i}=\text {newloc}\left(S_{n+1}\right) \text { for } i=1, \ldots, n\\ E^{\prime}=\left[a_{1}: I_{1}, \ldots, a_{m}: I_{m}\right]\left[x_{1} / I_{x 1}, \ldots, x_{n} / I_{x n}\right]\\ \mathrm{S}_{\mathrm{n}+2}=\mathrm{S}_{\mathrm{n}+1}\left[\mathrm{v}_{1} / \mathrm{l}_{\mathrm{x} 1}, \ldots, \mathrm{v}_{\mathrm{n} } / \mathrm{I}_{\mathrm{xn} }\right]\\ \mathrm{v}_{0}, \mathrm{E}^{\prime}, \mathrm{S}_{\mathrm{n}+2} \vdash \mathrm{e}_{\text {body} }: \mathrm{v}, \mathrm{S}_{\mathrm{n}+3} }\\ \hline & \text {so, } \mathrm{E}, \mathrm{S} \vdash \mathrm{e}_{0} .\mathrm{f}\left(\mathrm{e}_{1}, \ldots, \mathrm{e}_{\mathrm{n} }\right): \mathrm{v}, \mathrm{S}_{\mathrm{n}+3} \end{aligned}$

小结

使用以下方法调用方法的主体
- $\mathrm{E}$映射形式参数和自己的属性
- $\mathrm{S}$与调用者类似，但实际参数绑定到分配给形式的位置
栈帧的概念是隐含的
- 为实际参数分配新的位置
静态分派的语义相似

习题

选择第3项。

其他

操作规则并未涵盖所有情况。请考虑分派示例：

$\begin{array}{l} \ldots\\ \mathrm{ \mathrm{so}, \mathrm{E}, \mathrm{S}_{\mathrm{n} } \vdash \mathrm{e}_{0}: \mathrm{v}_{0}, \mathrm{S}_{\mathrm{n}+1} \\ \mathrm{v}_{0}=\mathrm{X}\left(\mathrm{a}_{1}=\mathrm{I}_{1}, \ldots, \mathrm{a}_{\mathrm{m} }=I_{\mathrm{m} }\right) \\ \operatorname{impl}(\mathrm{X}, \mathrm{f})=\left(\mathrm{x}_{1}, \ldots, \mathrm{x}_{\mathrm{n} }, \mathrm{e}_{\mathrm{body} }\right) }\\ \ldots \end{array}$

小结

有一些类型检查器无法阻止的运行时错误
- 无效分派
- 除以0
- 子字符串超出范围
- 堆溢出
在这种情况下，执行必须优雅地中止
- 带有错误消息，而不是段错误
操作规则非常精确和详细
- 没有留下任何未指明的东西
- 请仔细阅读
大多数语言没有明确指定的操作语义
当可移植性很重要时，操作语义就变得至关重要