整理之前的笔记...😪
两年前,这门课程的下半部分基本上都学完了,留了一点小尾巴,当时是直接在 iPad 上做的笔记,本来想着时候重新整理一下,结果就是不断被搁...(自己的拖延症真的无敌😵)。正好最近有空闲的时间,干脆就把之前的笔记整理好吧。
嗯,另外由于当时做的笔记是直接抄的老师 PPT,所以是英文的,但本文应该会采用中英混合的方式进行描述,估计读起来会有点累(不管了,自己写的舒服就行了😁)。不过,整体的行文脉络会与原来一致。
好了,废话少说。
Unit 7.1 Program Compilation Preview
A. Two-tier compilation(like Java)
1 | Java compiler |
The VM code is an abstraction: write once, run anywhere.
B. Jack compilation(Java-like)
1 | Jack compiler |
这一小节主要讲解的内容是 Jack 这门语言的一些特点:Jack 和 Java 很相似,也是采用“双层编译”的思想来设计的,二者都需要将高级语言编写的代码转化为虚拟机代码(VM code)。在 Java 中,虚拟机代码的形式是字节码,执行起来速度比较快;但在 Jack 中,虚拟机代码的形式是一种像是用虚拟机语言编写出来的代码,理解起来比较简单直观。
另外,在这门课程下半部分的前两个模块,会设计一个 VM translator。
Unit 7.2 VM Abstraction: The Stack
A. Stack
⭐Stack operations:
- push: add a plate at the stack’s top.
- pop: remove the plate
首先介绍的是在虚拟机中使用最多的一种抽象数据结构——栈(这个数据结构大家基本都熟悉的不能再熟悉了😂),它有两个操作:push
和pop
,具体这里就不解释了。
B. Stack arithmetic
Apply a function f on the stack:
- Pops the arguments from the stack.
- Computes f on the arguments.
- Pushes the result onto the stack.
这里主要在介绍如何使用栈进行运算,一样不细说了,然后引出了关于抽象和实现的观点。
Abstraction/Implementation:
- The high-level language is an abstraction.
- It can be implemented by a stack machine.
- The stack machine is an abstraction.
- It can implemented by … Stay tuned.
简言之就是高级语言的抽象可以通过栈来实现,而栈又可以通过更低级的方法来实现(这是一种自顶向下的解决问题的思路)。
C. The Stack machine model
Stack machine, manipulated by:
- Arithmetic/Logical commands
- Memory segment commands
- Branching commands
- Function commands
整个栈机器的模型由上述这四部分组成,也就对应着虚拟机语言的四种指令。
D. Arithmetic/Logical commands
Command | Expression | Return value |
---|---|---|
add | x + y | integer |
sub | x - y | integer |
neg | -y | integer |
eq | x == 0 | boolean |
gt | x > y | boolean |
lt | x < y | boolean |
and | x and y | boolean |
or | x or y | boolean |
not | x not y | boolean |
Observation: Any arithmetic or logical expression can be expressed and evaluated by applying some sequence of the above on a stack.
表格中就是本门课程的虚拟机语言所支持的所有算术运算,任意的算术运算和逻辑运算都可以在一个栈上被上述九种基本运算经过特定的计算顺序而实现。
Unit 7.3 VM Abstraction:Memory Segments
A. Variable kinds
a. Argument variables <-(mapping)-> argument segments
b. Local variables <-(mapping)-> local segments
c. Stack variables <-(mapping)-> static segments
总共有三种变量类型,对应会映射到三个虚拟的内存段上。
B. Memory segments
8 segments: argument/local/static/constant/this/that/pointer/temp
The VM language syntax: push/pop segment i
❕ Attention: i is a non-negative integer.
VM 语言有两个指令:push
和pop
,这两个指令的语法如上所示。另外,这八个内存段都是虚拟的,不是真实存在的。
Unit 7.4 VM Implementation: the Stack
A. The VM abstraction
As we mentioned earlier, 8 segments that we have seen is completely imaginary.
B. Pointer manipulation
For example:1
D = *p // D becomes 23
can be translate:1
2
3@p
A=M
D=M
这一部分主要在讲指针的概念和基本操作,有 C 语言基础的话,还是比较简单的。
C. Stack machine
Assumptions:
- SP stored in RAM[0]
- Stack base addr = 256
按照上面的假设,在 Hack 中,SP 是栈的栈顶指针,保存在 RAM[0](物理地址 0)这个位置,默认从 256 开始的,
so, push constant 17
can be express in logically psuedo code:1
2*SP = 17
SP++
Further, translate to Hack assembly:1
2
3
4
5
6
7@17 // D = 17
D=A
@SP // *SP = D
A=M
M=D
@SP // SP++
M=M+1
这一部分主要在讲如何把一个push
指令翻译为前面学过的汇编语言,理解这个后,就可以推广到所有的算术指令。
D. VM translator perspective
VM code translate Assembly code by VM Translator.
So, what is the VM Translator?
a. A program that translate VM code into machine language.
b. Each VM command generates several assembly commands.
Unit 7.5 VM Implementation: Memory Segments
A. Implementing local
LCL is just a constant number like SP, and it’s also a base address.
Our Goals:1
2
3
4
pop local i -- VM Translator -- addr = LCL + i, SP--, *addr = *sp
|---------------> |
push local i -- -- addr = LCL + i, *SP = *addr, SP--
这个部分需要我们思考如何将pop/push
指令转换为汇编代码,上述代码块中的内容是老师给出的伪码,可以参考这个来手动转换,然后再写到工程代码中即可。大致的过程就是这个样子,具体的代码都写在整个工程中了,这里就不展开了。
B. Implement local, argument, this, that
When translating the high-level code of some method into VM code, the compiler:
a. maps the method’s local and argument variables onto the local and argument segments
b. maps the object fields and the array entries that the method is currently processing onto the this and that segments
So, local, argument, this and that are implemented precisely by same way.
argument/this/that
这三个内存段的实现方式实际上与local
是一样的,所以可以采用同样的方式来实现。
C. Memory segment: constant
When translating the high-level code of some method into VM code, the compiler:
c. it translates high-level operations involving constants into VM operation involving the constant segment.
1 | push constant i ---> *SP = i, SP++ |
对于const
这个内存段而言,只有push
指令没有pop
指令。
D. Memory segment: static
When translating the high-level code of some method into VM code, the compiler:
d. maps the static variables that the method sees onto the static segment.
The challenge: static variables should be seen by all the methods in a program.
Solution: Store them in some “global space”.
- Have the VM translator translate each VM reference static i(int file Foo.vm) into an assembly reference Foo.i.
- Following assembly, the Hack assembler will map these references onto RAM[16], RAM[17], …, RAM[255].
- Therefore, the entries of the static segment will end up being mapped onto RAM[16], RAM[17], …, RAM[255], in the order in which they appear in the program.
这块是在讲解用于存储静态变量的 static 内存段的实现,实际上 static 这个虚拟内存段被映射到实地址段 16-255 了。
E. Memory segment: temp
When translating the high-level code of some method into VM code, the compiler:
e. Sometimes needs to use some variables for temporary storage.
f. Our VM provides 8 such temporary variables
temp implementation: Mapped and RAM locations 5 to 12.1
2
3pop temp i -- VM Translator -- addr = 5 + i, SP--, *addr = *SP
|---------------> |
push temp i -- -- addr = 5 + i, *SP = *addr, SP--
temp 内存段一共有八个地址,被映射到了地址 5-12。
F. Memory segment: pointer
When translating the high-level code of some method into VM code, the compiler:
g. generates code that keeps track of the base addresses of the this and that segments using the pointer segment.
A fixed, 2-place segment:
- accessing pointer 0 should be result in accessing THIS
- accessing pointer 1 should be result in accessing THAT
1 | push temp i -- VM Translator -- *SP = THIS/THAT, SP++ |
THIS
是对象的自身的地址,THAT
是数组的首地址。
Summary
In this chapter, we need to implement:
- Arithmetic/Logical commands: add, sub, neg, eq, gt, lt, and, or, not.
- Memory access commands: pop/push segment i.
Unit 7.6 The VM Emulator
Typical uses of the VM Emulator:
- Running(compiled) Jack programs
- Testing programs(systematically)
- Experimenting with VM commands
- Observing the VM internals(stack, memory and segments)
这一小节主要在介绍 VM Emulator 的使用方法,其实与之前的工具差不多。
Unit 7.7 VM Implementation on the Hack Platform
In order to write a VM translator, we must be familiar with:
- the source language
- the target language
- the VM mapping on the target platform
A. Source: VM language
Arithmetic/Logical commands: add, sub, neg, eq, gt, lt, and, or, not
Memory access commands
pop/push segment i
B. Target: symbolic Hack code
A instructions and C instructions.
C. Standard VM mapping on the Hack Platform
VM mapping decisions:
- How to map the VM’s data structures using the host hardware platform
- How to express the VM’s commands using the host machine language
Standard mapping:
- Specifies how to do the mapping in an agreed-upon way
- Benefits:
- Compatibility with other software systems
- Standard testing
这一小节主要介绍的内容是如何做一个 VM Translator,需要具备三点:
- 对源语言熟悉
- 对目标语言熟悉
- 对目标平台的地址映射关系要熟悉
Unit 7.8 VM Translator: Proposed Implementation
Prose design:
- Parser: parses each VM command into its lexical elements
- CodeWriter: write the assembly code that implements the parsed command
- Main: drives the process(VMTranslator)
Main(VMTranslator):
Input: fileName.vm
Output: fileName.asm
Main logic:
- Constructors a Parser to handle the input file
- Constructors a CodeWriter to handle the output file
- Marches through the input file, parsing each line and generating code form it.
这一小节主要是在介绍 VM Translator 的一些功能和设计思路,详细的设计思路可以看老师的讲解或者查阅老师给的资料。
Unit 7.9 Building the VM Translator, Part I
这个小节主要再讲这个章节需要完成的任务及对应的测试方法。在 VM Translator 做好之后,需要将 Project7 中每个目录下的.vm
文件翻译成.asm
文件,然后使用 CPUEmulator 加载翻译后的.asm
和.tst
脚本文件,执行完成后比对生成的.out
文件和.cmp
文件即可。
Unit 7.10 Perspectives
- You mentioned that the VM is a rather old idea. How old is it?
- 1970s - 1980s
- How close is our VM to Java’s JVM?
- Both language are stack based, both use their push and pop and both access their memory using virtual memory segments instead of symbolic variables.
- About efficiency
In reality, developers of VM Translator work very hard to generate low level code which is a tight and efficient as possible. And that’s something that up until now in nand2tetris part 2 we have completely ignored.
Also, the stack architecture that we use so carefully in this course, is not really a necessary ingredient of a virtual machine.
Unit 7.11 Project
如前所说,这个章节的任务是实现一个简化版的 VM Translator,只需要支持九个算数指令(Arithmetic Commands)和push/pop
指令即可。按照老师的讲解,这个 VM Translator 由两个部分组成:Parser 和 CodeWriter,具体的功能说明这里不再提及了,直接看代码吧。
1 |
|
在阅读这门课程的书籍时,发现书本上对 VM Translator 的要求是既可以解析单个.vm
文件,也可以解析目录下所有.vm
文件并保存为一个.asm
文件,所以在上述文件中,做好了 windows 系统下对文件名及目录的解析。不过,在看了本章需要测试的内容后,发现都是单个文件,索性就暂时先不写多文件解析的相关代码吧,偷个懒~。
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
在做 CodeWriter 这个模块之前,建议先手动写出从 VM 语言转换到 Hack Assembly 的代码,然后再合并到实际的项目代码中。
Summary
最后总结一下,这一章实现了一个虚拟机语言翻译器,功能是将 Hack 平台的 VM Language 翻译为 Hack Assembly Language,目前只支持算数指令和push/pop
指令,所以只是个半成品,下一章会将剩余功能补全。
完成这个项目时,建议结合老师给的.vm
文件入手,先用 VMEmulator 模拟一下 vm 代码的实际行为,然后根据这些行为手动翻译出对应的汇编代码,再合并到项目中即可。
另外,有些指令的设计思路其实是很类似的,照葫芦画瓢即可。不过,要是不熟悉这门课程的汇编语言,那估计还是有点难的。
好了,先这样吧。
PS:中英混合着写笔记感觉怪怪的,思路不连贯🤔...