0%

Nand2Tetris_Part1_06

好几年了,上册终于要学完了🤣...

最后一周的内容与汇编解释器有关。
因为这一周在课堂上用 iPad 做了笔记,所以每一节的内容直接用课堂上的笔记,然后看情况再增加或修改一点内容吧。

Unit 6.1 Assembly Languages and Assemblers

Week Target: use a Assembler translate Assembly Language to binary code.
Basic Concept:

  1. The “Assembler” is software
  2. First software layer above the hardware

Basic Logic:

  1. Read the next Assembly language command
  2. Break it into the different fields it is composed of
  3. Lookup the binary code for each field
  4. Combine these codes into a single machine language command
  5. Output this machine language command
example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1. Load  R1, 18 <- char array == strings
👇
2. load R1 18
v s t <- 3 strings
👇
3. Command Table <- use a table to translate the command into machine language directly
Load 11001
R1 01
18 000010010
... ...
👇
4. Combination, sometime need to add other bits to it.
Load + R1 + 18 = 1100101000010010
👇
5. Output

About Symbols: Assembler must replace names with address by using a Symbols-Address Table like Command Table.

Unit 6.2 The Hack Assembly Language: A Translator’s Perspective

Hack language specification:

  1. A-instruction
    Symbolic syntax: @value
    Binary syntax: 0valueInBinary
    Example: @21 == 0000000000010101
  2. C-instruction
    Symbolic syntax: dest=comp;jump
    Binary syntax: 111 ac1c2c3c4c5c6 d1d2d3 j1j2j3
  3. Pre-defined symbols
    R0, R1, R2...(refer to the book for more information)

Assembly program elements:

  • White space(Ignore it, we don’t need to care ✅)
    • Empty lines / indentation
    • Line comments
    • In-line comments
  • Instructions
    • A-instructions
    • C-instructions
  • Symbols (⭐⭐⭐ Challenges)
    • References
    • Label declaration

The plan ahead:

  1. Develop a basic assembler that translate symbol-less Hack programs(next unit)
  2. Develop an ability to handle symbols(nexxt unit)
  3. Morph the basic assembler into an assembler that translate any Hack program(nexxxt unit)

Unit 6.3 The Assembly Process: Handling Instructions

  1. Translating A-instructions
2 situations
1
2
1. If value is a decimal constant, generate the equivalent 15-bit binary constant
2. If value is a symbol, later
  1. Translating C-instructions
example
1
2
3
4
5
6
7
8
MD=D+1
👇
dst = MD, comp = D+1, jmp = null
👇
dst = 011, comp = 1011111, jmp = 000, C-flag = 111
👇
C-flag + comp + dst + jmp
111 1011111 011 000
  1. The overall assembly logic

Assembly_Logic_1

Unit 6.4 The Assembly Process: Handling Symbols

Symbols: Pre-defined symbols, Label symbols, Variable symbols.

Pre-defined symbols

Pre-defined symbols represent special memory locations.

Pre-defined_symbols

@pre-defined symbols = @value, it’s A-instructions.

Label symbols

Label symbols represent destinations of goto instructions.
3 steps:

  1. Used to label destinations of goto commands
  2. Declared by the pseudo-command: (xxx), and no need to translate, just ignore it.
  3. This directive defines the symbols xxx to refer to the memory location holding the next instruction in the program, like that:
example
1
2
3
4
Symbol   Value
LOOP 4
STOP 18
END 22

So, translate the label symbol:
@lableSymbol = @value, and here the value is just a line number.

Variable symbols

Variable symbols represent memory location where the programmer wants to maintain values. There’s a little hard to translate it.

First of all, any symbol xxx appearing in an assembly program which is
not pre-defined and is not defined elsewhere using the (xxx) directive is treated as a variable.

Secondly, each variable is assigned a unique memory address, starting at 16(not a arbitrary number 👀).

So, translate variable symbols by 2 steps:

  1. If you see it for the first time, assign a unique memory address.
  2. Replace variable symbols with it’s value. And in order to get this value, we search the value by a Symbol Table.

Now, how do we get the Symbol Table?
3 steps:

  1. Initialization: Add the pre-defined symbols
  2. First pass: Add the label symbols
  3. Second pass: Add the variable symbols

Unit 6.5 Developing a Hack Assembler: Proposed Software Architecture

Sub-tasks that need to be done:

Reading and parsing commands
Steps:
a. Start reading a file with a given name
b. Move to the next command in the file
c. Get the fields of current command
Tips: No need to understand the meaning of anything.

Converting mnemonics -> code
Use the API provided by Java/C++/Python string class to parse the instructions.
Tips: No need to worry about the mnemonic fields were obtained.

Handling symbols
Use the Use the Symbol Table that we have mentioned eariler translate symbols.
Tips: No need to worry about what these symbols mean.

Over logic:

  • Initialization: Parser and Symbol Table
  • First Pass: Read all commands, only paying attention to labels and updating the symbol table
  • Second Pass: Restart reading and translating commands
  • Main Loop:
    • Get the next Assembly Language Command and parse it
    • For A-Commands: Translate symbols to binary addresses
    • For C-Commands: get code for each part and put them together
    • Output the resulting machine language command

Unit 6.6 Project 6 Overview(Programming Option)

Proposed design:

  1. Parser
  2. Code
  3. Symbol Table
  4. Main

Proposed Implementation:

  1. Staged development
    • Develop a basic assembler that translate assembly programs without symbols
    • Develop an ability to handle symbols
    • Morph the basic assembler into an assembler that can translate any assembly program
  2. Supplied 2 types of test programs
1
2
3
4
Add.asm
Max.asm MaxL.asm
Rectangle.asm RectangleL.asm
Pong.asm PongL.asm

Unit 6.6 Project 6 Overview(without Programming)

Translating *.asm program manually✍, and it will be a boring process.
So, it would be better to solve this problem by programming.

Unit 6.7 Project

As mentioned earlier, we can use Java, C++ or Python implement assembler. And here I use C++ to complete the project. I also get some template files about this project, so I just need to complete the methods that implement these classes.
🎈 Here are the files that have been implemented(can handle A/C instructions and symbols):

Assembler.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#ifndef ASSEMBLER_H
#define ASSEMBLER_H

#include <string>
#include "SymbolTable.h"
using namespace std;

class Assembler {
public:
/** Instruction types */
enum InstructionType {
A_INSTRUCTION,
C_INSTRUCTION,
L_INSTRUCTION,
NULL_INSTRUCTION
};

/** C-instruction destinations */
enum InstructionDest {
A,
D,
M,
AM,
AD,
MD,
AMD,
NULL_DEST
};

/** C-instruction jump conditions */
enum InstructionJump {
JLT,
JGT,
JEQ,
JLE,
JGE,
JNE,
JMP,
NULL_JUMP
};

/** C-instruction computations/op-codes */
enum InstructionComp {
CONST_0,
CONST_1,
CONST_NEG_1,
VAL_A,
VAL_M,
VAL_D,
NOT_A,
NOT_M,
NOT_D,
NEG_A,
NEG_M,
NEG_D,
A_ADD_1,
M_ADD_1,
D_ADD_1,
A_SUB_1,
M_SUB_1,
D_SUB_1,
D_ADD_A,
D_ADD_M,
D_SUB_A,
D_SUB_M,
A_SUB_D,
M_SUB_D,
D_AND_A,
D_AND_M,
D_OR_A,
D_OR_M,
NULL_COMP
};

static uint16_t variableSymbolCount;

/** Practical Assignment 5 methods */
Assembler();
~Assembler();

void buildSymbolTable(SymbolTable* symbolTable, string instructions[], int numOfInst);
string generateMachineCode(SymbolTable* symbolTable, string instructions[], int numOfInst);

InstructionType parseInstructionType(string instruction);
InstructionDest parseInstructionDest(string instruction);
InstructionJump parseInstructionJump(string instruction);
InstructionComp parseInstructionComp(string instruction);
string parseSymbol(string instruction);

string translateDest(InstructionDest dest);
string translateJump(InstructionJump jump);
string translateComp(InstructionComp comp);
string translateSymbol(string symbol, SymbolTable* symbolTable);
};

#endif /* ASSEMBLER_H */
Assembler.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
#include "Assembler.h"
#include "SymbolTable.h"

#include <string>
#include <iostream>
#include <algorithm>

using namespace std;

uint16_t Assembler::variableSymbolCount = 16;

/**
* Assembler constructor
*/
Assembler::Assembler() {
// Your code here
}

/**
* Assembler destructor
*/
Assembler::~Assembler() {
// Your code here
}

/**
* Assembler first pass; populates symbol table with label locations.
* @param instructions An array of the assembly language instructions.
* @param symbolTable The symbol table to populate.
*/
void Assembler::buildSymbolTable(SymbolTable* symbolTable, string instructions[], int numOfInst) {
// Your code here
uint16_t lineNumber = 0;
for(int i = 0; i < numOfInst; i++) {
if(instructions[i][0] == '(') {
string t = instructions[i].substr(1, instructions[i].size() - 2);
symbolTable->addSymbol(t, lineNumber);
} else {
lineNumber++;
}
}
}

/**
* Assembler second pass; Translates a set of instructions to machine code.
* @param instructions An array of the assembly language instructions to be converted to machine code.
* @param symbolTable The symbol table to reference/update.
* @return A string containing the generated machine code as lines of 16-bit binary instructions.
*/
string Assembler::generateMachineCode(SymbolTable* symbolTable, string instructions[], int numOfInst) {
// Your code here
// Only process A and C instructions
string ret = "";
for(int i = 0; i < numOfInst; i++) {
InstructionType t = parseInstructionType(instructions[i]);
if(t == A_INSTRUCTION || t == L_INSTRUCTION) {
ret = ret + '0' + translateSymbol(parseSymbol(instructions[i]), symbolTable);
} else if(t == C_INSTRUCTION) {
ret += "111"; // C instructions prefix
ret = ret + translateComp(parseInstructionComp(instructions[i]));
ret = ret + translateDest(parseInstructionDest(instructions[i]));
ret = ret + translateJump(parseInstructionJump(instructions[i]));
}
}
return ret;
}

/**
* Parses the type of the provided instruction
* @param instruction The assembly language representation of an instruction.
* @return The type of the instruction (A_INSTRUCTION, C_INSTRUCTION, L_INSTRUCTION, NULL)
*/
Assembler::InstructionType Assembler::parseInstructionType(string instruction) {
// Your code here:
if(instruction[0] == '@' && ('0' <= instruction[1] && instruction[1] <= '9')) return A_INSTRUCTION;
else if(instruction[0] == '@' && !('0' <= instruction[1] && instruction[1] <= '9')) return L_INSTRUCTION;
else if(instruction[0] =='(') return NULL_INSTRUCTION;
else return C_INSTRUCTION;
}

/**
* Parses the destination of the provided C-instruction
* @param instruction The assembly language representation of a C-instruction.
* @return The destination of the instruction (A, D, M, AM, AD, MD, AMD, NULL)
*/
Assembler::InstructionDest Assembler::parseInstructionDest(string instruction) {
// Your code here:
InstructionDest ret = NULL_DEST;
string::size_type idx = instruction.find("=");
if(idx == string::npos) return ret;
string dest = instruction.substr(0, idx);
if(dest == "A") ret = A;
else if(dest == "D") ret = D;
else if(dest == "M") ret = M;
else if(dest == "AM") ret = AM;
else if(dest == "AD") ret = AD;
else if(dest == "MD") ret = MD;
else if(dest == "AMD") ret = AMD;
return ret;
}

/**
* Parses the jump condition of the provided C-instruction
* @param instruction The assembly language representation of a C-instruction.
* @return The jump condition for the instruction (JLT, JGT, JEQ, JLE, JGE, JNE, JMP, NULL)
*/
Assembler::InstructionJump Assembler::parseInstructionJump(string instruction) {
// Your code here:
// for example if "JLT" appear at the comp field return enum label JLT
if (instruction.find("JLT") != string::npos) {
return JLT;
} else if(instruction.find("JGT") != string::npos) {
return JGT;
} else if(instruction.find("JEQ") != string::npos) {
return JEQ;
} else if(instruction.find("JLE") != string::npos) {
return JLE;
} else if(instruction.find("JGE") != string::npos) {
return JGE;
} else if(instruction.find("JNE") != string::npos) {
return JNE;
} else if(instruction.find("JMP") != string::npos) {
return JMP;
}
return NULL_JUMP;
}

/**
* Parses the computation/op-code of the provided C-instruction
* @param instruction The assembly language representation of a C-instruction.
* @return The computation/op-code of the instruction (CONST_0, ... ,D_ADD_A , ... , NULL)
*/
Assembler::InstructionComp Assembler::parseInstructionComp(string instruction) {
// Your code here:
// for example if "0" appear at the comp field return CONST_0
InstructionComp ret;
string::size_type idx1 = instruction.find("=");
string::size_type idx2 = instruction.find(";");
string comp;
if(idx1 != string::npos && idx2 != string::npos) {
comp = instruction.substr(idx1 + 1, idx2);
} else if(idx1 == string::npos && idx2 != string::npos) {
comp = instruction.substr(0, idx2);
} else if(idx1 != string::npos && idx2 == string::npos) {
comp = instruction.substr(idx1 + 1, instruction.length());
} else {
comp = instruction;
}
if ("0" == comp) {
ret = CONST_0;
} else if("1" == comp) {
ret = CONST_1;
} else if("-1" == comp) {
ret = CONST_NEG_1;
} else if("A" == comp) {
ret = VAL_A;
} else if("M" == comp) {
ret = VAL_M;
} else if("D" == comp) {
ret = VAL_D;
} else if("!A" == comp) {
ret = NOT_A;
} else if("!M" == comp) {
ret = NOT_M;
} else if("!D" == comp) {
ret = NOT_D;
} else if("-A" == comp) {
ret = NEG_A;
} else if("-M" == comp) {
ret = NEG_M;
} else if("-D" == comp) {
ret = NEG_D;
} else if("A+1" == comp) {
ret = A_ADD_1;
} else if("M+1" == comp) {
ret = M_ADD_1;
} else if("D+1" == comp) {
ret = D_ADD_1;
} else if("A-1" == comp) {
ret = A_SUB_1;
} else if("M-1" == comp) {
ret = M_SUB_1;
} else if("D-1" == comp) {
ret = D_SUB_1;
} else if("D+A" == comp) {
ret = D_ADD_A;
} else if("D+M" == comp) {
ret = D_ADD_M;
} else if("D-A" == comp) {
ret = D_SUB_A;
} else if("D-M" == comp) {
ret = D_SUB_M;
} else if("A-D" == comp) {
ret = A_SUB_D;
} else if("M-D" == comp) {
ret = M_SUB_D;
} else if("D&A" == comp) {
ret = D_AND_A;
} else if("D&M" == comp) {
ret = D_AND_M;
} else if("D|A" == comp) {
ret = D_OR_A;
} else if("D|M" == comp) {
ret = D_OR_M;
}
return ret;
}

/**
* Parses the symbol of the provided A/L-instruction
* @param instruction The assembly language representation of a A/L-instruction.
* @return A string containing either a label name (L-instruction),
* a variable name (A-instruction), or a constant integer value (A-instruction)
*/
string Assembler::parseSymbol(string instruction) {
// Your code here:
return instruction.substr(1);
}

/**
* Generates the binary bits of the dest part of a C-instruction
* @param dest The destination of the instruction
* @return A string containing the 3 binary dest bits that correspond to the given dest value.
*/
string Assembler::translateDest(InstructionDest dest) {
// Your code here:
string ret;
switch(dest) {
case A: ret = "100"; break;
case D: ret = "010"; break;
case M: ret = "001"; break;
case AM: ret = "101"; break;
case AD: ret = "110"; break;
case MD: ret = "011"; break;
case AMD: ret = "111"; break;
case NULL_DEST: ret = "000"; break;
default: break;
}
return ret;
}

/**
* Generates the binary bits of the jump part of a C-instruction
* @param jump The jump condition for the instruction
* @return A string containing the 3 binary jump bits that correspond to the given jump value.
*/
string Assembler::translateJump(InstructionJump jump) {
// Your code here:
string ret;
switch(jump) {
case JLT: ret = "100"; break;
case JGT: ret = "001"; break;
case JEQ: ret = "010"; break;
case JLE: ret = "110"; break;
case JGE: ret = "011"; break;
case JNE: ret = "101"; break;
case JMP: ret = "111"; break;
case NULL_JUMP: ret = "000"; break;
default: break;
}
return ret;
}

/**
* Generates the binary bits of the computation/op-code part of a C-instruction
* @param comp The computation/op-code for the instruction
* @return A string containing the 7 binary computation/op-code bits that correspond to the given comp value.
*/
string Assembler::translateComp(InstructionComp comp) {
// Your code here:
string ret;
if (CONST_0 == comp) {
ret = "0101010";
} else if(CONST_1 == comp) {
ret = "0111111";
} else if(CONST_NEG_1 == comp) {
ret = "0111010";
} else if(VAL_A == comp) {
ret = "0110000";
} else if(VAL_M == comp) {
ret = "1110000";
} else if(VAL_D == comp) {
ret = "0001100";
} else if(NOT_A == comp) {
ret = "0110001";
} else if(NOT_M == comp) {
ret = "1110001";
} else if(NOT_D == comp) {
ret = "0001101";
} else if(NEG_A == comp) {
ret = "0110011";
} else if(NEG_M == comp) {
ret = "1110011";
} else if(NEG_D == comp) {
ret = "0001111";
} else if(A_ADD_1 == comp) {
ret = "0110111";
} else if(M_ADD_1 == comp) {
ret = "1110111";
} else if(D_ADD_1 == comp) {
ret = "0011111";
} else if(A_SUB_1 == comp) {
ret = "0110010";
} else if(M_SUB_1 == comp) {
ret = "1110010";
} else if(D_SUB_1 == comp) {
ret = "0001110";
} else if(D_ADD_A == comp) {
ret = "0000010";
} else if(D_ADD_M == comp) {
ret = "1000010";
} else if(D_SUB_A == comp) {
ret = "0010011";
} else if(D_SUB_M == comp) {
ret = "1010011";
} else if(A_SUB_D == comp) {
ret = "0000111";
} else if(M_SUB_D == comp) {
ret = "1000111";
} else if(D_AND_A == comp) {
ret = "0000000";
} else if(D_AND_M == comp) {
ret = "1000000";
} else if(D_OR_A == comp) {
ret = "0010101";
} else if(D_OR_M == comp) {
ret = "1010101";
}
return ret;
}

/**
* Generates the binary bits for an A-instruction, parsing the value, or looking up the symbol name.
* @param symbol A string containing either a label name, a variable name, or a constant integer value
* @param symbolTable The symbol table for looking up label/variable names
* @return A string containing the 15 binary bits that correspond to the given sybmol.
*/
string Assembler::translateSymbol(string symbol, SymbolTable* symbolTable) {
// Your code here:
uint16_t n;
if(!('0' <= symbol[0] && symbol[0] <= '9')) {
if(symbolTable->getSymbol(symbol) == -1) {
n = variableSymbolCount;
symbolTable->addSymbol(symbol, variableSymbolCount);
variableSymbolCount++;
} else {
n = symbolTable->getSymbol(symbol);
}
} else {
n = stoi(symbol);
}
string binNum = "";
for(int i = 0; i < 15; i++) {
if(n & 1) binNum += '1';
else binNum += '0';
n >>= 1;
}
reverse(binNum.begin(), binNum.end());
return binNum;
}
SymbolTable.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#ifndef SYMBOL_TABLE_H
#define SYMBOL_TABLE_H

#include <cstdint> // this contains uint16_t
#include <map> // indexable dictionary
#include <string> // process c++ string

using namespace std;

class SymbolTable {
public:
map<string, uint16_t> hashMap;
SymbolTable();
~SymbolTable();

void addSymbol(string symbol, uint16_t value);
int getSymbol(string symbol);
};

#endif /* SYMBOL_TABLE_H */
SymbolTable.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include "SymbolTable.h"

#include <string>

/**
* Symbol Table constructor
*/
SymbolTable::SymbolTable() {
// add pre-defined symbols
hashMap["R0"] = 0;
hashMap["R1"] = 1;
hashMap["R2"] = 2;
hashMap["R3"] = 3;
hashMap["R4"] = 4;
hashMap["R5"] = 5;
hashMap["R6"] = 6;
hashMap["R7"] = 7;
hashMap["R8"] = 8;
hashMap["R9"] = 9;
hashMap["R10"] = 10;
hashMap["R11"] = 11;
hashMap["R12"] = 12;
hashMap["R13"] = 13;
hashMap["R14"] = 14;
hashMap["R15"] = 15;
hashMap["SCREEN"] = 16384;
hashMap["KBD"] = 24576;
hashMap["SP"] = 0;
hashMap["LCL"] = 1;
hashMap["ARG"] = 2;
hashMap["THIS"] = 3;
hashMap["THAT"] = 4;
}

/**
* Symbol Table destructor
*/
SymbolTable::~SymbolTable() {}

/**
* Adds a symbol to the symbol table
* @param symbol The name of the symbol
* @param value The address for the symbol
*/
void SymbolTable::addSymbol(string symbol, uint16_t value) {
// Your code here
hashMap[symbol] = value;
}

/**
* Gets a symbol from the symbol table
* @param symbol The name of the symbol
* @return The address for the symbol or -1 if the symbol isn't in the table
*/
int SymbolTable::getSymbol(string symbol) {
// Your code here
if(hashMap.find(symbol) != hashMap.end()) return hashMap[symbol];
return -1;
}
Main.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <fstream>
#include <iostream>
#include <regex>
#include <sstream>
#include <string>
#include <vector>

#include "Assembler.h"
#include "SymbolTable.h"

using namespace std;

int main(int argc, char** argv) {
if (argc > 1) {
SymbolTable symbolTable;
Assembler assembler;
vector<string> instructionList;

// Open file
fstream file;
string fname = argv[1];
file.open(argv[1], ios::in);
if (file.is_open()) {
// Read line-by-line
string line;
while (getline(file, line)) {
/* remove comments */
string::size_type idx = line.find("//"); // find start of "//"
string lineRmComm = line.substr(0, idx);
if (lineRmComm.size() == 0) continue; // skip empty line
/* remove spaces */
string::iterator str_iter = remove(lineRmComm.begin(), lineRmComm.end(), ' ');
lineRmComm.erase(str_iter, lineRmComm.end());
instructionList.push_back(lineRmComm);
}
file.close();
}
// Get array of instructions
string instructions[instructionList.size()];
copy(instructionList.begin(), instructionList.end(), instructions);

// First pass
assembler.buildSymbolTable(&symbolTable, instructions, instructionList.size());
// Second pass
string code = assembler.generateMachineCode(&symbolTable, instructions, instructionList.size());
// Print output
cout << code << endl;
}
}

Unit 6.8 Perspectives

Three questions in this week:

  1. Can you possibly improve the symbolic Hack language without changing the binary code or the machine language which is underlying the symbolic level?
  • We have 2 layers of expression here: symbolic level and binary code. We can take symbolic level, and make it more user-friendly or more programmer-friendly. For example, D=M[100] can be translated to @100 and D=M.
  1. Will I ever have to use an assembler outside school?
  • Vvvvvery rarely.
  1. How was the first assembler actually written?
  • First, simply had to complie something by hand. You write an assembler in a high-level language and translate it by hand for the first time only into a machine language of your computer. Once you’ll finish this translation, which is extremely time-consuming, extremely annoying to do, but only needs to be done once conceptually.
    Then you already have a machine language that runs your compiler or your assembler or any high-level help that you want.