Programming Assumptions

3. FAQs

Q: Am I going to build a full scale assembler?

A: Compared with a full scale assembler, this assignment builds a mini-assembler which makes the following simplifications.

It implements a subset of instructions. A number of real and synthetic instructions and storage directives are skipped, such as the floating-point instructions and the branch instructions with annul bits. See section 1.2 for the required instructions.
For some instructions, it only recognizes subsets of their formats. See section 1.2 for the required formats.
It supports reserved section names (.text, .data, and .bss). But it doesn't support user defined section names.
The relocation record it is able to generate is limited to four basic relocation types with no relocation addend. This means, in most cases, an expression contains only immediate values. See section 2.3 for the four circumstances under which an expression contains a symbol.

Q: Do I need to check any syntatic error in the assembly language program?

A: You should assume that all the instructions passed from the input interface are syntactically correct. This means that it is the responsibility of the input interface to check any syntactic error of the assembly language program. For example, no mnemonic with invalid format will be passed, and no directive with invalid type of argument will be passed. The errors you are responsible to check is illustrated by the testing programs.

Q: Do I need to handle different representations of integers?

A: The input interface can recognize different integer representations in an assembly language instruction such as octal (040), hexadecimal (0x500), and ASCII code ('a'), and translate them into the corresponding integer values.

Q: Why is there no `()` in an expression?

A: The expression passed from the input interface does not have () sub-expression. Instead, it uses its tree structure to express the operation precedence forced by ().

Q: Why is the symbol table displayed by `elfdump` a bit different from my symbol table?

A: First, the length of the symbol table of the object file might be different from that of your symbol table. This is because the output interface adds extra symbols which are the names of the sections after the two-pass assembly.
Second, the sequence number of each symbol in the symbol table of the object file might be different from that in your symbol table. This is because the output interface will sort the symbols and assign new sequence numbers to them after the two-pass assembly. The same is applied to the section indices, -- the output interface will sort the sections and assign new section indices to them after the two-pass assembly.

Q: Do I need to free memory?

A: You don't need to free the instruction list, the symbol table, or the sections. The output interface does that for you. But the you are responsible for freeing any other allocated memory.

table of content