X86lite Specification¶
Overview¶
X86lite is an abstract, 64-bit signed integer-only subset of the Intel X86-64 machine architecture. The X86lite instruction set is tiny by comparison to full X86, yet it still provides a sufficient compilation target for the COS320 course compiler projects.
This document explains the X86lite machine model and its instruction set, and is intended for use as a reference manual. Information about the full X86 architecture can be found on the Intel web pages. The COS320 course infrastructure provides OCaml interfaces for manipulating X86lite assembly programs and tools for creating X86-64 executables using the system assembler and linker.
X86lite Machine State¶
The X86lite machine state consists of sixteen general-purpose 64-bit registers, an instruction pointer register that can only be manipulated indirectly by control flow instructions, three condition flags, and a memory consisting of 264 bytes. Instructions are represented as 8-byte words using an unspecified fixed-length encoding.
Register file¶
The 16 64-bit registers in X86lite and their common uses in the full X86 architecture are given below. In X86lite, most of the registers can be used for general purpose calculation, but some X86lite instructions make special use of some of the registers; see the instruction descriptions below.
Register |
Description / common use on X86 |
---|---|
|
General purpose accumulator |
|
Base register, pointer to data |
|
Counter register for strings and loops |
|
Data register for I/O |
|
Pointer register, string source register |
|
Pointer register, string destination register |
|
Base pointer, points to the stack frame |
|
Stack pointer, points to the value at the top of the stack |
|
General purpose |
In addition, the instruction pointer register rip
contains the
address of the next instruction to execute. The address in rip
is
used to load the next instruction to execute, then rip
is
increased by the size of the instruction (always 8-bytes, since we use
a fixed-length encoding), and then the instruction is executed.
Condition flags¶
The X86 architecture provides conditional branch and conditional move instructions. The processor maintains a set of bit-sized flags to keep track of conditions arising from arithmetic and comparison operations. These condition flags are tested by the conditional jump and move instructions; the flags are set by the arithmetic instructions. X86lite provides only three condition flags (the full X86 architecture has several more).
Condition flag |
Description |
---|---|
|
Overflow set when the result is too big or too small to fit in a 64-bit value and cleared otherwise. This is overflow/underflow for signed (two’s complement) 64-bit arithmetic. |
|
Sign equal to the most significant bit of the result ( |
|
Zero set if the result is |
Memory, addresses, and the stack¶
The X86lite memory consists of 264 bytes numbered
0x0000000000000000
through 0xffffffffffffffff
. All of the
X86lite instructions operate on 8-byte quadwords, but memory is
byte-addressable. That is, unaligned memory accesses are legal.
The only general-purpose register that is treated specially by the
X86lite ISA is rsp
, which contains the address of the top of the
stack of the executing program. By convention on X86 machines, the
program stack starts at the high addresses of virtual memory and grows
toward the low addresses. Instructions like pushq
, popq
,
callq
, and retq
, increment and decrement rsp
as needed to
maintain this invariant.
X86lite Operands and Condition Codes¶
This section describes the X86lite instruction set.
Operands¶
X86lite instructions manipulate data stored in memory or in registers. The values operated on by a given instruction are described by operands, which are constant values like integers and statically known memory addresses, or dynamic values such as the contents of a register or a computed memory address.
Operands can take one of several forms, described below:
Operand kind |
Description |
---|---|
|
An immediate, constant literal of size 64-bits or a symbolic label that is resolved by the assembler/linker/loader to a 64-bit constant. Label values typically denote targets of |
|
One of the sixteen machine registers, or |
|
An indirect address consisting only of a displacement by a literal or symbolic label immediate value. An example use is |
|
An indirect reference to an address held in a register. For example, |
|
An indirect reference to an offset of an address held in a register. For example, |
In their full generality, and X86 indirect reference operand consists of three optional components:
[base :
reg
] [index :reg
, scale :int64
] [disp : (int64
|Label
)]
The effective address denoted by an indirect address is calculated by:
addr(Ind) = base + (index * scale) + disp
In the formula above, a missing optional component’s value is 0. For
the purposes of X86lite, we disregard the index and scale parts, which
yields the three combinations given by Ind1
, Ind2
, and
Ind3
described above.
When an Ind
operand is used as a value (not a location) the
operand denotes Mem[addr(Ind)
], the contents of the machine memory
at the effective address denoted by Ind
.
Condition codes¶
The X86lite cmpq SRC1, SRC2
instruction is used to compare two
64-bit operands (SRC1
and SRC2
). It works by subtracting
SRC1
from SRC2
(i.e., SRC2 - SRC1
), setting the condition
flags according to the result (the actual result of the subtraction is
ignored).
The X86lite conditional branch (J
) and conditional set (setb
)
instructions specify condition codes that look at the condition flags
to determine whether or not the condition is satisfied. The eight
condition codes and their interpretation in terms of condition flags
are given in the following table:
Condition code |
Description |
---|---|
|
Equals: This condition holds when |
|
Not equals: This condition holds when |
|
(Signed) less than: This condition holds when |
|
(Signed) less than or equal: This condition holds when ( |
|
(Signed) greater than: This condition holds when (not |
|
(Signed) greater than or equal: This condition holds when (not |
X86lite Instructions¶
There are only about 20 instructions in the X86lite architecture.
Together, they provide basic signed arithmetic over 64-bit integers,
logical operations, data movement between registers and memory, and
control-flow operations for branches and jumps. In general,
instructions that involve two operands must not use two memory
(Ind
) operands.
When an operand appears on the right-hand side of the ← symbol in the
instruction descriptions below, it is interpreted as a value,
computed as described above. When an operand appears on the left-hand
side of an ← symbol, it is interpreted as a location. The location
of a register operand is the register itself; the location of an
Ind
operand is the memory location at address addr(Ind)
.
Immediate values and labels do not denote valid locations.
In the following table, the Flags column indicates which condition
flags are affected by the operation. The symbol ---
means that no
condition flags are set (i.e., the operation does not modify the
condition flags). The presence of the symbols SF
, ZF
, and
OF
indicate that these flags are set as described in the
condition flags section. A *
next to a flag indicates
special handling. Note that overflow conditions for all arithmetic
operations are defined per instruction.
Arithmetic Instructions¶
Instruction |
Operation |
Flags |
Comments |
---|---|---|---|
|
|
|
Two’s complement negation. Note that flag |
|
|
|
Signed integer addition. Let |
|
|
|
Signed integer subtraction. This operation can be computed
using arithmetic negation and addition. Let |
|
|
|
Signed integer multiply. Let |
|
|
|
Computes the 64-bit successor of |
|
|
|
Computes the 64-bit predecessor of |
Logic Instructions¶
Instruction |
Operation |
Flags |
Comments |
---|---|---|---|
|
|
|
One’s complement (logical) negation. |
|
|
|
Logical |
|
|
|
Logical |
|
|
|
Logical |
Bit-manipulation Instructions¶
Instruction |
Operation |
Flags |
Comments |
---|---|---|---|
|
|
|
Arithmetic shift |
|
|
|
Bitwise shift |
|
|
|
Bitwise shift |
|
|
|
If condition code |
Data-movement Instructions¶
Instruction |
Operation |
Flags |
Comments |
---|---|---|---|
|
|
|
Load effective address of |
|
|
|
Copy the value of |
|
rsp ← rsp - 8 ;Mem[
rsp ] ← SRC |
|
Push a 64-bit value onto the stack: decrement |
|
DEST ← Mem[rsp ];rsp ← rsp + 8 |
|
Pop the top of the stack into |
Control-flow and condition Instructions¶
Instruction |
Operation |
Flags |
Comments |
---|---|---|---|
|
← |
|
Compare |
|
|
|
|
|
pushq %rip rip ← SRC |
|
Call a procedure: Push the program counter (which, at this
point, contains the address of the instruction
after the |
|
|
|
Return from a procedure: Pop the current top of the stack into
|
|
|
|
Conditional jump: If the condition code |