AARCH64 Assembly
- The assembler is created by the manufacturers of that specific CPU architecture Certainly! Let’s break down the concept of two-pass assemblers and the specifics of the process, step by step.
- Assemblers are programs that convert assembly language code into machine code (binary instructions that a computer’s CPU can execute). Modern assemblers often use a two-pass process to ensure accurate translation and efficient use of labels and symbols.
Pass One: Building the Symbol Table
Reading the Input: During the first pass, the assembler reads through the entire assembly source code.
Tracking Locations: It keeps track of the location (address) of each piece of data and each instruction. This is often managed using a location counter, which starts at the beginning of the memory space and increments as instructions and data are processed.
Identifying Labels: When it encounters a label (e.g.,
Image:
), it notes the current value of the location counter and associates that value with the label.Building the Symbol Table: The assembler builds a symbol table that maps each label to its corresponding address or numerical value. For example, if the label
Image
appears at address0x0040
, the symbol table would record this association.
Pass Two: Generating Machine Code
- Reading the Input Again: During the second pass, the assembler reads through the assembly source code once more.
- Converting Instructions: It converts assembly instructions and data declarations into machine code. For example, an instruction like
MOV R1, Image
might be translated into a specific binary pattern that the CPU can execute. - Using the Symbol Table: Whenever a label or symbol is encountered in an instruction or data declaration, the assembler uses the symbol table created in the first pass to supply the correct numerical value. For instance, if
Image
corresponds to address0x0040
, this value is used in the generated machine code.
Labels in Assembly #
- Labels: Labels are named markers used to indicate specific addresses in the program. They can refer to the location of instructions, data, functions, or blocks of code.
- GNU Assembly Syntax: In GNU assembly syntax, a label is defined by ending the label name with a colon (e.g.,
Image:
). This colon is not part of the label but serves as a syntactical marker.
Example #
Consider a simplified example to illustrate this process:
Assembly Source Code:
START(0x0001): MOV(0x0002) R0(0x0003), #0(0x0004) ; Initialize R0 to 0
MOV(0x0005) R1(0x0006), #5(0x0007) ; Initialize R1 to 5
LDR(0x0008) R2(0x000A), Image(0x000B) ; Load the address of Image into R2
...
Image(0x000C): .WORD(0x000D) 0x1234(0x000E) ; Define a word of data at label Image
Pass One (Building Symbol Table):
START
is at address0x0000
Image
is at address0x000C
(assuming the instructions above occupy bytes up to address0x000B
)
Symbol Table:
START -> 0x0000
Image -> 0x000C
Pass Two (Generating Machine Code):
MOV R0, #0
-> binary instruction for MOV R0, #0MOV R1, #5
-> binary instruction for MOV R1, #5LDR R2, Image
-> binary instruction for LDR R2, 0x000C (using the address from the symbol table).WORD 0x1234
-> binary representation of 0x1234 at address 0x000C
Directives Important Directives Related to Symbols
- .equ or EQU
- Purpose: Define a constant value that can be used throughout the program.
- Syntax:
symbol .equ value
orsymbol EQU value
- Example:
MAX_SIZE .equ 100
orMAX_SIZE EQU 100
- Usage: The symbol
MAX_SIZE
can be used wherever the value100
is needed. Changing the value in the definition changes it everywhere.
- .set
- Purpose: Similar to
.equ
, but allows the symbol to be redefined later in the code. - Syntax:
symbol .set value
- Example:
COUNTER .set 0
- Usage: The symbol
COUNTER
can be updated to a new value later in the code if needed.
- Purpose: Similar to
- .define
- Purpose: Often used in macro assembly to define constants or macros.
- Syntax:
symbol .define value
- Example:
PI .define 3.14
- Usage: The symbol
PI
can be used in place of the numerical value3.14
.
- .def
- Purpose: Define a symbol, often used in older assembly languages.
- Syntax:
symbol .def value
- Example:
BUFFER_SIZE .def 256
- Usage: The symbol
BUFFER_SIZE
is used to represent the value256
.
There are four elements to assembly syntax: labels, directives, instructions, and comments. Directives are used mainly to define symbols, allocate storage, and control the behavior of the assembler. The most common assembler directives were introduced in this chapter, but there are many other directives available in the GNU Assembler.
Directives can also be used to declare macros. Macros are expanded by the assembler and may generate multiple statements. Careful use of macros can automate some simple tasks, allowing several lines of assembly code to be replaced with a single macro invocation.
A conceptual explanation of how most modern CPUs work
- Since there are limited amount of registers most of the program data are stored in the memory and loaded when ever the cpu needs it.
Load, store and branch instructions #
The part of the engineering/programming related to communicating with the hardware is called ISA ( instruction set architecture ) which is basically a set of instructions that
- In the CPU we have the most important parts of the machine namely the ALU, CU and Registers
- the AARCH64 architechecture provides registers called User Registers
What those are is layers of X
Width
x XSize
bit cells (bit store), Width of normal 8’s multiple and arbitrary SizeGeneral Purpose Register (GPR)
- each register is capable of storing 64 bits of data
W0-W30
are 32-bit registers are the least significant bits, and are used if you don’t require the whole 64bits of that registerX0-X30
also called extended registers are used when you require 64bits storage- when you don’t want to specify which or it doesn’t matter using
R20
for example to call is enough
The frame pointer
- It store the base of the current stack frame to keep track of the address of the callee and the caller of a given function/program. ussually
X20
on aarch64, can vary.
- It store the base of the current stack frame to keep track of the address of the callee and the caller of a given function/program. ussually
PState Register
This is a specialized register, it keeps track of the program state and contains any flags triggered by it so for example, if the last result was a Negative or Zero or if the last operation oVerflew. other flags are important for the OS but not for us.
N - negative
Z - Zero
C - Carry
V - oVerflow
Stack Pointer
- The stack pointer is used to hold the address where the stack ends. This is commonly referred to as the of the stack, although on most systems the stack grows downwards and the stack pointer really refers to the address in the stack.
Program Counter
- This counter points to the next operation that the processor will be executing next
- The processor increments this register by 4( in 32bit architectures because a 4 bytes = 4 x 8 bits = 32 bits, thus of the next instruction) automagically every time to refer to the address of the next instruction
- By moving an address into this register, the programmer can cause the processor to fetch the next instruction from the new address.
- the AARCH64 architechecture provides registers called User Registers
Register Addressing: #
This mode uses the contents of a register as the memory address for load/store operations.
LDR X1, [X2] ; Load 8 bytes from the address in X2 into X1
STR X1, [X2] ; Store 8 bytes from X1 into the address in X2
Signed Immediate Offset: #
This mode adds a signed immediate value to a base register to calculate the memory address.
LDR X1, [X2, #0x50] ; Load 8 bytes from the address (X2 + 0x50) into X1
STR X1, [X2, #-0x50] ; Store 8 bytes from X1 into the address (X2 - 0x50)
Unsigned Immediate Scaled Offset: #
This mode adds an unsigned immediate value, scaled based on the data size, to a base register.
LDR X1, [X2, #0x7FF8] ; Load 8 bytes from the address (X2 + (0x7FF8 << 3)) into X1
STR W1, [X2, #0x3FFC] ; Store 4 bytes from W1 into the address (X2 + (0x3FFC << 2))
Pre-indexed Immediate Offset: #
This mode adds a signed immediate value to a base register and updates the base register with the new address before accessing memory.
LDR X1, [X2, #0x10]! ; Load 8 bytes from (X2 + 0x10) into X1, then set X2 to (X2 + 0x10)
STR X1, [X2, #-0x10]! ; Store 8 bytes from X1 into (X2 - 0x10), then set X2 to (X2 - 0x10)
Post-indexed Immediate Offset: #
This mode uses a base register for the memory address and then updates the base register with a signed immediate value after the memory access.
LDR X1, [X2], #0x10 ; Load 8 bytes from the address in X2 into X1, then set X2 to (X2 + 0x10)
STR X1, [X2], #-0x10 ; Store 8 bytes from X1 into the address in X2, then set X2 to (X2 - 0x10)
Register Offset: #
This mode adds the contents of a shifted or extended register to a base register to calculate the memory address.
LDR X1, [X2, X3, LSL #3] ; Load 8 bytes from the address (X2 + (X3 << 3)) into X1
STR W1, [X2, X3, LSL #2] ; Store 4 bytes from W1 into the address (X2 + (X3 << 2))
Literal: #
This mode calculates an address in memory within one megabyte of the program counter using a signed offset from the instruction.
LDR X1, =label ; Load address of the label into X1
LDR X1, [PC, #0x1F] ; Load 8 bytes from the address (PC + 0x1F) into X1
Loading and storing #
can be grouped into the
- dealing with one register (single register)
- dealing with two (register pair)
- atomic
loading and storing operations happen only between the registers and memory (more often than not the RAM) thus we have only two commands for it
- LDR - Load from memory to Register
- STR - Store contents of register TO memory
- When talking about storing we have a convetion to follow apparently
<opcode>{size} R <addr>
where- opcode is either
LDR
orSTR
- size is either
b
for byteh
for halfwordsb
for signed bytesh
for signed halfwordsw
for word
- R is the register to be
- opcode is either
Branching instructions #
Branch instructions allow you to change the address of the next instruction to be executed. They are used to implement loops, if-then structures, subroutines, and other flow control structures. There are five instructions related to branching:
- Branch,
- Branch to Register,
- Branch and Link (subroutine call),
- Compare and Branch, and
- Form program-counter-relative Address.
b<cond> target
- where
<cond>
is any of the provided conditional opcodes likeEQ
NE
ORR
target
is any operation being performed be it a function or operation result
Branch instructions alter the normal sequential flow of execution by making the processor jump to a different instruction address. This allows for conditional and unconditional changes in the execution path.
Types of Branch Instructions
Branch instructions can be broadly categorized into:
- Unconditional Branches:These instructions always result in a jump to a new address, regardless of any conditions. Example:
JMP
(Jump). - Conditional Branches:These instructions result in a jump only if a specific condition is met. Examples:
BEQ
(Branch if Equal),BNE
(Branch if Not Equal)
Components of Branch Instructions
A typical branch instruction consists of:
- Opcode:Specifies the type of branch (e.g.,
JMP
,BEQ
). - Condition (for conditional branches):A specific condition that needs to be met for the branch to occur (e.g., comparison of two values).
- Target Address:The address to which the control should jump if the branch is taken.
Mechanism of Branching
- Fetch:The CPU fetches the branch instruction from memory.
- Decode:The CPU decodes the instruction to determine the type of branch and the target address.
- Evaluate Condition (for conditional branches):The CPU evaluates the condition specified in the instruction.
- If the condition is true, the Program Counter (PC) is updated to the target address.
- If the condition is false, the PC continues to the next sequential instruction.
- Execute:The CPU executes the jump to the target address if the branch is taken.
Data processing #
All opcodes seen so far | What it does |
---|---|
adc | adding with carry |
add | adding |
adr | address of a label |
adrp | address of a page |
and | bitwise and |
asr | arithmetic shift right |
b | branch |
bic | bitwise clear |
bl | branch and link |
blr | branch to link register |
br | branch to register |
cbnz | compare and branch if not zero |
cbz | compare and branch if zero |
ccmn | ccmp with immediate |
ccmp | ccmp with register |
cinc | conditional increment |
cinv | conditional invert |
cls | count leading sign bits |
clz | count leading zeros |
cmn | compare and negate |
cmp | compare |
cneg | conditional negate |
csel | conditional select |
cset | conditional set |
csetm | conditional set mask |
csinc | conditional select increment |
csinv | conditional select invert |
csneg | conditional select negate |
eon | exclusive or not |
eor | exclusive or |
ldp | load pair |
ldr | load register |
ldur | load register unscaled |
lsl | logical shift left |
lsr | logical shift right |
madd | multiply and add |
mneg | multiply and negate |
mov | move |
movk | move with keep |
movn | move with not |
movz | move with zero |
mrs | move to system register |
msr | move from system register |
msub | multiply and subtract |
mul | multiply |
mvn | move with not |
neg | negate |
ngc | negate with carry |
nop | no operation |
orn | or not |
orr | or |
ret | return |
ror | rotate right |
sbc | subtract with carry |
sdiv | signed divide |
smaddl | signed multiply and add long |
smnegl | signed multiply and negate long |
smsubl | signed multiply and subtract long |
smulh | signed multiply high |
smull | signed multiply long |
stp | store pair |
str | store register |
stur | store register unscaled |
sub | subtract |
svc | supervisor call |
tbnz | test bit and branch if not zero |
tbz | test bit and branch if zero |
tst | test |
udiv | unsigned divide |
umaddl | unsigned multiply and add long |
umnegl | unsigned multiply and negate long |
umsubl | unsigned multiply and subtract long |
umulh | unsigned multiply high |
umull | unsigned multiply long |