9. Array and Addressing Modes
Now that we can make decisions and create loops, let’s learn how to work with collections of data. Arrays are fundamental to programming, and x86-64 provides powerful addressing modes to access them efficiently.
We have already seen the Addressing modes here.
What are Arrays?
In assembly or in CS, an array is simply a contiguous block of memory containing multiple elements of the same data type.
Here is how you define an array -
1
2
3
4
5
6
7
8
section .data
bytes db 10, 20, 30, 40, 50 ; Array of 5 bytes
words dw 1000, 2000, 3000, 4000 ; Array of 4 words (2 bytes each)
dwords dd 1, 2, 3, 4, 5, 6 ; Array of 6 doublewords (4 bytes each)
qwords dq 100, 200, 300, 400 ; Array of 4 quadwords (8 bytes each)
; String is also an array!
message db 'Hello', 0 ; Array of characters
Addressing Modes for Array Access
x86-64 provides several ways to calculate memory addresses for array elements.
1. Direct Addressing
Access elements using fixed offsets from the array base.
1
2
3
mov al, [bytes] ; First element (bytes[0] = 10)
mov bl, [bytes + 1] ; Second element (bytes[1] = 20)
mov cl, [bytes + 2] ; Third element (bytes[2] = 30)
2. Register Indirect
Use a register as a pointer to traverse the array.
1
2
3
4
mov rsi, bytes ; RSI points to start of array
mov al, [rsi] ; bytes[0]
inc rsi ; Move to next element
mov bl, [rsi] ; bytes[1]
3. Indexed Addressing
The most powerful method - combines base, index, and scale.
Syntax: [base + index * scale + displacement]
Where:
- base: Base address register
- index: Index register (the array subscript)
- scale: Element size (1, 2, 4, or 8)
- displacement: Constant offset
1
2
3
4
5
6
7
8
9
10
11
section .data
arr dd 10, 20, 30, 40, 50 ; 32-bit integers (4 bytes each)
section .text
global _start
_start:
mov rbx, arr ; Base address
mov rsi, 2 ; Index (we want arr[2])
; Access arr[2] = 30
mov eax, [rbx + rsi * 4] ; Base + (index * element_size)
Let’s see some practical examples now on arrays.
Example 1: Summing a Byte Array
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
section .data
numbers db 5, 10, 15, 20, 25, 30
length equ $ - numbers ; Calculate array length
section .text
global _start
_start:
mov rsi, numbers ; Pointer to array start
mov rcx, length ; Counter = number of elements
xor rax, rax ; Sum = 0
sum_loop:
movzx rbx, byte [rsi] ; Load byte, zero-extend to 64 bits
add rax, rbx ; Add to sum
inc rsi ; Move to next element
loop sum_loop ; Decrement RCX and loop if not zero
; RAX now contains the sum (5+10+15+20+25+30 = 105)
mov rax, 60
mov rdi, 0
syscall
You must have noticed this $ - numbers thing…
Well length equ $ - numbers is just a compile-time calculation (not runtime):
$represents the current memory addressnumbersis the start address of the array$ - numberscalculates: (address after array) - (address of array start) = total bytesequmakeslengtha constant equal to 6 (since we have 6 bytes)
The syntax byte [rsi] is a memory operand with explicit size specification. Using byte [..] explicitly tells the assembler we want to read 1 byte from that memory location.
Without size specifier, the assembler gets confused: It will be ambiguous for assembler and it will ask how many bytes should we read from [rsi]?
There is a new instruction we are seeing movzx = Move with Zero eXtend It takes a small value and places it in a larger register, filling the upper bits with zeros.
But why do we need to do it? We’re reading a byte (8 bits) but adding to rax (64 bits). Without zero-extension, we might get incorrect results.
Example 2: String Length Calculation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
section .data
string db "Hello, Assembly!", 0
section .text
global _start
_start:
mov rdi, string ; String pointer
xor rcx, rcx ; Counter = 0
strlen_loop:
cmp byte [rdi], 0 ; Check for null terminator
je strlen_done ; Found end of string
inc rdi ; Move to next character
inc rcx ; Increment count
jmp strlen_loop
strlen_done:
; RCX now contains string length (16)
mov rax, 60
mov rdi, 0
syscall
Addressing modes make array access elegant and efficient! The ability to combine base addresses, indexes, and scaling factors in a single instruction is what makes x86-64 assembly powerful for data processing.
Next Up: We’ll learn about Multiplication and Division Instructions - essential for more complex calculations and working with arrays of different element sizes!