Post

5. Data Representation and Basic Arithmetic Instructions

5. Data Representation and Basic Arithmetic Instructions

Now that you’ve written your first assembly program, let’s dive deeper into how data is handled and manipulated. This chapter builds on our understanding of data representation and introduces the fundamental arithmetic operations that form the core of computational logic.

Working with Different Data Sizes

In Intel syntax, the operand size is determined by the register used, not by instruction suffixes. This makes understanding register sizes crucial:

1
2
3
4
mov al, 0x12          ; 8-bit operation (byte)
mov ax, 0x1234        ; 16-bit operation (word)
mov eax, 0x12345678   ; 32-bit operation (doubleword)
mov rax, 0x123456789ABCDEF0  ; 64-bit operation (quadword)

Important: When you write to a 32-bit register, the upper 32 bits are automatically zeroed:

1
2
mov eax, 0xFFFFFFFF   ; EAX = 0xFFFFFFFF, RAX = 0x00000000FFFFFFFF
mov rax, 0xFFFFFFFF   ; RAX = 0x00000000FFFFFFFF (same result!)

I recommend using GDB with Pwndbg extension - it’s an excellent debugger that gives you a clear, real-time view of what your assembly instructions are doing.

Essential Commands in GDB:

1
2
3
4
5
6
7
break _start      # Set breakpoint at entry point
run               # Start program
si                # Step one instruction
ni                # Step over function calls
info registers    # Show all register values
x/10x $rsp        # Examine 10 words at stack pointer
context           # Pwndbg: show registers, code, stack

Basic Arithmetic Instructions

Core Arithmetic Instructions

  • add: Adds the source operand to the destination operand and stores the result in the destination.
  • sub: Subtracts the source operand from the destination operand and stores the result in the destination.
  • mul: Multiplies the accumulator register by the source operand (unsigned).
  • imul: Multiplies two operands (signed).
  • div: Divides the accumulator by the source operand (unsigned).
  • idiv: Divides the accumulator by the source operand (signed).
  • inc: Increments an operand by 1.
  • dec: Decrements an operand by 1.

Addition and Subtraction:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
section .data
    num1 dq 20       ; Define a 64-bit integer with value 20
    num2 dq 4        ; Define a 64-bit integer with value 4

section .text
    global _start

_start:
    mov rax, [num1]  ; Load num1 into rax register
    add rax, [num2]  ; rax = rax + num2 (20 + 4 = 24)

    sub rax, 5       ; rax = rax - 5 (24 - 5 = 19)

    mov rbx, 3       
    imul rax, rbx    ; rax = rax * rbx (19 * 3 = 57)

    mov rdx, 0       ; Clear rdx for division
    mov rcx, 7       
    div rcx          ; Divide rdx:rax by rcx; quotient in rax (57/7=8), remainder in rdx (1)

    ; Exit syscall (Linux)
    mov rax, 60      ; syscall: exit
    xor rdi, rdi     ; status 0
    syscall

Let’s compile and run this assembly code -

1
2
3
4
5
nasm -f elf64 arithmetic.s -o arithmetic.o
ld arithmetic.o -o arithmetic

# Load it in GDB
gdb ./arithmetic

You can go step by step and analyze the change in registers.

For making this quick and doing rapid prototyping I prefer you use rappel

1
2
git clone https://github.com/yrp604/rappel.git
CC=clang make

Start rappel -

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ ./bin/rappel 
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=0000000000400001 rsp=00007ffc33654340 rbp=0000000000000000
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
[cf=0, zf=0, of=0, sf=0, pf=0, af=0, df=0]
cs=0033  ss=002b  ds=0000  es=0000  fs=0000  gs=0000            efl=00000202
> 

# Now you can enter basic assembly instructions
> mov rax, 0x1234
rax=0000000000001234 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=0000000000400006 rsp=00007ffc33654340 rbp=0000000000000000
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
[cf=0, zf=0, of=0, sf=0, pf=0, af=0, df=0]
cs=0033  ss=002b  ds=0000  es=0000  fs=0000  gs=0000            efl=00000202

SASM

If you’re coming from modern IDEs like VSCode and prefer a graphical interface for writing and debugging assembly code, SASM (SimpleASM) is an excellent choice!

Installation on Ubuntu/Debian:

1
2
sudo apt update
sudo apt install sasm

Getting Started with SASM

  1. Open SASM and create a new file
  2. Choose your assembler (NASM for Intel syntax)
  3. Write your code in the editor
  4. Build and Run with one click
  5. Debug using the integrated debugger

Sample SASM setup:

1
2
3
4
5
6
7
8
%include "io64.inc"    ; SASM-specific I/O macros

section .text
global CMAIN
CMAIN:
    ; Your code here
    mov eax, 0
    ret

In SASM, CMAIN is the default entry point that the IDE expects. It’s part of SASM’s template system that makes coding easier for beginners.

It is equivalent to _start we used when using nasm. Don’t worry about the difference too much - the assembly concepts you learn with _start will transfer directly to SASM. The main difference is just the entry point and how I/O is handled.

io64.inc is SASM’s include file that provides ready-to-use I/O functions for x86-64 and io.inc for x86 system. Instead of writing complex system calls yourself, you can use these simple macros. The io64.inc file is located at /usr/share/sasm/include/io64.inc.

Here are some useful macros you’ll find in io64.inc:

1
2
3
4
5
6
7
8
9
10
11
12
13
; PRINTING
PRINT_STRING "Hello"      ; Print a string
PRINT_CHAR 'A'            ; Print a single character
PRINT_DEC 8, rax          ; Print decimal number from RAX (8 bytes)
PRINT_HEX 8, rbx          ; Print hexadecimal number from RBX

; INPUT  
GET_CHAR                  ; Read a character into AL
GET_STRING buffer, size   ; Read a string into buffer
GET_DEC 8, rax            ; Read decimal number into RAX

; FORMATTING
NEWLINE                   ; Print newline character

If you view it then it will show functions predefined by SASM. Its fine if you dont get it we will learn in future how to define a function and import it .

Example: SASM vs Raw System Calls

With SASM io64.inc:

1
2
3
4
5
6
7
8
9
10
11
12
%include "io64.inc"

section .data
    name db "Alice", 0

section .text
global CMAIN
CMAIN:
    PRINT_STRING "Hello, "
    PRINT_STRING name
    NEWLINE
    ret

To build use Ctrl + F9 or click on the Build icon. Then debug it using F5 or in Debug -> Debug option.

To enable Registers and Memory view, use Ctrl + R and Ctrl + M. You can find GDB command prompt in the bottom of the screen where you can use it just like using GDB.

This post is licensed under CC BY 4.0 by the author.