Home of the Plackyhacker

Plackyhacker Home Posts Contact

Home > Posts > 64-bit Custom Shellcode Part 3

64-bit Custom Shellcode Part 3

MessageBox Shellcode

Writing shellcode to display a message box might seem a bit pointless but it includes all the elements needed to write more complex shellcode, such as a reverse shell.

To display a message box we are required to do the following:

We have discussed all of these in previous sections, the shellcode is presented in the following small chunks.

Whilst working through these sections you should test it with Windbg preview, using the workflow presented or your own. As you build the shellcode up, insert breakpoints and examine registers and memory to ensure what you expect to be there is there.

Groundwork

The first chunk has been discussed in depth and will not be discussed further. It locates kernel32.dll, resolves the VMA of GetProcAddress and prepares us for the rest of the shellcode:

        BITS 64
SECTION .text
global main
            
main:
  push  rbp                       ;
  and   rsp, 0FFFFFFFFFFFFFFF0h   ; Align the stack to a multiple of 16 bytes
  mov   rbp, rsp                  ;
  sub   rsp, 0x64                 ; 100 bytes of shadow space
                
find_kernel32:
  xor rcx, rcx                    ; RCX = 0
  mov rax, [gs:rcx + 0x60]        ; RAX = PEB
  mov rax, [rax + 0x18]           ; RAX = PEB->Ldr
  mov rsi, [rax + 0x20]           ; RSI = PEB->Ldr.InMemOrder
  lodsq                           ; RAX = Second module(NTDLL)
  xchg rax, rsi                   ; RAX = RSI, RSI = RAX
  lodsq                           ; RAX = Third(kernel32)
  mov rbx, [rax + 0x20]           ; RBX = Base address
            
get_function_address:
  lea rsi, [rel get_function + 0x41414141] 
                                                ; POP the function address in to RSI
  sub rsi, 0x41414141             ; 
  mov [rbp-0x20], rsi             ; [RBP-0x20] = get_function address
  jmp start                       ;
            
get_function:
  xor r8, r8                      ; R8 = 0
  mov r8d, [rbx + 0x3c]           ; R8D = DOS->e_lfanew offset
  mov rdx, r8                     ; RDX = DOS->e_lfanew
  add rdx, rbx                    ; RDX = PE Header
            
  add rdx, 0x44                   ; add 0x44 to RDX to avoid null bytes
  add rdx, 0x44                   ; add 0x44 to RDX to avoid null bytes
            
  mov r8d, [rdx]                  ; R8D = Offset export table - was [rdx + 0x88]
  add r8, rbx                     ; R8 = Export table
  xor rsi, rsi                    ; Clear RSI
  mov esi, [r8 + 0x20]            ; RSI = Offset namestable
  add rsi, rbx                    ; RSI = Names table
  xor rcx, rcx                    ; RCX = 0
next_function_name:
  inc rcx                         ; Increment the ordinal
  xor rax, rax                    ; RAX = 0
  mov eax, [rsi + rcx * 4]        ; Get name offset
  add rax, rbx                    ; Get function name
  cmp qword [rax], r9             ; Does it match the function name in R9 ?
  jnz next_function_name          ;
                
found_function:
  xor rsi, rsi                    ; RSI = 0
  mov esi, [r8 + 0x24]            ; ESI = Offset ordinals
  add rsi, rbx                    ; RSI = Ordinals table
  mov cx, [rsi + rcx * 2]         ; Number of function
  xor rsi, rsi                    ; RSI = 0
  mov esi, [r8 + 0x1c]            ; Offset address table
  add rsi, rbx                    ; ESI = Address table
  xor rdx, rdx                    ; RDX = 0
  mov edx, [rsi + rcx * 4]        ; EDX = Pointer(offset)
  add rdx, rbx                    ; RDX = Function Address
  mov rdi, rdx                    ; Save Function Address in RDI
  ret                             ;
            
start:
  get_getprocaddress:
    mov r9, 0x41636f7250746547      ; GetProcA (in ASCII AcorPteG)
    call QWORD [rbp-0x20]           ; CALL get_function
    mov [rbp-0x18], rdi             ; [RBP-0x18] = *GetProcAddress
    

This shellcode snippet can be used as a foundation for a lot of user land shellcode.

LoadLibraryA

The next snippet will get the VMA of LoadLibraryA. We will do this by calling the GetProcAddress function. The syntax for this call is shown below:

FARPROC GetProcAddress(
  [in] HMODULE hModule,
  [in] LPCSTR  lpProcName
);

Remember, when calling a function in 64-bit Windows the first parameter is passed in via the rcx register, and the second parameter is passed in using the rdx register:

  1. The hModule parameter is passed in the rcx register.
  2. The lpProcName parameter is passed in the rdx register.

hModule is the module/DLL where the function resides, in this case kernel32.dll (we already have the base address for this module in rbx). lProcName is a pointer to a string; this means we need to place "LoadLibraryA" somewhere in memory and put a pointer to it in rdx:

    call_getprocaddress_loadlibrarya:
  mov [rbp-0x28], rbx             ; [RBP-0x28] = Kernel32 base address
  mov rcx, [rbp-0x28]             ; RCX = hModule = Kernel32 base address
  mov rax, 0x41797261             ;
  push rax                        ;
  mov rax, 0x7262694c64616f4c     ;
  push rax                        ;
  mov rdx, rsp                    ; RDX = lpProcName = LoadLibraryA  
  sub rsp, 0x2c                   ; Allocate stack space for the function call 
                                  ;  (+ alignment)
  call [rbp-0x18]                 ; CALL GetProcAddress
  add rsp, 0x2c                   ; Clean up allocated space
  add rsp, 0x10                   ; Clean up LoadLibraryA on stack
  mov [rbp-0x30], rax             ; [RBP=0x30] = *LoadLibraryA

On line 2 we store the base address of kernel32 in the memory at rbx-0x28. This isn't strictly necessary in this instance but there might be times when we may need to reference it later.

On line 3 the base address is placed in to rcx which is the hModule parameter.

The string "LoadLibraryA" is pushed on to the stack in reverse order and a pointer to the string is moved in to rdx via rsp (which points to the string that was just pushed on to the stack). This is done on lines 4 through 8.

The call is made on line 11 and the return value (the VMA of the function) is stored in [rbx-0x30 on line 14.

We now have a reference to the LoadLibraryA function.

User32.DLL

This snippet uses the LoadLibraryA function to load the user32.dll module in to memory and return the base address. We will use this to locate the VMA for the MessageBoxA function. The syntax for this call is shown below:

HMODULE LoadLibraryA(
  [in] LPCSTR lpLibFileName
);

This call is very simple and only has one paramater, which is a pointer to a string and is moved in to the rcx register:

            call_loadlibrarya_user32.dll:
  mov rax, 0x6c6c                 ; PUSH user32.dll
  push rax                        ;
  mov rax, 0x642e323372657375     ;
  push rax                        ;
  mov rcx, rsp                    ; RCX = lpLibFileName = user32.dll
  sub rsp, 0x2c                   ; Allocate stack space for the function call (+ allignment)
  call [rbp-0x30]                 ; CALL LoadLibraryA
  add rsp, 0x2c                   ; Clean up allocated space
  add rsp, 0x10                   ; Clean up user32.dll on stack

Lines 2 through 6 push the string on to the stack and move the pointer in to the rcx register. Line 8 calls LoadLibraryA (remember from the previous section that [rbp-0x30] holds the address for the function LoadLibraryA. The function returns the base address of the module in the rax register, which we will use next.

MessageBoxA

The next snippet gets the VMA of MessageBoxA by calling the GetProcAddress function. This should be familiar by now and does not require any explanation:

    call_getprocaddress_messageboxa:
  mov rcx, rax                    ; RCX = hModule = User32 base address
  mov rax, 0x41786f               ;
  push rax                        ;
  mov rax, 0x426567617373654d     ;
  push rax                        ;
  mov rdx, rsp                    ; RDX = lpProcName = MessageBoxA  
  sub rsp, 0x2c                   ; Allocate stack space for the function call (+ alignment)
  call [rbp-0x18]                 ; CALL GetProcAddress
  add rsp, 0x2c                   ; Clean up allocated space
  add rsp, 0x10                   ; Clean up MessageBoxA on stack
  mov r10, rax                    ; R10 = *MessageBoxA

Finally we make the call to MessageBoxA. The syntax for the call is shown below:

int MessageBoxA(
  [in, optional] HWND   hWnd,
  [in, optional] LPCSTR lpText,
  [in, optional] LPCSTR lpCaption,
  [in]           UINT   uType
);

When calling the MessageBoxA function:

  1. The hWnd parameter is passed in the rcx register.
  2. The lpText parameter is passed in the rdx register.
  3. The lpCaption parameter is passed in the r8 register.
  4. The uType parameter is passed in the r9 register.

The shellcode is shown below:

    ; the address of MessageBoxA is in r10
call_messagebox:
  xor  rax, rax                     ; RAX = 0
  mov  rcx, rax                     ; RCX = hWnd = NULL
  mov  r9,  rax                     ; R9 = uType = NULL (default)
  mov  rax, 0x6e6f697461            ; PUSH string on to the stack
  push rax                          ; .
  mov  rax, 0x74696f6c70784520      ; .
  push rax                          ; .
  mov  rax, 0x73776f646e695720      ; .
  push rax                          ; .
  mov  rax, 0x6465636e61766441      ; .
  push rax                          ; ---
  mov  rdx, rsp                     ; RDX = lpText
  mov  r8,  rdx                     ; R8 = lpCaption
  sub  rsp, 0x2c                    ; align the stack for the call
  call r10                          ; CALL R10 (MessageBoxA)

I will leave it to the reader to implement the TerminateProcess call. This should be straightforward using the same techniques presented.

Avoiding NULL

The presence of NULL characters in payloads can be detrimental as they can prematurely terminate string-based operations, leading to unintended consequences or truncation of data.

Analysing Shellcode

Using the workflow we developed we can compile our shellcode into a .bin /raw file. We can take this raw file and analyse with a tool in my github repository here. You will need to install the keystone and capstone engines and the rich framework using pip:

pip.exe install keystone-engine
pip.exe install capstone
pip.exe install rich

We can compile our MessageBox shellcode from the previous section:

nasm -f bin -o messagebox.bin messagebox.asm

We can then use the Bad Character tool to check for null bytes:

python.exe .\bad-char-check.py --raw ..\x64\messagebox.bin --badchars "0x00" --scroll 10 --platform x64

We can scroll through the shellcode until we find that there is two NULL bytes at 0x10d8:

We can see the offending shellcode in our assembly:

    call_loadlibrarya_user32.dll:
  mov rax, 0x6c6c                 ; PUSH user32.dll

Removing NULL Bytes

Various techniques exist to eliminate NULL bytes from shellcode, often requiring imaginative approaches to ensure that the payload remains free of these characters.

The shellcode below presents a way in which we can acheive the same objective but avoid NULL bytes:

    call_loadlibrarya_user32.dll:
  mov rax, 0x4141adad                 ; PUSH user32.dll
  mov rcx, 0x41414141                 ; RCX = 0x41414141
  sub rax, rcx                        ; RAX = 0x6c6c

After testing the .bin file again, we see that the NULL character has been eliminated:

As we work through eliminating the null bytes it would be prudent to insert a breakpoint and test the shellcode to ensure that we acheive the same outcome:

    call_loadlibrarya_user32.dll:
  mov rax, 0x4141adad             ; PUSH user32.dll
  mov rcx, 0x41414141             ; RCX = 0x41414141
  sub rax, rcx                    ; RAX = 0x6c6c
  push rax                        ;
  mov rax, 0x642e323372657375     ;
  push rax                        ;
  mov rcx, rsp                    ; RCX = lpLibFileName = user32.dll
  sub rsp, 0x2c                   ; Allocate stack space for the function call 
                                  ; (+ allignment)
  int3                            ; our breakpoint
  call [rbp-0x30]                 ; CALL LoadLibraryA
  add rsp, 0x2c                   ; Clean up allocated space
  add rsp, 0x10                   ; Clean up user32.dll on stack

The breakpoint has been entered at line 11. If we recompile this and debug it in Windbg Preview, we can hit the breakpoint. We can use the da @rcx command to show that the lpLibFileName is still correct:

0:000> g
(a24.1618): Break instruction exception - code 80000003 (first chance)
x64+0x10f8:
00007ff6`46c810f8 cc              int     3
0:000> da @rcx
00000038`cd6ffc3c  "user32.dll"

Don't forget to remove any int3 instructions when you want to use your shellcode for real.

Bad Characters

This technique can be used to eliminate bad characters too, but sometimes instruction mnemonics include bad characters.

Let us imagine that 0x31 is a bad character:

python.exe .\bad-char-check.py --raw ..\x64\x64.bin --badchars "0x00 0x31" --scroll 10 --platform x64

We can replace the xor rcx, rcx with:

    find_kernel32:
  ;xor rcx, rcx                   ; removed
  mov rcx, -0x01                  ;
  inc rcx                         ; RCX = 0

We can debug in Windbg Preview and we find that rcx is zero:

0:000> g
(2134.b6c): Break instruction exception - code 80000003 (first chance)
x64+0x1016:
00007ff6`f5f51016 cc              int     3
0:000> r rcx
rcx=0000000000000000

There are many different ways to replace bad characters in your shellcode and sometimes it might be more beneficial to change your shellcode rather than rely upon decoding shellcode in memory. This is particularly true if the memory in which your shellcode has been copied does not have write permissions.


Part 1 | Part 2 | Part 3