In this post we will analyze the linux/x86/exec
shellcode that comes bundled with Metasploit. We opt to use GDB to observe its behavior at runtime. First, we will check out which options are needed to generate a sample with msfvenom
.
root@kali:~# msfvenom -p linux/x86/exec --payload-options
Options for payload/linux/x86/exec:
Name: Linux Execute Command
Module: payload/linux/x86/exec
Platform: Linux
Arch: x86
Needs Admin: No
Total size: 36
Rank: Normal
Provided by:
vlad902 <vlad902@gmail.com>
Basic options:
Name Current Setting Required Description
---- --------------- -------- -----------
CMD yes The command string to execute
Description:
Execute an arbitrary command
Advanced options for payload/linux/x86/exec:
...
Thus, as we could imagine beforehand, we just need to specify the command to be executed when the shellcode is run by setting the CMD
option. We choose something like whoami
for simplicity’s sake.
We can then generate the payload with msfvenom
by using the following command:
root@kali:~# msfvenom -p linux/x86/exec CMD=/bin/whoami -f c
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 47 bytes
Final size of c file: 224 bytes
unsigned char buf[] =
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73\x68"
"\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x0c\x00\x00\x00\x2f"
"\x62\x69\x6e\x2f\x77\x68\x6f\x61\x6d\x69\x00\x57\x53\x89\xe1"
"\xcd\x80";
We add the -f
parameter to indicate that we are interested in shellcode output that is ready to be inserted into a C program. This way we can directly insert it in our little C wrapper, which will come handy in order to be able to run the shellcode with GDB.
#include<stdio.h>
#include<string.h>
unsigned char code[] =
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73\x68"
"\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x0c\x00\x00\x00\x2f"
"\x62\x69\x6e\x2f\x77\x68\x6f\x61\x6d\x69\x00\x57\x53\x89\xe1"
"\xcd\x80";
int main() {
printf("Shellcode length: %d\n", strlen(code));
int (*ret)() = (int(*)()) code;
ret();
}
We then compile this file and check that it’s working:
$ gcc -ggdb -o shellcode-wrapper-exec shellcode-wrapper-exec.c -fno-stack-protector -z execstack -m32
$ ./shellcode-wrapper-exec
Shellcode length: 15
root
We now open the generated executable with GDB and set a breakpoint at the start of our shellcode, which we can reference by the variable name code
, as in the previously listed C code snippet.
$ gdb -q shellcode-wrapper-exec
Reading symbols from shellcode-wrapper-exec...done.
(gdb) break *&code
Breakpoint 1 at 0x2040
Let’s now let the program run so that the breakpoint is reached, and at that point we list the assembly instructions.
(gdb) r
Starting program: /home/shellcode/5-linux-x86-exec-analysis/shellcode-wrapper-exec
Shellcode length: 15
Breakpoint 1, 0x56557040 in code ()
(gdb) disas
Dump of assembler code for function code:
=> 0x56557040 <+0>: push 0xb
0x56557042 <+2>: pop eax
0x56557043 <+3>: cdq
0x56557044 <+4>: push edx
0x56557045 <+5>: pushw 0x632d
0x56557049 <+9>: mov edi,esp
0x5655704b <+11>: push 0x68732f
0x56557050 <+16>: push 0x6e69622f
0x56557055 <+21>: mov ebx,esp
0x56557057 <+23>: push edx
0x56557058 <+24>: call 0x56557069 <code+41>
0x5655705d <+29>: das
0x5655705e <+30>: bound ebp,QWORD PTR [ecx+0x6e]
0x56557061 <+33>: das
0x56557062 <+34>: ja 0x565570cc
0x56557064 <+36>: outs dx,DWORD PTR ds:[esi]
0x56557065 <+37>: popa
0x56557066 <+38>: ins DWORD PTR es:[edi],dx
0x56557067 <+39>: imul eax,DWORD PTR [eax],0xe1895357
0x5655706d <+45>: int 0x80
0x5655706f <+47>: add BYTE PTR [eax],al
End of assembler dump.
Let’s focus on the first bunch of instructions:
push 0xb
pop eax
cdq
push edx
pushw 0x632d
mov edi,esp
The first two instructions put 0xb
at EAX
in a shorter way than performing the typical mov
instruction. We will come back to the 0xb
value later, just keep it in mind! The cdq
instruction might look strange on a first look. We find some more information in the Intel® 64 and IA-32 Architectures Software Developer’s Manual.
The CDQ instruction copies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register
Therefore, using cdq
in this case is a nice one-byte solution to zeroing out the EDX
register. These zeroes are next put into the stack via push edx
. The following pushw
instruction puts a word (16 bytes) in the stack, and then the mov
instruction is used to save the stack pointer into EDI
.
Let’s examine the stack at this point by placing a new breakpoint right after this last instruction and letting the program execution continue.
(gdb) break *0x56557049
Breakpoint 2 at 0x56557049
(gdb) c
Continuing.
Breakpoint 2, 0x56557049 in code ()
(gdb) x/8x $esp
0xffffd616: 0x2d 0x63 0x00 0x00 0x00 0x00 0x9d 0x55
We can observe the 0x632d
word at the top of the stack, followed by the four 0x00
bytes that were pushed from EDX
. What is achieved with this? Let’s try printing those characters as a string:
(gdb) x/s $esp
0xffffd616: "-c"
Interesting! So we now have the address that points to the string -c
in EDI
. We head back to the assembly code.
push 0x68732f
push 0x6e69622f
mov ebx,esp
These instructions put some characters in the stack and then save the corresponding pointer into EBX
. Let’s examine those in the same way.
(gdb) break *0x56557055
Breakpoint 3 at 0x56557055
(gdb) c
Continuing.
Breakpoint 3, 0x56557055 in code ()
(gdb) x/s $esp
0xffffd60e: "/bin/sh"
That starts making sense, both /bin/sh
and -c
are now already accessible from EBX
and EDI
respectively.
push edx
call 0x56557069 <code+41>
These instructions first push some zero bytes into the stack (remember that EDX
was zeroed-out at the very beginning), and then the execution is moved to another address via the call
instruction. This instruction has one particularity that will most probably be useful for the following steps: the return address is pushed to the stack so that the execution can be resumed once the called procedure is finished (using the ret
instruction).
Therefore, since the call
instruction is at address 0x56557058
, 0x56557058+8
will be pushed to the stack. As we could previously see, the instructions shown by GDB from that address do not make much sense. What do we actually have there?
(gdb) x/s 0x5655705d
0x5655705d <code+29>: "/bin/whoami"
Exactly, that is our command! Let’s now follow the execution from the destination address of the call
instruction to find out how this is used.
(gdb) break *0x56557069
Breakpoint 4 at 0x56557069
(gdb) c
Continuing.
Breakpoint 4, 0x56557069 in code ()
(gdb) x/8i $eip
=> 0x56557069 <code+41>: push edi
0x5655706a <code+42>: push ebx
0x5655706b <code+43>: mov ecx,esp
0x5655706d <code+45>: int 0x80
0x5655706f <code+47>: add BYTE PTR [eax],al
0x56557071: add BYTE PTR [eax],al
0x56557073: add BYTE PTR [eax],al
0x56557075: add BYTE PTR [eax],al
Having found an int 0x80
instruction, we can say that there is a system call in there, let’s extract the assembly code we are interested in:
push edi
push ebx
mov ecx,esp
int 0x80
This is setting up some registers for the system call and then performing the call via int 0x80
. Which call is this? Remember the 0xb
value in EAX
? That register will indicate which specific syscall is being called, and in this case this value is for syscall execve
.
By checking the Linux manpages (online at http://man7.org/linux/man-pages/man2/execve.2.html), we see that execve takes three arguments: the first one is the pointer to the program to be executed, the second is a pointer to an array with its arguments, and the third one defines environment variables to be used by the program during its execution.
These arguments are defined in assembly by using the EBX
, ECX
and EDX
, respectively.
Let’s check the current status of these registers:
EBX
: contains/bin/sh
, the program to be executed byexecve
.ECX
: check themov ecx, esp
instruction. At this point the stack contains the address to/bin/sh
,-c
, and/bin/whoami
, followed by null bytes. By placing the stack pointer intoECX
, this is effectively storing the address to a three-element array which defines the arguments for our program (note that the first element is the name of the program itself, again).EDX
: containx 0x0, and this is fine since we are not interested in environment variables.
Therefore, the shellcode execution results will be equivalent to the ouput of the /bin/sh -c /bin/whoami
command.
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://www.securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-964