1. Introduction
2. Analyzing a 'strange' binary file
3. Analyzing a 'strange' process
4. Security Forensics using DTrace
5. Conclusion
--[ Introduction
The art of security forensics requires lots of patience, creativity and
observation. You may not always be successful in your endeavours but
constantly 'sharpening' your skills by hands-on practicing, learning a
couple more things here and there in advance will definitely help.
In this article I'd like to share my personal experience in analyzing
suspicious binary files and processes that you may find on the system. We
will use only standard, out of the box, UNIX utilities. The output for all
the examples in the article is provided for Solaris OS.
--[ Analyzing a 'strange' binary file
During your investigation you may encounter some executable (binary) files
whose purpose in your system you don't understand. When you try to read
this file it displays 'garbage'. You cannot recognize this file by name
and you are not sure if you saw it before.
Unfortunately, you cannot read the binary file with more, cat, pg, vi or
other utilities that you normally use for text files. You will need other
tools. In order to read such files, I use the following tools: strings,
file, ldd, adb, and others.
Let's assume, for example, that we found a file called cr1 in the /etc
directory. The first command to run on this file is strings(1). This will
show all printable strings in the object or binary file:
$ strings cr1 | more
%s %s %s%s%s -> %s%s%s (%.*s)
Version: 2.3
Usage: dsniff [-cdmn] [-i interface] [-s snaplen] [-f services]
[-t trigger[,...]] [-r|-w savefile] [expression]
...
/usr/local/lib/dsniff.magic
/usr/local/lib/dsniff.services
...
The output is very long, so some of it has been omitted. But you can see
that it shows that this is actually a dsniff tool masquerading as cr1.
Sometimes you may not be so lucky in finding the name of the program,
version, and usage inside the file. If you still don't know what this file
can do, try to run strings with the 'a' flag, or just '-'. With these
options, strings will look everywhere in the file for strings. If this flag
is omitted, strings only looks in the initialized data space of the object
file:
$ strings cr1 | more
Try to compare this against the output from known binaries to get an idea
of what the program might be.
Alternatively, you can use the nm(1) command to print a name list of an
object file:
$ /usr/ccs/bin/nm -p cr1 | more
cr1:
[Index] Value Size Type Bind Other Shndx Name
[180] |0 | 0| FILE | LOCL | 0 |ABS | decode_smtp.c
[2198] |160348| 320| FUNC | GLOB | 0 | 9 | decode_sniffer
Note that the output of this command may contain thousands of lines,
depending on the size of the object file. You can run nm through pipe to
more or pg, or redirect the output to the file for further analysis.
To check the runtime linker symbol table - calls of shared library routines,
use nm with the '-Du' options, where -D displays the symbol table used by
ld.so.1 and is present even in stripped dynamic executables, and -u prints
a long listing for each undefined symbol.
You can also dump selected parts of any binary file with the dump(1) or
elfdump(1) utilities. The following command will dump the strings table of
cr1 binary:
$ /usr/ccs/bin/dump -c ./cr1 | more
You may use the following options to dump various parts of the file:
-c Dump the string table(s).
-C Dump decoded C++ symbol table names.
-D Dump debugging information.
-f Dump each file header.
-h Dump the section headers.
-l Dump line number information.
-L Dump dynamic linking information and static shared library
information, if available.
-o Dump each program execution header.
-r Dump relocation information.
-s Dump section contents in hexadecimal.
-t Dump symbol table entries.
Note: To display internal version information contained within an ELF file,
use the pvs(1) utility.
If you are still not sure what the file is, run the command file(1):
$ file cr1
cr1: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+
Required, UltraSPARC1 Extensions Required, dynamically linked, not
stripped
Based on this output, we can tell that this is an executable file for SPARC
that requires the availability of libraries loaded by the OS (dynamically
linked). This file also is not stripped, which means that the symbol tables
were not removed from the compiled binary. This will help us a lot when we
do further analysis.
Note: To strip the symbols, do strip <my_file>.
The file command could also tell us that the binary file is statically
linked, with debug output or stripped.
Statically linked means that all functions are included in the binary, but
results in a larger executable. Debug output - includes debugging symbols,
like variable names, functions, internal symbols, source line numbers, and
source file information. If the file is stripped, its size is much smaller.
The file command identifies the type of a file using, among other tests, a
test for whether the file begins with a certain magic number (see the
/etc/magic file). A magic number is a numeric or string constant that
indicates the file type. See magic(4) for an explanation of the format of
/etc/magic.
If you still don't know what this file is used for, try to guess this by
taking a look at which shared libraries are needed by the binary using
ldd(1) command:
$ ldd cr1
...
libsocket.so.1 => /usr/lib/libsocket.so.1
librpcsvc.so.1 => /usr/lib/librpcsvc.so.1
...
This output tells us that this application requires network share libraries
(libsocket.so.1 and librpcsvc.so.1).
The adb(1) debugger can also be very useful. For example, the following
output shows step-by-step execution of the binary in question:
# adb cr1
:s
adb: target stopped at:
ld.so.1`_rt_boot: ba,a +0xc
<ld.so.1`_rt_boot+0xc>
,5:s
adb: target stopped at:
ld.so.1`_rt_boot+0x58: st %l1, [%o0 + 8]
You can also analyze the file, or run it and see how it actually works. But
be careful when you run an application because you don't know yet what to
expect. For example:
# adb cr1
:r
Using device /dev/hme0 (promiscuous mode)
192.168.2.119 -> web TCP D=22 S=1111 Ack=2013255208
Seq=1407308568 Len=0 Win=17520
web -> 192.168.2.119 TCP D=1111 S=22 Push Ack=1407308568
We can see that this program is a sniffer. See adb(1) for more information
of how to use the debugger.
If you decide to run a program anyway, you can use truss(1). The truss
command allows you to run a program while outputting system calls and
signals.
Note: truss produces lots of output. Redirect the output to the file:
$ truss -f -o cr.out ./cr1
listening on hme0
^C
$
Now you can easily examine the output file cr.out.
As you can see, many tools and techniques can be used to analyze a strange
file. Not all files are easy to analyze. If a file is a statically linked
stripped binary, it would be much more difficult to find what a file
(program) is up to. If you cannot tell anything about a file using simple
tools like strings and ldd, try to debug it and use truss. Experience using
and analyzing the output of these tools, together with a good deal of
patience, will reward you with success.
--[ Analyzing a 'strange' process
What do you do if you find a process that is running on your system, but
you don't know what it is doing? Yes, in UNIX everything is a file, even a
process! There may be situations in which the application runs on the
system but a file is deleted. In this situation the process will still run
because a link to the process exists in the /proc/[PID]/object/a.out
directory, but you may not find the process by its name running the find(1)
command.
For example, let's assume that we are going to investigate the process
ID 22889 from the suspicious srg application that we found running on our
system:
# ps -ef | more
UID PID PPID C STIME TTY TIME CMD
...
root 22889 16318 0 10:09:25 pts/1 0:00 ./srg
...
Sometimes it is as easy as running the strings(1) command against the
/proc/[PID]/object/a.out to identify the process.
# strings /proc/22889/object/a.out | more
...
TTY-Watcher version %s
Usage: %s [-c]
-c turns on curses interface
NOTE: Running without root privileges will only allow you to monitor
yourself.
...
We can see that this command is a TTY-Watcher application that can see all
keystrokes from any terminal on the system.
Suppose we were not able to use strings to identify what this process is
doing. We can examine the process using other tools.
You may want to suspend the process until you will figure out what it is.
For example, run kill -STOP 22889 as root. Check the results. We will look
for 'T' which indicates the process that was stopped:
# /usr/ucb/ps | grep T
root 22889 0.0 0.7 3784 1720 pts/1 T 10:09:25 0:00 ./srg
Resume the process if necessary with kill -CONT <PID>
To further analyze the process, we will create a \core dump\ of variables
and stack of the process:
# gcore 22889
gcore: core.22889 dumped
Here, 22889 is the process ID (PID). Examine results of the core.22889 with
strings:
# strings core.22889 | more
...
TTY-Watcher version %s
Usage: %s [-c]
-c turns on curses interface
NOTE: Running without root privileges will only allow you to monitor
yourself.
...
You may also use coreadm(1M) to analyze the core.22889 file. The coreadm
tool provides an interface for managing the parameters that affect core
file creation. The coreadm command modifies the /etc/coreadm.conf file.
This file is read at boot time and sets the global parameters for core
dump creation.
First, let's set our core filenames to be of the form core.<PROC NAME>.<PID>.
We'll do this only for all programs we execute in this shell (the $$
notation equates to the PID of our current shell):
$ coreadm -p core.%f.%p $$
The %f indicates that the program name will be included, and the %p
indicates that the PID will be appended to the core filename.
You may also use adb to analyze the process. If you don't have the object
file, use the /proc/[PID]/object/a.out. You can use a core file for the
process dumped by gcore or specify a '-' as a core file. If a dash (-) is
specified for the core file, adb will use the system memory to execute the
object file. You can actually run the object file under the adb control (it
could also be dangerous because you don't know for sure what this
application is supposed to do!):
# adb /proc/22889/object/a.out -
main:b
:r
breakpoint at:
main: save %sp, -0xf8, %sp
...
:s
stopped at:
main+4: clr %l0
:s
stopped at:
main+8: sethi %hi(0x38400), %o0
$m
? map
...
b11 = ef632f28 e11 = ef6370ac f11 = 2f28 `/usr/lib/libsocket.so.1'
$q
We start the session by setting a breakpoint at the beginning of main() and
then begin execution of a.out by giving adb the ':r' command to run.
Immediately, we stop at main(), where our breakpoint was set. Next, we list
the first instruction from the object file. The ':s' command tells adb to
step, executing only one assembly instruction at a time.
Note: Consult the book Panic!, by Drake and Brown, for more information on
how to use adb to analyze core dumps.
To analyze the running process, use truss:
# truss -vall -f -o /tmp/outfile -p 22889
# more /tmp/outfile
On other UNIX systems, where available, you may trace a process by using the
ltrace or strace commands. To start the trace, type ltrace -p <PID>.
To view the running process environment, you may use the following:
# /usr/ucb/ps auxeww 22889
USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND
root 22889 0.0 0.4 1120 896 pts/1 S 14:15:27 0:00 -
sh _=/usr/bin/csh
MANPATH=/usr/share/man:/usr/local/man HZ=
PATH=/usr/sbin:/usr/bin:/usr/local/bin:/usr/ccs/bin:/usr/local/sbin:
/opt/NSCPcom/ LOGNAME=root SHELL=/bin/ksh HOME=/
LD_LIBRARY_PATH=/usr/openwin/lib:/usr/local/lib TERM=xterm TZ=
The /usr/ucb directory contains SunOS/BSD compatibility package commands. The
/usr/ucb/ps command displays information about processes. We used the
following options (from the man for ps(1B)):
-a Include information about processes owned by others.
-u Display user-oriented output. This includes fields USER, %CPU,o
%MEM, SZ, RSS and START as described below.
-x Include processes with no controlling terminal.
-e Display the environment as well as the arguments to the command.
-w Use a wide output format (132 columns rather than 80); if repeated,
that is, -ww, use arbitrarily wide output. This information is
used to decide how much of long commands to print.
To view the memory address type:
# ps -ealf | grep 22889
F S UID PID PPID C PRI NI ADDR SZ WCHAN
STIME TTY TIME CMD
8 S root 3401 22889 0 41 20 615a3b40 474 60ba32e6 14:16:49
pts/1 0:00 ./srg
To view the memory usage, type:
# ps -e -opid,vsz,rss,args
PID VSZ RSS COMMAND
...
22889 3792 1728 ./srg
We can see that the ./srg uses 3,792 K of virtual memory, 1,728 of which
have been allocated from physical memory.
You can use the /etc/crash(1M) utility to examine the contents of a proc
structure of the running process:
# /etc/crash
dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout
> p
PROC TABLE SIZE = 3946
SLOT ST PID PPID PGID SID UID PRI NAME FLAGS
...
66 s 22889 16318 16337 24130 0 58 srg load
> p -f 66
PROC TABLE SIZE = 3946
SLOT ST PID PPID PGID SID UID PRI NAME FLAGS
66 s 22889 16318 16337 24130 0 58 srg load
Session: sid: 24130, ctty: vnode(60b8f3ac) maj( 24) min( 1)
...
>
After invoking the crash utility, we used the p function to get the process
table slot (66, in this case). Then, to dump the proc structure for process
PID 22889, we again used the p utility, with the '-f' flag and the process
table slot number.
Like the process structure, the uarea contains supporting data for signals,
including an array that defines the disposition for each possible signal.
The signal disposition tells the operating system what to do in the event
of a signal - ignore it, catch it and invoke a user-defined signal handler,
or take the default action. To dump a process's uarea:
> u 66
PER PROCESS USER AREA FOR PROCESS 66
PROCESS MISC:
command: srg, psargs: ./srg
start: Mon Jun 3 08:56:40 2002
mem: 6ad, type: exec su-user
vnode of current directory: 612daf48
...
>
The 'u' function takes a process table slot number as an argument.
To dump the address space of a process, type:
# /usr/proc/bin/pmap -x 22889
To obtain a list of process's open files, use the /usr/proc/bin/pfiles
command:
# /usr/proc/bin/pfiles 22889
The command lists the process name and PID for the process' open files. Note
that various bits of information are provided on each open file, including
the file type, file flags, mode bits, and size.
If you cannot find a binary file and the process is on the memory only, you
can still use methods described for analyzing suspicious binary files above
against the process's object file. For example:
# /usr/ccs/bin/nm a.out | more
a.out:
[Index] Value Size Type Bind Other Shndx Name
...
[636] | 232688| 4|OBJT |GLOB |0 |17 |Master_utmp
[284] | 234864| 20|OBJT |GLOB |0 |17 |Mouse_status
You may also use mdb(1) - a modular debugger to analyze the process:
# mdb -p 22889
Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libuutil.so.1 ]
> :bjects
BASE LIMIT SIZE NAME
10000 62000 52000 ./srg
ff3b0000 ff3dc000 2c000 /lib/ld.so.1
ff370000 ff37c000 c000 /lib/libsocket.so.1
ff280000 ff312000 92000 /lib/libnsl.so.1
--[ Security Forensics using DTrace
Solaris 10 has introduced a new tool for Dynamic Tracing in the OS
environment - dtrace. This is a very powerful tool that allows system
administrators to observe and debug the OS behaviour or even to dynamically
modify the kernel. Dtrace has its own C/C++ like programming language called
'D language' and comes with many different options that I am not going to
discuss here. Consult dtrace(1M) man pages and
http://docs.sun.com/app/docs/doc/817-6223 for more information.
Although this tool has been designed primarily for developers and
administrators, I will explain how one can use dtrace for analyzing
suspicious files and process.
We will work on a case study, as followes. For example, let's assume that we
are going to investigate the process ID 968 from the suspicious srg
application that we found running on our system.
By typing the following at the command-line, you will list all files that
this particular process opens at the time of our monitoring. Let it run for
a while and terminate with Control-C:
# dtrace -n syscall:pen:entry'/pid == 968/
{ printf("%s%s",execname,copyinstr(arg0)); }'
dtrace: description 'syscall:pen*:entry' matched 2 probes
^C
CPU ID FUNCTION:NAME
0 14 open:entry srg /var/ld/ld.config
0 14 open:entry srg /lib/libdhcputil.so.1
0 14 open:entry srg /lib/libsocket.so.1
0 14 open:entry srg /lib/libnsl.so.1
D language comes with its own terminology, which I will try to address here
briefly.
The whole 'syscall:pen:entry' construction is called a 'probe' and
defines a location or activity to which dtrace binds a request to perform
a set of 'actions'. The 'syscall' element of the probe is called a 'provider'
and, in our case, permits to enable probes on 'entry' (start) to any 'open'
Solaris system call ('open' system call instracts the kernel to open a file
for reading or writing).
The so-called 'predicate' - /pid == 968/ uses the predefined dtrace
variable 'pid', which always evaluates to the process ID associated with
the thread that fired the corresponding probe.
The 'execname' and 'copyinstr(arg0)' are called 'actions' and define the
name of the current process executable file and convert the first integer
argument of the system call (arg0) into a string format respectively. The
printf's action uses the same syntax as in C language and serves for the
same purpose - to format the output.
Each D program consists of a series of 'clauses', each clause describing one
or more probes to enable, and an optional set of actions to perform when the
probe fires. The actions are listed as a series of statements enclosed in
curly braces { } following the probe name. Each statement ends with a
semicolon (.
You may want to read the Introduction from Solaris Tracing Guide
(http://docs.sun.com/app/docs/doc/817-6223) for more options and to
understand the syntax.
Note: As the name suggests, the dtrace (Dynamic Trace) utility will show you
the information about a chnaging process - in dynamic. That is, if the
process is idle (doesn't do any system calls or opens new files), you won't
be able to get any information. To analyze the process, either restart it or
use methods described in the previous two sections of this paper.
Second, we will use the following command-line construction to list all
system calls for 'srg'. Let it run for a while and terminate by Control-C:
# dtrace -n 'syscall:::entry /execname == "srg"/ { @num[probefunc] =
count(); }'
dtrace: description 'syscall:::entry ' matched 226 probes
^C
pollsys 1
getrlimit 1
connect 1
setsockopt 1
...
You may recognize some of the building elements of this small D program. In
addition, this clause defines an array named 'num' and assigns the
appropriate member 'probefunc' (executed system call's function) the namber
of times these particular functions have been called (count()).
Using dtrace we can easily emulate all utilities we have used in the
previous sections to analyze suspicious binary files and processes. But
dtrace is much more powerful tool and may provide one with more
functionality: for example, you can dynamically monitor the stack of the
process in question:
# dtrace -n 'syscall:::entry/execname == "srg"/{ustack()}'
0 286 lwp_sigmask:entry
libc.so.1`__systemcall6+0x20
libc.so.1`pthread_sigmask+0x1b4
libc.so.1`sigprocmask+0x20
srg`srg_alarm+0x134
srg`scan+0x400
srg`net_read+0xc4
srg`main+0xabc
srg`_start+0x108
Based on all our investigation (see the list of opened files, syscalls,
and the stack examination above), we may positively conclude that srg is a
network based application. Does it write to the network? Let's check it by
constructing the following clause:
# dtrace -n 'mib:ip::/execname == "srg"/{@[execname]=count()}'
dtrace: description 'mib:ip::' matched 412 probes
dtrace: aggregation size lowered to 2m
^C
srg 520
It does. We used 'mib' provider to find out if our application transmits
to the network.
Could it be just a sniffer or a netcat-liked application that is bounded
to a specific port? Let's run dtrace in the truss(1) like fashion to answer
this question (inspired by Brendan Gregg's dtruss utility ):
#!/usr/bin/sh
#
dtrace='
inline string cmd_name = "'$1'";
/*
** Save syscall entry info
*/
syscall:::entry
/execname == cmd_name/
{
/* set start details */
self->start = timestamp;
self->arg0 = arg0;
self->arg1 = arg1;
self->arg2 = arg2;
}
/* Print data */
syscall::write:return,
syscall:write:return,
syscall::*read*:return
/self->start/
{
printf("%s(0x%X, \"%S\", 0x%X)\t\t = %d\n",probefunc,self->arg0,
stringof(copyin(self->arg1,self->arg2)),self->arg2,(int)arg0);
self->arg0 = arg0;
self->arg1 = arg1;
self->arg2 = arg2;
}
'
# Run dtrace
/usr/sbin/dtrace -x evaltime=exec -n "$dtrace" >&2
Save it as truss.d, change the permissions to executable and run:
# ./truss.d srg
0 13 write:return write(0x1, " sol10 -
> 192.168.2.119 TCP D=3138 S=22 Ack=713701289 Seq=3755926338 Len=0
Win=49640\n8741 Len=52 Win=16792\n\0", 0x5B) = 91
0 13 0 13
write:return write(0x1, "192.168.2.111 -> 192.168.2.1 UDP D=1900
S=21405 LEN=140\n\0", 0x39) = 57
^C
Looks like a sniffer to me, with probably some remote logging (remember the
network transmission by ./srg discovered by the 'mib' provider above!).
You can actually write pretty sophisticated programs for dtrace using D
language.
Take a look at /usr/demo/dtrace for some examples.
You may also use dtrace for other forensic activities. Below is an example
of more complex script that allows monitoring of who fires the suspicious
application and starts recording of all the files opened by the process:
#!/usr/bin/sh
command=$1
/usr/sbin/dtrace -n '
inline string COMMAND = "'$command'";
#pragma D option quiet
/*
** Print header
*/
dtrace:::BEGIN
{
/* print headers */
printf("%-20s %5s %5s %5s %s\n","START_TIME","UID","PID","PPID","ARGS");
}
/*
** Print exec event
*/
syscall::exec:return, syscall::exece:return
/(COMMAND == execname)/
{
/* print data */
printf("%-20Y %5d %5d %5d %s\n",walltimestamp,uid,pid,ppid,
stringof(curpsinfo->pr_psargs));
s_pid = pid;
}
/*
** Print open files
*/
syscall:pen*:entry
/pid == s_pid/
{
printf("%s\n",copyinstr(arg0));
}
'
Save this script as wait.d, change the permissions to executable
'chmod +x wait.d' and run:
# ./wait.d srg
START_TIME UID PID PPID ARGS
2005 May 16 19:51:20 100 1582 1458 ./srg
/var/ld/ld.config
/lib/libnsl.so.1
/lib/libsocket.so.1
/lib/libresolv.so.2
...
^C
Once the srg is started you will see the output.
However, the real power of dtrace comes from the fact that you can do
things with it that won't be possible without writing a comprehensive
C program. For example, the shellsnoop application written by Brendan Gregg
(http://users.tpg.com.au/adsln4yb/DTrace/shellsnoop) allows you to use
dtrace at the capacity of ttywatcher!
It is not possible to show all capabilities of dtrace in such a small
presentation of this amazing utility. Dtrace is a very powerful as well a
complex tool with virtually endless capabilities. Although Sun insists that
you don't have to have a 'deep understanding of the kernel for DTrace to be
useful', the knowledge of Solaris internals is a real asset. Taking a look
at the include files in /usr/include/sys/ directory may help you to write
complex D scripts and give you more of an understanding of how Solaris 10
is implemented.
--[ Conclusion
Be creative and observant. Apply all your knowledge and experience for
analyzing suspicious binary files and processes. Also, be patient and have
a sense of humour!
|=-------------------------=[ 0x03-2 ]=----------------------------------=|
|=----------=[ TCP Timestamp To count Hosts behind NAT ]=----------------=|
|=-----------------------------------------------------------------------=|
|=-------------=[ Elie aka Lupin (lupin@zonart.net) ]=-------------------=|
Table of Contents
*=*=*=*=*=*=*=*=*
1.0 - Introduction
2.0 - Time has something to tell us
- 2.1 Past history
- 2.2 Present
- 2.3 Back to the begin of timestamp history
- 2.4 Back to school
- 2.5 Back to the NAT
- 2.6 Let's do PAT
- 2.7 Time to fightback
3.0 History has something to tell us
- 3.1 Which class ?
- 3.2 So were does it come from ?
- 3.3 How do you find it ?
- 3.4 Back to the future
- 4 Learning from the past aka conclusion
- A Acknowledgements
- B Proof of concept
--[ 1.0 - Introduction
This article is about TCP timestamp option. This option is used to
offer a new way for counting host beyond a NAT and enhanced host
fingerprinting. More deeply, this article tries to give a new vision of a
class of bug known has "Design error". The bug described here, deserves
interest for the following reasons.
- It's new.
- It affects every platform since it is related to the specification
rather than implementation.
- It's a good way to explain how some specifications can be broken.
The article is organized has follow : First I will explain what's
wrong about TCP timestamp. Then I will describe How to exploit it, the
limitations of this exploitation and a way to avoid it. In the second part
I will talk about the origin of this error and why it will happen again. At
the end I will give a proof of concept and greeting as usual.
--[ 2.0 - Time has something to tell us
----[ 2.1 - Past history
Fingerprinting and Nat detection have been an active field for long
time. Since you read phrack you already know the old school TCP/IP
fingerprinting by Fyodor.
You may also know p0f (Passive of fingerprinting) by M. Zalewski. With
the version 2 he has done a wonderful tool, introducing clever ways to know
if a host uses the NAT mechanism by analyzing TCP packet option. If you are
interested in this tool (and you should !) read his paper :
"Dr. Jekyll had something to Hyde"[5].
In fact the technique described here is related to p0f in the way, that
like p0f, it can be totally passive.
To be complete about NAT detection, I need to mention that AT&T has
done research on counting host behind a NAT[1]. Their work focus on IP ID,
assuming that this value is incremental in some OS. In fact they are mainly
talking about Windows box which increment IP ID by 256 for each packet.
Discovered by Antirez[7], Nmap[6] has used this fact for a long time
(option -sI).
Now that we know what we are talking about it's time to explain what's
going on.
----[ 2.2 - Present
NAT was designed to face the IP address depletion. It is also used to
hide multiple hosts behind a single IP. The TCP timestamp option[2] is
improperly handled by the IP Network Address Translator (NAT) mechanism[3].
In other words even scrubbing from pf doesn't rewrite the timestamp option.
Until now this property of the NAT has been useless (in the security point
of view). It is interesting to point out that the timestamp option by itself
has already been used for information disclosure. Let's take a quick look
at timestamp security history
----[ 2.3 - Back to the beginning of timestamp history
In the past the timestamp has been used to calculate the uptime of a
computer[4]. Any one who had try the TCP fingerprint option (-O) of Nmap
has been impressed by a line like this one :
"Uptime 36.027 days (since Tue May 25 11:12:31 2004)".
Of course their is no black magic behind that, only two facts :
- Time goes back only in movie (sorry boys...)
- Every OS increments the timestamp by one every n milliseconds.
So if you know the OS, you know how often the OS increment the timestamp
option. All you have to do to know the uptime is to apply a trivial math
formula :
timestamp / num inc by sec = uptime in sec
Has you can notice this formula does not take into account the warp
around of integer. Here we know two information : the actual timestamp and
the number of increments by second. This can only be done because we know
the OS type. Let's see how we can improve this technique to do it without
knowing the OS.
----[ 2.4 - Back to school
Remember a long time ago at school, you heard about affine function.
A basic example of it is :
"y = Ax + B"
where A is the slope and B the initial point.
The graphic representation of it is straight line. From timestamp point of
view this can be express has the follow :
timestamp = numincbysec * sec + intial number
When you do active fingerprinting you get the timestamp and know the
numincbysec by guessing the OS.
Now let's suppose you can't guess the OS. In this case you don't know
the slope and can't guess the uptime. Here is an other way to know the
slope of the OS. You need to get the computer timestamp twice. Name it ts1
and ts2 and name the time (in sec) where you gather it t1 and t2.
With thoses informations, it is trivial to find the slope since we have the
following equationnal system:
ts1 = A*s1 + B
ts2 = A*s2 + B
which is solved by the following equation :
ts1 - ts2 = A*(s1 - s2) <=> A = (ts1 - ts2) / (s1 - s2)
An imediate application of this idea can be implemented in active scanner:
requeste twice the timestamp to verify that the slope is the
same as the one guessed.
This can be use to defeat some anti-fingerprint tools. It also can be used
as a standalone fingerprinting technic but will not be accurate has the TCP
or ICMP one.
Now that we have the theory ready, let's go back to NAT.
----[ 2.5 - Back to the NAT
Let's make the connection with the NAT. Since the timestamp option is
not rewritten by NAT, we can count the number of host behind the NAT using
the following algorithm :
1. for each host already discovered verifying is the packet belong to it
straight line equation. each host has a unique straight line equation
until two host have booted at the same second.
2. otherwise add the packet to unmatched packet : a new host beyond NAT is
detected.
Look to the proof of concept if you need to make things more clear.
This simple algorithm has a lot of room for improvement. It has been keeped
has simple has possible for clarity. As you can see timestamp option can be
used to count host beyond a NAT in a reliable manner. It will also giveo
indication of the OS class.
----[ 2.6 - Let's do PAT
PAT (Port Address Translation) is used to provide service on a box
behind a NAT.
The question is how do I know that the port is forwarded?
Well timestamp is once again your friend. If for two different ports the
slope of timestamp differs then there is a PAT and the OS of the two
computers is different. If the timestamp gathered from the two ports does
not belong to the same straight line, then it's the same OS but not the
same computer.
Another interesting use of PAT is the round robin. Until now their were
no way to know if such mechanism is used. By comparing the different
timestamps gathered you can determine how many hosts are beyond a single
port. This might be an interesting functionality to add to an active
scanner.
----[ 2.7 - Time to fight back
Since playing with this option can give valuable information there is
some limitation to this technique. Mainly Windows box does not use timestamp
option when they establish connection[8] unless you activate it. This
limitation only affects passive analysis, if you use timestamp when
you connect to a windows it will use it too. Moreover many tweaks software
activate the TCP extension in windows.
To be completed on the subject I had to mention that it seems that TCP
extension does not exist on win 9X.
One other problem is the time gap. In passive mode there can be a
desynchronization between computers due to computer desynchronization or
network lags. In the proof of concept this phenomenon can occur. To handle
it you need not to rely on the computer clock but on timestamp itself.
What can we do against this ? Since no vendor except Microsoft (1)
(Thanks Magnus) has answer to me, the following workaround may not be
available. Here is a theoric way to patch this problem.
1. Disabling tcp timestamp. This is the worse solution since we will need
it with fast network[2].
2. Make NAT rewrite the timestamp and changing The NAT RFC.
3. Changing the RFC to specify that the timestamp option needs to have a
random increment. Modifying each implementation to reflect this change.
The a clean way to fix this thing because it's does not rely on an
external system (the NAT computer in this case).
Well I have to try to be as complete as possible for this technical
part. The next part will be more "philosophic" since it deals with the
cause instead of the consequence.
--[ 3 - History has something to tell us
In this part I will try to focus on why we have this situation and what
we can do about it. Here I am not talking about the timestamp option by
itself but about the interaction between the timestamp option and the NAT
mechanism.
----[ 3.1 - Which class ?
First question is what is this bug? This bug belongs to the design
error class. To be more precise this bug exists because protocol
specification overlap. IP was designed to be a one on one protocol: one
client talks to one server. NAT violates this specification by allowing
multiple to one. By itself this violation has caused so many problems that
I lost the count of it, but it is pretty sure that the most recurrent
problem is the FTP transfer. If you use FTP you know what I mean (other can
look at netfilter ftp conntrack).
----[ 3.2 - So were does it come from ?
FTP problem is a good example to explain the origin of the overlap
specification problem. FTP was specified to work over a one to one
reliable connexion (TCP in fact). NAT was designed to modify IP. So due to
protocol dependency it also alter TCP and therefor FTP.
During NAT specification it was not taken into account that every
protocol that relies on IP, can conflict with the modified specification.
To tell the truth ,even if the people that design the NAT mechanism have
ever wanted to ensure that every protocol that relies on IP can work with
the NAT they couldn't make it.
Why ? because specification are RFC and RFC are in english. English is
not a good way to specify things especially if you have a dependency graph
for the specification.
For example many programming languages have formal specifications.
Which is a more full proof way. The reason of this lack of formal
specification resides on the history of Internet[9]. At this time writing a
simple text was good enough. Nowadays it can be very problematic.
----[ 3.3 - How do you find it ?
The big question is, how do I find this bug ?. Well I found this
problem by formalizing a part of the TCP RFC and confronts the result of
this analysis to real execution traces. My analyzer (2) warned me about a
timestamp that was less than the previous one and as you know time does not
go back...
I check out why and found this problem. What's interesting here is that
the start point to find the bug is the specification rather than the
implementation as it usually does to find a buffer overflow for example.
----[ 3.4 - Back to the future
So from now on, what will happen ? Well more design errors will be
found because we cannot change the past and we need to live with it. It is
not reasonable to say that we can wipe off all that TCP stuff and start a
new thing from scratch. Internet and network are simply too big to move
just like that. Just think for one second about the IP v6 deployment and
you will be convinced. All we can do is try to be as careful as possible
when designing a new extension or a protocol. Trying to ensure that this
new stuff does not conflicts with previous specification or breaks
dependence. We can also try to formalize the protocols as much as we can to
try and detect errors before they cause problems. Sadly patching is mainly
our primary option for the coming years.
--[ 4.0 - Learning from the past aka conclusion
The past tells us that protocol is not well enough specified and leads
to errors (bug, conflict...). It may be time to change our habits and try
something in ad equation with our time. For example to design things with
security in mind. In this article I have tried to show you that by simply
understanding specification and with the help of some basic math you can:
- Find a flaw with a worldwide impact.
- Exploit this flaw in an elegant manner by the means of a simple theory.
- Extend fingerprint state of art.
I hope this will help to convince you that theory and formal tools are a
necessary part of the computer security field. Next time I will focus on
simple formal method to find bug. I hope you will be here .
--[ A Acknowledgements
First I would like to thank Romain Bottier for his help and his
patience. I also want to thank Plops and Poluc for having faith in me. See
guys we made it!
I also want to say that I take great care about non disclosure policy.
I have informed major vendors (Kernel.org, freeBSD, OpenBSD, Cisco...) a
month ago. As I said I did not get any feedback so I assume they do not
care.
References
*=*=*=*=*=
[1] AT&T Steven M. Bellovin. A Technique for Counting NATted Hosts
http://www.research.att.com/~smb/papers/fnat.pdf
[2] Jacobson, Braden, & Borman. RFC 1323 :TCP Extensions for High
Performance .
[3] K. Egevang, Cray Communications, P. Francis. RFC 1631 : The IP
Network Address Translator (NAT).
[4] Bret McDanel. TCP Timestamping - Obtaining System Uptime Remotely
originally posted to Bugtraq Security Mailing List on March 11, 2001.
[5] Michal Zalewski. p0f 2r. Jekyll had something to Hyde.
[6] Fyodor. Nmap - Free Security Scanner For Network Exploration &
Security Audits.
[7] Antirez. dumbscan original BUGTRAQ posting (18 Dec 1998)
[8] Microsoft. TCP timestamp in windows : KB224829.
[9] Hafner, Katie, Matthew Lyon. Where Wizards Stay Up Late: The Origins
of the Internet.
FootNotes
*=*=*=*=*=
(1) Microsoft point of view is that NAT is not a security mechanism so they
do not want to patch.
(2) If you are interested about my analyzer. I hope to publish soon a
theoric paper on how it works. I also hope to release one day a version
of it. To the question did I find other interesting things, the answer
is: maybe I need to check out more deeply.
--[ B - Proof of concept
/*
* Proof Of Concept : counting host behind a NAT using timestamp
* To compile this file, you will need the libpcap
* Copyright Elie Bursztein (lupin@zonart.net)
* Successfully compiled on FreeBSD 5.X and Linux 2.6.X
*
* $gcc natcount.c -o natcount -I/usr/local/include -L/usr/local/lib
* -lpcap
*/
#define __USE_BSD 1
#include <sys/time.h>
#include <time.h>
#include <netinet/in.h>
#include <net/ethernet.h>
#ifdef __FreeBSD__
# include <netinet/in_systm.h>
#endif /* __FreeBSD__ */
#ifdef __linux__
# include <linux/if_ether.h>
#endif /* __linux__ */
#include <netinet/ip.h>
#include <stdlib.h>
#include <string.h>
#include <pcap.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <net/if.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
#ifdef __linux__
# define th_off doff
#endif /* __linux__ */
u_int32_t addr = 0;
/* chain lists structures */
typedef struct listes_s {
struct listes_s *next;
void *elt;
} listes_t;
/* Structures for TCP options */
typedef struct { u_int32_t ts, ts_r; } timestamp_t;
typedef struct { timestamp_t *ts; } tcp_opt_t;
/* Structures for datas storage */
typedef struct { u_int32_t from, first_timestamp; struct timeval
first_seen; } machine_t;
typedef struct { u_int32_t host, nat; struct timeval first_seen; }
nat_box_t;
#define TIMESTAMP_ERROR_MARGIN 0.5
#define DELAY 1
/*
* List functions
*/
int add_in_list(listes_t **list, void * elt) {
listes_t *lst;
lst = malloc(sizeof (listes_t));
lst->next = *list;
lst->elt = elt;
*list = lst;
return (1);
}
void show_nated(listes_t *list) {
nat_box_t *nat;
struct in_addr addr;
printf("-- Begin of nated IP list --\n");
while (list)
{
nat = (nat_box_t *) list->elt;
if (nat->nat > 1) {
addr.s_addr = nat->host;
printf("I've guess %i computers sharing the same IP address
(%s)\n", nat->nat, inet_ntoa(addr));
}
list = list->next;
}
printf("-- End of nated IP list --\n");
}
/*
* Function used to get all TCP options
* Simple TCP options parser
*/
int tcp_option_parser(const u_char *options,
tcp_opt_t *parsed,
unsigned int size) {
u_int8_t kind, len, i;
bzero(parsed, sizeof(tcp_opt_t));
i = 0;
kind = *(options + i);
while (kind != 0) /* EO */
{
switch (kind) {
case 1: i++; break; /* NOP byte */
case 2: i += 4; break;
case 3: i += 3; break;
case 4: i += 2; break;
case 5: /* skipping SACK options */
len = (*options + ++i) - 1;
i += len;
break;
case 6: i += 6; break;
case 7: i += 6; break;
case 8:
i += 2;
parsed->ts = (timestamp_t *) (options + i);
i += 8;
return (1);
break;
default:
i++;
}
kind = *(options + i);
}
return (0);
}
/*
* Most interesting function ... Here we can know if a TCP packet is
* coming from someone we already know !
* Algo :
* finc (seconds) = current_packet_time - first_packet_time <- time
* between 2 packets
* ts_inc = inc_table * finc <- our supposed timestamp increment
* between 2 packets
* new_ts = first_timestamp + ts_inc <- new = timestamp we should have
* now !
*
* Now we just have to compare new_ts with current timestamp
* We can authorize an error margin of 0.5%
*
* Our inc_table contain timestamp increment per second for most
* Operating System
*/
int already_seen(machine_t *mach, tcp_opt_t *opt,
struct timeval temps)
{
int inc_table[4] = {2, 10, 100, 1000};
unsigned int new_ts;
float finc, tmp, ts_inc;
int i, diff;
finc = ((temps.tv_sec - mach->first_seen.tv_sec) * 1000000.
+ (temps.tv_usec - mach->first_seen.tv_usec)) / 1000000.;
for (i = 0; i < 4; i++) {
ts_inc = inc_table * finc;
new_ts = ts_inc + mach->first_timestamp;
diff = ntohl(opt->ts->ts) - new_ts;
if (diff == 0) { /* Perfect shoot ! */
return (2);
}
tmp = 100. - (new_ts * 100. / ntohl(opt->ts->ts));
if (tmp < 0.)
tmp *= -1.;
if (tmp <= TIMESTAMP_ERROR_MARGIN) { /* Update timestamp and time */
mach->first_seen = temps;
mach->first_timestamp = ntohl(opt->ts->ts);
return (1);
}
}
return (0);
}
/*
* Simple function to check if an IP address is already in our list
* If not, it's only a new connection
*/
int is_in_list(listes_t *lst, u_int32_t addr) {
machine_t *mach;
while (lst) {
mach = (machine_t *) lst->elt;
if (mach->from == addr)
return (1);
lst = lst->next;
}
return (0);
}
/*
* This function should be call if a packet from an IP address have been
* found,
* is address is already in the list, but doesn't match any timestamp
* value
*/
int update_nat(listes_t *list, u_int32_t addr)
{
nat_box_t *box;
while (list)
{
box = (nat_box_t *) list->elt;
if (box->host == addr)
{
box->nat++;
return (1);
}
list = list->next;
}
return (0);
}
int check_host(listes_t **list, listes_t **nat, u_int32_t
from,
tcp_opt_t *opt, struct timeval temps) {
listes_t *lst;
machine_t *mach;
int found, zaped;
found = zaped = 0;
lst = *list;
while (lst && !(found)) {
mach = (machine_t *) lst->elt;
if (mach->from == from) {
if ( temps.tv_sec - mach->first_seen.tv_sec > DELAY ) {
found = already_seen(mach, opt, temps);
} else zaped = 1;
}
lst = lst->next;
}
if (!(zaped) && !(found)) {
mach = malloc(sizeof (machine_t));
mach->from = from;
mach->first_seen = temps;
mach->first_timestamp = ntohl(opt->ts->ts);
add_in_list(list, mach);
update_nat(*nat, from);
show_nated(*nat);
return (1);
}
return (0);
}
void callback_sniffer(u_char *useless,
const struct pcap_pkthdr* pkthdr,
const u_char *packet)
{
static listes_t *list_machines = 0;
static listes_t *list_nat = 0;
const struct ip *ip_h;
const struct tcphdr *tcp_h;
tcp_opt_t tcp_opt;
machine_t *mach;
nat_box_t *nat;
struct in_addr my_addr;
ip_h = (struct ip *) (packet + sizeof(struct ether_header));
if (ip_h->ip_p == IPPROTO_TCP)
{
tcp_h = (struct tcphdr *) (packet + sizeof(struct ether_header) +
sizeof(struct ip));
if (tcp_h->th_off * 4 > 20) {
if (tcp_option_parser((u_char *) (packet + sizeof(struct
ether_header)
+ sizeof(struct ip) +
sizeof(struct tcphdr)),
&tcp_opt, tcp_h->th_off * 4 - 20))
{
if (is_in_list(list_machines, (ip_h->ip_src).s_addr)) {
check_host(&list_machines, &list_nat, (u_int32_t)
(ip_h->ip_src).s_addr, &tcp_opt, pkthdr->ts);
} else {
if (ntohl(tcp_opt.ts->ts) != 0)
{
addr = (ip_h->ip_src).s_addr;
my_addr.s_addr = addr;
mach = malloc(sizeof (machine_t));
mach->from = (ip_h->ip_src).s_addr;
mach->first_seen = pkthdr->ts;
mach->first_timestamp = ntohl(tcp_opt.ts->ts);
nat = malloc(sizeof (nat_box_t));
nat->host = (u_int32_t) (ip_h->ip_src).s_addr;
nat->nat = 1;
nat->first_seen = mach->first_seen;
add_in_list(&list_machines, mach);
add_in_list(&list_nat, nat);
}
}
}
}
}
}
int main(int ac, char *argv[])
{
pcap_t *sniff;
char errbuf[PCAP_ERRBUF_SIZE];
struct bpf_program fp;
char *device;
bpf_u_int32 maskp, netp;
struct in_addr my_ip_addr;
char filter[250];
if (getuid() != 0) {
printf("You must be root to use this tool.\n");
exit (2);
}
if (--ac != 1)
{
printf("Usage: ./natcount xl0\n");
return (1);
}
device = (++argv)[0];
pcap_lookupnet(device, &netp, &maskp, errbuf);
my_ip_addr.s_addr = (u_int32_t) netp;
printf("Using interface %s IP : %s\n", device, inet_ntoa(my_ip_addr));
if ((sniff = pcap_open_live(device, BUFSIZ, 1, 1000, errbuf)) == NULL)
{
printf("ERR: %s\n", errbuf);
exit(1);
}
bzero(filter, 250);
snprintf(filter, 250, "not src net %s", inet_ntoa(my_ip_addr));
if(pcap_compile(sniff,&fp, filter, 0, netp) == -1) {
fprintf(stderr,"Error calling pcap_compile\n");
exit(1);
}
if(pcap_setfilter(sniff,&fp) == -1) {
fprintf(stderr,"Error setting filter\n");
exit(1);
}
pcap_loop(sniff, -1, callback_sniffer, NULL);
return (0);
}
|=-----------------------------=[ 0x03-3 ]=------------------------------=|
|=---=[ All Hackers Need To Know About Elliptic Curve Cryptography ]=----=|
|=-----------------------------------------------------------------------=|
|=----------------------------=[ f86c9203 ]=-----------------------------=|
---[ Contents
0 - Abstract
1 - Algebraical Groups and Cryptography
2 - Finite Fields, Especially Binary Ones
3 - Elliptic Curves and their Group Structure
4 - On the Security of Elliptic Curve Cryptography
5 - The ECIES Public Key Encryption Scheme
6 - The XTEA Block Cipher, CBC-MAC and Davies-Meyer Hashing
7 - Putting Everything Together: The Source Code
8 - Conclusion
9 - Outlook
A - Appendix: Literature
B - Appendix: Code
---[ 0 - Abstract
Public key cryptography gained a lot of popularity since its invention
three decades ago. Asymmetric crypto systems such as the RSA
encryption scheme, the RSA signature scheme and the Diffie-Hellman Key
Exchange (DH) are well studied and play a fundamental role in modern
cryptographic protocols like PGP, SSL, TLS, SSH.
The three schemes listed above work well in practice, but they still
have a major drawback: the data structures are large, i.e. secure
systems have to deal with up to 2048 bit long integers. These are
easily handled by modern desktop computers; by contrast embedded
devices, handhelds and especially smartcards reach their computing
power limits quickly. As a second problem, of course, the
transportation of large integers "wastes" bandwidth. In 2048 bit
systems an RSA signature takes 256 bytes; that's quite a lot,
especially for slow communication links.
As an alternative to RSA, DH and suchlike the so called Elliptic Curve
Cryptography (ECC) was invented in the mid-eighties. The theory behind
it is very complicated and much more difficult than doing calculations
on big integers. This resulted in a delayed adoption of ECC systems
although their advantages over the classic cryptographic building
blocks are overwhelming: key lengths and the necessary processing
power are much smaller (secure systems start with 160 bit keys). Thus,
whenever CPU, memory or bandwidth are premium resources, ECC is a good
alternative to RSA and DH.
This article has two purposes:
1. It is an introduction to the theory of Elliptic Curve Cryptography.
Both, the mathematical background and the practical implementability
are covered.
2. It provides ready-to-use source code. The C code included and
described in this article (about 500 lines in total) contains a
complete secure public key crypto system (including symmetric
components: a block cipher, a hash function and a MAC) and is
released to the public domain.
The code doesn't link against external libraries, be they of
bigint, cryptographic or other flavour; an available libc is
sufficient. This satisfies the typical hacker need for compact and
independent programs that have to work in "inhospitable"
environments; rootkits and backdoors seem to be interesting
applications.
As mentioned above the theory behind EC cryptography is rather
complex. To keep this article brief and readable by J. Random Hacker
only the important results are mentioned, theorems are not proven,
nasty details are omitted. If on the other hand you are into maths and
want to become an ECC crack I encourage to start reading [G2ECC] or
[ECIC].
---[ 1 - Algebraical Groups and Cryptography
Definition. A set G together with an operation G x G -> G denoted by
'+' is called an (abelian algebraical) group if the following axioms
hold:
G1. The operation '+' is associative and commutative:
(a + b) + c = a + (b + c) for all a,b,c in G
a + b = b + a for all a,b in G
G2. G contains a neutral element '0' such that
a + 0 = a = 0 + a for all a in G
G3. For each element 'a' in G there exists an "inverse element",
denoted by '-a', such that
a + (-a) = 0.
For a group G the number of elements in G is called the group order,
denoted by |G|.
Example. The sets Z, Q and R of integers, rational numbers and real
numbers, respectively, form groups of infinite order in respect to
their addition operation. The sets Q* and R* (Q and R without 0) also
form groups in respect to multiplication (with 1 being the neutral
element and 1/x the inverse of x).
Definition. Let G be a group with operation '+'. A (nonempty) subset H
of G is called a subgroup of G if H is a group in respect to the same
operation '+'.
Example. Z is a subgroup of Q is a subgroup of R in respect to '+'.
In respect to '*' Q* is a subgroup of R*.
Theorem (Lagrange). Let G be a group of finite order and H be a
subgroup of G. Then |H| properly divides |G|.
It follows that if G has prime order, G has only two subgroups,
namely {0} and G itself.
We define the "scalar multiplication" of a natural number k with a
group element g as follows:
k * g := g + g + ... + g + g
\____ k times ____/
Theorem. For a finite group G and an element g in G the set of all
elements k * g (k natural) forms a subgroup of G. This subgroup
is named the "cyclic subgroup generated by g".
Thus a prime order group is generated by any of its nonzero elements.
We now introduce the Diffie-Hellman Key Exchange protocol: let G be a
prime order group and g a nonzero element. Let two players, called
Alice and Bob respectively, do the following:
1. Alice picks a (secret) random natural number 'a', calculates
P = a * g and sends P to Bob.
2. Bob picks a (secret) random natural number 'b', calculates
Q = b * g and sends Q to Alice.
3. Alice calculates S = a * Q = a * (b * g).
4. Bob calculates T = b * P = b * (a * g).
By definition of the scalar multiplication it is apparent that S =
T. Therefore after step 4 Alice and Bob possess the same value S. The
eavesdropper Eve, who recorded the exchanged messages P and Q, is able
to calculate the same value if she manages to determine 'a' or
'b'. This problem (calculating 'a' from G, g and 'a * g') is called
the group's Discrete Logarithm Problem (DLP).
In groups where DLP is too 'hard' to be practically solvable it is
believed to be out of reach for eavesdroppers to determine the value
S, hence Alice and Bob can securely establish a secret key which can
be used to protect further communication.
If an attacker is able to intercept the transmission of P and Q and to
replace both by the group's neutral element, obviously Alice and Bob
are forced to obtain S = 0 = T as shared key. This has to be
considered a successful break of the crypto system. Therefore both
Alice and Bob have to make sure that the received elements Q and P,
respectively, indeed do generate the original group.
The presented DH scheme may also serve as public key encryption scheme
(called ElGamal encryption scheme): let Alice pick a random natural
number 'a' as private key. The element P = a * g is the corresponding
public key. If Bob wants to encrypt a message for her, he picks a
random number 'b', symmetrically encrypts the message with key T = b *
P and transmits the cipher text along with Q = b * g to Alice. She
can reconstruct T = S via S = a * Q and then decrypt the message.
Note the direct relationship between this and the DH scheme!
Conclusion: Cryptographers are always seeking for finite prime order
groups with hard DLP. This is where elliptic curves come into play:
they induce algebraical groups, some of them suitable for DH and
ElGamal crypto systems. Moreover the elliptic curve arithmetic
(addition, inversion) is implementable in a relatively efficient way.
You will find more information about groups and their properties in
[Groups], [Lagrange], [CyclicGroups] and [GroupTheory]. Read more
about the DLP, DH key exchange and ElGamal encryption in [DLP], [DH]
and [ElGamal].
---[ 2 - Finite Fields, Especially Binary Ones
Definition. A set F together with two operations F x F -> F named
'+' and '*' is called a field if the following axioms hold:
F1. (F, +) forms a group
F2. (F*, *) forms a group (where F* is F without the
'+'-neutral element '0')
F3. For all a,b,c in G the distributive law holds:
a * (b + c) = (a * b) + (a * c)
For 'a + (-b)' we write shorter 'a - b'. Accordingly we write 'a / b'
when we multiply 'a' with the '*'-inverse of b.
To put it clearly: a field is a structure with addition, substraction,
multiplication and division that work the way you are familiar with.
Example. The sets Q and R are fields.
Theorem. For each natural m there exists a (finite) field GF(2^m) with
exactly 2^m elements. Fields of this type are called binary fields.
Elements of binary fields GF(2^m) can efficiently be represented by
bit vectors of length m. The single bits may be understood as
coefficients of a polynomial of degree < m. To add two field elements
g and h just carry out the polynomial addition g + h (this means: the
addition is done element-wise, i.e. the bit vectors are XORed
together). The multiplication is a polynomial multiplication modulo a
certain fixed reduction polynomial p: the elements' product is the
remainder of the polynomial division (g * h) / p.
The fact that field addition just consists of a bitwise XOR already
indicates that in binary fields F each element is its own additive
inverse, that is: a + a = 0 for all a in F. For a,b in F as
consequence 2*a*b = a*b + a*b = 0 follows, what leads to the (at the
first glance surprising) equality
(a + b)^2 = a^2 + b^2 for all a,b in F.
More about finite fields and their arithmetical operations can be
found in [FiniteField], [FieldTheory], [FieldTheoryGlossary] and
especially [FieldArithmetic].
---[ 3 - Elliptic Curves and their Group Structure
Definition. Let F be a binary field and 'a' and 'b' elements in F.
The set E(a, b) consisting of an element 'o' (the "point at
infinity") plus all pairs (x, y) of elements in F that satisfy
the equation
y^2 + x*y = x^3 + a*x^2 + b
is called the set of points of the binary elliptic curve E(a, b).
Theorem. Let E = E(a, b) be the point set of a binary elliptic curve
over the field F = GF(2^m). Then
1. E consists of approximately 2^m elements.
2. If (x, y) is a point on E (meaning x and y satisfy the above
equation) then (x, y + x) is also a point on E.
3. If two points P = (x1, y1) and Q = (x2, y2) on E with x1 != x2 are
connected by a straight line (something of the form y = m*x + b),
then there is exactly one third point R = (x3, y3) on E that is
also on this line. This induces a natural mapping fP, Q) -> R,
sometimes called chord-and-tangent mapping.
Exercise. Prove the second statement.
The chord-and-tangent mapping 'f' is crucial for the group structure
given naturally on elliptic curves:
a) The auxiliary element 'o' will serve as neutral element which may
be added to any curve point without effect.
b) For each point P = (x, y) on the curve we define the point
-P := (x, y + x) to be its inverse.
c) For two points P = (x1, y1) and Q = (x2, y2) the sum 'P + Q'
is defined as -f(P, Q).
It can be shown that the set E together with the point addition '+'
and the neutral element 'o' defacto has group structure. If the
curve's coefficients 'a' and 'b' are carefully chosen, there exist
points on E that generate a prime order group of points for which the
DLP is hard. Based on these groups secure crypto systems can be built.
The point addition on curves over the field R can be visualized. See
[EllipticCurve] for some nice images.
In ECC implementations it is essential to have routines for point
addition, doubling, inversion, etc. We present pseudocode for the
most important ones:
Let (x, y) be a point on the elliptic curve E(a, b). The point
(x', y') := 2 * (x, y) can be computed by
l = x + (y / x)
x' = l^2 + l + a
y' = x^2 + l*x' + x'
return (x', y')
For two points P = (x1, y1), Q = (x2, y2) the sum (x3, y3) = P + Q
can be computed by
l = (y2 + y1) / (x2 + x1)
x3 = l^2 + l + x1 + x2 + a
y3 = l(x1 + x3) + x3 + y1
return (x3, y3)
Some special cases where the point at infinity 'o' has to be
considered have been omitted here. Have a look at [PointArith] for
complete pseudocode routines. But nevertheless we see that point
arithmetic is easy and straight forward to implement. A handful of
field additions, multiplications plus a single division do the job.
The existence of routines that do point doubling and addition is
sufficient to be able to build an efficient "scalar multiplier": a
routine that multiplies a given curve point P by any given natural
number k. The double-and-add algorithm works as follows:
H := 'o'
let n be the number of the highest set bit in k
while(n >= 0) {
H = 2 * H;
if the nth bit in k is set:
H = H + P;
n--;
}
return H;
Example. Suppose you want to calculate k*P for k = 11 = 1011b. Then
n is initialized to 3 and H calculated as
H = 2 * (2 * (2 * (2 * 'o' + P)) + P) + P
= 2 * (2 * (2 * P) + P) + P
= 2 * (5 * P) + P
= 11 * P
Some elliptic curves that are suitable for cryptographic purposes have
been standardized. NIST recommends 15 curves (see [NIST]), among them
five binary ones called B163, B233, B283, B409 and B571. The
parameters of B163 are the following ([NISTParams]):
Field: GF(2^163)
Reduction poly: p(t) = t^163 + t^7 + t^6 + t^3 + 1
Coefficient a: 1
Coefficient b: 20a601907b8c953ca1481eb10512f78744a3205fd
x coordinate of g: 3f0eba16286a2d57ea0991168d4994637e8343e36
y coordinate of g: 0d51fbc6c71a0094fa2cdd545b11c5c0c797324f1
group order: 2 * 5846006549323611672814742442876390689256843201587
The field size is 2^163, the corresponding symmetric security level is
about 80 bits (see chapter 4). The field elements are given in
hexadecimal, the curve's order in decimal form as h * n, where h (the
"cofactor") is small and n is a large prime number. The point g is
chosen in a way that the subgroup generated by g has order n.
The source code included in this article works with B163. It can
easily be patched to support any other binary NIST curve; for this it
is sufficient to alter just 6 lines.
Exercise. Try it out: patch the sources to get a B409 crypto
system. You will find the curve's parameters in [NISTParams].
Read [EllipticCurve], [PointArith] and [DoubleAndAdd] for further
information.
---[ 4 - On the Security of Elliptic Curve Cryptography
We learned that the security of the DH key exchange is based on the
hardness of the DLP in the underlying group. Algorithms are known that
determine discrete logarithms in arbitrary groups; for this task no
better time complexity bound is known than that for Pollard's "Rho
Method" ([PollardRho]):
Theorem. Let G be a finite (cyclic) group. Then there exists an
algorithm that solves DLP in approximately sqrt(|G|) steps (and low
memory usage).
For elliptic curves no DLP solving algorithm is known that performs
better than the one mentioned above. Thus it is believed that the
ECCDLP is "fully exponential" with regard to the bit-length of
|G|. RSA and classical DH systems can, by contrast, be broken in
"subexponential" time. Hence their key lengths must be larger than
those for ECC systems to achieve the same level of security.
We already saw that elliptic curves over GF(2^m) contain about 2^m
points. Therefore DLP can be solved in about sqrt(2^m) steps, that is
2^(m/2). We conclude that m-bit ECC systems are equivalent to
(m/2)-bit symmetric ciphers in measures of security.
The following table compares equivalent key sizes for various crypto
systems.
ECC key size | RSA key size | DH key size | AES key size
-------------+--------------+-------------+-------------
160 | 1024 | 1024 | (80)
256 | 3072 | 3072 | 128
384 | 7680 | 7680 | 192
512 | 15360 | 15360 | 256
---[ 5 - The ECIES Public Key Encryption Scheme
Earlier we presented the DH Key Exchange and the ElGamal public key
crypto system built on top of it. The Elliptic Curve Integrated
Encryption Scheme (ECIES, see ANSI X9.63) is an enhancement of ElGamal
encryption specifically designed for EC groups. ECIES provides
measures to defeat active attacks like the one presented above.
Let E be an elliptic curve of order h * n with n a large prime
number. Let G be a subgroup of E with |G| = n. Choose a point P in G
unequal to 'o'.
We start with ECIES key generation:
Alice picks as private key a random number 'd' with 1 <= d < n;
She distributes the point Q := d * P as public key.
If Bob wants to encrypt a message m for Alice he proceeds as follows:
1. Pick a random number 'k' with 1 <= k < n.
2. Compute Z = h * k * Q.
3. If Z = 'o' goto step 1.
4. Compute R = k * P.
5. Compute (k1, k2) = KDF(Z, R) (see below).
6. Encrypt m with key k1.
7. Calculate the MAC of the ciphertext using k2 as MAC key.
8. Transmit R, the cipher text and the MAC to Alice.
Alice decrypts the cipher text using the following algorithm:
1. Check that R is a valid point on the elliptic curve.
2. Compute Z = h * d * R.
3. Check Z != 'o'.
4. Compute (k1, k2) = KDF(Z, R) (see below).
5. Check the validity of the MAC using key k2.
6. Decrypt m using key k1.
If any of the checks fails: reject the message as forged.
KDF is a key derivation function that produces symmetric keys k1, k2
from a pair of elliptic curve points. Just think of KDF being the
cryptographic hash function of your choice.
ECIES offers two important features:
1. If an attacker injects a curve point R that does not generate a
large group (this is the case in the attack mentioned above), this
is detected in steps 2 und 3 of the decryption process (the
cofactor plays a fundamental role here).
2. The message is not only encrypted in a secure way, it is also
protected from modification by a MAC.
Exercise. Implement a DH key exchange. Let E be a binary elliptic
curve or order h * n. Let G be a subgroup of E with |G| = n. Choose a
point g in G unequal to 'o'. Let Alice and Bob proceed as follows:
1. Alice picks a random number 'a' with 1 <= a < n and sends P = a * g
to Bob.
2. Bob picks a random number 'b' with 1 <= b < n and sends Q = b * g
to Alice.
3. Alice checks that Q is a point on the curve that generates a group
of order n (see the ECIES_public_key_validation routine). Alice
calculates S = a * Q.
4. Bob checks that P is a point on the curve that generates a group of
ordern n. He calculates T = b * P.
If everything went OK the equality S = T should hold.
---[ 6 - The XTEA Block Cipher, CBC-MAC and Davies-Meyer Hashing
XTEA is the name of a patent-free secure block cipher invented by
Wheeler and Needham in 1997. The block size is 64 bits, keys are 128
bits long. The main benefit of XTEA over its competitors AES, Twofish,
etc is the compact description of the algorithm:
void encipher(unsigned long m[], unsigned long key[])
{
unsigned long sum = 0, delta = 0x9E3779B9;
int i;
for(i = 0; i < 32; i++) {
m[0] += ((m[1] << 4 ^ m[1] >> 5) + m[1]) ^ (sum + key[sum & 3]);
sum += delta;
m[1] += ((m[0] << 4 ^ m[0] >> 5) + m[0]) ^ (sum + key[sum >> 11 & 3]);
}
}
Let E be a symmetric encryption function with block length n,
initialized with key k. The CBC-MAC of a message m is calculated as
follows:
1. Split m in n-bit-long submessages m1, m2, m3, ...
2. Calculate the intermediate values t0 = E(length(m)),
t1 = E(m1 XOR t0), t2 = E(m2 XOR t1), t3 = E(m3 XOR t2), ...
3. Return the last value obtained in step 2 as MAC(k, m) and
discard t0, t1, t2, ...
Next we show how a block cipher can be used to build a cryptographic
hash function using the "Davies-Meyer" construction. Let m be the
message that is to be hashed. Let E(key,block) be a symmetric
encryption function with block length n and key length l.
1. Split m in l-bit-long submessages m1, m2, m3, ...
2. Calculate the intermediate values h1 = E(m1, 0), h2 = E(m2, h1) XOR
h1, h3 = E(m3, h2) XOR h2, ...
3. If h is the last intermediate value obtained in step 2 return
E(length(m), h) XOR h as hash value and discard h1, h2, h3, ...
The code included in this article uses the block cipher XTEA in
counter mode (CTR) for encryption, a CBC-MAC garantees message
authenticity; finally KDF (see chapter 5) is implemented using XTEA in
Davies-Meyer mode.
Read [XTEA] and [DMhashing] to learn more about the XTEA block cipher
and the Davies-Meyer construction.
---[ 7 - Putting Everything Together: The Source Code
The public domain source code you find at the end of this document
implements the ECIES public key encryption system over the curve
B163. The code is commented, but we outline the design here.
1. The central data structure is a bit vector of fixed but "long"
length. It is the base data type used to represent field elements
and suchlike. The dedicated typedef is called bitstr_t.
Appropriate routines for bit manipulation, shifting, bitcounting,
importing from an ASCII/HEX representation, etc do exist.
2. The functions with "field_" prefix do the field arithmetic:
addition, multiplication and calculation of the multiplicative
inverse of elements are the important routines.
3. ECC points are represented as pairs of elem_t (an alias for
bitstr_t), the special point-at-infinity as the pair (0,0). The
functions prefixed with "point_" act on elliptic curve points and
implement basic point operations: point addition, point doubling,
etc.
4. The function "point_mult" implements the double-and-add algorithm
to compute "k * (x,y)" in the way described in chapter 3 .
5. The "XTEA"-prefixed functions implement the XTEA block cipher,
but also the CBC-MAC and the Davies-Meyer construction.
6. The "ECIES_"-routines do the ECIES related work.
ECIES_generate_key_pair() generates a private/public key pair,
ECIES_public_key_validation() checks that a given point is
on the curve and generates a group of order "n".
ECIES_encryption/ECIES_decryption do what their names imply.
7. A demonstration of the main ECIES functionalities is given in the
program's main() section.
The code may be compiled like this:
gcc -O2 -o ecc ecc.c
---[ 8 - Conclusion
We have seen how crypto systems are built upon algebraical groups that
have certain properties. We further gave an introduction into a special
class of appropriate groups and their theory, namely to the binary
elliptic curves. Finally we presented the secure public key encryption
scheme ECIES (together with necessary symmetrical components). All
this is implemented in the source code included in this article.
We recall that besides security the central design goal of the code
was compactness, not speed or generality. Libraries specialized on EC
cryptography benefit from assembler hand-coded field arithmetic
routines and easily perform a hundred times faster than this code.
If compactness is not essential for your application you might opt for
linking against one of the following ECC capable free crypto libraries
instead:
Crypto++ (C++) http://www.eskimo.com/~weidai/cryptlib.html
Mecca (C) http://point-at-infinity.org/mecca/
LibTomCrypt (C) http://libtomcrypt.org/
borZoi (C++/Java) http://dragongate-technologies.com/products.html
---[ 9 - Outlook
You have learned a lot about elliptic curves while reading this
article, but there still remains a bunch of unmentioned ideas. We
list some important ones:
1. Elliptic curves can be defined over other fields than binary ones.
Let p be a prime number and Z_p the set of nonnegative integers
smaller than p. Then Z_p forms a finite field (addition and
multiplication have to be understood modulo p, see
[ModularArithmetic] and [FiniteField]).
For these fields the elliptic curve E(a, b) is defined to be the
set of solutions of the equation
y^2 = x^3 + ax + b
plus the point at infinity 'o'. Of course point addition and
doubling routines differ from that given above, but essentially
these "prime curves" form an algebraical group in a similar way as
binary curves do. It is not that prime curves are more or less
secure than binary curves. They just offer another class of groups
suitable for cryptographic purposes.
NIST recommends five prime curves: P192, P224, P256, P384 and P521.
2. In this article we presented the public key encryption scheme
ECIES. It should be mentioned that ECC-based signature schemes
(see [ECDSA]) and authenticated key establishment protocols ([MQV])
do also exist. The implementation is left as exercise to the
reader.
3. Our double-and-add point multiplicator is very rudimentary. Better
ones can do the "k * P" job in half the time. We just give the idea
of a first improvement:
Suppose we want to calculate 15 * P for a curve point P. The
double-and-add algorithm does this in the following way:
15 * P = 2 * (2 * (2 * (2 * 'o' + P) + P) + P) + P
This takes three point doublings and three point additions
(calculations concerning 'o' are not considered).
We could compute 15 * P in a cleverer fashion:
15 * P = 16 * P - P = 2 * 2 * 2 * 2 * P - P
This takes four doublings plus a single addition; hence we may
expect point multiplicators using this trick to be better
performers than the standard double-and-add algorithm. In practice
this trick can speed up the point multiplication by about 30%.
See [NAF] for more information about this topic.
4. In implementations the most time consuming field operation is
always the element inversion. We saw that both the point addition
and the point doubling routines require one field division each.
There is a trick that reduces the amount of divisions in a full "k
* P" point multiplication to just one. The idea is to represent the
curve point (x,y) as triple (X,Y,Z) where x = X/Z, y = Y/Z. In this
"projective" representation all field divisions can by deferred to
the very end of the point multiplication, where they are carried
out in a single inversion.
Different types of coordinate systems of the projective type
are presented in [CoordSys].
2. Analyzing a 'strange' binary file
3. Analyzing a 'strange' process
4. Security Forensics using DTrace
5. Conclusion
--[ Introduction
The art of security forensics requires lots of patience, creativity and
observation. You may not always be successful in your endeavours but
constantly 'sharpening' your skills by hands-on practicing, learning a
couple more things here and there in advance will definitely help.
In this article I'd like to share my personal experience in analyzing
suspicious binary files and processes that you may find on the system. We
will use only standard, out of the box, UNIX utilities. The output for all
the examples in the article is provided for Solaris OS.
--[ Analyzing a 'strange' binary file
During your investigation you may encounter some executable (binary) files
whose purpose in your system you don't understand. When you try to read
this file it displays 'garbage'. You cannot recognize this file by name
and you are not sure if you saw it before.
Unfortunately, you cannot read the binary file with more, cat, pg, vi or
other utilities that you normally use for text files. You will need other
tools. In order to read such files, I use the following tools: strings,
file, ldd, adb, and others.
Let's assume, for example, that we found a file called cr1 in the /etc
directory. The first command to run on this file is strings(1). This will
show all printable strings in the object or binary file:
$ strings cr1 | more
%s %s %s%s%s -> %s%s%s (%.*s)
Version: 2.3
Usage: dsniff [-cdmn] [-i interface] [-s snaplen] [-f services]
[-t trigger[,...]] [-r|-w savefile] [expression]
...
/usr/local/lib/dsniff.magic
/usr/local/lib/dsniff.services
...
The output is very long, so some of it has been omitted. But you can see
that it shows that this is actually a dsniff tool masquerading as cr1.
Sometimes you may not be so lucky in finding the name of the program,
version, and usage inside the file. If you still don't know what this file
can do, try to run strings with the 'a' flag, or just '-'. With these
options, strings will look everywhere in the file for strings. If this flag
is omitted, strings only looks in the initialized data space of the object
file:
$ strings cr1 | more
Try to compare this against the output from known binaries to get an idea
of what the program might be.
Alternatively, you can use the nm(1) command to print a name list of an
object file:
$ /usr/ccs/bin/nm -p cr1 | more
cr1:
[Index] Value Size Type Bind Other Shndx Name
[180] |0 | 0| FILE | LOCL | 0 |ABS | decode_smtp.c
[2198] |160348| 320| FUNC | GLOB | 0 | 9 | decode_sniffer
Note that the output of this command may contain thousands of lines,
depending on the size of the object file. You can run nm through pipe to
more or pg, or redirect the output to the file for further analysis.
To check the runtime linker symbol table - calls of shared library routines,
use nm with the '-Du' options, where -D displays the symbol table used by
ld.so.1 and is present even in stripped dynamic executables, and -u prints
a long listing for each undefined symbol.
You can also dump selected parts of any binary file with the dump(1) or
elfdump(1) utilities. The following command will dump the strings table of
cr1 binary:
$ /usr/ccs/bin/dump -c ./cr1 | more
You may use the following options to dump various parts of the file:
-c Dump the string table(s).
-C Dump decoded C++ symbol table names.
-D Dump debugging information.
-f Dump each file header.
-h Dump the section headers.
-l Dump line number information.
-L Dump dynamic linking information and static shared library
information, if available.
-o Dump each program execution header.
-r Dump relocation information.
-s Dump section contents in hexadecimal.
-t Dump symbol table entries.
Note: To display internal version information contained within an ELF file,
use the pvs(1) utility.
If you are still not sure what the file is, run the command file(1):
$ file cr1
cr1: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+
Required, UltraSPARC1 Extensions Required, dynamically linked, not
stripped
Based on this output, we can tell that this is an executable file for SPARC
that requires the availability of libraries loaded by the OS (dynamically
linked). This file also is not stripped, which means that the symbol tables
were not removed from the compiled binary. This will help us a lot when we
do further analysis.
Note: To strip the symbols, do strip <my_file>.
The file command could also tell us that the binary file is statically
linked, with debug output or stripped.
Statically linked means that all functions are included in the binary, but
results in a larger executable. Debug output - includes debugging symbols,
like variable names, functions, internal symbols, source line numbers, and
source file information. If the file is stripped, its size is much smaller.
The file command identifies the type of a file using, among other tests, a
test for whether the file begins with a certain magic number (see the
/etc/magic file). A magic number is a numeric or string constant that
indicates the file type. See magic(4) for an explanation of the format of
/etc/magic.
If you still don't know what this file is used for, try to guess this by
taking a look at which shared libraries are needed by the binary using
ldd(1) command:
$ ldd cr1
...
libsocket.so.1 => /usr/lib/libsocket.so.1
librpcsvc.so.1 => /usr/lib/librpcsvc.so.1
...
This output tells us that this application requires network share libraries
(libsocket.so.1 and librpcsvc.so.1).
The adb(1) debugger can also be very useful. For example, the following
output shows step-by-step execution of the binary in question:
# adb cr1
:s
adb: target stopped at:
ld.so.1`_rt_boot: ba,a +0xc
<ld.so.1`_rt_boot+0xc>
,5:s
adb: target stopped at:
ld.so.1`_rt_boot+0x58: st %l1, [%o0 + 8]
You can also analyze the file, or run it and see how it actually works. But
be careful when you run an application because you don't know yet what to
expect. For example:
# adb cr1
:r
Using device /dev/hme0 (promiscuous mode)
192.168.2.119 -> web TCP D=22 S=1111 Ack=2013255208
Seq=1407308568 Len=0 Win=17520
web -> 192.168.2.119 TCP D=1111 S=22 Push Ack=1407308568
We can see that this program is a sniffer. See adb(1) for more information
of how to use the debugger.
If you decide to run a program anyway, you can use truss(1). The truss
command allows you to run a program while outputting system calls and
signals.
Note: truss produces lots of output. Redirect the output to the file:
$ truss -f -o cr.out ./cr1
listening on hme0
^C
$
Now you can easily examine the output file cr.out.
As you can see, many tools and techniques can be used to analyze a strange
file. Not all files are easy to analyze. If a file is a statically linked
stripped binary, it would be much more difficult to find what a file
(program) is up to. If you cannot tell anything about a file using simple
tools like strings and ldd, try to debug it and use truss. Experience using
and analyzing the output of these tools, together with a good deal of
patience, will reward you with success.
--[ Analyzing a 'strange' process
What do you do if you find a process that is running on your system, but
you don't know what it is doing? Yes, in UNIX everything is a file, even a
process! There may be situations in which the application runs on the
system but a file is deleted. In this situation the process will still run
because a link to the process exists in the /proc/[PID]/object/a.out
directory, but you may not find the process by its name running the find(1)
command.
For example, let's assume that we are going to investigate the process
ID 22889 from the suspicious srg application that we found running on our
system:
# ps -ef | more
UID PID PPID C STIME TTY TIME CMD
...
root 22889 16318 0 10:09:25 pts/1 0:00 ./srg
...
Sometimes it is as easy as running the strings(1) command against the
/proc/[PID]/object/a.out to identify the process.
# strings /proc/22889/object/a.out | more
...
TTY-Watcher version %s
Usage: %s [-c]
-c turns on curses interface
NOTE: Running without root privileges will only allow you to monitor
yourself.
...
We can see that this command is a TTY-Watcher application that can see all
keystrokes from any terminal on the system.
Suppose we were not able to use strings to identify what this process is
doing. We can examine the process using other tools.
You may want to suspend the process until you will figure out what it is.
For example, run kill -STOP 22889 as root. Check the results. We will look
for 'T' which indicates the process that was stopped:
# /usr/ucb/ps | grep T
root 22889 0.0 0.7 3784 1720 pts/1 T 10:09:25 0:00 ./srg
Resume the process if necessary with kill -CONT <PID>
To further analyze the process, we will create a \core dump\ of variables
and stack of the process:
# gcore 22889
gcore: core.22889 dumped
Here, 22889 is the process ID (PID). Examine results of the core.22889 with
strings:
# strings core.22889 | more
...
TTY-Watcher version %s
Usage: %s [-c]
-c turns on curses interface
NOTE: Running without root privileges will only allow you to monitor
yourself.
...
You may also use coreadm(1M) to analyze the core.22889 file. The coreadm
tool provides an interface for managing the parameters that affect core
file creation. The coreadm command modifies the /etc/coreadm.conf file.
This file is read at boot time and sets the global parameters for core
dump creation.
First, let's set our core filenames to be of the form core.<PROC NAME>.<PID>.
We'll do this only for all programs we execute in this shell (the $$
notation equates to the PID of our current shell):
$ coreadm -p core.%f.%p $$
The %f indicates that the program name will be included, and the %p
indicates that the PID will be appended to the core filename.
You may also use adb to analyze the process. If you don't have the object
file, use the /proc/[PID]/object/a.out. You can use a core file for the
process dumped by gcore or specify a '-' as a core file. If a dash (-) is
specified for the core file, adb will use the system memory to execute the
object file. You can actually run the object file under the adb control (it
could also be dangerous because you don't know for sure what this
application is supposed to do!):
# adb /proc/22889/object/a.out -
main:b
:r
breakpoint at:
main: save %sp, -0xf8, %sp
...
:s
stopped at:
main+4: clr %l0
:s
stopped at:
main+8: sethi %hi(0x38400), %o0
$m
? map
...
b11 = ef632f28 e11 = ef6370ac f11 = 2f28 `/usr/lib/libsocket.so.1'
$q
We start the session by setting a breakpoint at the beginning of main() and
then begin execution of a.out by giving adb the ':r' command to run.
Immediately, we stop at main(), where our breakpoint was set. Next, we list
the first instruction from the object file. The ':s' command tells adb to
step, executing only one assembly instruction at a time.
Note: Consult the book Panic!, by Drake and Brown, for more information on
how to use adb to analyze core dumps.
To analyze the running process, use truss:
# truss -vall -f -o /tmp/outfile -p 22889
# more /tmp/outfile
On other UNIX systems, where available, you may trace a process by using the
ltrace or strace commands. To start the trace, type ltrace -p <PID>.
To view the running process environment, you may use the following:
# /usr/ucb/ps auxeww 22889
USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND
root 22889 0.0 0.4 1120 896 pts/1 S 14:15:27 0:00 -
sh _=/usr/bin/csh
MANPATH=/usr/share/man:/usr/local/man HZ=
PATH=/usr/sbin:/usr/bin:/usr/local/bin:/usr/ccs/bin:/usr/local/sbin:
/opt/NSCPcom/ LOGNAME=root SHELL=/bin/ksh HOME=/
LD_LIBRARY_PATH=/usr/openwin/lib:/usr/local/lib TERM=xterm TZ=
The /usr/ucb directory contains SunOS/BSD compatibility package commands. The
/usr/ucb/ps command displays information about processes. We used the
following options (from the man for ps(1B)):
-a Include information about processes owned by others.
-u Display user-oriented output. This includes fields USER, %CPU,o
%MEM, SZ, RSS and START as described below.
-x Include processes with no controlling terminal.
-e Display the environment as well as the arguments to the command.
-w Use a wide output format (132 columns rather than 80); if repeated,
that is, -ww, use arbitrarily wide output. This information is
used to decide how much of long commands to print.
To view the memory address type:
# ps -ealf | grep 22889
F S UID PID PPID C PRI NI ADDR SZ WCHAN
STIME TTY TIME CMD
8 S root 3401 22889 0 41 20 615a3b40 474 60ba32e6 14:16:49
pts/1 0:00 ./srg
To view the memory usage, type:
# ps -e -opid,vsz,rss,args
PID VSZ RSS COMMAND
...
22889 3792 1728 ./srg
We can see that the ./srg uses 3,792 K of virtual memory, 1,728 of which
have been allocated from physical memory.
You can use the /etc/crash(1M) utility to examine the contents of a proc
structure of the running process:
# /etc/crash
dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout
> p
PROC TABLE SIZE = 3946
SLOT ST PID PPID PGID SID UID PRI NAME FLAGS
...
66 s 22889 16318 16337 24130 0 58 srg load
> p -f 66
PROC TABLE SIZE = 3946
SLOT ST PID PPID PGID SID UID PRI NAME FLAGS
66 s 22889 16318 16337 24130 0 58 srg load
Session: sid: 24130, ctty: vnode(60b8f3ac) maj( 24) min( 1)
...
>
After invoking the crash utility, we used the p function to get the process
table slot (66, in this case). Then, to dump the proc structure for process
PID 22889, we again used the p utility, with the '-f' flag and the process
table slot number.
Like the process structure, the uarea contains supporting data for signals,
including an array that defines the disposition for each possible signal.
The signal disposition tells the operating system what to do in the event
of a signal - ignore it, catch it and invoke a user-defined signal handler,
or take the default action. To dump a process's uarea:
> u 66
PER PROCESS USER AREA FOR PROCESS 66
PROCESS MISC:
command: srg, psargs: ./srg
start: Mon Jun 3 08:56:40 2002
mem: 6ad, type: exec su-user
vnode of current directory: 612daf48
...
>
The 'u' function takes a process table slot number as an argument.
To dump the address space of a process, type:
# /usr/proc/bin/pmap -x 22889
To obtain a list of process's open files, use the /usr/proc/bin/pfiles
command:
# /usr/proc/bin/pfiles 22889
The command lists the process name and PID for the process' open files. Note
that various bits of information are provided on each open file, including
the file type, file flags, mode bits, and size.
If you cannot find a binary file and the process is on the memory only, you
can still use methods described for analyzing suspicious binary files above
against the process's object file. For example:
# /usr/ccs/bin/nm a.out | more
a.out:
[Index] Value Size Type Bind Other Shndx Name
...
[636] | 232688| 4|OBJT |GLOB |0 |17 |Master_utmp
[284] | 234864| 20|OBJT |GLOB |0 |17 |Mouse_status
You may also use mdb(1) - a modular debugger to analyze the process:
# mdb -p 22889
Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libuutil.so.1 ]
> :bjects
BASE LIMIT SIZE NAME
10000 62000 52000 ./srg
ff3b0000 ff3dc000 2c000 /lib/ld.so.1
ff370000 ff37c000 c000 /lib/libsocket.so.1
ff280000 ff312000 92000 /lib/libnsl.so.1
--[ Security Forensics using DTrace
Solaris 10 has introduced a new tool for Dynamic Tracing in the OS
environment - dtrace. This is a very powerful tool that allows system
administrators to observe and debug the OS behaviour or even to dynamically
modify the kernel. Dtrace has its own C/C++ like programming language called
'D language' and comes with many different options that I am not going to
discuss here. Consult dtrace(1M) man pages and
http://docs.sun.com/app/docs/doc/817-6223 for more information.
Although this tool has been designed primarily for developers and
administrators, I will explain how one can use dtrace for analyzing
suspicious files and process.
We will work on a case study, as followes. For example, let's assume that we
are going to investigate the process ID 968 from the suspicious srg
application that we found running on our system.
By typing the following at the command-line, you will list all files that
this particular process opens at the time of our monitoring. Let it run for
a while and terminate with Control-C:
# dtrace -n syscall:pen:entry'/pid == 968/
{ printf("%s%s",execname,copyinstr(arg0)); }'
dtrace: description 'syscall:pen*:entry' matched 2 probes
^C
CPU ID FUNCTION:NAME
0 14 open:entry srg /var/ld/ld.config
0 14 open:entry srg /lib/libdhcputil.so.1
0 14 open:entry srg /lib/libsocket.so.1
0 14 open:entry srg /lib/libnsl.so.1
D language comes with its own terminology, which I will try to address here
briefly.
The whole 'syscall:pen:entry' construction is called a 'probe' and
defines a location or activity to which dtrace binds a request to perform
a set of 'actions'. The 'syscall' element of the probe is called a 'provider'
and, in our case, permits to enable probes on 'entry' (start) to any 'open'
Solaris system call ('open' system call instracts the kernel to open a file
for reading or writing).
The so-called 'predicate' - /pid == 968/ uses the predefined dtrace
variable 'pid', which always evaluates to the process ID associated with
the thread that fired the corresponding probe.
The 'execname' and 'copyinstr(arg0)' are called 'actions' and define the
name of the current process executable file and convert the first integer
argument of the system call (arg0) into a string format respectively. The
printf's action uses the same syntax as in C language and serves for the
same purpose - to format the output.
Each D program consists of a series of 'clauses', each clause describing one
or more probes to enable, and an optional set of actions to perform when the
probe fires. The actions are listed as a series of statements enclosed in
curly braces { } following the probe name. Each statement ends with a
semicolon (.
You may want to read the Introduction from Solaris Tracing Guide
(http://docs.sun.com/app/docs/doc/817-6223) for more options and to
understand the syntax.
Note: As the name suggests, the dtrace (Dynamic Trace) utility will show you
the information about a chnaging process - in dynamic. That is, if the
process is idle (doesn't do any system calls or opens new files), you won't
be able to get any information. To analyze the process, either restart it or
use methods described in the previous two sections of this paper.
Second, we will use the following command-line construction to list all
system calls for 'srg'. Let it run for a while and terminate by Control-C:
# dtrace -n 'syscall:::entry /execname == "srg"/ { @num[probefunc] =
count(); }'
dtrace: description 'syscall:::entry ' matched 226 probes
^C
pollsys 1
getrlimit 1
connect 1
setsockopt 1
...
You may recognize some of the building elements of this small D program. In
addition, this clause defines an array named 'num' and assigns the
appropriate member 'probefunc' (executed system call's function) the namber
of times these particular functions have been called (count()).
Using dtrace we can easily emulate all utilities we have used in the
previous sections to analyze suspicious binary files and processes. But
dtrace is much more powerful tool and may provide one with more
functionality: for example, you can dynamically monitor the stack of the
process in question:
# dtrace -n 'syscall:::entry/execname == "srg"/{ustack()}'
0 286 lwp_sigmask:entry
libc.so.1`__systemcall6+0x20
libc.so.1`pthread_sigmask+0x1b4
libc.so.1`sigprocmask+0x20
srg`srg_alarm+0x134
srg`scan+0x400
srg`net_read+0xc4
srg`main+0xabc
srg`_start+0x108
Based on all our investigation (see the list of opened files, syscalls,
and the stack examination above), we may positively conclude that srg is a
network based application. Does it write to the network? Let's check it by
constructing the following clause:
# dtrace -n 'mib:ip::/execname == "srg"/{@[execname]=count()}'
dtrace: description 'mib:ip::' matched 412 probes
dtrace: aggregation size lowered to 2m
^C
srg 520
It does. We used 'mib' provider to find out if our application transmits
to the network.
Could it be just a sniffer or a netcat-liked application that is bounded
to a specific port? Let's run dtrace in the truss(1) like fashion to answer
this question (inspired by Brendan Gregg's dtruss utility ):
#!/usr/bin/sh
#
dtrace='
inline string cmd_name = "'$1'";
/*
** Save syscall entry info
*/
syscall:::entry
/execname == cmd_name/
{
/* set start details */
self->start = timestamp;
self->arg0 = arg0;
self->arg1 = arg1;
self->arg2 = arg2;
}
/* Print data */
syscall::write:return,
syscall:write:return,
syscall::*read*:return
/self->start/
{
printf("%s(0x%X, \"%S\", 0x%X)\t\t = %d\n",probefunc,self->arg0,
stringof(copyin(self->arg1,self->arg2)),self->arg2,(int)arg0);
self->arg0 = arg0;
self->arg1 = arg1;
self->arg2 = arg2;
}
'
# Run dtrace
/usr/sbin/dtrace -x evaltime=exec -n "$dtrace" >&2
Save it as truss.d, change the permissions to executable and run:
# ./truss.d srg
0 13 write:return write(0x1, " sol10 -
> 192.168.2.119 TCP D=3138 S=22 Ack=713701289 Seq=3755926338 Len=0
Win=49640\n8741 Len=52 Win=16792\n\0", 0x5B) = 91
0 13 0 13
write:return write(0x1, "192.168.2.111 -> 192.168.2.1 UDP D=1900
S=21405 LEN=140\n\0", 0x39) = 57
^C
Looks like a sniffer to me, with probably some remote logging (remember the
network transmission by ./srg discovered by the 'mib' provider above!).
You can actually write pretty sophisticated programs for dtrace using D
language.
Take a look at /usr/demo/dtrace for some examples.
You may also use dtrace for other forensic activities. Below is an example
of more complex script that allows monitoring of who fires the suspicious
application and starts recording of all the files opened by the process:
#!/usr/bin/sh
command=$1
/usr/sbin/dtrace -n '
inline string COMMAND = "'$command'";
#pragma D option quiet
/*
** Print header
*/
dtrace:::BEGIN
{
/* print headers */
printf("%-20s %5s %5s %5s %s\n","START_TIME","UID","PID","PPID","ARGS");
}
/*
** Print exec event
*/
syscall::exec:return, syscall::exece:return
/(COMMAND == execname)/
{
/* print data */
printf("%-20Y %5d %5d %5d %s\n",walltimestamp,uid,pid,ppid,
stringof(curpsinfo->pr_psargs));
s_pid = pid;
}
/*
** Print open files
*/
syscall:pen*:entry
/pid == s_pid/
{
printf("%s\n",copyinstr(arg0));
}
'
Save this script as wait.d, change the permissions to executable
'chmod +x wait.d' and run:
# ./wait.d srg
START_TIME UID PID PPID ARGS
2005 May 16 19:51:20 100 1582 1458 ./srg
/var/ld/ld.config
/lib/libnsl.so.1
/lib/libsocket.so.1
/lib/libresolv.so.2
...
^C
Once the srg is started you will see the output.
However, the real power of dtrace comes from the fact that you can do
things with it that won't be possible without writing a comprehensive
C program. For example, the shellsnoop application written by Brendan Gregg
(http://users.tpg.com.au/adsln4yb/DTrace/shellsnoop) allows you to use
dtrace at the capacity of ttywatcher!
It is not possible to show all capabilities of dtrace in such a small
presentation of this amazing utility. Dtrace is a very powerful as well a
complex tool with virtually endless capabilities. Although Sun insists that
you don't have to have a 'deep understanding of the kernel for DTrace to be
useful', the knowledge of Solaris internals is a real asset. Taking a look
at the include files in /usr/include/sys/ directory may help you to write
complex D scripts and give you more of an understanding of how Solaris 10
is implemented.
--[ Conclusion
Be creative and observant. Apply all your knowledge and experience for
analyzing suspicious binary files and processes. Also, be patient and have
a sense of humour!
|=-------------------------=[ 0x03-2 ]=----------------------------------=|
|=----------=[ TCP Timestamp To count Hosts behind NAT ]=----------------=|
|=-----------------------------------------------------------------------=|
|=-------------=[ Elie aka Lupin (lupin@zonart.net) ]=-------------------=|
Table of Contents
*=*=*=*=*=*=*=*=*
1.0 - Introduction
2.0 - Time has something to tell us
- 2.1 Past history
- 2.2 Present
- 2.3 Back to the begin of timestamp history
- 2.4 Back to school
- 2.5 Back to the NAT
- 2.6 Let's do PAT
- 2.7 Time to fightback
3.0 History has something to tell us
- 3.1 Which class ?
- 3.2 So were does it come from ?
- 3.3 How do you find it ?
- 3.4 Back to the future
- 4 Learning from the past aka conclusion
- A Acknowledgements
- B Proof of concept
--[ 1.0 - Introduction
This article is about TCP timestamp option. This option is used to
offer a new way for counting host beyond a NAT and enhanced host
fingerprinting. More deeply, this article tries to give a new vision of a
class of bug known has "Design error". The bug described here, deserves
interest for the following reasons.
- It's new.
- It affects every platform since it is related to the specification
rather than implementation.
- It's a good way to explain how some specifications can be broken.
The article is organized has follow : First I will explain what's
wrong about TCP timestamp. Then I will describe How to exploit it, the
limitations of this exploitation and a way to avoid it. In the second part
I will talk about the origin of this error and why it will happen again. At
the end I will give a proof of concept and greeting as usual.
--[ 2.0 - Time has something to tell us
----[ 2.1 - Past history
Fingerprinting and Nat detection have been an active field for long
time. Since you read phrack you already know the old school TCP/IP
fingerprinting by Fyodor.
You may also know p0f (Passive of fingerprinting) by M. Zalewski. With
the version 2 he has done a wonderful tool, introducing clever ways to know
if a host uses the NAT mechanism by analyzing TCP packet option. If you are
interested in this tool (and you should !) read his paper :
"Dr. Jekyll had something to Hyde"[5].
In fact the technique described here is related to p0f in the way, that
like p0f, it can be totally passive.
To be complete about NAT detection, I need to mention that AT&T has
done research on counting host behind a NAT[1]. Their work focus on IP ID,
assuming that this value is incremental in some OS. In fact they are mainly
talking about Windows box which increment IP ID by 256 for each packet.
Discovered by Antirez[7], Nmap[6] has used this fact for a long time
(option -sI).
Now that we know what we are talking about it's time to explain what's
going on.
----[ 2.2 - Present
NAT was designed to face the IP address depletion. It is also used to
hide multiple hosts behind a single IP. The TCP timestamp option[2] is
improperly handled by the IP Network Address Translator (NAT) mechanism[3].
In other words even scrubbing from pf doesn't rewrite the timestamp option.
Until now this property of the NAT has been useless (in the security point
of view). It is interesting to point out that the timestamp option by itself
has already been used for information disclosure. Let's take a quick look
at timestamp security history
----[ 2.3 - Back to the beginning of timestamp history
In the past the timestamp has been used to calculate the uptime of a
computer[4]. Any one who had try the TCP fingerprint option (-O) of Nmap
has been impressed by a line like this one :
"Uptime 36.027 days (since Tue May 25 11:12:31 2004)".
Of course their is no black magic behind that, only two facts :
- Time goes back only in movie (sorry boys...)
- Every OS increments the timestamp by one every n milliseconds.
So if you know the OS, you know how often the OS increment the timestamp
option. All you have to do to know the uptime is to apply a trivial math
formula :
timestamp / num inc by sec = uptime in sec
Has you can notice this formula does not take into account the warp
around of integer. Here we know two information : the actual timestamp and
the number of increments by second. This can only be done because we know
the OS type. Let's see how we can improve this technique to do it without
knowing the OS.
----[ 2.4 - Back to school
Remember a long time ago at school, you heard about affine function.
A basic example of it is :
"y = Ax + B"
where A is the slope and B the initial point.
The graphic representation of it is straight line. From timestamp point of
view this can be express has the follow :
timestamp = numincbysec * sec + intial number
When you do active fingerprinting you get the timestamp and know the
numincbysec by guessing the OS.
Now let's suppose you can't guess the OS. In this case you don't know
the slope and can't guess the uptime. Here is an other way to know the
slope of the OS. You need to get the computer timestamp twice. Name it ts1
and ts2 and name the time (in sec) where you gather it t1 and t2.
With thoses informations, it is trivial to find the slope since we have the
following equationnal system:
ts1 = A*s1 + B
ts2 = A*s2 + B
which is solved by the following equation :
ts1 - ts2 = A*(s1 - s2) <=> A = (ts1 - ts2) / (s1 - s2)
An imediate application of this idea can be implemented in active scanner:
requeste twice the timestamp to verify that the slope is the
same as the one guessed.
This can be use to defeat some anti-fingerprint tools. It also can be used
as a standalone fingerprinting technic but will not be accurate has the TCP
or ICMP one.
Now that we have the theory ready, let's go back to NAT.
----[ 2.5 - Back to the NAT
Let's make the connection with the NAT. Since the timestamp option is
not rewritten by NAT, we can count the number of host behind the NAT using
the following algorithm :
1. for each host already discovered verifying is the packet belong to it
straight line equation. each host has a unique straight line equation
until two host have booted at the same second.
2. otherwise add the packet to unmatched packet : a new host beyond NAT is
detected.
Look to the proof of concept if you need to make things more clear.
This simple algorithm has a lot of room for improvement. It has been keeped
has simple has possible for clarity. As you can see timestamp option can be
used to count host beyond a NAT in a reliable manner. It will also giveo
indication of the OS class.
----[ 2.6 - Let's do PAT
PAT (Port Address Translation) is used to provide service on a box
behind a NAT.
The question is how do I know that the port is forwarded?
Well timestamp is once again your friend. If for two different ports the
slope of timestamp differs then there is a PAT and the OS of the two
computers is different. If the timestamp gathered from the two ports does
not belong to the same straight line, then it's the same OS but not the
same computer.
Another interesting use of PAT is the round robin. Until now their were
no way to know if such mechanism is used. By comparing the different
timestamps gathered you can determine how many hosts are beyond a single
port. This might be an interesting functionality to add to an active
scanner.
----[ 2.7 - Time to fight back
Since playing with this option can give valuable information there is
some limitation to this technique. Mainly Windows box does not use timestamp
option when they establish connection[8] unless you activate it. This
limitation only affects passive analysis, if you use timestamp when
you connect to a windows it will use it too. Moreover many tweaks software
activate the TCP extension in windows.
To be completed on the subject I had to mention that it seems that TCP
extension does not exist on win 9X.
One other problem is the time gap. In passive mode there can be a
desynchronization between computers due to computer desynchronization or
network lags. In the proof of concept this phenomenon can occur. To handle
it you need not to rely on the computer clock but on timestamp itself.
What can we do against this ? Since no vendor except Microsoft (1)
(Thanks Magnus) has answer to me, the following workaround may not be
available. Here is a theoric way to patch this problem.
1. Disabling tcp timestamp. This is the worse solution since we will need
it with fast network[2].
2. Make NAT rewrite the timestamp and changing The NAT RFC.
3. Changing the RFC to specify that the timestamp option needs to have a
random increment. Modifying each implementation to reflect this change.
The a clean way to fix this thing because it's does not rely on an
external system (the NAT computer in this case).
Well I have to try to be as complete as possible for this technical
part. The next part will be more "philosophic" since it deals with the
cause instead of the consequence.
--[ 3 - History has something to tell us
In this part I will try to focus on why we have this situation and what
we can do about it. Here I am not talking about the timestamp option by
itself but about the interaction between the timestamp option and the NAT
mechanism.
----[ 3.1 - Which class ?
First question is what is this bug? This bug belongs to the design
error class. To be more precise this bug exists because protocol
specification overlap. IP was designed to be a one on one protocol: one
client talks to one server. NAT violates this specification by allowing
multiple to one. By itself this violation has caused so many problems that
I lost the count of it, but it is pretty sure that the most recurrent
problem is the FTP transfer. If you use FTP you know what I mean (other can
look at netfilter ftp conntrack).
----[ 3.2 - So were does it come from ?
FTP problem is a good example to explain the origin of the overlap
specification problem. FTP was specified to work over a one to one
reliable connexion (TCP in fact). NAT was designed to modify IP. So due to
protocol dependency it also alter TCP and therefor FTP.
During NAT specification it was not taken into account that every
protocol that relies on IP, can conflict with the modified specification.
To tell the truth ,even if the people that design the NAT mechanism have
ever wanted to ensure that every protocol that relies on IP can work with
the NAT they couldn't make it.
Why ? because specification are RFC and RFC are in english. English is
not a good way to specify things especially if you have a dependency graph
for the specification.
For example many programming languages have formal specifications.
Which is a more full proof way. The reason of this lack of formal
specification resides on the history of Internet[9]. At this time writing a
simple text was good enough. Nowadays it can be very problematic.
----[ 3.3 - How do you find it ?
The big question is, how do I find this bug ?. Well I found this
problem by formalizing a part of the TCP RFC and confronts the result of
this analysis to real execution traces. My analyzer (2) warned me about a
timestamp that was less than the previous one and as you know time does not
go back...
I check out why and found this problem. What's interesting here is that
the start point to find the bug is the specification rather than the
implementation as it usually does to find a buffer overflow for example.
----[ 3.4 - Back to the future
So from now on, what will happen ? Well more design errors will be
found because we cannot change the past and we need to live with it. It is
not reasonable to say that we can wipe off all that TCP stuff and start a
new thing from scratch. Internet and network are simply too big to move
just like that. Just think for one second about the IP v6 deployment and
you will be convinced. All we can do is try to be as careful as possible
when designing a new extension or a protocol. Trying to ensure that this
new stuff does not conflicts with previous specification or breaks
dependence. We can also try to formalize the protocols as much as we can to
try and detect errors before they cause problems. Sadly patching is mainly
our primary option for the coming years.
--[ 4.0 - Learning from the past aka conclusion
The past tells us that protocol is not well enough specified and leads
to errors (bug, conflict...). It may be time to change our habits and try
something in ad equation with our time. For example to design things with
security in mind. In this article I have tried to show you that by simply
understanding specification and with the help of some basic math you can:
- Find a flaw with a worldwide impact.
- Exploit this flaw in an elegant manner by the means of a simple theory.
- Extend fingerprint state of art.
I hope this will help to convince you that theory and formal tools are a
necessary part of the computer security field. Next time I will focus on
simple formal method to find bug. I hope you will be here .
--[ A Acknowledgements
First I would like to thank Romain Bottier for his help and his
patience. I also want to thank Plops and Poluc for having faith in me. See
guys we made it!
I also want to say that I take great care about non disclosure policy.
I have informed major vendors (Kernel.org, freeBSD, OpenBSD, Cisco...) a
month ago. As I said I did not get any feedback so I assume they do not
care.
References
*=*=*=*=*=
[1] AT&T Steven M. Bellovin. A Technique for Counting NATted Hosts
http://www.research.att.com/~smb/papers/fnat.pdf
[2] Jacobson, Braden, & Borman. RFC 1323 :TCP Extensions for High
Performance .
[3] K. Egevang, Cray Communications, P. Francis. RFC 1631 : The IP
Network Address Translator (NAT).
[4] Bret McDanel. TCP Timestamping - Obtaining System Uptime Remotely
originally posted to Bugtraq Security Mailing List on March 11, 2001.
[5] Michal Zalewski. p0f 2r. Jekyll had something to Hyde.
[6] Fyodor. Nmap - Free Security Scanner For Network Exploration &
Security Audits.
[7] Antirez. dumbscan original BUGTRAQ posting (18 Dec 1998)
[8] Microsoft. TCP timestamp in windows : KB224829.
[9] Hafner, Katie, Matthew Lyon. Where Wizards Stay Up Late: The Origins
of the Internet.
FootNotes
*=*=*=*=*=
(1) Microsoft point of view is that NAT is not a security mechanism so they
do not want to patch.
(2) If you are interested about my analyzer. I hope to publish soon a
theoric paper on how it works. I also hope to release one day a version
of it. To the question did I find other interesting things, the answer
is: maybe I need to check out more deeply.
--[ B - Proof of concept
/*
* Proof Of Concept : counting host behind a NAT using timestamp
* To compile this file, you will need the libpcap
* Copyright Elie Bursztein (lupin@zonart.net)
* Successfully compiled on FreeBSD 5.X and Linux 2.6.X
*
* $gcc natcount.c -o natcount -I/usr/local/include -L/usr/local/lib
* -lpcap
*/
#define __USE_BSD 1
#include <sys/time.h>
#include <time.h>
#include <netinet/in.h>
#include <net/ethernet.h>
#ifdef __FreeBSD__
# include <netinet/in_systm.h>
#endif /* __FreeBSD__ */
#ifdef __linux__
# include <linux/if_ether.h>
#endif /* __linux__ */
#include <netinet/ip.h>
#include <stdlib.h>
#include <string.h>
#include <pcap.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <net/if.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
#ifdef __linux__
# define th_off doff
#endif /* __linux__ */
u_int32_t addr = 0;
/* chain lists structures */
typedef struct listes_s {
struct listes_s *next;
void *elt;
} listes_t;
/* Structures for TCP options */
typedef struct { u_int32_t ts, ts_r; } timestamp_t;
typedef struct { timestamp_t *ts; } tcp_opt_t;
/* Structures for datas storage */
typedef struct { u_int32_t from, first_timestamp; struct timeval
first_seen; } machine_t;
typedef struct { u_int32_t host, nat; struct timeval first_seen; }
nat_box_t;
#define TIMESTAMP_ERROR_MARGIN 0.5
#define DELAY 1
/*
* List functions
*/
int add_in_list(listes_t **list, void * elt) {
listes_t *lst;
lst = malloc(sizeof (listes_t));
lst->next = *list;
lst->elt = elt;
*list = lst;
return (1);
}
void show_nated(listes_t *list) {
nat_box_t *nat;
struct in_addr addr;
printf("-- Begin of nated IP list --\n");
while (list)
{
nat = (nat_box_t *) list->elt;
if (nat->nat > 1) {
addr.s_addr = nat->host;
printf("I've guess %i computers sharing the same IP address
(%s)\n", nat->nat, inet_ntoa(addr));
}
list = list->next;
}
printf("-- End of nated IP list --\n");
}
/*
* Function used to get all TCP options
* Simple TCP options parser
*/
int tcp_option_parser(const u_char *options,
tcp_opt_t *parsed,
unsigned int size) {
u_int8_t kind, len, i;
bzero(parsed, sizeof(tcp_opt_t));
i = 0;
kind = *(options + i);
while (kind != 0) /* EO */
{
switch (kind) {
case 1: i++; break; /* NOP byte */
case 2: i += 4; break;
case 3: i += 3; break;
case 4: i += 2; break;
case 5: /* skipping SACK options */
len = (*options + ++i) - 1;
i += len;
break;
case 6: i += 6; break;
case 7: i += 6; break;
case 8:
i += 2;
parsed->ts = (timestamp_t *) (options + i);
i += 8;
return (1);
break;
default:
i++;
}
kind = *(options + i);
}
return (0);
}
/*
* Most interesting function ... Here we can know if a TCP packet is
* coming from someone we already know !
* Algo :
* finc (seconds) = current_packet_time - first_packet_time <- time
* between 2 packets
* ts_inc = inc_table * finc <- our supposed timestamp increment
* between 2 packets
* new_ts = first_timestamp + ts_inc <- new = timestamp we should have
* now !
*
* Now we just have to compare new_ts with current timestamp
* We can authorize an error margin of 0.5%
*
* Our inc_table contain timestamp increment per second for most
* Operating System
*/
int already_seen(machine_t *mach, tcp_opt_t *opt,
struct timeval temps)
{
int inc_table[4] = {2, 10, 100, 1000};
unsigned int new_ts;
float finc, tmp, ts_inc;
int i, diff;
finc = ((temps.tv_sec - mach->first_seen.tv_sec) * 1000000.
+ (temps.tv_usec - mach->first_seen.tv_usec)) / 1000000.;
for (i = 0; i < 4; i++) {
ts_inc = inc_table * finc;
new_ts = ts_inc + mach->first_timestamp;
diff = ntohl(opt->ts->ts) - new_ts;
if (diff == 0) { /* Perfect shoot ! */
return (2);
}
tmp = 100. - (new_ts * 100. / ntohl(opt->ts->ts));
if (tmp < 0.)
tmp *= -1.;
if (tmp <= TIMESTAMP_ERROR_MARGIN) { /* Update timestamp and time */
mach->first_seen = temps;
mach->first_timestamp = ntohl(opt->ts->ts);
return (1);
}
}
return (0);
}
/*
* Simple function to check if an IP address is already in our list
* If not, it's only a new connection
*/
int is_in_list(listes_t *lst, u_int32_t addr) {
machine_t *mach;
while (lst) {
mach = (machine_t *) lst->elt;
if (mach->from == addr)
return (1);
lst = lst->next;
}
return (0);
}
/*
* This function should be call if a packet from an IP address have been
* found,
* is address is already in the list, but doesn't match any timestamp
* value
*/
int update_nat(listes_t *list, u_int32_t addr)
{
nat_box_t *box;
while (list)
{
box = (nat_box_t *) list->elt;
if (box->host == addr)
{
box->nat++;
return (1);
}
list = list->next;
}
return (0);
}
int check_host(listes_t **list, listes_t **nat, u_int32_t
from,
tcp_opt_t *opt, struct timeval temps) {
listes_t *lst;
machine_t *mach;
int found, zaped;
found = zaped = 0;
lst = *list;
while (lst && !(found)) {
mach = (machine_t *) lst->elt;
if (mach->from == from) {
if ( temps.tv_sec - mach->first_seen.tv_sec > DELAY ) {
found = already_seen(mach, opt, temps);
} else zaped = 1;
}
lst = lst->next;
}
if (!(zaped) && !(found)) {
mach = malloc(sizeof (machine_t));
mach->from = from;
mach->first_seen = temps;
mach->first_timestamp = ntohl(opt->ts->ts);
add_in_list(list, mach);
update_nat(*nat, from);
show_nated(*nat);
return (1);
}
return (0);
}
void callback_sniffer(u_char *useless,
const struct pcap_pkthdr* pkthdr,
const u_char *packet)
{
static listes_t *list_machines = 0;
static listes_t *list_nat = 0;
const struct ip *ip_h;
const struct tcphdr *tcp_h;
tcp_opt_t tcp_opt;
machine_t *mach;
nat_box_t *nat;
struct in_addr my_addr;
ip_h = (struct ip *) (packet + sizeof(struct ether_header));
if (ip_h->ip_p == IPPROTO_TCP)
{
tcp_h = (struct tcphdr *) (packet + sizeof(struct ether_header) +
sizeof(struct ip));
if (tcp_h->th_off * 4 > 20) {
if (tcp_option_parser((u_char *) (packet + sizeof(struct
ether_header)
+ sizeof(struct ip) +
sizeof(struct tcphdr)),
&tcp_opt, tcp_h->th_off * 4 - 20))
{
if (is_in_list(list_machines, (ip_h->ip_src).s_addr)) {
check_host(&list_machines, &list_nat, (u_int32_t)
(ip_h->ip_src).s_addr, &tcp_opt, pkthdr->ts);
} else {
if (ntohl(tcp_opt.ts->ts) != 0)
{
addr = (ip_h->ip_src).s_addr;
my_addr.s_addr = addr;
mach = malloc(sizeof (machine_t));
mach->from = (ip_h->ip_src).s_addr;
mach->first_seen = pkthdr->ts;
mach->first_timestamp = ntohl(tcp_opt.ts->ts);
nat = malloc(sizeof (nat_box_t));
nat->host = (u_int32_t) (ip_h->ip_src).s_addr;
nat->nat = 1;
nat->first_seen = mach->first_seen;
add_in_list(&list_machines, mach);
add_in_list(&list_nat, nat);
}
}
}
}
}
}
int main(int ac, char *argv[])
{
pcap_t *sniff;
char errbuf[PCAP_ERRBUF_SIZE];
struct bpf_program fp;
char *device;
bpf_u_int32 maskp, netp;
struct in_addr my_ip_addr;
char filter[250];
if (getuid() != 0) {
printf("You must be root to use this tool.\n");
exit (2);
}
if (--ac != 1)
{
printf("Usage: ./natcount xl0\n");
return (1);
}
device = (++argv)[0];
pcap_lookupnet(device, &netp, &maskp, errbuf);
my_ip_addr.s_addr = (u_int32_t) netp;
printf("Using interface %s IP : %s\n", device, inet_ntoa(my_ip_addr));
if ((sniff = pcap_open_live(device, BUFSIZ, 1, 1000, errbuf)) == NULL)
{
printf("ERR: %s\n", errbuf);
exit(1);
}
bzero(filter, 250);
snprintf(filter, 250, "not src net %s", inet_ntoa(my_ip_addr));
if(pcap_compile(sniff,&fp, filter, 0, netp) == -1) {
fprintf(stderr,"Error calling pcap_compile\n");
exit(1);
}
if(pcap_setfilter(sniff,&fp) == -1) {
fprintf(stderr,"Error setting filter\n");
exit(1);
}
pcap_loop(sniff, -1, callback_sniffer, NULL);
return (0);
}
|=-----------------------------=[ 0x03-3 ]=------------------------------=|
|=---=[ All Hackers Need To Know About Elliptic Curve Cryptography ]=----=|
|=-----------------------------------------------------------------------=|
|=----------------------------=[ f86c9203 ]=-----------------------------=|
---[ Contents
0 - Abstract
1 - Algebraical Groups and Cryptography
2 - Finite Fields, Especially Binary Ones
3 - Elliptic Curves and their Group Structure
4 - On the Security of Elliptic Curve Cryptography
5 - The ECIES Public Key Encryption Scheme
6 - The XTEA Block Cipher, CBC-MAC and Davies-Meyer Hashing
7 - Putting Everything Together: The Source Code
8 - Conclusion
9 - Outlook
A - Appendix: Literature
B - Appendix: Code
---[ 0 - Abstract
Public key cryptography gained a lot of popularity since its invention
three decades ago. Asymmetric crypto systems such as the RSA
encryption scheme, the RSA signature scheme and the Diffie-Hellman Key
Exchange (DH) are well studied and play a fundamental role in modern
cryptographic protocols like PGP, SSL, TLS, SSH.
The three schemes listed above work well in practice, but they still
have a major drawback: the data structures are large, i.e. secure
systems have to deal with up to 2048 bit long integers. These are
easily handled by modern desktop computers; by contrast embedded
devices, handhelds and especially smartcards reach their computing
power limits quickly. As a second problem, of course, the
transportation of large integers "wastes" bandwidth. In 2048 bit
systems an RSA signature takes 256 bytes; that's quite a lot,
especially for slow communication links.
As an alternative to RSA, DH and suchlike the so called Elliptic Curve
Cryptography (ECC) was invented in the mid-eighties. The theory behind
it is very complicated and much more difficult than doing calculations
on big integers. This resulted in a delayed adoption of ECC systems
although their advantages over the classic cryptographic building
blocks are overwhelming: key lengths and the necessary processing
power are much smaller (secure systems start with 160 bit keys). Thus,
whenever CPU, memory or bandwidth are premium resources, ECC is a good
alternative to RSA and DH.
This article has two purposes:
1. It is an introduction to the theory of Elliptic Curve Cryptography.
Both, the mathematical background and the practical implementability
are covered.
2. It provides ready-to-use source code. The C code included and
described in this article (about 500 lines in total) contains a
complete secure public key crypto system (including symmetric
components: a block cipher, a hash function and a MAC) and is
released to the public domain.
The code doesn't link against external libraries, be they of
bigint, cryptographic or other flavour; an available libc is
sufficient. This satisfies the typical hacker need for compact and
independent programs that have to work in "inhospitable"
environments; rootkits and backdoors seem to be interesting
applications.
As mentioned above the theory behind EC cryptography is rather
complex. To keep this article brief and readable by J. Random Hacker
only the important results are mentioned, theorems are not proven,
nasty details are omitted. If on the other hand you are into maths and
want to become an ECC crack I encourage to start reading [G2ECC] or
[ECIC].
---[ 1 - Algebraical Groups and Cryptography
Definition. A set G together with an operation G x G -> G denoted by
'+' is called an (abelian algebraical) group if the following axioms
hold:
G1. The operation '+' is associative and commutative:
(a + b) + c = a + (b + c) for all a,b,c in G
a + b = b + a for all a,b in G
G2. G contains a neutral element '0' such that
a + 0 = a = 0 + a for all a in G
G3. For each element 'a' in G there exists an "inverse element",
denoted by '-a', such that
a + (-a) = 0.
For a group G the number of elements in G is called the group order,
denoted by |G|.
Example. The sets Z, Q and R of integers, rational numbers and real
numbers, respectively, form groups of infinite order in respect to
their addition operation. The sets Q* and R* (Q and R without 0) also
form groups in respect to multiplication (with 1 being the neutral
element and 1/x the inverse of x).
Definition. Let G be a group with operation '+'. A (nonempty) subset H
of G is called a subgroup of G if H is a group in respect to the same
operation '+'.
Example. Z is a subgroup of Q is a subgroup of R in respect to '+'.
In respect to '*' Q* is a subgroup of R*.
Theorem (Lagrange). Let G be a group of finite order and H be a
subgroup of G. Then |H| properly divides |G|.
It follows that if G has prime order, G has only two subgroups,
namely {0} and G itself.
We define the "scalar multiplication" of a natural number k with a
group element g as follows:
k * g := g + g + ... + g + g
\____ k times ____/
Theorem. For a finite group G and an element g in G the set of all
elements k * g (k natural) forms a subgroup of G. This subgroup
is named the "cyclic subgroup generated by g".
Thus a prime order group is generated by any of its nonzero elements.
We now introduce the Diffie-Hellman Key Exchange protocol: let G be a
prime order group and g a nonzero element. Let two players, called
Alice and Bob respectively, do the following:
1. Alice picks a (secret) random natural number 'a', calculates
P = a * g and sends P to Bob.
2. Bob picks a (secret) random natural number 'b', calculates
Q = b * g and sends Q to Alice.
3. Alice calculates S = a * Q = a * (b * g).
4. Bob calculates T = b * P = b * (a * g).
By definition of the scalar multiplication it is apparent that S =
T. Therefore after step 4 Alice and Bob possess the same value S. The
eavesdropper Eve, who recorded the exchanged messages P and Q, is able
to calculate the same value if she manages to determine 'a' or
'b'. This problem (calculating 'a' from G, g and 'a * g') is called
the group's Discrete Logarithm Problem (DLP).
In groups where DLP is too 'hard' to be practically solvable it is
believed to be out of reach for eavesdroppers to determine the value
S, hence Alice and Bob can securely establish a secret key which can
be used to protect further communication.
If an attacker is able to intercept the transmission of P and Q and to
replace both by the group's neutral element, obviously Alice and Bob
are forced to obtain S = 0 = T as shared key. This has to be
considered a successful break of the crypto system. Therefore both
Alice and Bob have to make sure that the received elements Q and P,
respectively, indeed do generate the original group.
The presented DH scheme may also serve as public key encryption scheme
(called ElGamal encryption scheme): let Alice pick a random natural
number 'a' as private key. The element P = a * g is the corresponding
public key. If Bob wants to encrypt a message for her, he picks a
random number 'b', symmetrically encrypts the message with key T = b *
P and transmits the cipher text along with Q = b * g to Alice. She
can reconstruct T = S via S = a * Q and then decrypt the message.
Note the direct relationship between this and the DH scheme!
Conclusion: Cryptographers are always seeking for finite prime order
groups with hard DLP. This is where elliptic curves come into play:
they induce algebraical groups, some of them suitable for DH and
ElGamal crypto systems. Moreover the elliptic curve arithmetic
(addition, inversion) is implementable in a relatively efficient way.
You will find more information about groups and their properties in
[Groups], [Lagrange], [CyclicGroups] and [GroupTheory]. Read more
about the DLP, DH key exchange and ElGamal encryption in [DLP], [DH]
and [ElGamal].
---[ 2 - Finite Fields, Especially Binary Ones
Definition. A set F together with two operations F x F -> F named
'+' and '*' is called a field if the following axioms hold:
F1. (F, +) forms a group
F2. (F*, *) forms a group (where F* is F without the
'+'-neutral element '0')
F3. For all a,b,c in G the distributive law holds:
a * (b + c) = (a * b) + (a * c)
For 'a + (-b)' we write shorter 'a - b'. Accordingly we write 'a / b'
when we multiply 'a' with the '*'-inverse of b.
To put it clearly: a field is a structure with addition, substraction,
multiplication and division that work the way you are familiar with.
Example. The sets Q and R are fields.
Theorem. For each natural m there exists a (finite) field GF(2^m) with
exactly 2^m elements. Fields of this type are called binary fields.
Elements of binary fields GF(2^m) can efficiently be represented by
bit vectors of length m. The single bits may be understood as
coefficients of a polynomial of degree < m. To add two field elements
g and h just carry out the polynomial addition g + h (this means: the
addition is done element-wise, i.e. the bit vectors are XORed
together). The multiplication is a polynomial multiplication modulo a
certain fixed reduction polynomial p: the elements' product is the
remainder of the polynomial division (g * h) / p.
The fact that field addition just consists of a bitwise XOR already
indicates that in binary fields F each element is its own additive
inverse, that is: a + a = 0 for all a in F. For a,b in F as
consequence 2*a*b = a*b + a*b = 0 follows, what leads to the (at the
first glance surprising) equality
(a + b)^2 = a^2 + b^2 for all a,b in F.
More about finite fields and their arithmetical operations can be
found in [FiniteField], [FieldTheory], [FieldTheoryGlossary] and
especially [FieldArithmetic].
---[ 3 - Elliptic Curves and their Group Structure
Definition. Let F be a binary field and 'a' and 'b' elements in F.
The set E(a, b) consisting of an element 'o' (the "point at
infinity") plus all pairs (x, y) of elements in F that satisfy
the equation
y^2 + x*y = x^3 + a*x^2 + b
is called the set of points of the binary elliptic curve E(a, b).
Theorem. Let E = E(a, b) be the point set of a binary elliptic curve
over the field F = GF(2^m). Then
1. E consists of approximately 2^m elements.
2. If (x, y) is a point on E (meaning x and y satisfy the above
equation) then (x, y + x) is also a point on E.
3. If two points P = (x1, y1) and Q = (x2, y2) on E with x1 != x2 are
connected by a straight line (something of the form y = m*x + b),
then there is exactly one third point R = (x3, y3) on E that is
also on this line. This induces a natural mapping fP, Q) -> R,
sometimes called chord-and-tangent mapping.
Exercise. Prove the second statement.
The chord-and-tangent mapping 'f' is crucial for the group structure
given naturally on elliptic curves:
a) The auxiliary element 'o' will serve as neutral element which may
be added to any curve point without effect.
b) For each point P = (x, y) on the curve we define the point
-P := (x, y + x) to be its inverse.
c) For two points P = (x1, y1) and Q = (x2, y2) the sum 'P + Q'
is defined as -f(P, Q).
It can be shown that the set E together with the point addition '+'
and the neutral element 'o' defacto has group structure. If the
curve's coefficients 'a' and 'b' are carefully chosen, there exist
points on E that generate a prime order group of points for which the
DLP is hard. Based on these groups secure crypto systems can be built.
The point addition on curves over the field R can be visualized. See
[EllipticCurve] for some nice images.
In ECC implementations it is essential to have routines for point
addition, doubling, inversion, etc. We present pseudocode for the
most important ones:
Let (x, y) be a point on the elliptic curve E(a, b). The point
(x', y') := 2 * (x, y) can be computed by
l = x + (y / x)
x' = l^2 + l + a
y' = x^2 + l*x' + x'
return (x', y')
For two points P = (x1, y1), Q = (x2, y2) the sum (x3, y3) = P + Q
can be computed by
l = (y2 + y1) / (x2 + x1)
x3 = l^2 + l + x1 + x2 + a
y3 = l(x1 + x3) + x3 + y1
return (x3, y3)
Some special cases where the point at infinity 'o' has to be
considered have been omitted here. Have a look at [PointArith] for
complete pseudocode routines. But nevertheless we see that point
arithmetic is easy and straight forward to implement. A handful of
field additions, multiplications plus a single division do the job.
The existence of routines that do point doubling and addition is
sufficient to be able to build an efficient "scalar multiplier": a
routine that multiplies a given curve point P by any given natural
number k. The double-and-add algorithm works as follows:
H := 'o'
let n be the number of the highest set bit in k
while(n >= 0) {
H = 2 * H;
if the nth bit in k is set:
H = H + P;
n--;
}
return H;
Example. Suppose you want to calculate k*P for k = 11 = 1011b. Then
n is initialized to 3 and H calculated as
H = 2 * (2 * (2 * (2 * 'o' + P)) + P) + P
= 2 * (2 * (2 * P) + P) + P
= 2 * (5 * P) + P
= 11 * P
Some elliptic curves that are suitable for cryptographic purposes have
been standardized. NIST recommends 15 curves (see [NIST]), among them
five binary ones called B163, B233, B283, B409 and B571. The
parameters of B163 are the following ([NISTParams]):
Field: GF(2^163)
Reduction poly: p(t) = t^163 + t^7 + t^6 + t^3 + 1
Coefficient a: 1
Coefficient b: 20a601907b8c953ca1481eb10512f78744a3205fd
x coordinate of g: 3f0eba16286a2d57ea0991168d4994637e8343e36
y coordinate of g: 0d51fbc6c71a0094fa2cdd545b11c5c0c797324f1
group order: 2 * 5846006549323611672814742442876390689256843201587
The field size is 2^163, the corresponding symmetric security level is
about 80 bits (see chapter 4). The field elements are given in
hexadecimal, the curve's order in decimal form as h * n, where h (the
"cofactor") is small and n is a large prime number. The point g is
chosen in a way that the subgroup generated by g has order n.
The source code included in this article works with B163. It can
easily be patched to support any other binary NIST curve; for this it
is sufficient to alter just 6 lines.
Exercise. Try it out: patch the sources to get a B409 crypto
system. You will find the curve's parameters in [NISTParams].
Read [EllipticCurve], [PointArith] and [DoubleAndAdd] for further
information.
---[ 4 - On the Security of Elliptic Curve Cryptography
We learned that the security of the DH key exchange is based on the
hardness of the DLP in the underlying group. Algorithms are known that
determine discrete logarithms in arbitrary groups; for this task no
better time complexity bound is known than that for Pollard's "Rho
Method" ([PollardRho]):
Theorem. Let G be a finite (cyclic) group. Then there exists an
algorithm that solves DLP in approximately sqrt(|G|) steps (and low
memory usage).
For elliptic curves no DLP solving algorithm is known that performs
better than the one mentioned above. Thus it is believed that the
ECCDLP is "fully exponential" with regard to the bit-length of
|G|. RSA and classical DH systems can, by contrast, be broken in
"subexponential" time. Hence their key lengths must be larger than
those for ECC systems to achieve the same level of security.
We already saw that elliptic curves over GF(2^m) contain about 2^m
points. Therefore DLP can be solved in about sqrt(2^m) steps, that is
2^(m/2). We conclude that m-bit ECC systems are equivalent to
(m/2)-bit symmetric ciphers in measures of security.
The following table compares equivalent key sizes for various crypto
systems.
ECC key size | RSA key size | DH key size | AES key size
-------------+--------------+-------------+-------------
160 | 1024 | 1024 | (80)
256 | 3072 | 3072 | 128
384 | 7680 | 7680 | 192
512 | 15360 | 15360 | 256
---[ 5 - The ECIES Public Key Encryption Scheme
Earlier we presented the DH Key Exchange and the ElGamal public key
crypto system built on top of it. The Elliptic Curve Integrated
Encryption Scheme (ECIES, see ANSI X9.63) is an enhancement of ElGamal
encryption specifically designed for EC groups. ECIES provides
measures to defeat active attacks like the one presented above.
Let E be an elliptic curve of order h * n with n a large prime
number. Let G be a subgroup of E with |G| = n. Choose a point P in G
unequal to 'o'.
We start with ECIES key generation:
Alice picks as private key a random number 'd' with 1 <= d < n;
She distributes the point Q := d * P as public key.
If Bob wants to encrypt a message m for Alice he proceeds as follows:
1. Pick a random number 'k' with 1 <= k < n.
2. Compute Z = h * k * Q.
3. If Z = 'o' goto step 1.
4. Compute R = k * P.
5. Compute (k1, k2) = KDF(Z, R) (see below).
6. Encrypt m with key k1.
7. Calculate the MAC of the ciphertext using k2 as MAC key.
8. Transmit R, the cipher text and the MAC to Alice.
Alice decrypts the cipher text using the following algorithm:
1. Check that R is a valid point on the elliptic curve.
2. Compute Z = h * d * R.
3. Check Z != 'o'.
4. Compute (k1, k2) = KDF(Z, R) (see below).
5. Check the validity of the MAC using key k2.
6. Decrypt m using key k1.
If any of the checks fails: reject the message as forged.
KDF is a key derivation function that produces symmetric keys k1, k2
from a pair of elliptic curve points. Just think of KDF being the
cryptographic hash function of your choice.
ECIES offers two important features:
1. If an attacker injects a curve point R that does not generate a
large group (this is the case in the attack mentioned above), this
is detected in steps 2 und 3 of the decryption process (the
cofactor plays a fundamental role here).
2. The message is not only encrypted in a secure way, it is also
protected from modification by a MAC.
Exercise. Implement a DH key exchange. Let E be a binary elliptic
curve or order h * n. Let G be a subgroup of E with |G| = n. Choose a
point g in G unequal to 'o'. Let Alice and Bob proceed as follows:
1. Alice picks a random number 'a' with 1 <= a < n and sends P = a * g
to Bob.
2. Bob picks a random number 'b' with 1 <= b < n and sends Q = b * g
to Alice.
3. Alice checks that Q is a point on the curve that generates a group
of order n (see the ECIES_public_key_validation routine). Alice
calculates S = a * Q.
4. Bob checks that P is a point on the curve that generates a group of
ordern n. He calculates T = b * P.
If everything went OK the equality S = T should hold.
---[ 6 - The XTEA Block Cipher, CBC-MAC and Davies-Meyer Hashing
XTEA is the name of a patent-free secure block cipher invented by
Wheeler and Needham in 1997. The block size is 64 bits, keys are 128
bits long. The main benefit of XTEA over its competitors AES, Twofish,
etc is the compact description of the algorithm:
void encipher(unsigned long m[], unsigned long key[])
{
unsigned long sum = 0, delta = 0x9E3779B9;
int i;
for(i = 0; i < 32; i++) {
m[0] += ((m[1] << 4 ^ m[1] >> 5) + m[1]) ^ (sum + key[sum & 3]);
sum += delta;
m[1] += ((m[0] << 4 ^ m[0] >> 5) + m[0]) ^ (sum + key[sum >> 11 & 3]);
}
}
Let E be a symmetric encryption function with block length n,
initialized with key k. The CBC-MAC of a message m is calculated as
follows:
1. Split m in n-bit-long submessages m1, m2, m3, ...
2. Calculate the intermediate values t0 = E(length(m)),
t1 = E(m1 XOR t0), t2 = E(m2 XOR t1), t3 = E(m3 XOR t2), ...
3. Return the last value obtained in step 2 as MAC(k, m) and
discard t0, t1, t2, ...
Next we show how a block cipher can be used to build a cryptographic
hash function using the "Davies-Meyer" construction. Let m be the
message that is to be hashed. Let E(key,block) be a symmetric
encryption function with block length n and key length l.
1. Split m in l-bit-long submessages m1, m2, m3, ...
2. Calculate the intermediate values h1 = E(m1, 0), h2 = E(m2, h1) XOR
h1, h3 = E(m3, h2) XOR h2, ...
3. If h is the last intermediate value obtained in step 2 return
E(length(m), h) XOR h as hash value and discard h1, h2, h3, ...
The code included in this article uses the block cipher XTEA in
counter mode (CTR) for encryption, a CBC-MAC garantees message
authenticity; finally KDF (see chapter 5) is implemented using XTEA in
Davies-Meyer mode.
Read [XTEA] and [DMhashing] to learn more about the XTEA block cipher
and the Davies-Meyer construction.
---[ 7 - Putting Everything Together: The Source Code
The public domain source code you find at the end of this document
implements the ECIES public key encryption system over the curve
B163. The code is commented, but we outline the design here.
1. The central data structure is a bit vector of fixed but "long"
length. It is the base data type used to represent field elements
and suchlike. The dedicated typedef is called bitstr_t.
Appropriate routines for bit manipulation, shifting, bitcounting,
importing from an ASCII/HEX representation, etc do exist.
2. The functions with "field_" prefix do the field arithmetic:
addition, multiplication and calculation of the multiplicative
inverse of elements are the important routines.
3. ECC points are represented as pairs of elem_t (an alias for
bitstr_t), the special point-at-infinity as the pair (0,0). The
functions prefixed with "point_" act on elliptic curve points and
implement basic point operations: point addition, point doubling,
etc.
4. The function "point_mult" implements the double-and-add algorithm
to compute "k * (x,y)" in the way described in chapter 3 .
5. The "XTEA"-prefixed functions implement the XTEA block cipher,
but also the CBC-MAC and the Davies-Meyer construction.
6. The "ECIES_"-routines do the ECIES related work.
ECIES_generate_key_pair() generates a private/public key pair,
ECIES_public_key_validation() checks that a given point is
on the curve and generates a group of order "n".
ECIES_encryption/ECIES_decryption do what their names imply.
7. A demonstration of the main ECIES functionalities is given in the
program's main() section.
The code may be compiled like this:
gcc -O2 -o ecc ecc.c
---[ 8 - Conclusion
We have seen how crypto systems are built upon algebraical groups that
have certain properties. We further gave an introduction into a special
class of appropriate groups and their theory, namely to the binary
elliptic curves. Finally we presented the secure public key encryption
scheme ECIES (together with necessary symmetrical components). All
this is implemented in the source code included in this article.
We recall that besides security the central design goal of the code
was compactness, not speed or generality. Libraries specialized on EC
cryptography benefit from assembler hand-coded field arithmetic
routines and easily perform a hundred times faster than this code.
If compactness is not essential for your application you might opt for
linking against one of the following ECC capable free crypto libraries
instead:
Crypto++ (C++) http://www.eskimo.com/~weidai/cryptlib.html
Mecca (C) http://point-at-infinity.org/mecca/
LibTomCrypt (C) http://libtomcrypt.org/
borZoi (C++/Java) http://dragongate-technologies.com/products.html
---[ 9 - Outlook
You have learned a lot about elliptic curves while reading this
article, but there still remains a bunch of unmentioned ideas. We
list some important ones:
1. Elliptic curves can be defined over other fields than binary ones.
Let p be a prime number and Z_p the set of nonnegative integers
smaller than p. Then Z_p forms a finite field (addition and
multiplication have to be understood modulo p, see
[ModularArithmetic] and [FiniteField]).
For these fields the elliptic curve E(a, b) is defined to be the
set of solutions of the equation
y^2 = x^3 + ax + b
plus the point at infinity 'o'. Of course point addition and
doubling routines differ from that given above, but essentially
these "prime curves" form an algebraical group in a similar way as
binary curves do. It is not that prime curves are more or less
secure than binary curves. They just offer another class of groups
suitable for cryptographic purposes.
NIST recommends five prime curves: P192, P224, P256, P384 and P521.
2. In this article we presented the public key encryption scheme
ECIES. It should be mentioned that ECC-based signature schemes
(see [ECDSA]) and authenticated key establishment protocols ([MQV])
do also exist. The implementation is left as exercise to the
reader.
3. Our double-and-add point multiplicator is very rudimentary. Better
ones can do the "k * P" job in half the time. We just give the idea
of a first improvement:
Suppose we want to calculate 15 * P for a curve point P. The
double-and-add algorithm does this in the following way:
15 * P = 2 * (2 * (2 * (2 * 'o' + P) + P) + P) + P
This takes three point doublings and three point additions
(calculations concerning 'o' are not considered).
We could compute 15 * P in a cleverer fashion:
15 * P = 16 * P - P = 2 * 2 * 2 * 2 * P - P
This takes four doublings plus a single addition; hence we may
expect point multiplicators using this trick to be better
performers than the standard double-and-add algorithm. In practice
this trick can speed up the point multiplication by about 30%.
See [NAF] for more information about this topic.
4. In implementations the most time consuming field operation is
always the element inversion. We saw that both the point addition
and the point doubling routines require one field division each.
There is a trick that reduces the amount of divisions in a full "k
* P" point multiplication to just one. The idea is to represent the
curve point (x,y) as triple (X,Y,Z) where x = X/Z, y = Y/Z. In this
"projective" representation all field divisions can by deferred to
the very end of the point multiplication, where they are carried
out in a single inversion.
Different types of coordinate systems of the projective type
are presented in [CoordSys].