Sunday, July 24, 2016

Hello World under the Hood




Previous | Hacking | Next
 
A simple code everybody starts with is a “hello world!” program.  Here, using the ‘for’ loop, I make my computer display hello twice. Looking at the code it is easy to figure out what it does.

But it is not exactly how hackers look at it. There is some magic happening there.

What do computers really do? They obediently follow the program (step by step defined recipe). The same way we humans do things. Listen to master’s orders.
But the difference is that they can do the same thing over and over again without breaking sweat (loops) and without being bored don’t complain because they are machines. Dumb without a program, but boy are they fast if you provide a program to run!
But how do they do that? What’s under the hood.

#include 

int main(void) {

    int i;
    for (i = 0; i < 2; ++i) 
        printf("Hello, world!\n");    

    return 0;
}

The source code (like the one above) is set of instructions that must be translated (compiled) into a machine language that the CPU understands. All instructions are in fact strings of ones and zeros. Those form something called machine code. CPU can decode them as instructions and compute stuff, which means they operate on data.
Here is what my first program does.

A look into Computer Guts
Compile and run the code. See what it does.
pi@tron:~/dh $ gcc -g -o hello hello.c 
pi@tron:~/dh $ 
pi@tron:~/dh $ ./hello
Hello, world!
Hello, world!
pi@tron:~/dh $
Compilation -g allows gdb to read the code and step through it.
Inside the computer the code operates at a different level.
First GDB lesson courtesy of beej.us and few other places on the net.
My first program up close looks like this:
$ objdump -M intel -D hello | grep -A20 main.:
000000000040052d 
: 40052d: 55 push rbp 40052e: 48 89 e5 mov rbp,rsp 400531: 48 83 ec 10 sub rsp,0x10 400535: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0 40053c: eb 0e jmp 40054c
40053e: bf e4 05 40 00 mov edi,0x4005e4 400543: e8 c8 fe ff ff call 400410 400548: 83 45 fc 01 add DWORD PTR [rbp-0x4],0x1 40054c: 83 7d fc 01 cmp DWORD PTR [rbp-0x4],0x1 400550: 7e ec jle 40053e
400552: b8 00 00 00 00 mov eax,0x0 400557: c9 leave 400558: c3 ret
400559: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0] 0000000000400560 : 400560: 41 57 push r15 400562: 41 89 ff mov r15d,edi 400565: 41 56 push r14 400567: 49 89 f6 mov r14,rsi
It is beautiful. Looking at the code like that.
I want gdb to debug code using Intel representation rather than default AT&T. Here’s my file that is going to take care of it:
$ cat ~/.gdbinit 
set disassembly intel
CPU registers are sort of like quick access variables. Here they are:
$ gdb -q hello
Reading symbols from hello...done.
(gdb) break main
Breakpoint 1 at 0x400535: file hello.c, line 6.
(gdb) run
Starting program: /home/jaro/Desktop/Jun-Dec-2016/14.Code/aoh/hello 

Breakpoint 1, main () at hello.c:6
6     for(i = 0; i < 2; ++i)
(gdb) info registers
rax            0x40052d 4195629
rbx            0x0 0
rcx            0x0 0
rdx            0x7fffffffe178 140737488347512
rsi            0x7fffffffe168 140737488347496
rdi            0x1 1
rbp            0x7fffffffe080 0x7fffffffe080
rsp            0x7fffffffe070 0x7fffffffe070
r8             0x7ffff7dd4e80 140737351863936
r9             0x7ffff7dea530 140737351951664
r10            0x7fffffffdf10 140737488346896
r11            0x7ffff7a36e50 140737348070992
r12            0x400440 4195392
r13            0x7fffffffe160 140737488347488
r14            0x0 0
r15            0x0 0
rip            0x400535 0x400535 
eflags 0x202 [ IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb)
Program listed in gdb:
(gdb) list
1 #include 
2 
3 int main(void) {
4 
5     int i;
6     for(i = 0; i < 2; ++i)
7         printf("hello, world.\n");
8 
9     return 0;
10 }
(gdb)
Now time to disassemble the code:
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
=> 0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
The arrow points to the firs instruction to be executed as per address in rip (program counter register). The command x is for examine.
(gdb) info register rip
rip            0x400535 0x400535 
(gdb)
Funny thing is that I am no programmer. Not at all. I am just starting. But I do know the music when I hear one. Just like this one below:
Move value 0 to the memory location (here in hex) minus 4 bytes.


=> 0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
And next instruction (nexti). Once executed it will jump to the memory location: 0x40054c where the next command must be executed. The rip register (instruction pointer is always incremented to point to the next instruction.


(gdb) nexti
0x000000000040053c 6     for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
=> 0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
Here we go. Jump baby, jump!


(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 => 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
Is the value what variable i points to still lower than 2 (our loop condition)? Next instruction jle says: if less or equal/not greater. It relies on flags: SF<>OF or ZF == 1.
Let’s see those CPU flags after we executed comparison. Here SF is set but there is no OF (not set) if I understand it correctly.


(gdb) nexti
0x0000000000400550 6     for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 => 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb) info register eflags eflags 0x297 [ CF PF AF SF IF ] (gdb)
Jumping to the location pointed then!
(gdb) nexti
7         printf("hello, world.\n");
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
=> 0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
What is at this location? String to be printed of course. It is placed into Destination Index (EDI) register.
Let’s read what is at the location 0x4005e4. The gdb instruction x/12x means: examine 12 bytes in hex at the location (here: 0x4005e4).


(gdb) x/12x 0x4005e4
0x4005e4: 0x68 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x77
0x4005ec: 0x6f 0x72 0x6c 0x64
(gdb) 
Hex 0x68 is lower case ‘h’, 0x65 is lower case ‘e’ etc. It gets ready to be used by C standard library function printf(). Calling the function and magic happens.
(gdb) nexti
0x0000000000400543 7         printf("hello, world.\n");
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 => 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
Go baby! Display it! Procedure Linkage Table (PLT) locates the library code to use (libc.so.x).
Next instruction goes back to for loop and adds 1 to variable i.
(gdb) nexti
hello, world.
6     for(i = 0; i < 2; ++i)
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 => 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
Now the variable i is 1 (0 + 1) so the same process is repeated. In order to jump out of the loop the variable i must not be less than 2. One more loop and we finish the job.
=> 0x0000000000400550 <+35>: jle    0x40053e 
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb) info register eflags eflags 0x202 [ IF ] (gdb)
Now, neither SF<>OF or ZF == 1 are seem to show to be the case. We do NOT jump but follow to the next instruction that puts value 0 into the accumulator register.
(gdb) nexti
9     return 0;
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
=> 0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
We clean stuff up (exit status 0 is success I guess). And we leave the party and return to operating system.


(gdb) nexti
10 }
(gdb) disassemble main
Dump of assembler code for function main:
   0x000000000040052d <+0>: push   rbp
   0x000000000040052e <+1>: mov    rbp,rsp
   0x0000000000400531 <+4>: sub    rsp,0x10
   0x0000000000400535 <+8>: mov    DWORD PTR [rbp-0x4],0x0
   0x000000000040053c <+15>: jmp    0x40054c 
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 => 0x0000000000400557 <+42>: leave 0x0000000000400558 <+43>: ret End of assembler dump. (gdb) nexti 0x0000000000400558 10 } (gdb) disassemble main Dump of assembler code for function main: 0x000000000040052d <+0>: push rbp 0x000000000040052e <+1>: mov rbp,rsp 0x0000000000400531 <+4>: sub rsp,0x10 0x0000000000400535 <+8>: mov DWORD PTR [rbp-0x4],0x0 0x000000000040053c <+15>: jmp 0x40054c
0x000000000040053e <+17>: mov edi,0x4005e4 0x0000000000400543 <+22>: call 0x400410 0x0000000000400548 <+27>: add DWORD PTR [rbp-0x4],0x1 0x000000000040054c <+31>: cmp DWORD PTR [rbp-0x4],0x1 0x0000000000400550 <+35>: jle 0x40053e
0x0000000000400552 <+37>: mov eax,0x0 0x0000000000400557 <+42>: leave => 0x0000000000400558 <+43>: ret End of assembler dump. (gdb)
Music's over and it's midnight on Saturday. Time to go to bed.

Previous | Hacking | Next

Cisco Is Easy - Main

  Cisco Basics (CCNA level)  Lessons: Watch Video Tutorials on Youtube 01 - Connecting to Cisco Console Port with MINICOM 02 - Navigatin...