Assembler Basics: Registers

When learning Assembler and Machine Language, processor registers are super important. Whenever you want to calculate something in Assembler, you first have to load the values into a register before you can perform the arithmetic calculation. Let's start with a hello world program in C:

hello.c:

#include <stdio.h>
int main()
{
	printf("hello world");
} 

Compile and link it with:

gcc hello.c
Have a look at the <main> function with the command:
objdump -d -Mintel a.out
Note that objdump -Mintel puts the source operand right and the target operand left while objdump -MIntel does it the other way round. Here is the <main> function: 

    1149:	f3 0f 1e fa          	endbr64 
    114d:	55                   	push   rbp
    114e:	48 89 e5             	mov    rbp,rsp
    1161:	48 8d 05 9c 0e 00 00 	lea    rax,[rip+0xe9c]        # 2004 <_IO_stdin_used+0x4>
    1168:	48 89 c7             	mov    rdi,rax
    116b:	b8 00 00 00 00       	mov    eax,0x0
    1170:	e8 db fe ff ff       	call   1050 <printf@plt>
    1175:	b8 00 00 00 00       	mov    eax,0x0
    117a:	c9                   	leave  
    117b:	c3                   	ret 

Here is an explanation what the commands do:
endbr64
Here for security purposes.
push   rbp
Saves the Base Pointer Register to the stack. Can be restored later with the pop command.
mov    rbp,rsp
Sets the register rbp (base pointer) to rsp (stack pointer). Base pointer = Stack pointer now. If you were to put some variables here, it would put them here on the stack.
48 8d 05 9c 0e 00 00 	lea    rax,[rip+0xe9c]
Loads the effective address of the instruction pointer plux 0xe9c into register AX. This is exactly where "hello world" is in the RAM.
mov    rdi,rax
Moves register AX (RAM address of "hello world") into the destination index register RDI.
mov    eax,0x0
Sets EAX to 0.
call   1050 <printf@plt>
Calls the printf@plt function that calls GLIBC's printf function. This can then inspect the registers and will know what to do (print) and where the output is. It will output the string till the first occurrence of the 0 byte.
leave
sets ESP=EBP and restores the base pointer from the stack using the pop command.
ret
returns to the calling function.

Debugging it

Now let's debug it, run one machine language command after the other. We use the GNU Debugger:
thorsten@tweedleburg:~$ gdb a.out 
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...
(No debugging symbols found in a.out)
We want the executable to get loaded and get a virtual RAM space, but not finish, so we set a break point at the main routine and start running it:

(gdb) break main
Breakpoint 1 at 0x1151
(gdb) run
Starting program: /home/thorsten/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, 0x0000555555555151 in main ()
Aha! We have finally been loaded into RAM and our memory addressses are around 0x0000555555555151. Let's see the commands that will be executed next by using the command disassemble:

(gdb) disassemble
Dump of assembler code for function main:
   0x0000555555555149 <+0>:	endbr64 
   0x000055555555514d <+4>:	push   %rbp
   0x000055555555514e <+5>:	mov    %rsp,%rbp
=> 0x0000555555555151 <+8>:	lea    0xeac(%rip),%rax        # 0x555555556004
   0x0000555555555158 <+15>:	mov    %rax,%rdi
   0x000055555555515b <+18>:	mov    $0x0,%eax
   0x0000555555555160 <+23>:	call   0x555555555050 <printf@plt>
   0x0000555555555165 <+28>:	mov    $0x0,%eax
   0x000055555555516a <+33>:	pop    %rbp
   0x000055555555516b <+34>:	ret    
End of assembler dump.
(gdb) 
The arrow ( => ) shows that we are about to execute the lea command, load effective address.

See also



Comments

Popular posts from this blog

My SAT>IP Server

Set up a webcam with Linux

Network Engineer Certification