This is the sixth post in the series. In case you have not read the previous post in this series, I would recommend you do, as this builds on the parts of previous, and skips the details explained in the previous posts.
I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.
In this post, we will see how the main function is converted to assembly.
Let us consider the simplest use of main function :
I will be skipping the part which was explained in previous posts, and will start from the main function itself.
.globl main .type main, @function main: .LFB0: .cfi_startproc endbr64 pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 nop popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main
Let us see this, line by line.
main as a global symbol, which means it will be visible to other linked files as well.
.type main, @function
This sets the type of symbol main to function name.
declares the label main, which is used by the _start function which does the initial set up for the program execution.
This is explained a bit in
_start is the entry point of your program. That is, the address of that symbol is the address jumped to on program start. Normally, the function with the name
_start is supplied by a file called
crt0.o which contains the startup code for the C runtime environment. It…
Is the label indicating Local Function Beginning, and 0 indicates that it is the first in file from start, similarly the label
.LFE0: indicates the end of local function 0.
The cfi stands for Call Frame Information, and it is used when unwinding the stack in case of exceptions or errors.
cfi_startproc indicates the start of a procedure/function.
endbr64 is an assembly instruction
End branch 64. This is used to detect invalid jumps:The first instruction after making a jump of a call to function must be an endbr64, which tells processor that this is a valid and intended place to jump/call. If it is not found, then the jump/call becomes invalid, and a fault is generated.
It stands for "End Branch 64 bit" -- or more precisely, Terminate Indirect Branch in 64 bit
Here is the operation:
IF EndbranchEnabled(CPL) & EFER.LMA = 1 & CS.L = 1 IF CPL = 3 THEN IA32_U_CET.TRACKER = IDLE IA32_U_CET.SUPPRESS = 0
This pushes the current value of Base pointer register on stack.
.cfi_def_cfa_offset 16 .cfi_offset 6, -16
These again are cfi directives, the first one is for declaring the change of stack address which occurs afterwards, and second declares that previous value of register 6 (SP) is stored at offset of -16 from current call frame address.
As the DWARF spec says in section 6.4:
[...] The call frame is identified by an address on the stack. We refer to this address as the Canonical Frame Address or CFA. Typically, the CFA is defined to be the value of the stack pointer at the call site in…
movq %rsp, %rbp
This moves value of current stack pointer into the base pointer, indicating that now the base of stack is moved to the current stack top.
NOTE this is in AT & T syntax, where the order is
opcode source , destination.
This states that from this point onward, register 6 (SP) will be used for Call Frame Address.
indicates No Operation, which is usually used to either fill
in to make the resulting addresses of instruction that has advantages in processor cache, or which can be used to debug the functions. A discussion on it can be found https://www.reddit.com/r/C_Programming/comments/ecr9pp/why_does_gcc_include_nooperation_nop_assembly/.
This pops the stack top into the BP register. The stack top contained only the original value of BP register, because of pushq instruction, which is now restored.
.cfi_def_cfa 7, 8 ret .cfi_endproc
The First CFI directive cfi_def offet tells to take value from register 7 (BP), and add offset 8 to it, to get the new Call Frame Address.
After it is ret, which is used to return from a procedure/function, and then the cfi directive indicates the end of the function.
.LFE0 label, we have
.size main, .-main
Which declares size of the symbol main, which is declared using
. indicates current value of address pointer, and the address of main label is subtracted from it, to get the size of main function.
After this is part we have seen in previous posts.
This is how the main function is converted to assembly when compiling from c.
Again,I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.
- For more information on CFI , check out https://www.imperialviolet.org/2017/01/18/cfi.html and https://sourceware.org/binutils/docs/as/CFI-directives.html#CFI-directives