DEV Community

Cover image for Under the Hood : The Main Function
YJDoc2
YJDoc2

Posted on

Under the Hood : The Main Function

Hello!
This is the sixth post in the series. In case you have not read the previous post in this series, I would recommend you do, as this builds on the parts of previous, and skips the details explained in the previous posts.

I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.

In this post, we will see how the main function is converted to assembly.

A simple main function

Let us consider the simplest use of main function :

void main(){}
Enter fullscreen mode Exit fullscreen mode

This generated the code
Main Function Assembly
Which we will now take a look at.

I will be skipping the part which was explained in previous posts, and will start from the main function itself.

Main Function

    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    endbr64
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    nop
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
Enter fullscreen mode Exit fullscreen mode

Let us see this, line by line.

.globl main
Enter fullscreen mode Exit fullscreen mode

Declares main as a global symbol, which means it will be visible to other linked files as well.

.type main, @function
Enter fullscreen mode Exit fullscreen mode

This sets the type of symbol main to function name.

main:
Enter fullscreen mode Exit fullscreen mode

declares the label main, which is used by the _start function which does the initial set up for the program execution.
This is explained a bit in

The symbol _start is the entry point of your program. That is, the address of that symbol is the address jumped to on program start. Normally, the function with the name _start is supplied by a file called crt0.o which contains the startup code for the C runtime environment. It…

.LFB0:
Enter fullscreen mode Exit fullscreen mode

Is the label indicating Local Function Beginning, and 0 indicates that it is the first in file from start, similarly the label .LFE0: indicates the end of local function 0.

Next are,

.cfi_startproc
endbr64
Enter fullscreen mode Exit fullscreen mode

The cfi stands for Call Frame Information, and it is used when unwinding the stack in case of exceptions or errors.
the cfi_startproc indicates the start of a procedure/function.
The endbr64 is an assembly instruction End branch 64. This is used to detect invalid jumps:The first instruction after making a jump of a call to function must be an endbr64, which tells processor that this is a valid and intended place to jump/call. If it is not found, then the jump/call becomes invalid, and a fault is generated.

It stands for "End Branch 64 bit" -- or more precisely, Terminate Indirect Branch in 64 bit

Intel has a document about this instruction.

Here is the operation:

IF EndbranchEnabled(CPL) & EFER.LMA = 1 & CS.L = 1
  IF CPL = 3
  THEN
    IA32_U_CET.TRACKER = IDLE
    IA32_U_CET.SUPPRESS = 0
pushq   %rbp
Enter fullscreen mode Exit fullscreen mode

This pushes the current value of Base pointer register on stack.

.cfi_def_cfa_offset 16
.cfi_offset 6, -16
Enter fullscreen mode Exit fullscreen mode

These again are cfi directives, the first one is for declaring the change of stack address which occurs afterwards, and second declares that previous value of register 6 (SP) is stored at offset of -16 from current call frame address.

As the DWARF spec says in section 6.4:

[...] The call frame is identified by an address on the stack. We refer to this address as the Canonical Frame Address or CFA. Typically, the CFA is defined to be the value of the stack pointer at the call site in…

movq    %rsp, %rbp
Enter fullscreen mode Exit fullscreen mode

This moves value of current stack pointer into the base pointer, indicating that now the base of stack is moved to the current stack top.
NOTE this is in AT & T syntax, where the order is opcode source , destination.

.cfi_def_cfa_register 6
Enter fullscreen mode Exit fullscreen mode

This states that from this point onward, register 6 (SP) will be used for Call Frame Address.

nop
Enter fullscreen mode Exit fullscreen mode

indicates No Operation, which is usually used to either fill
in to make the resulting addresses of instruction that has advantages in processor cache, or which can be used to debug the functions. A discussion on it can be found https://www.reddit.com/r/C_Programming/comments/ecr9pp/why_does_gcc_include_nooperation_nop_assembly/.

popq    %rbp
Enter fullscreen mode Exit fullscreen mode

This pops the stack top into the BP register. The stack top contained only the original value of BP register, because of pushq instruction, which is now restored.

.cfi_def_cfa 7, 8
ret
.cfi_endproc
Enter fullscreen mode Exit fullscreen mode

The First CFI directive cfi_def offet tells to take value from register 7 (BP), and add offset 8 to it, to get the new Call Frame Address.

After it is ret, which is used to return from a procedure/function, and then the cfi directive indicates the end of the function.
After the .LFE0 label, we have

.size   main, .-main
Enter fullscreen mode Exit fullscreen mode

Which declares size of the symbol main, which is declared using .-main where . indicates current value of address pointer, and the address of main label is subtracted from it, to get the size of main function.
After this is part we have seen in previous posts.

This is how the main function is converted to assembly when compiling from c.

Again,I'm learning this all as I write this, If you find any mistakes or have any suggestions and improvements, please let know in comments.

Thank you !

Notes :

Latest comments (1)

Collapse
 
tik_tok_spb profile image
tik tok

Great Job my friend!