DEV Community

Cover image for Let’s Understand Chrome V8 — Chapter 7: Stack Frame
灰豆
灰豆

Posted on • Originally published at Medium

Let’s Understand Chrome V8 — Chapter 7: Stack Frame

Original source:https://medium.com/@huidou/lets-understand-chrome-v8-chapter-7-stack-frame-bb3fa3b7ad5
Welcome to other chapters of Let’s Understand Chrome V8


In this paper, we’ll talk about the stack frame of two common function calls, “the arguments are less than the declared parameters” and “the arguments are more than the declared parameters”. Explain why these two function calls(arguments mismatch) do not cause a stack overflow, but can still run and output results correctly.

1. Introduction

The stack frame is used to store arguments, the return of callee, local variables, and registers. The steps of creating a stack frame is below:

Callee’s parameters. Push the stack if there are params.
Push the return of the callee.
Step into callee, Push ESP.
Make EBP = ESP.
Reserve stack space for local variables if there are.
Push the register to be protected onto the stack.
Here’s an example:

function add42(x) {
  return x + 42;
}
add42(91);
Enter fullscreen mode Exit fullscreen mode

Figure 1 shows the call stack when add42 is executed.

Image description
When the add42 executes, Ignition generates it’s bytecode, installs the stack, enters the add42’s stack space, and finally returns to the caller.

Image description
Figure 2 is the stack frame of add42, which is created by InterpreterPushArgsThenCall that there are more details. Let’s look at the following two cases:

  • call add42 without argument — add42()

  • call add32 with three arguments — add42(91,1,2)
    In sloppy mode, both two calls execute normally and the add(91,1,2) can return the expected result — 133. How does the stack frame work?

2. Register, Bytecode

Before diving into the stack frame, let’s take a look at the registers and bytecodes, which helps to understand the stack better. The call of add42(91) is below.

0d              LdaUndefined              //Load Undefined to accumulator.
26 f9           Star r2                   //Store accumulator to r2.
13 01 00        LdaGlobal [1]             //Load global value(the add42's address) to accumulator. 
26 fa           Star r1                   //Store accumulator to r1.
0c 03           LdaSmi [91]               //Load smi-int 91 to accumulator. 
26 f8           Star r3                   //Store accumulator to r3.
5f fa f9 02     CallNoFeedback r1, r2-r3  //Call add42, r2 and r3 are arguments. 
Enter fullscreen mode Exit fullscreen mode

V8 uses non-negative hexadecimal integers to encode registers, fa is registers-r1, f9 is r2…

In the last line — 5f fa f9 02 CallNoFeedback r1, r2-r3, fa is r1, f9 is r2, 02 is length. The combination of f9 and 02 means: a register list of length 2 starting with r2.

The add42’s bytecde is below:

25 02             Ldar a0          //load the first arguments to accumulator
40 2a 00          AddSmi [42]      //accumulator = accumulator+42
ab                Return           //return
Enter fullscreen mode Exit fullscreen mode

In 1st line of code, the encoding of the parameter a0 is 02, which is the offset of the parameter in the stack frame, shown in Figure 2. The complement of fa is 6, and FP -6 is exactly register r1; FP +2 is exactly argument 91.

By moving the FP pointer, you can quickly access parameters and registers, which simplifies the design of bytecodes.

Register addressing formula: r[i]=FP-2-kFixedFrameHeaderSize-i. Arguments addressing formula: a[i]=FP+2+parameter_count-i-1.

3. Adaptor Frame

Although the adaptor frame has been abandoned, learning it still helps to learn stack frame better.

add42()
add42(91,1,2,)
Enter fullscreen mode Exit fullscreen mode

We know that add42() returns Nan, add42(91,1,2) returns 133. Figure 3 shows the adaptor frame.

Image description
The adaptor frame has two important members:

  • Number of arguments, the number records the number of arguments, it comes from JSFunction. For add42(x), it is 1;

  • Arguments slot, the slot is used to store the arguments, the number of slots is (1).

  • For add42(), one argument is missing, so fill slot with Undefined, see left side in Figure 3.

For add42(91,1,2), although there are three arguments, Number of arguments is 1, so fill the slot with the first argument — 91.

No matter how many arguments are passed in, the add42 always can get the correct arguments, so it executes correctly and returns the expected value.

The adaptor frame uses Number of parameters and slot to normalize arguments mismatch so that a function can be called correctly.

Why is the adapter framework scrapped? Of course, for improving performance!
When a function is called, two stack frames are created. One is the adapter framework, and the other is the called framework, which is too time-consuming.

4. Stack Frame

The new stack frame removes adaptor frame, but still uses number of arguments, as shown in Figure 4.

Image description
The intention of the new stack frame is to meet the four requirements below:

(1) Get parameters and registers using FP pointer and offset.

(2) Normalize arguments mismatch, add42() and add42(91,1,2) can be called correctly.

(3) The stack can be rolled back correctly when callee returns.

(4) Abandon adaptor frame, reduce the number of stack builds.

Let’s explain how to meet these four requirements.

For (1): The push order and encoding are not changed, so it is met.

For (2): Number of arguments is here. When the length of args >= Number, only the args[0:Number-1] are retained. When it < Number, use Undefined as the missing parameter.

For (3): From figure 4, we met this requirement.

For (4): The adaptor frame has already gone.

The code to build stack frame is below.

void Builtins::Generate_InterpreterPushArgsThenCallImpl(
    MacroAssembler* masm, ConvertReceiverMode receiver_mode,
    InterpreterPushArgsMode mode) {
  DCHECK(mode != InterpreterPushArgsMode::kArrayFunction);
  // ----------- S t a t e -------------
  //  -- rax : the number of arguments (not including the receiver)
  //  -- rbx : the address of the first argument to be pushed. Subsequent
  //           arguments should be consecutive above this, in the same order as
  //           they are to be pushed onto the stack.
  //  -- rdi : the target to call (can be any Object).
  // -----------------------------------
  Label stack_overflow;

  if (mode == InterpreterPushArgsMode::kWithFinalSpread) {
    // The spread argument should not be pushed.
    __ decl(rax);
  }

  __ leal(rcx, Operand(rax, 1));  // Add one for receiver.

  // Add a stack check before pushing arguments.
  __ StackOverflowCheck(rcx, &stack_overflow);

  // Pop return address to allow tail-call after pushing arguments.
  __ PopReturnAddressTo(kScratchRegister);

  if (receiver_mode == ConvertReceiverMode::kNullOrUndefined) {
    // Don't copy receiver.
    __ decq(rcx);
  }
  //...............omit
  __ bind(&stack_overflow);
  {
    __ TailCallRuntime(Runtime::kThrowStackOverflow);
    // This should be unreachable.
    __ int3();
  }
}
//=======================================
//======separation==========================
//=======================================
void Builtins::Generate_InterpreterPushArgsThenConstructImpl(
    MacroAssembler* masm, InterpreterPushArgsMode mode) {
  // ----------- S t a t e -------------
  //  -- rax : the number of arguments (not including the receiver)
  //  -- rdx : the new target (either the same as the constructor or
  //           the JSFunction on which new was invoked initially)
  //  -- rdi : the constructor to call (can be any Object)
  //  -- rbx : the allocation site feedback if available, undefined otherwise
  //  -- rcx : the address of the first argument to be pushed. Subsequent
  //           arguments should be consecutive above this, in the same order as
  //           they are to be pushed onto the stack.
  // -----------------------------------
  Label stack_overflow;

  // Add a stack check before pushing arguments.
  __ StackOverflowCheck(rax, &stack_overflow);

  // Pop return address to allow tail-call after pushing arguments.
  __ PopReturnAddressTo(kScratchRegister);

  if (mode == InterpreterPushArgsMode::kWithFinalSpread) {
    // The spread argument should not be pushed.
    __ decl(rax);
  }

  // rcx and r8 will be modified.
  GenerateInterpreterPushArgs(masm, rax, rcx, r8);

  // Push slot for the receiver to be constructed.
  __ Push(Immediate(0));

  if (mode == InterpreterPushArgsMode::kWithFinalSpread) {
    // Pass the spread in the register rbx.
    __ movq(rbx, Operand(rcx, -kSystemPointerSize));
    // Push return address in preparation for the tail-call.
    __ PushReturnAddressFrom(kScratchRegister);
  } else {
    __ PushReturnAddressFrom(kScratchRegister);
    __ AssertUndefinedOrAllocationSite(rbx);
  }

//omit.....................

  // Throw stack overflow exception.
  __ bind(&stack_overflow);
  {
    __ TailCallRuntime(Runtime::kThrowStackOverflow);
    // This should be unreachable.
    __ int3();
  }
}
Enter fullscreen mode Exit fullscreen mode

The above code shows the process of building stack frame, it also includes some architecture-related preparations, see builtins-x64.cc for details.

void Builtins::Generate_Call(MacroAssembler* masm, ConvertReceiverMode mode) {
  // ----------- S t a t e -------------
  //  -- rax : the number of arguments (not including the receiver)
  //  -- rdi : the target to call (can be any Object)
  // -----------------------------------
  StackArgumentsAccessor args(rax);

  Label non_callable;
  __ JumpIfSmi(rdi, &non_callable);
  __ LoadMap(rcx, rdi);
  __ CmpInstanceTypeRange(rcx, FIRST_JS_FUNCTION_TYPE, LAST_JS_FUNCTION_TYPE);
  __ Jump(masm->isolate()->builtins()->CallFunction(mode),
          RelocInfo::CODE_TARGET, below_equal);

  __ CmpInstanceType(rcx, JS_BOUND_FUNCTION_TYPE);
  __ Jump(BUILTIN_CODE(masm->isolate(), CallBoundFunction),
          RelocInfo::CODE_TARGET, equal);

  // Check if target has a [[Call]] internal method.
  __ testb(FieldOperand(rcx, Map::kBitFieldOffset),
           Immediate(Map::Bits1::IsCallableBit::kMask));
  __ j(zero, &non_callable, Label::kNear);

  // Check if target is a proxy and call CallProxy external builtin
  __ CmpInstanceType(rcx, JS_PROXY_TYPE);
  __ Jump(BUILTIN_CODE(masm->isolate(), CallProxy), RelocInfo::CODE_TARGET,
          equal);

  // 2. Call to something else, which might have a [[Call]] internal method (if
  // not we raise an exception).

  // Overwrite the original receiver with the (original) target.
  __ movq(args.GetReceiverOperand(), rdi);
  // Let the "call_as_function_delegate" take care of the rest.
  __ LoadNativeContextSlot(rdi, Context::CALL_AS_FUNCTION_DELEGATE_INDEX);
  __ Jump(masm->isolate()->builtins()->CallFunction(
              ConvertReceiverMode::kNotNullOrUndefined),
          RelocInfo::CODE_TARGET);

  // 3. Call to something that is not callable.
  __ bind(&non_callable);
  {
    FrameScope scope(masm, StackFrame::INTERNAL);
    __ Push(rdi);
    __ CallRuntime(Runtime::kThrowCalledNonCallable);
  }
}
Enter fullscreen mode Exit fullscreen mode

The above code is the entry to execute callee, the Add42’s bytecodes will be executed.


Okay, that wraps it up for this share. I’ll see you guys next time, take care!

Please reach out to me if you have any issues. WeChat: qq9123013 Email: v8blink@outlook.com

Top comments (0)