In this post, we will dive head-first intro into what Python bytecode is, how to view it, and how to read and understand it.
[Cover Photo by Jeppe Hove Jensen on Unsplash]
Like many people who have worked with Python, you may only be used to seeing Python source code (.py files), but what happens to our Python source code for it to be able to executed by a CPU?
In a similar fashion to many other interpreted programming language's, Python first compiles its source code to an intermediate bytecode format which is in turn interpreted by the Python runtime and subsequently converted to native CPU instructions. The intermediate bytecode instructions are stored in pycache (.pyc) files which are then consumed by the Python runtime when executing a program.
The Python standard library provides the dis module which exposes an API for disassembling Python source code into bytecode instructions.
Official Docs: dis — Disassembler for Python bytecode.
We can utilise the dis(obj) function within this module to print out the disassembled bytecode of the object passed in as an argument.
Below is an example of a simple hello_world() function which has been disassembled using the dis() function.
The bytecode output is composed of the following properties.
- The line number of the Python code that the current block of bytecode corresponds to.
- The instructions index in the evaluation stack.
- The opcode of the instruction.
- The oparg, this is the argument for the opcode where applicable.
- Where possible, the resolved oparg value.
Let's step through a simple, (perhaps somewhat contrived) example and outline what is being performed by each bytecode instruction
The following function simply takes two arguments, x & y, and returns the sum of the two provided arguments.
The first two
LOAD_FASTinstructions push the x & y arguments provided to the add function onto the evaluation stack. The opargs provided to the
LOAD_FASTinstruction reference the index of the values to be loaded in the
BINARY_ADDinstruction then pops the two top items from the evaluation stack (x & y) and sums the two values. The result of the calculation is then pushed on to the top of the stack.
RETURN_VALUEthen returns the value from the top of the evaluation stack to the caller and exits the function.