Note: this article was originally published on my personal blog. Check it out here.
When writing a program, making sure it works the way it's supposed to is a crucial part of the process. This means you have to test your code, and you almost always end up with bugs that you have to debug.
When you are using a IDE, you often have nice tools to do that, with the ability to place breakpoints, observe the call stack, the content of the locals... But sometimes you find yourself in situation where your program lives somewhere else and you can't use those nice modern tools, and you have to find an other solution.
If you are using Python, have some good news for you: the standard lib comes with a basic debugger included, pdb which we'll discover today.
What is pdb?
pdb, as in Python DeBugger, is an interactive source code debugger for Python programs. It is part of the standard library, which mean that if you have Python, you have it. Just like any debugger it features builtin breakpoints capabilities, context and stack introspection as well as code evaluation.
How to use it?
There are a basically three way to run a program under pdb. The first one is using the run()
function inside a REPL1, the second is by putting a breakpoint inside your code with the set_trace()
function. The last one will also set a breakpoint, but this time using the builtin breakpoint()
function.
Assume you have the following piece of code you want to debug:
# mymodule.py
from math import sqrt
def compute(a: int, b: int) -> float:
c = a ** 2 + b ** 2
c = sqrt(c)
return c
def launch():
result = compute(3, 4)
print(result)
if __name__ == '__main__':
launch()
Using run()
inside a REPL
The first way to use pdb is bt using the run()
function inside a REPL1:
>>> from mymodule import launch
>>> import pdb
>>> pdb.run("launch()")
> <string>(1)<module>()
(Pdb)
After providing the name of the function to run to the run()
function, we find ourselves with a different shell, as we can see with the (Pdb)
. That's the pdb shell, where we can explore our code, set breakpoint, show the stack... We'll talk about how to interact with it later.
Setting a breakpoint with set_trace()
An other way is by putting a breakpoint inside your code with the set_trace()
function. When you put it inside your code, Python will stop the current execution each time it reaches it, and will launch a pdb shell for you to interact. Now assume I put a breakpoint inside the previous code at the beggining of the launch()
function:
from math import sqrt
+ import pdb
def compute(a: int, b: int) -> float:
c = a ** 2 + b ** 2
c = sqrt(c)
return c
def launch():
+ pdb.set_trace()
result = compute(3, 4)
print(result)
if __name__ == '__main__':
launch()
If I launch this updated script with the regular python (and not inside a REPL):
> python mymodule.py
> /home/user/mymodule.py(11)launch()
-> result = compute(3, 4)
(Pdb)
We end up with basically the same input, where we can interact with pdb.
Using the builtin breakpoint()
function
There's a third way to run pdb. It is quite similar to the previous one, but this time we'll be using a builtin function: breakpoint()
. This function exists since Python 3.7, and acts as some kind of proxy to your debugger, allowing you to switch between various Python debugger, without having to update your code.
Its default behavior is to call pdb.set_trace()
when called, which is what matters for us here:
from math import sqrt
- import pdb
def compute(a: int, b: int) -> float:
c = a ** 2 + b ** 2
c = sqrt(c)
return c
def launch():
- pdb.set_trace()
+ breakpoint()
result = compute(3, 4)
print(result)
if __name__ == '__main__':
launch()
If I launch this updated script with the regular python (and not inside a REPL):
> python mymodule.py
> /home/user/mymodule.py(11)launch()
-> result = compute(3, 4)
(Pdb)
We can see that we end up with the same output as when we were using pdb.set_trace()
.
Interacting with pdb
As you may have noticed, all interaction with pdb must be done using the keyboard, as it is the case with everything that happens inside a terminal. As such, there are a few commands builtin that allows you to interact with it. Let's run over the most interesting ones.
THe official doc of the pdb module presents a list of all the available commands.
Note on the syntax
Some of them have a both short and long syntax, which mean you can use either one of them for the same result. They are written like this, with the long form between parenthesis: my(command)
. For the previous example, my
is the short version, while mycommand
is the long.
Note: all the excerpts shown below have been reduced for the sake of readibility. If you try them yourself, you might see a bit more verbosity, but the core stays the same.
l(ist)/ll
These two commands are really similar: they both print the source code for the current file:
-
l(ist)
will print 11 lines around the current line being executed (ie. 5 lines before, the line executed and 5 lines after) -
ll
will print all the source code for the current function
The current line in the current frame is indicated by ->
.
(Pdb) ll
4 def compute(a: int, b: int) -> float:
5 c = a ** 2 + b ** 2
6 c = sqrt(c)
7 -> return c
(Pdb) list
2 from pdb import set_trace
3
4 def compute(a: int, b: int) -> float:
5 c = a ** 2 + b ** 2
6 c = sqrt(c)
7 -> return c
8
9 def launch():
10 set_trace()
11 result = compute(3, 4)
12 print(result)
(Pdb)
c(ont(inue))
This command will simply continue the executions of the program, until the next breakpoint, the end of the program, or any error raised.
(Pdb) ll
4 def compute(a: int, b: int) -> float:
5 c = a ** 2 + b ** 2
6 c = sqrt(c)
7 -> return c
(Pdb) continue
5.0
user@computer ~/ >
q(uit)
This command will simply stop the execution of the program and exits right away.
user@computer ~/ > python mymodule.py
-> result = compute(3, 4)
(Pdb) step
--Call--
-> def compute(a: int, b: int) -> float:
(Pdb) next
-> c = a ** 2 + b ** 2
(Pdb) next
-> c = sqrt(c)
(Pdb) quit
Traceback (most recent call last):
File "/home/user/mymodule.py", line 15, in <module>
launch()
File "/home/user/mymodule.py", line 11, in launch
result = compute(3, 4)
File "/home/user/mymodule.py", line 6, in compute
c = sqrt(c)
File "/home/user/mymodule.py", line 6, in compute
c = sqrt(c)
File "/usr/lib/python3.7/bdb.py", line 88, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib/python3.7/bdb.py", line 113, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
Warning: Be careful when using this command, especially when you are using handlers that need to be properly closed, like file handlers or database connections. If the code you are debugging doesn't properly close them in case of error, you might end up with side errors because those handlers weren't properly closed.
Check out below the explanation of the s(tep)
and n(ext)
commands.
p/pp
evaluate and (pretty) prints an expression in the current context
These two commands are really similar. They both evaluate a given expression in the current context and print its value. p
will print just like the regular builtin print()
function does, while pp
will pretty-print using the pprint
module.
(Pdb) p f"a={a}, b={b}"
'a=3, b=4'
(Pdb) pp list(range((a+b)**2))
[0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
...,
40,
41,
42,
43,
44,
45,
46,
47,
48]
n(ext)/s(tep)
These two commands are almost the same. They allow you to move forward in you program, one line at a time, by executing the next line.
Their behavior differ when the next step to execute is a call to a function:
-
n(ext)
will simply execute the function -
s(tep)
will enter the function and stops at the function signature (ie. thedef myFunc(*args, **kwargs):
), allowing you to go further usingn(ext)
ors(tep)
.
(Pdb) ll
9 def launch():
10 set_trace()
11 -> result = compute(3, 4)
12 print(result)
(Pdb) step
--Call--
-> def compute(a: int, b: int) -> float:
(Pdb) next
-> a = 3
(Pdb) next
-> b = 4
(Pdb)
w(here)
This command prints the stack trace, with the most recent stack at the bottom.
(Pdb) where
/home/user/mymodule.py(15)<module>()
-> launch()
> /home/user/mymodule.py(11)launch()
-> result = compute(3, 4)
(Pdb)
u(p)/d(own)
These two commands allows you to move up and down inside the stack. u(p)
will go to an older stack, ie. up in the list, while d(own)
will go to a newer one, ie. one at the bottom of the list.
(Pdb) up
-> launch()
(Pdb) list
10 set_trace()
11 result = compute(3, 4)
12 print(result)
13
14 if __name__ == '__main__':
15 -> launch()
[EOF]
(Pdb) down
-> result = compute()
(Pdb) list
6 c = sqrt(c)
7 return c
8
9 def launch():
10 set_trace()
11 -> result = compute(3, 4)
12 print(result)
13
14 if __name__ == '__main__':
15 launch()
[EOF]
(Pdb)
r(eturn)
This command is useful when you are inside a function. It will continue its execution until it raises or returns.
(Pdb) ll
4 def compute(a: int, b: int) -> float:
5 -> c = a ** 2 + b ** 2
6 c = sqrt(c)
7 return c
(Pdb) return
--Return--
> /home/user/mymodule.py(7)compute()->5.0
-> return c
(Pdb) ll
4 def compute(a: int, b: int) -> float:
5 c = a ** 2 + b ** 2
6 c = sqrt(c)
7 -> return c
(Pdb)
a(rgs)
This command is really useful when you are insed the scope of a function, as it will print the args this function has been called with.
(Pdb) args
a = 3
b = 4
interact
The command interact
can be considered as an evolution of the two above. It will enter in a classic REPL, with the locals of the current scope loaded in the globals, meaning you can interact with it, just like in a regular Python console.
(Pdb) interact
*interactive*
>>> from pprint import pprint
>>> pprint(locals())
{'__annotations__': {},
'__builtins__': <module 'builtins' (built-in)>,
'__cached__': None,
'__doc__': None,
'__file__': '/home/user/mymodule.py',
'__loader__': <_frozen_importlib_external.SourceFileLoader object at 0x107bf5ca0>,
'__name__': '__main__',
'__package__': None,
'__return__': 5.0,
'__spec__': None,
'a': 3,
'b': 4,
'c': 5.0,
'compute': <function compute at 0x107c41310>,
'launch': <function launch at 0x107e0e940>,
'pprint': <function pprint at 0x1082c6c10>,
'set_trace': <function set_trace at 0x1082df700>,
'sqrt': <built-in function sqrt>}
>>> # Ctrl+D to exit and return to the pdb console
now exiting InteractiveConsole...
(Pdb)
ipdb
, the evolution
Do you know IPython? It is a modern Python console that extends the capabilities of the classic builtin Python shell by offering introspection, tab completion, syntaxing coloring, as well as history. If you don't know it, I can't recommend it enough. More information can be found in its GitHub page.
Based on this evolved console, some people have designed an evolved version of pdb that uses this REPL to bring all the cool features of IPython inside pdb: it is called ipdb
.
Once you have installed it using pip
, like any regular Python package, you can use like the same way you would use pdb:
- import pdb
+ import ipdb
def compute(a: int, b: int) -> float:
c = a ** 2 + b ** 2
c = sqrt(c)
return c
def launch():
- pdb.set_trace()
+ ipdb.set_trace()
result = compute(3, 4)
print(result)
if __name__ == '__main__':
launch()
Once run, the commands are the same as with pdb, excpet with IPython formatting applied (which you can't really see with the markdown formatting applied here. You'll have to either trust me or try it by yourself2):
user@computer ~/ > python mymodule.py
10 set_trace()
---> 11 result = compute(3, 4)
12 print(result)
ipdb> ll
9 def launch():
10 set_trace()
---> 11 result = compute(3, 4)
12 print(result)
13
ipdb> step
--Call--
3
----> 4 def compute(a: int, b: int) -> float:
5 c = a ** 2 + b ** 2
ipdb> next
4 def compute(a: int, b: int) -> float:
----> 5 c = a ** 2 + b ** 2
6 c = sqrt(c)
ipdb> args
a = 3
b = 4
ipdb>
Note on running ipdb with breakpoint()
As said above, since Python 3.7, there's the builtin command breakpoint()
that allows to call a Python debugger using a single command. The default behavior of this command is to run pdb.set_trace()
when called. If you are using ipdb, you can change this behavior so that it uses this debugger instead.
To do that you'll have to set the environment variable PYTHONBREAKPOINT
to ipdb.set_trace
, so that Python will know which debugger to use:
- import ipdb
def compute(a: int, b: int) -> float:
c = a ** 2 + b ** 2
c = sqrt(c)
return c
def launch():
- ipdb.set_trace()
+ breakpoint()
result = compute(3, 4)
print(result)
if __name__ == '__main__':
launch()
user@computer ~/ > export PYTHONBREAKPOINT=ipdb.set_trace
user@computer ~/ > python mymodule.py
10 set_trace()
---> 11 result = compute(3, 4)
12 print(result)
ipdb>
Some drawbacks
Now, like every nice tools, pdb, and its modern version ipdb, have some issues.
An interactive terminal is required
This may sound obvious, but I can assure you sometimes you forget about this and are reminded the hard way when you are trying to debug a application that's inside a Docker container.
To be a bit more precise, to be able to properly use (i)pdb, you need to have access to a terminal in which you have access to both its input and output file descriptors, so that you can send commands to (i)pdb and see the output of the debugger.
This means that if you are trying to debug an application that's running inside a container, a VM, or even a Kubernetes pod, and where you don't have any acces further than the logs, you'll have to find another solution.
In this case, there are some pdb extensions that allows you to connect to the debugger using telnet, but that will the subject of a next article.
The interface is not very user-friendly
As you may have noticed, the interface isn't really user-friendly, if we can even call it a real user interface. You have to remember where you are in your code, you can quickly see the content of your current context, setting new breakpoints within (i)pdb is kinda hard...
This may be enough for some small applications, but once you start to debug more complex programs, this might become hard to use, if not impossible. In this you may find some more advanced tools more useful, like pudb. pudb is a console-based visual debugger for Python that is absolutely awesome, but I'll save the presentation for a next article.
Top comments (0)