An Introduction to Using Jupyter Notebooks

#jupyter #collaboration #python

Jupyter Notebooks are a popular way to collaborate and share code, reports and data analyses. Unfortunately, their usage is fundamentally different from both the common website and conventional IDEs, so the first use can be confusing. To solve this, this article gives you a quick overview of the knowledge you need to successfully use an existing Notebook.

Who is this for?

This introduction is aimed at people with a minor technical background. I use some very basic Python code in my running example, but all of the concepts explained here are independent of that.

I was never able to find an introduction to Jupyter Notebooks that jumps right to the usage. Instead, readers first have to plow through different sections on conceptual ideas, setup and creating content. You will definitely need this knowledge once you decide to create and share your own Notebooks, and I can highly recommend the introduction to Jupyter on realpython.com and the official documentation. But for now let’s focus on how to use that wonderful Notebook you found online.

Running Example

We will use a small Jupyter Notebook I’ve created for this introduction. It is shared through binder. Binder takes Jupyter Notebooks that are stored in a public Git repository and provides a web server and some resources to run them. If you are interested in publishing your own Notebooks with binder, I can recommend this well-written guide on Medium.

The running example is self-contained and contains some code examples, so you can just use that, but this article will give you more details on usage and behavior.

Architecture of Jupyter Notebooks

A Notebook consists of multiple cells. Each cell can be run individually and in arbitrary order, or you can run all cells one after the other.

A cell is either text or code: If you run a text cell, the markdown contained in that cell is rendered and displayed. If you run a code cell, the code in that cell is executed on the current state of the Notebook. Any output of that cell is displayed right below the cell.

First: Running Code

You can run a selected cell by hitting Shift+Enter or clicking the Run ▶-Button in the toolbar.

To signal that the cell was run, the Notebook will add a number [1]: left of the cell. This number increases with each run and shows if and in which order cells were run. In addition, the Notebook selects the next cell. This makes it easy to rapidly execute multiple cells one after another: Just quickly press Shift+Enter multiple times.

Second: Understanding the Notebook’s Global State

Each Notebook has a single state that is shared between all cells, called the kernel. Whenever you execute a cell, it modifies that state by running functions and setting variable values. Usually, the cells of a Notebook should be executed top-to-bottom, but that order has no influence on the program state: only the order of executions does! Since each cell works on the current global state, running the same cell multiple times may produce different results if its code depends on the global state.

Take the example in the image above: the cell in the example executes the Python code below.

try:
    run_count += 1
except NameError:
    run_count = 0

On the first execution, variable run_count does not exist and Python raises the NameError: this sets run_count = 0 . In subsequent executions of the same cell, run_count does now exist and the cell increases its value by 1. This small example shows how a single cell can depend on the global state and show different behavior across multiple executions — make sure to remember this when you play around with Notebooks. The running example contains one more example that illustrates the importance of execution order.

To wrap this up, it is very important you understand two things:

The order of cell execution is important when you start to experiment with Notebooks.
A single cell may be executed multiple times and will always work on the current global state.

Each time a code cell is run, the Notebook puts an increasing number in brackets left to the cell, for example [4]:. This number shows you the order in which the cells were run and makes it easy to check whether everything was run in the intended order.

Third: Modification

All cells in a Jupyter Notebook can be modified, and you can add new cells by pressing Alt+Enter. So go ahead, just change some code and run it! Notebooks are all about exploring and experimenting.

Saving changes

One last word of advice: When you use a service like binder, you use a temporary instance of a Jupyter Notebook. This is a good thing, because you can change whatever you want and it will have no effect on the original Notebook. But vice-versa your changes will not be saved on the web because your instance is deleted after some inactivity. To save your changes, you have to download the Notebook’s content through the menu: File→Download as. This gives you a selection of different formats: you can download the notebook as-is (Notebook); you can select to download the content as a native Python file (Python); or you can download the content as text in different formats (e.g., AsciiDoc, HTML, or LaTex).

Disclaimer: “Jupyter” and the Jupyter logos are trademarks or registered trademarks of NumFOCUS.