DEV Community

MariaZentsova
MariaZentsova

Posted on • Edited on

How to share Jupyter Notebook on GitHub with Poetry

Poetry is a dependency management and packaging tool for python. It makes it really easy to create and share data science work with others without worrying about version conflicts.

Poetry creates a new virtual environment for each project, allows to add and track packages and even publish your work to PyPi. It can be used to make it easy for fellow data scientists to reproduce a jupyter notebook just by running poetry install to create the same virtual environment.

1. Why use Poetry?

You might wonder, why use poetry at all? For me it was several reasons.

  • Great dependency resolver
    Sometimes with conda I run into some weird dependency issues, which forced me to spend time on StackOverflow looking through various posts to resolve the issue. Poetry has a dependency resolver, where the solutions are already implemented if they exist.

  • Build for both dependancy management and packaging
    Poetry allows not only track packages and resolve dependency conflicts, but it also helps to publish packages to PiPy. So it's very helpful to learn it if you plan to release your own package at some point.

  • Lots of people are using it
    Many great data science notebooks are packaged with poetry, so it is useful to understand how it works and to have it installed.

2. Installing poetry

The recommended way of installing poetry is via a custom installer, which will isolate the package from the rest of the system.

On OSX:

curl -sSL https://install.python-poetry.org | python3 -
Enter fullscreen mode Exit fullscreen mode

On Windows:

(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
Enter fullscreen mode Exit fullscreen mode

To make sure it's installed correctly, let's check the poetry version.

$ poetry --V
Enter fullscreen mode Exit fullscreen mode

3. Creating a new project

Poetry project creates a new virtual environment.

$ poetry new jupyter-demo
Enter fullscreen mode Exit fullscreen mode

The project consists of several files, most important one is pyproject.toml, which contains a list of all project dependencies.

folder structure poetry

4. Installing Jupyter

To add packages to poetry project, we just need to use a simple poetry add command.

$ poetry add jupyter ipykernel
Enter fullscreen mode Exit fullscreen mode

Once Jupyter is installed, let's add a couple of packages often used in data science.

$ poetry add pandas tensorflow
Enter fullscreen mode Exit fullscreen mode

To uninstall any package, we can use poetry remove command.

$ poetry remove tensorflow
Enter fullscreen mode Exit fullscreen mode

Poetry allows us to see details of a particular package installed in our virtual environment by using poetry show with the package name.

$ poetry show pandas
Enter fullscreen mode Exit fullscreen mode

poetry show pandas

5. Track your packages

While we could always look up all the packages in pyproject.toml file, we could see packages in a command line by running poetry show --tree to see all packages and their dependencies.

poetry show all packages

To see the latest updates for packages on PyPi to check if we're using latest versions, just run poetry show --latest.

poetry show latest packages

For compatibility, poetry also allows to export dependencies in requirements.txt

$ poetry export -f requirements.txt --output requirements.txt
Enter fullscreen mode Exit fullscreen mode

6. Run the Jypyter notebook

To run any executable from our newly created environment, we need to use poetry run command. Let's spin up jupyter server.

$ poetry run jupyter notebook
Enter fullscreen mode Exit fullscreen mode

Copy and paste one of the URLS from your command line in the browser.

jupyter links

So now we have a jupyter environment running and can create a new notebook. I've just created a Test_notebook.ipynb in jupyter_demo folder.

jupyter notebook

7. Publish Notebook on GitHub

Once notebook is ready and environment with all packages is created, it's time to share your work on GitHub!

First, we need to lock the package versions we use with poetry lock command, so people who will work with our code can create the same virtual environment.

$ poetry lock
Enter fullscreen mode Exit fullscreen mode

Once the packages are locked, you can initialise a local git repository, add all files and push initial commit.

$ git init -b main
$ git add .
$ git commit -m "My first commit"
Enter fullscreen mode Exit fullscreen mode

Then, create a git repo with GitHub UI, don't initialise it, then copy the repo url.

$ git remote add origin  <YOU_REPO_URL> 
$ git push -u origin main
Enter fullscreen mode Exit fullscreen mode

This pushes all your files to Github, so other people can easily access your work and set up similar environment.

8. Run notebook from existing poetry project

To recreate the notebook published by somebody else, we just need to clone the repo and set up virtual environment by running poetry install.

$ git clone <YOU_REPO_URL> 
$ cd <your_repo>
$ poetry install
Enter fullscreen mode Exit fullscreen mode

Once the files are copied and virtual environment is created, we can spin up the notebook, just like before.

$ poetry run jupyter notebook
Enter fullscreen mode Exit fullscreen mode

This is how, in a few easy step we can share and reproduce jupyter notebooks with poetry. Hope, this will inspire you to give poetry a try!

Top comments (0)