I've written a number of python packages over time for my own use really and I'm so snowed under just getting it done that every time I stopped to consider sharing one, so it could be
pip installed, I searched on-line and immediately got lost among the tens, hundreds maybe, of guides, tutorials and options and what looked like enough material for a doctoral dissertation to wade through. And all of it looked scary talking about starting from scratch (not from an existing package), wanting me to add and write a dozen files and understand this and that, and worse there are old ways, new ways, alternate ways ... Aaargh.
But hey, we're in a snap 3 day lockdown here (thanks COVID!) and I was tidying up some projects and tried again ... this time, experience in hand, (and less hair) I figured I would put my blinkers on and use just the official tutorial:
And I was mildly pleased. I guess my expectations had dropped but hey, it proved manageable and I did succeed in publishing a few packages. But even this tutorial left me wasting far too much time trying to work out how stuff works and I wanted to write it down ASAP, for my own sake, and well, here we are ... share it.
So here's a better (IMHO) guide to packaging your Python Project and publishing it (better because it says what I wanted to know, and I'm making different mistakes and it has it's own shortcomings that I don't notice ;-).
1. Start with a package
Yep, I already have packages ... a good few of them. I want to publish them. Anyhow a package of course is just a folder with an
__init__.py file in it, and if it's a small package (as many of mine are) that's all they are. Nothing more, nothing less, than a folder with an
__init__.py in it that provided some classes and/or functions.
Sometimes there might be a few more
.py files in the folder beside the
__init__.py. Minor detail, just what happens when it's a little too big to fit conveniently into the one file.
Key thing here is we're not starting from scratch, but have a package.
2. Get the tools
You only need two tools as it happens and they are:
build- which creates from your package by default a
.tar.gzfile and a
.whlfile which is what pip wants/needs for it's
twine- which publishes to pypi.org
and so for prep:
pip install build twine
Not too bad. Step 2 was easy.
3. Prep the few extra files a package wants
Not many, don't fret.
README.md- a simple markdown file with a welcome message and whatever you want to add. What is this package, how do you use it? I use Typora but you can write it in any text editor and it can be as brief or in depth as you like. It is what's shown on the pypi.org page for your package so it's also your ad if you like for your package.
LICENSE.md- Not sure you need it but worth doing and easy as Py. I am beastly careless in this space and just love the Hippocratic License. Download the markdown version and save it as
That's it! And to think it seemed so scary in past.
4. But no! There's more - just a little more, not much.
The tutorial recommends that you lay out your folder like this (yes, I've simplified it a bit):
the-folder-I-keep-it-in/ ├── LICENSE ├── README.md ├── pyproject.toml ├── setup.cfg └── src/ └── my_package/ └── __init__.py
The things to note are:
You don't need to put it in a
srcfolder, but why not? If it ain't broke don't fix it. The
srcstrategy means you can just drag and drop you package from where it was into src ... done. The stuff above it is the publishing kit ...
twinewhat to do. We'll come back to these shortly. The first is just a standard file to tell
buildthat we're going to use
twinein a roundabout way ;-) and the second one describes your package so
twinecan publish it (purists may argue with this neat division, but let them).
the-folder-I-keep-it-incan have any name you like. won't change a thing with the build or publish. I actually call it
my-package(in this example). As to why, keep reading. It's just convenient that's all.
my_packageshould use underscores between words, yes, do it. There's a bizarre confusion in the Python world between
my-package? What, why, when?
This is described nowhere, and I had to work this out with a lot of trial an error and hair pulling alas. But here's what I got for you.
my_package: just stick to this don't waver, never waver, use only this ;-). I kid ye not. Using
my-packagein either the folder under
setup.cfgwill cause you grief during the build, publish, install and test.
Once it's published it will appear on pypi.org as
my-packageand people will install it with
pip install my-package, but use it with
import my_package. That's just the way it is, that's the convention, don't rock the boat, all you need to know is you don't have to lift a finger to make that happen, just stick with
srcfolder and in
But of course,
the-folder-I-keep-it-inis irrelevant here and I call it
my-packagejust because, because that's what the package is called. The only other exception is the github repo if you're using one (and I do), that too can be
my-packageand is in my case in fact later you'll se I can exploit that for a nice two line install script.
pyproject.toml is easy. Just copy the standard. Put this in it:
[build-system] requires = [ "setuptools>=42", "wheel" ] build-backend = "setuptools.build_meta"
and be done with it. Ask no more. It's
build internals and unless you're super keen in digging deeper, let it rest, this just means when you run
python3 -m build in your package folder, it knows what to do (if you don't have this file it will ask for one). What it does, is created a
dist folder and drops two files in it. These are what
setup.cfg is not hard either and here's my minimalist take and the clarifications that I felt were missing elsewhere:
[metadata] name = my_package version = 0.1 author = my name author_email = my email address description = My little package long_description = file: README.md long_description_content_type = text/markdown url = https://github.com/me/my-package project_urls = Bug Tracker = https://github.com/me/my-package/issues classifiers = Programming Language :: Python :: 3 License :: Freely Distributable Operating System :: OS Independent Development Status :: 4 - Beta Framework :: Django :: 3.2 Intended Audience :: System Administrators Topic :: Software Development :: Libraries :: Python Modules [options] install_requires = other_package1 >= 0.1.1 other_package2 >= 2.0.1 package_dir = = src packages = find: python_requires = >=3.6 [options.packages.find] where = src
And here's what I felt I should have known:
my-package. Just believe me. Things go weird if it says
my-package. Experiment if you like, I wish I didn't need to and the tutorial was clear here.
install_requireswants one indented line per requirement with relatively familiar syntax (similar to
pip freeze- another one of those mysteriously named python commands that actually means
pip show-me-whats-installed). This is completely missed in the tutorial.
package_diris weird, yes, but forget it. Like
install_requiresit has a list of one liners beneath it, in this case just one. The one liners map package names to folders somehow in the internal complexities of setuptools - details most of don't care about or want to know about when publishing our simple one file package. The tutorial tells us that this line maps the "noname" package to the
srcfolder, and that the "noname" package (that nothingness before the = sign) is a code name for the overarching root package, so the
srcfolder becomes the mystical "root package". Do most of us actually care about this? What is a "root package"? anyhow. Nah, let's leave it for the boffins, and just accept this is the odd way of telling
twinethat our package is in the
- there's nothing missing after
find:. No. That's just the syntax, live with it. Refer back to the intro, re: my sentiments on the unnecessary befuddling cryptic nature of Python package publication ... Ditto the
where = src, just accept it.
- The classifiers are bit fiddly they have to come from the list of allowed classifiers. And they bothersomley lack a clear way of saying you're using the Hippocratic License (which I just happen to love).
6. The Importance and Catches with Testing
Publishing is as simple as:
python3 -m twine upload dist/*
BUT, it's committal. Once you've published there appears to be no way of undoing it and it consume the filenames you used (which means also the
version you have in
setup.cfg as these get built into the filenames in
And so, testing first is critical. And pypi.org provide
testpypi at https://test.pypi.org/ that you can publish to freely, as often as you need to get it right.
The main things that demand a retry are in my experience:
You look at it on pypi and
README.mdhas issues. Either typos, or code lines that are too long and render badly etc. Either way, you get see how it's going to presented on pypi and can adjust your README to look nice.
Your test installing it with pip doesn't work. Which actually doens't happen now that I have a workflow, but happened a lot while Iw as trying to work all that
setup.cfgsyntax out that the tutorial deigns to gloss over.
To publish to the test site it's just small variant:
python3 -m twine upload --repository testpypi dist/*
So testing is great. A lifesaver. But it caused me some modest grief too (the flip side of the same coin).
Firstly you need to create an account on the site, and I did that but use Bitwarden always, and generate large random passwords for me - a habit (that we should all have).
twine when used as above prompts for username and password. Alas these long random passwords of mine are not easy to type, so I usually do a copy/paste but alas pasting the password does not work - I tried and tried.
Fortunately they can be provided on the command line as in:
python3 -m twine upload --repository testpypi -u $username -p $password dist/*
and I saved this in a file called
test-publish that reads:
#!/bin/bash source ~/.auth/pypi.auth python3 -m twine upload --repository testpypi --verbose -u $username -p $password dist/*
Secondly, you can't republish. At all. You need to increment the
setup.cfg and rebuild before you can republish. Slows things down some. Not least because of the time and energy spent searching online for ways and means to republish. Some on-line sources suggest
--skip-existing does the trick, but it doesn't - not for me and it's not clear what it does or what it's for and maybe I just misread that. C'est la vie.
Thirdly, the dependencies listed under
setup.cfg don't work, presumably because, when testing the required packages aren't on https://test.pypi.org/. But it took a bit of head scratching and try and try again to convince myself of that, as I was trying believe it or not to validate the syntax for just that setting as it's not described in the tutorial and sent me looking at that warren of other sources quickly again. I do wish that testpyi would look at pypi for requirements as a fallback so this test cycle could be complete.
7. A Standard Workflow
OK, so having gone through that all now, like most folk eventually do, I have a standard template (the last package I published). I now routinely use five tiny little two line shell scripts to make life easy for myself.
Basically a build script, and two publish and install scripts.
A script to build:
#!/bin/bash rm dist/* python3 -m build
A script to test publishing:
#!/bin/bash source ~/.auth/pypi.auth python3 -m twine upload --repository testpypi --verbose -u $username -p $password dist/*
A script to install the test publish (test installing) - noting that errors here about requirements that cannot be met are expected:
#!/bin/bash package=$(basename $(dirname $(readlink -f "$0"))) python -m pip install --index-url https://test.pypi.org/simple/ $package
A script to publish properly:
#!/bin/bash source ~/.auth/pypi.auth python3 -m twine upload --verbose -u $username -p $password dist/*
A script to install the package properly:
#!/bin/bash package=$(basename $(dirname $(readlink -f "$0"))) python -m pip install $package
A basic example of that together you can visit at:
and see here:
I hope that helps someone save all the learning hassle, and publish something easily, by just adding 4 files to a folder (a
README.md to write, a
LICENSE.md to download, a
pyproject.toml to copy, and
setup.cfg to tune) and maybe 5 tiny little helper bash scripts and in no time a test and then publish cycle is underway.
Top comments (6)
The reason you can use an underscore but not a hyphen in the name is that it must be a valid identifier for the import statement, and hyphens are not allowed in Python identifiers. The directory name inside src must match so must not use hyphens or any other characters that are disallowed in identifiers.
As for why hyphens are otherwise preferred over underscores for the name used to install from pypi, I'm not as certain but I have a guess.... It is likely due to the general preference of hyphens instead of underscores in URLs. See for example these guidelines from Google: developers.google.com/search/docs/.... They don't explain why that is the preference but it could be related to the fact that underscores are not allowed in domains, so even though they are allowed in the rest of a URL you have greater visual consistency if you also avoid them in the rest of a URL.
This isn't unique to Python. I use Java more than Python. Java package names and module names can use underscores but not hyphens. Generally, when you publish Java artifacts to Maven Central, the artifact name is often the same as either the Java module (if modules are in use) or a Java package contained in the artifact except using hyphens rather than underscores if you have a reason to use either. I'm not actually sure if underscores are actually disallowed in artifact names or if it is a strong convention to use hyphens. The file name of a jar on Maven Central includes the artifact name, the version, and an identifier all separated by hyphens, so by using hyphens in an artifact name when separation is needed looks nicer since it is consistent with rest of filename.
It also has the benefit that if you have a site dedicated to it that you can use the artifact name in the domain if you use hyphens, which you can't do if there were underscores. Here is an example.... I have a Java library named
rho-mu(with a hyphen) so the artifact name and corresponding jar file uses a hyphen. But the jar contains a Java module named
rhu_muwith an underscore. The website for the project uses the artifact name in the domain:
https://rho-mu.cicirello.org. An underscore would not have been allowed there even though it could otherwise be used elsewhere in the URL.
Thanks for the considered appraisal. It is indeed likely that the
-norms arise out of a need for URLs so the package name for example is needed in a URL like: github.com/bernd-wechner/my-package
That said, you misread me a little in that it it is not the specifics of the wherefores and why's that are my central observation or complaint, so much as the enormous unnecessary complexity that would-be contributors are exposed to to this day, not least in a language that is currently at the arguable peak of popularity.
But as you've given a moment to specifics I will add some of my specific test results that led me to pull my hair out and write down these notes (and those results which I did not include as the article is long enough as is). Consider the two configurables, the name of the folder under
srcand the same declared in setup.cfg. There are 4 variations to explore of
_use and there are two outputs from
.tar.gz, and a
.whlfile. Put this in the context of the official tutorial in which:
That is they use
_. bear that in mind as you examine these four
name = my_package
name = my_package
name = my-package
name = my-package
-never works and yes is the recommended name int he official tutorial
_) the second publishes fine but cannot be installed and used. Go figure.
In conclusion the official tutorial, one of the last havens we have in a world that will produce is replete (as noted in my intro) an already befuddling number of tutes and more a cacophony of research material (to which I've only added I admit) is both a) wrong (suggests using a name that does not work) and b) completely ignores the issue (let along others, like how to defined requirements).
On top of which, befuddling to me is how that tutorial provides no ready clues on how to contribute to improving it. In so many other context today, such material is anything from an open wiki to sporting feedback buttons or notes on how to help improve the documentation. This one is wrong and the best it offers is a tiny "Found a bug?" link in the footer that jumps to an Issues list at:
Given we've come this far (and are still in lockdown here ;-). I may just look at filing an issue or PRing a fix over there for the doc.
But the bemusement goes further. Setuptools for example have (finally) evolved to the point where you can use just a
setup.cfgfile with a basic
setup.pyassumed if it's missing. Next step will be for
buildto simply assume that basic
pyproject.tomlif it's missing. For one of the most popular languages and community based ones at that it would be nice if sharing packages became much much easier.
Wow. That's weird that options that won't work actually produce something and in some cases even publish to pypi. If you try to do the equivalent in Java, either directory name or package name or both, you'll get syntax errors when you compile.
Totally agree. It's rather frustrating how complex it is and moreover that the official tutorial suggests something that plain doesn't work.
Not being allowed to replace or remove a version is also not just a pypi thing. Maven Central also doesn't allow this. Once it is public, other packages might depend on it. Removing or even replacing it can then break other people's projects.
That's all good and well, and easy enough to understad but still falls short of awesome ;-). There's public and there's public. In the extreme, there's public and got lots of people using it, and there's public just published now and ooops, made a mistake, let's fix it.
To help with the latter cast testpypi was born and that rocks! And yet it falls short of awesome too as we cannot test the
install_requiresthere (that could be fixed by having pip more smartly try pypi if testpyi doesn't have a package - easily generalised to if repository is testX and an
install_requirespackage cannot be find try the repository X).
But pypi could also be smarter. Allowing for two steps like many publishing media do. Push to pypi (visible publicly perhaps, maybe installable only with your account credentials) and then Releasing, making fully public. OR alternately keeping track of all installs (downloads and from where the request came) and if there are no downloads from source IPs different to the one that uploaded, then allow an overwrite (an oops style fix).
All just thoughts in the stunning and still very surprising complexity of publishing Python packages.