DEV Community

Discussion on: Publishing a Python Package: What I Wish the Maze of Tutorials Covered

Collapse
 
cicirello profile image
Vincent A. Cicirello

The reason you can use an underscore but not a hyphen in the name is that it must be a valid identifier for the import statement, and hyphens are not allowed in Python identifiers. The directory name inside src must match so must not use hyphens or any other characters that are disallowed in identifiers.

As for why hyphens are otherwise preferred over underscores for the name used to install from pypi, I'm not as certain but I have a guess.... It is likely due to the general preference of hyphens instead of underscores in URLs. See for example these guidelines from Google: developers.google.com/search/docs/.... They don't explain why that is the preference but it could be related to the fact that underscores are not allowed in domains, so even though they are allowed in the rest of a URL you have greater visual consistency if you also avoid them in the rest of a URL.

This isn't unique to Python. I use Java more than Python. Java package names and module names can use underscores but not hyphens. Generally, when you publish Java artifacts to Maven Central, the artifact name is often the same as either the Java module (if modules are in use) or a Java package contained in the artifact except using hyphens rather than underscores if you have a reason to use either. I'm not actually sure if underscores are actually disallowed in artifact names or if it is a strong convention to use hyphens. The file name of a jar on Maven Central includes the artifact name, the version, and an identifier all separated by hyphens, so by using hyphens in an artifact name when separation is needed looks nicer since it is consistent with rest of filename.

It also has the benefit that if you have a site dedicated to it that you can use the artifact name in the domain if you use hyphens, which you can't do if there were underscores. Here is an example.... I have a Java library named rho-mu (with a hyphen) so the artifact name and corresponding jar file uses a hyphen. But the jar contains a Java module named rhu_mu with an underscore. The website for the project uses the artifact name in the domain: https://rho-mu.cicirello.org. An underscore would not have been allowed there even though it could otherwise be used elsewhere in the URL.

Collapse
 
thumbone profile image
Bernd Wechner

Thanks for the considered appraisal. It is indeed likely that the - norms arise out of a need for URLs so the package name for example is needed in a URL like: github.com/bernd-wechner/my-package

That said, you misread me a little in that it it is not the specifics of the wherefores and why's that are my central observation or complaint, so much as the enormous unnecessary complexity that would-be contributors are exposed to to this day, not least in a language that is currently at the arguable peak of popularity.

But as you've given a moment to specifics I will add some of my specific test results that led me to pull my hair out and write down these notes (and those results which I did not include as the article is long enough as is). Consider the two configurables, the name of the folder under src and the same declared in setup.cfg. There are 4 variations to explore of - vs _ use and there are two outputs from build, a .tar.gz, and a .whl file. Put this in the context of the official tutorial in which:

name = example-pkg-YOUR-USERNAME-HERE
Enter fullscreen mode Exit fullscreen mode

That is they use - not _. bear that in mind as you examine these four build outputs:

Using: src/my_package and name = my_package
Produces: my_package-0.1.tar.gz and my_package-0.1-py3-none-any.whl

Using: src/my-package and name = my_package
Produces: my_package-0.1.tar.gz and my_package-0.1-py3-none-any.whl

Using: src/my_package and name = my-package
Produces: my-package-0.1.tar.gz and my_package-0.1-py3-none-any.whl

Using: src/my-package and name = my-package
Produces: my-package-0.1.tar.gz and my_package-0.1-py3-none-any.whl

Key observations:

  1. Two scenarios produce files using the _
  2. The use of name with - never works and yes is the recommended name int he official tutorial
  3. Of the two that build to the same apparent result (files using _) the second publishes fine but cannot be installed and used. Go figure.

In conclusion the official tutorial, one of the last havens we have in a world that will produce is replete (as noted in my intro) an already befuddling number of tutes and more a cacophony of research material (to which I've only added I admit) is both a) wrong (suggests using a name that does not work) and b) completely ignores the issue (let along others, like how to defined requirements).

On top of which, befuddling to me is how that tutorial provides no ready clues on how to contribute to improving it. In so many other context today, such material is anything from an open wiki to sporting feedback buttons or notes on how to help improve the documentation. This one is wrong and the best it offers is a tiny "Found a bug?" link in the footer that jumps to an Issues list at:

github.com/pypa/packaging.python.o...

Given we've come this far (and are still in lockdown here ;-). I may just look at filing an issue or PRing a fix over there for the doc.

But the bemusement goes further. Setuptools for example have (finally) evolved to the point where you can use just a setup.cfg file with a basic setup.py assumed if it's missing. Next step will be for build to simply assume that basic pyproject.toml if it's missing. For one of the most popular languages and community based ones at that it would be nice if sharing packages became much much easier.

Collapse
 
cicirello profile image
Vincent A. Cicirello

Wow. That's weird that options that won't work actually produce something and in some cases even publish to pypi. If you try to do the equivalent in Java, either directory name or package name or both, you'll get syntax errors when you compile.

Thread Thread
 
thumbone profile image
Bernd Wechner

Totally agree. It's rather frustrating how complex it is and moreover that the official tutorial suggests something that plain doesn't work.