DEV Community

Cover image for Dead Simple Python: Project Structure and Imports

Dead Simple Python: Project Structure and Imports

Jason C. McDonald on January 15, 2019

Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press. The worst part of tutorials is alwa...
Collapse
 
sandordargo profile image
Sandor Dargo

Great article, Jason

I cannot figure out a problem, maybe you can share your ideas.

I have the following structure:

.
├── cmake_project_creator
│   ├── dependency.py
│   ├── directory_factory.py
│   ├── directory.py
│   ├── include_directory.py
│   ├── __init__.py
│   ├── project_creator.py
│   ├── source_directory.py
│   └── test_directory.py
├── examples
│   ├── dual.json
│   ├── nested_dual.json
│   └── single.json
├── README.md
└── tests
    ├── __init__.py
    ├── test_dependency.py
    ├── test_directory_factory.py
    ├── test_directory.py
    ├── test_include_directory.py
    ├── test_project_creator.py
    ├── test_source_directory.py
    └── test_test_directory.py
Enter fullscreen mode Exit fullscreen mode

The main entry point is cmake_project_creator/project_creator.py asking for a couple of parameters.

If I try to invoke it from Pycharm, everything is fine.
The tests running by nosetests --with-coverage --cover-erase running fine. But if I try to invoke cmake_project_creator/project_creator.py from the terminal, this is what I get:

sdargo@mymachine (master) ~/personal/dev/project_creator $ python cmake_project_creator/project_creator.py -c
Traceback (most recent call last):
  File "cmake_project_creator/project_creator.py", line 6, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'
Enter fullscreen mode Exit fullscreen mode

Do you have any idea what can be the issue?

Collapse
 
codemouse92 profile image
Jason C. McDonald

Absolutely. Your package needs a dedicated entry point for any imports off cmake_project_creator to work.

Add __main__.py to cmake_project_creator/. Your __main__.py file should look something like this:

from . import project_creator

if __name__ == "__main__":
    project_creator.WHATEVER_YOUR_INITIAL_FUNCTION_IS

Then, you can invoke the package directly with...

python3 -m cmake_project_creator

Collapse
 
sandordargo profile image
Sandor Dargo

Thanks a lot, Jason! This partly solved my problem.

Now I can run for example python3 -m cmake_project_creator -c where -c is a parameter and it works like a charm. But after adding the correct shebang and execution rights, I still cannot simply run ./cmake_project_creator/project_creator.py -c as I have the same failure of :

Traceback (most recent call last):
  File "./cmake_project_creator/project_creator.py", line 8, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'

Do I really have to manipulate sys.path for that?

Thread Thread
 
codemouse92 profile image
Jason C. McDonald • Edited

As a rule, never ever ever ever EVER manipulate sys.path to solve Python import issues. It has some pretty serious and nasty side-effects that can break other programs.

You shouldn't invoke modules within a package like this. Instead, I'd recommend adding command-line argument support to your __main__.py, via argparse.

With __main__.py becoming the dedicated entry point, you should update it further to have a dedicated main() function, like this:

def main():
    # Do whatever....

if __name__ == '__main__':
    main()

The sole entry point to your package should be python3 -m cmake_project_creator, or an entry point script that invokes cmake_project_creator.main()

Thread Thread
 
sandordargo profile image
Sandor Dargo

Ok, thanks. Yes, I've been already using argparse to get the CL arguments.
So one option is to use the -m option and the other way I managed to make it work is to add the repo-root to the PYTHONPATH, which could be done by a setup.py and most probably it would be OK to have it in a virtualenv.

Thanks once more!

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

Well, like I said, changing the path is always wrong. Yes, even in a virtualenv, especially since you can't guarantee that it'll always be run in one by another user. So, you only have one option, being the one I described. But, shrug, I've said my piece.

Thread Thread
 
sandordargo profile image
Sandor Dargo • Edited

I got your point and at the same time, in general, I don't believe in "having only one option". My problem with invoking a product with -m is twofold. One, it's not at all user-friendly, and the other is that it's leaking an abstraction. The product is implemented as a module with that name.

Following your recommendation not to change any path variable, I found to way to overcome this.
1) I wrap the python3 -m cmake_project_creator into a shell script. As such users don't have to bother with -m, not even with pretending the module or script name with python3. On the other hand, it's not very portable (what about Win users for example?), this might or might not be acceptable. In my case, it would be.
2) I managed to invoke the module with runpy.run_module("cmake_project_creator", run_name='__main__') from another python script that given a correct shebang I can simply call ./run.py <args>. To me this seems ideal as I keep the invocation (from a user perspective) as simple as possible and as portable as possible and I encapsulate both the module name and the fact that the product is implemented as a module.

PS: The product is going to be completely free, with the word product I only want to emphasize that it's meant to be used by people who might not even know with python -m is or python at all.

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

That's why you have an entry point script, or even several, as I alluded to earlier. You can use your setup.py to provide those, and those scripts can even be named such as you describe. But editing the Python path is still always wrong, for technical reasons.

Python quite often is meant to have only one right way of doing something. The language is built that way.

As I haven't yet been able to write the article on setup.py, please read this article by Chris Warrick: chriswarrick.com/blog/2014/09/15/p...

Thread Thread
 
sandordargo profile image
Sandor Dargo

Thanks for the recommendation. I'm definitely going to read it as that's pretty much my next thing to do, understand what I need to put in the setup.py. Thanks again!

Collapse
 
thehoffy73 profile image
Ashley Hoff • Edited

Hi!
Firstly great article. This has been one of the clearest examples of how it should be done. Thanks.

I am not sure whether this is an edge case, but I have a structure that looks like this:

generateandsend/
├── __init__.py
├── __main__.py
│
├── generatedata/
│   ├── __init__.py
│   └── generate_data.py
│ 
├── senddata/
│   ├── __init__.py
│   └── send_data.py
│ 
├── utilities/
│   ├── __init__.py
│   └── local_utilities.py
│ 
├── Readme.md
└── License.md

Both generate_data.py and send_data.py reference functions in local_utilities.py.

The issue I have, more often then not, I would be calling send_data.py or generate_data.py

I know that if I call either them specifically, I will need to add a reference to be able to import local_utilities.

Does this go against the general accepted practice? Would it be better to either separate them into different projects (I would like to keep all the code together) or use an argparser in __main__ and call the respective module using args?

Thanks
Ashley

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

Hi Ashley,

Sorry for the tremendous delay in reply. So, just to be clear, you're wanting to be able to call generate_data.py and send_data.py directly, and those are supposed to be able to import a module from elsewhere in the project?

If so, I would actually consider why you want to execute those modules directly. If you're simply wanting to be able to execute the two separately from the command-line, it may be worth fleshing out __main__.py to accept a command-line argument, so python3 -m generateandsend send or python3 -m generateandsend generate will execute what you want. That'll also be the easiest solution. That way, you're always executing the top-level package (generateandsend)

In fact, I'm not entirely sure off the top of my head how to get multiple projects to talk to one another within a shared directory! I know it has to do with PYTHONPATH, but I think that will necessitate more research on my part. ;)

Collapse
 
thehoffy73 profile image
Ashley Hoff

Thanks for replying (& no problem on the delay - we all have a life to live!).

I have thought about this more and agree - Why is it that I want to call them separately, where a parameter will suffice. So, I have abandoned the idea and gone with the python3 -m generateandsend generate approach

Cheers for the reply though. Appreciated.

Collapse
 
thehoffy73 profile image
Ashley Hoff

I also have one more question - if I wanted to include an ini/configuration file in a resources folder, how would I import it?

Thanks

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

I like to put all such non-code files in a project subdirectory (not a package) called resources, and then use the built-in package pkg_resources to access it.

For example, in my omission project, the module omission/game/content_loader.py needs to load the text file omission/resources/content/content.txt. I do that with...

import pkg_resources

class ContentLoader(object):

    def __init__(self):
        """
        Open the file and load the contents in.
        """

        # ...

        path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'resources', 'content', 'content.txt')
        )

        with open(path, 'rt', encoding='utf-8') as content_file:
            raw_content = content_file.read()

        # ...

Simple as that!

P.S. If you find yourself needing to access files outside of your project directory, say, in the user's home directory, I recommend the package appdirs.

Thread Thread
 
thehoffy73 profile image
Ashley Hoff

Again, thanks for the reply.

I've had a play with this one. Considering I am dealing with an ini file, it appears that configparser does what I want. This is the snippet I've come up with:

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    config.read('generateandsend/Resources/generateandsend.ini')
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

Is hard coding the relative path in that way frowned apon?

Thread Thread
 
codemouse92 profile image
Jason C. McDonald • Edited

Is hard coding the relative path in that way frowned apon?

Most certainly, especially because you have to account for differences in path format between operating systems.

I'd recommend incorporating pkg_resources into your approach above.

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'Resources', 'generateandsend.ini')
        )
    config.read(path)
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

I believe that will work? You'll have to check how config.read() handles an absolute path.

Thread Thread
 
thehoffy73 profile image
Ashley Hoff

Beautiful. Thanks. I had to massage it a little and remove os.pardir, as it was giving me a false directory on my windows machine (C:\tmp\generateandsend\..\Resources\generateandsend.ini).

The resultant path variable now looks like:

path = pkg_resources.resource_filename(__name__, os.path.join('Resources', 'dummy.ini'))

I just need to test this on my Linux box

Cheers again. Send the bill to...... 😉

Collapse
 
mkaut profile image
mkaut

Great introduction.
I have one question: you have tests inside the project directory, while this guide places both docs and tests into the git root. Are there any up- or down-sides to either of the choices?

Collapse
 
codemouse92 profile image
Jason C. McDonald

My method just makes the imports a lot easier. You'll notice that the guide you linked to requires some complex imports for the tests to work, whereas my approach requires nothing of the sort, since tests are part of the module.

I suppose if you absolutely don't want to ship tests as part of your finished product, that might justify the other approach. That said, I prefer to always ship tests in the project; it makes debugging on another system a lot more feasible.

Collapse
 
mkaut profile image
mkaut

Good point, thanks.

So, in your approach, how do you import, let's say game_item.py from test_game_item.py?
And does it then have to be run from a specific folder (omission-git, omission-git/omission/, or omission-git/omission/tests) or does it work from all the above?

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

Within omission/tests/test_game_item.py, I would import that other module via...

import omission.game.game_item

I always run python -m omission or pytest omission from within omission-git.

Collapse
 
grzegorzkrug profile image
Grzegorz Krug • Edited

My tree example:

---src
init.py
main.py
------game
---------cards98.py

------reinforced
---------rl_agent.py

------supervised

Readme.md
License.md

I can not reach parent module, from rl_agent.py

I added some init.py but it does not solves.
I have tried:
from game.cards98 import GameCards98
from src.game.cards98 import GameCards98

And all I got is ModuleNotFoundError: No module named 'src'
This works fin in pycharm, but not in idle :/

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

This structure should work:

src/
├── __init__.py
├── __main__.py
│
├── game/
│   ├── __init__.py
│   └── cards98.py
│ 
├── reinforced/
│   ├── __init__.py
│   └── rl_agent.py
│ 
├── supervised/
│   └── __init__.py
│ 
├── Readme.md
└── License.md

You need __init__.py under each directory that you want to use as a package.

Then, from rl_agent.py, you should be able to use this import:

from src.game.cards98 import GameCards98
Collapse
 
grzegorzkrug profile image
Grzegorz Krug • Edited

I know it should work, but it does not. I got
__init__.py everywhere and __main.py__ in top level.
Do I need to run it with -m param? I am definitely missing something. I was running scripts from top level to combine modules, but It can get messy sometimes :P

This is my repo: Github Cards98

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

It shouldn't be messy. But, yes, you'd need to invoke your top-level package (not your top-level script.)

python3 -m src

By the by, I recommend renaming src to your project name, cards98, and then renaming the subpackage by the same name to something like game.

Thread Thread
 
grzegorzkrug profile image
Grzegorz Krug

Yes, now it is working. python -m cards98
Well... but only for invoking top level in console. It does not work for normal execution, like clicking 2 times __main.py__ with mouse.
This also makes debugging and testing harder, cause I have to change it always in __main__.py. Where can I use it? I think it just complicates everything.

Thanks for help in understanding this

Thread Thread
 
codemouse92 profile image
Jason C. McDonald • Edited

This is, to my knowledge, the official (and only) way to structure a Python project. There are two to create an executable file to start everything.

Option 1: Native Script

Many Python projects offer a Bash script (on UNIX-like systems) or a Windows .bat file that will run the python3 -m cards98 command. I see this a lot.

Option 2: Python Script

This is the method I'd recommend, as it's the most portable.

Outside of your top-level package, you can write a separate Python script, such as run.py or cards98.py, and then use that to execute your main function.

For example, in cards98/__main__.py, you can put this...

def main():
    # The logic for starting your application.

if __name__ == "__main__":
    main()

And then, outside of the cards98 package, create the file cards98.py, with the following:

#!/usr/bin/env python3
from cards98.__main__ import main
main()

To start your Python application, just double-click cards98.py.

P.S. Thanks for bringing up this situation. I realized I never addressed it in the book!

Collapse
 
rhymes profile image
rhymes

Hi Jason, nice article!

Just a question: I've noticed you didn't talk about namespace packages. Is it because it might be outside the scope of a "dead simple" intro?

I'm mentioning it because I believe they are a simpler concept for a new developer, as in: folders are packages, if you need initialization code for such package, add a __init__.py, otherwise you can't totally ignore the file. I'm over simplifying here of course.

Thank you!

Collapse
 
codemouse92 profile image
Jason C. McDonald

That was something I actually didn't know about. Thanks for the link! It is probably more advanced than I want to go in the article series, but thanks for parking it in a comment anyhow. I'll look at this again later, and see if it might be worth adding to the guide after all. Thank you!

Collapse
 
rhymes profile image
rhymes

An example:

➜ tree
.
└── smart_door
    └── open.py

1 directory, 1 file

➜  cat smart_door/open.py
print("I have opened")

➜  python
Python 3.7.2 (default, Jan 13 2019, 22:54:07)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from smart_door import open
I have opened

You can read more about it here.

Collapse
 
kylerconway profile image
Kyle R. Conway

Thank you so much for this article. Hard to overstate how helpful this is for someone who feels relatively competent at the language but completely inexperienced at building something sane looking or structured appropriately.

Collapse
 
fossilx profile image
Tony

Nice article!

While this is probably beyond the scope of this article, one useful addition for those that need to create packages frequently would be to look into using cookiecutter. It lets you create a "package template". While these templates can be simple, they can also include support for many dev tools such as docker, travis-ci, sphinx, doctests (via pytest/nose/etc), etc.

Once the cookiecutter template is ready, you run a quick wizard and it generates the project directory/files for you. There are also a bunch of templates already available, some of which are specialized for specific tasks (such as data analysis).

For more info:
cookiecutter.readthedocs.io/en/lat...
github.com/audreyr/cookiecutter

Collapse
 
dubst3pp4 profile image
Marc Hanisch • Edited

Such a great article, thank you very much. I've just dived into Python, having used multiple languages before. But these are exactly the explanations needed by Python newcomers to get a better understanding how things work in Python.

Collapse
 
sobolevn profile image
Nikita Sobolev • Edited

Thanks for this article! It is very useful for beginners.

I would like to suggest to mention wemake-python-styleguide in one of the future articles. In my practice, it is very helpful for beginners, since it enforce insane rule to struct and clean your code. That's what stimulates learning progress!

Anyway, great series. Waiting for the next articles.

Collapse
 
jtaala profile image
Jay Ta'ala • Edited

Finally, a straight-forward description that is really well explained. As a Java developer getting into python, I've always been frustrated when asking python devs where I am about solid source code structure for python projects that makes sense and won't lead to the apparent mess of what I've seen in lots of python code (they usually just say I'm being too much of an uptight "java" dev... and I should just create a .py file and start "hacking" away).

Cheers,

Jay.

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

I certainly apologize on behalf of the Python community for your being treated like that! "Just start hacking" is, I believe, what someone says when they really don't know the answer, and are afraid you're going to find them out.

In my experience, the #python Freenode IRC room isn't like that most of the time. We have frequent conversations about proper Python project structure; most of this article came out of those conversations. Of course, as with any community, it depends on which people you encounter, but I would recommend checking that room out in general.

Collapse
 
photoniker profile image
Johann Krauter

Hi together,

I have some "beauty" buggy behaviour with importing typing hints in my docstring. I'm using Sphinx with the intersphinx extension to build a docu based on the typing hints and docstring of my code.
By using the extension "intersphinx" and the intersphinx_mapping you can map "python, numpy, matplotlib" docu references in your docu.

In the screenshot you see: First parameter is a np.ndarray ("import numpy as np") type without any references. The second parameter has the references to the python docu.
Python docu references

When I import numpy as "import numpy" and type in the docstring numpy.ndarray, I get the docu references in the build html docu.
Numpy docu references

Can somebody example why it does not go with the np.ndarray typehint?

Collapse
 
excelite profile image
Benjamin Ewert

thanks a lot for your article, this was exactly what i was looking for since this topic is skipped by basically everyone else...

You said that the topic with the import of app.py instead of using main.py is out of scope... Do you plan on writing up something that is picking this topic up?

I'm quite interested in the reasoning of this approach, do you have any reference by chance?

Collapse
 
codemouse92 profile image
Jason C. McDonald

I don't think there are any formal guidelines on the topic, to be honest. My use of app.py has a lot to do with separation of concerns; I put my GUI startup code in app.py, and my non-GUI startup code in __main__.py. I can't really point to something that says this is "right" or "wrong"...it just works out pretty well for my project. It's something that has to be considered on a project-by-project basis, really.

Collapse
 
cbrintnall profile image
Christian Brintnall

Nice article, you dropped a _ in this snippet:

import smart door
smart_door.open()
smart_door.close()

Just so you know!

Collapse
 
codemouse92 profile image
Jason C. McDonald

Thanks for catching that! I just went back and fixed it. :)

Collapse
 
kurtz1993 profile image
Luis Hernández

That's an excellent explanation. I can understand the project structure better with this.

Thanks for your article, Jason. Looking forward to read the following ones :D

Collapse
 
heitrix profile image
Crocodile Forest

I've just start learning to code. I love this series. Thank you for making this.

Collapse
 
felixcoutinho profile image
Felix Coutinho

Nice Article @codemouse92

Collapse
 
ardunster profile image
Anna R Dunster

Nice article, definitely gives me some things to think about as I approach my next project (my first 'real' project, so to speak)

Collapse
 
adir1661 profile image
adir abargil

i wonder where to place the venv directory in this project structure?

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

venv always belongs at the top-most level of the repository (make sure you untrack it via .gitignore!), or else outside of the repository altogether.

Collapse
 
ammachado profile image
Adriano Machado

Nice article, Jason. The code shown in this article is available somewhere? I'd like to check some of the inner details.

Collapse
 
codemouse92 profile image
Jason C. McDonald

Here you are, although I'll warn you that it's being heavily restructured at the moment.

GitHub logo mousepawmedia / omission

A game with a deceptively difficult premise: find the missing letter.

Omission

A game where you find what letter has been removed from a passage.

Content Notes

The content for this game has been derived from "Bartlett's Familiar Quotations".

For brevity, the source of each quote has been omitted - both title and author.

Sections previously italicized have been replaced with CAPS for easier display.

Passages have been trimmed and rearranged to be no more than 4-5 lines.

Authors

  • Jason C. McDonald
  • Jarek G. Thomas
  • Anne McDonald (Content)
  • Jane McArthur (Content)
  • tshirtman (omission.spec)

Thanks to the following:

  • tshirtman (Freenode/#kivy): Pyinstaller help
  • pabs (OFTC/#packaging): Debian packaging help

Dependencies

  • Python3
  • Kivy >= 1.10
  • appdirs >= 1.4.3

Installing

To install from source, see BUILDING.md.

Contributions

We do NOT accept pull requests through GitHub If you would like to contribute code, please read our Contribution Guide.

All contributions are licensed to us under the MousePaw Media Terms of Development.

License

Omission is licensed…

Collapse
 
guillermochussir profile image
Guillermo Chussir

Excellent guide! Thanks!

Collapse
 
feruzoripov profile image
Feruz Oripov

Hi Jason, when will you publish the next part of tutorial ?

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

I don't have a specific schedule in mind, and I'm balancing a few things, but I hope to have the next published later this week, or early next.

That said, I cannot promise any particular timeline beyond that. It all depends on how some other pieces of my life go. It might be really quick, or I might only post every other week. Can't say for sure.

Given how popular this series is, though, it's a very high priority of mine to update and finish.

Collapse
 
zichis profile image
Ezichi Ebere Ezichi

Nice article!!!

Collapse
 
marlonmarcello profile image
Marlon Ugocioni Marcello

This series has been awesome Jason, thanks!

Collapse
 
thisisthemurph profile image
Michael Murphy

This is the article I needed! Beautifully written, thank you very much.

Collapse
 
kalyan2809 profile image
Kalyan

Very nice article!