Skip to content
loading...

How Specific Are you With Your Imports and Why?

twitter logo github logo ・1 min read  

Hey all! I am curious as to how the rest of the community imports their dependencies and why they choose to do it that way.

For instance, I generally find myself either bring in the library itself eg import pandas as pd or a class within the library eg from requests import Session. This has always worked for me, but I've heard others swear that you should only import the methods or functions you need for performance and/or security reasons.

So how do you bring in dependencies that you need and what is your reasoning behind doing it that way??

twitter logo DISCUSS (6)
markdown guide
 

There are only good reasons to be specific about what parts of your dependencies you're using. It helps with readability by signalling what you're doing with those dependencies far in advance and it makes your statements shorter, for just two.

I can hear someone saying "but those lines at the top are so verbose and they must be maintained!" to which I say "Your poor, auto-complete-less hands." We have tools for that if it's so upsetting.

 

I'm not sure about "security reasons" (?) but there is a difference, albeit an insignificant one unless you are in a loop.

This is what happens at the moment of the import statement:

python import statement

The only case where performance might actually matter is if you loop a lot of times and call a function inside another method. Taking from this answer - stackoverflow.com/a/33642848/4186181 - I ran the two example functions on Python 3.6.6 and I get these timings:

In [2]: %timeit tight_loop_slow(10000000)
2.14 s Β± 33.5 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit tight_loop_fast(10000000)
1.63 s Β± 43.1 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each)

So yeah, if you're looping a lot consider aliasing the function

 

So there is a performance benefit after all? That is very cool. The difference seems a bit on the academic side in this example, but since most of what I do deals in datasets that seem to get larger by the day, an academic difference over few iterations can be an enormous difference over thousands or more!

As far as security goes, I've never really been sure what was meant by that either. I was at a coding meetup in Portland, ME where I live and some other developers were talking about it. They used different languages than I did, mostly C, C++, and Java, and seemed to reference the fact that importing things you didn't need would leave those modules unintentionally usable from within the code that imported but didn't use them. I've never been all that low level and, though I have worked in a C-derivative language (C#) it was very early on in my career and only for about a year so I never got that deep with it. Either way, thanks for pointing out those docs! It really makes me think about what I am actually doing with imports that I take for granted so much at this point.

 

mostly C, C++, and Java, and seemed to reference the fact that importing things you didn't need would leave those modules unintentionally usable from within the code that imported but didn't use them

Ok got it. I'm not sure that's applicable to Python because you bring the runtime with you when you deploy the app and because you can import modules at runtime which means you can execute anything in the standard library and libraries packaged with the app. It also depends on what kind of security we're talking about but I'm not a real expert on the subject...

 

from mymodule import funca
from mymodule import funcb
from somemodule import afunc

I always use this syntax with the exception of pandas and numpy which I import as import pandas as pd and import numpy as np, repectively. The reason is not performance or security but other three reasons.

First, the fact that the autocompletion in Jupyter Notebooks (my default development environment) is not working all the time.

Second, if you use the * syntax you have no way of knowing what you actually import: If you write from mymodule import *, you could have potentially imported a lot of funcions that you don't want or need.

Third, it saves you some typing later on. If you write from sklearn.metrics import mean_squared_error in the beginning of the file (or wherever you do your imports), you later only have to write mean_squared_error(), if you call the function.

 

I have heard this method used many times before and that is what I was wondering! For me, it usually depends on the situation and how familiar I am with what I am working with. If i'm doing something I've done a million times, I typically just import the exact functions I need from a given class. But if I'm not sure, I'll bring in an entire class or even an entire module. Even in times when I DO know exactly what is going on, I'll bring in most of the class. Example being:

from bs4 import BeautifulSoup

I am also wondering how developers in other languages approach this kind of thing. For instance, when I was a C# developer we rarely imported and entire class and NEVER an entire namespace simply because system.web and the like were so humongous that it would take FOREVER to compile. Python tends to be a lot lighter and modular so I've not experienced that kind of specificity in it despite almost a decade using it. Thanks for the input! I love hearing the way other devs do things because there is always something I can take away from the tried and true practices of others!

Classic DEV Post from Sep 13 '19

Is it possible to get relevant industry experience on your own (not through working at a company)?

This is an anonymous post sent in by a member who does not want their name disclo...

kaelscion profile image
I'm Jake Cahill. Lifetime Pythonista, web scraping and automation expert. Enjoy books. Love my wife, dog, and cat, and think AI and Julia are pretty nifty