I was inspired by @rpalo's quest to uncover gems in Python's standard library
defaultdicts: Never Check If ...
For further actions, you may consider blocking this person and/or reporting abuse
So if my next post about some lesser known part of the standard library (or whatever I want to talk about) is about a string method are you going to start your comment with "2018: Python engineers have heard about strings" ?
If you re-read the intro you'll notice I said the code works from Python 3.2 onwards, I'm well aware of that.
So not a solution in real life that people (and me) have been using the underlying APIs these executor use for years in production.
So, now you're saying that everyone that has ruby and python code in production using multiple threads or processes and doing fine doesn't exist because you decided threads and processes are not robust enough to deal with "real tasks"?
Please tell me in which part of my post I made a comparison with other languages. The whole post is about talking about a functionality that has been in the standard library for years and some people might not know about.
Great article. I was not aware of
concurrent.futures
in standard library.I use
gevent
in Python 2 for light-weight threads. It uses green threads under the hood. The API is very simple to use. If you have a lot of IO bound tasks, e.g downloading over 100 files / making a lot of requests concurrently, this library is very useful.I don't think you're the first one, the various "what's new in Python" contain a lot of gems over the years ;)
Yeah gevent is nice though I've never been a huge fan of cooperative multi tasking as a concurrency paradigm. It's true as you say that for I/O bound tasks has it uses though.
The objective of the post was to uncover a hidden gem in the standard library. Regarding async I/O you might want to take a look at asyncio in the standard library (Python 3's) and uvloop outside the library!
uvloop
is excellent! I have used it for Python 3 based web-services.Nice! You should make a post about it!
:-)
All in due time :)
I agree, have a look at asyncio, that seems to be the future. Better to say it's already the present!
Map is fantastic, but I like being able to see progress of my tasks correctly using tqdm, for that I use the following pattern:
Yeah, there's a lot of neat stuff in that module. I chose map because it's probably the quick and dirty solution to make a sequential workload into a parallel one.
I have a program in production that does a mass upload of images to S3 using futures and
as_completed
.Didn't know about tqdm though, thanks for the tip!
tqdm is a life changer!
To be welcomed you shouldn't start to comment python article with something like "python sucks, use Java.
Everybody here knows good and bad sides of Python, and sure it is not 1kk RPS per core language, thank you, K.O.
As for company examples, remember Eve online, running dozens of thousands players in one world using python 2.7
Laughter is always goood, releases endorphines
@rhymes , please don't pay too much attention when people with russian names criticise your work. High level of critics is a heritage of a Soviet model of education.
We suffer from self-criticism, too :)
From other side, sometimes westerns are too supportive, and will not tell you the bitter true, trying not to hurt anybody.
As for the question, I saw a python code that reach 100k requests per minute per core.
pawelmhm.github.io/asyncio/python/...
But far before this numbers, in a real app, you will reach limits of DB, or disk, or whatever other part of code, so...
Just add three things to your post:
This I said in the intro :D
This I didn't know, I don't pay attention to what's going on in Python 2 anymore :D
Thanks for the info!
That is true, though in this context it's yes and no and depends on the number of cores.
Python has a GIL which simplifies its C API so that up to 1 thread is running at the same time when it's running bytecode.
Python though has a 1 by 1 threading model, where 1 Python thread is mapped onto 1 OS thread.
If the code is I/O bound, Python effectively releases the GIL, so the threads waiting for I/O are running free of it.
How many threads are running at the same time? One per core. I have 4 cores, so there are effectively 4 units of independent code running in parallel. The context switching (concurrency) does happen but there are up to 4 different parallel threads runnning.
I hope this clears it.
As the famous Go slides you linked state: concurrency is about dealing with lots of things at once (and you can be concurrent with only one thread and one process, see Node for example), parallelism is about doing multiple things at once.
In this case we have concurrent tasks that can be run in parallel if you have more than one core.
In Python there's only one moment when multiple threads achieve paralellism, and that's if they are waiting for I/O (and that's exactly why I chose this example :D) with multiple cores. If you want CPU bound parallelism in Python you have to use processes.
Having 4 cores is irrelevant as the python threads can only utilize one logical core due to the GIL. The only possible way to release the GIL is using the C API, but it’s not advisable.
Python's I/O primitives are written with the C API.
They release the GIL when they are waiting for I/O, so in those moments when you're waiting for I/O, your program, for a little while is technically parallel.
A few moments of pure parallelism ;-)
So no, having multiple cores is not completely irrelevant.
Awesome! I haven’t done much async/concurrent Python and I want to learn more about it. Thanks for sharing :)
P.S. I’m curious to see if series can span multiple users or not. What happens if you add ‘series: The Python Standard Library’ to your front matter? Does it link it to my post? Or does it start your own series?
I just checked the source code, series are linked to a single user :-)
github.com/thepracticaldev/dev.to/...
Probably for the best. Would want just anybody hijacking your series.
Great Article!
Thank you Lucas!