DEV Community

Bosco Noronha
Bosco Noronha

Posted on

Multithreading vs Multiprocessing in Python 🐍

tldr;

The Python threading module uses threads instead of processes. Threads uniquely run in the same unique memory heap. Whereas Processes run in separate memory heaps. This makes sharing information harder with processes and object instances. One problem arises because threads use the same memory heap, multiple threads can write to the same location in the memory heap which is why the global interpreter lock(GIL) in CPython was created as a mutex to prevent it from happening.

What’s Multithreading?

The multithreading library is lightweight, shares memory, responsible for responsive UI and is used well for I/O bound applications. However, the module isn’t killable and is subject to the GIL
Threading library in Python
Multiple threads live in the same process in the same space, each thread will do a specific task, have its own code, own stack memory, instruction pointer, and share heap memory. If a thread has a memory leak it can damage the other threads and parent process.

import threading

def calc_square(number):
    print('Square': , number * number)
def calc_quad():
    print('Quad': , number * number * number * number)
if __name__ == "__main__":
    number = 7
    thread1 = threading.Thread(target=calc_square, args=(number,))
    thread2 = threading.Thread(target=calc_quad, args=(number,))
    # Will execute both in parallel
    thread1.start()
    thread2.start()
    # Joins threads back to the parent process, which is this
    # program
    thread1.join()
    thread2.join()

# This program reduces the time of execution by running tasks in parallel
Enter fullscreen mode Exit fullscreen mode

What’s multiprocessing?

The multiprocessing library uses separate memory space, multiple CPU cores, bypasses GIL limitations in CPython, child processes are killable(ex. function calls in program) and is much easier to use. Some caveats of the module are a larger memory footprint and IPC’s a little more complicated with more overhead.
Checkout Multiprocessing library in the Python docs

import multiprocessing

def calc_square(number):
    print('Square': , number * number)
    result = number * number
    print(result)
def calc_quad():
    print('Quad': , number * number * number * number)
if __name__ == "__main__":
    number = 7
    result = None
    p1 = multiprocessing.Process(target=calc_square, args=(number,))
    p2 = multiprocessing.Process(target=calc_quad, args=(number,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

    # Wont print because processes run using their own memory location                     
    print(result)
Enter fullscreen mode Exit fullscreen mode

An exercise, execute these programs and measure the delta between threads, between process & threading, relative to never using either libraries.

This is my first technical blog post, let me know if you found it interesting to read.

Original post here: https://medium.com/@nbosco/multithreading-vs-multiprocessing-in-python-c7dc88b50b5b

Oldest comments (5)

Collapse
 
plaintextnerds profile image
Tim Armstrong

You've got a bug in your code for both the threading and processing examples: You're passing "number" as an arg to calc_quad but calc_quad accepts no args. As a result "number" is undefined in the function.

Collapse
 
jacopobonta profile image
Jacopo

In an attempt to see the differences between processes and threads I tried to add the result variable also to the first example. Since with thread we share the same memory heap i was expected to see print(result) outputting the square result. Instead like the example with the multiprocesses None was printed, can you or someone else explain me why this happens?

Collapse
 
mdah profile image
m-dah

Declare the "result" variable as global in main and threads so it does not create a new local variable for each thread.

Collapse
 
nnkteja profile image
N N K Teja

πŸ‘
u may have to pass number to
calc_quad(): => calc_quad(number):

also print('Quad': => print('Quad:'

Collapse
 
linehammer profile image
linehammer

The threading module uses threads, the multiprocessing module uses processes. The difference is that threads run in the same memory space, while processes have separate memory. This makes it a bit harder to share objects between processes with multiprocessing. Since threads use the same memory, precautions have to be taken or two threads will write to the same memory at the same time. This is what the global interpreter lock is for. Spawning processes is a bit slower than spawning threads. Once they are running, there is not much difference. More...net-informations.com/python/iq/mul...