You may have heard the terms "parallelization" or "concurrency", which refer to scheduling tasks to run parallelly or concurrently (at the same time) to save time and resources. This is a common practice in asynchronous programming, where coroutines are used to execute tasks concurrently.
Threading in Python is used to run multiple tasks at the same time, hence saving time and resources and increasing efficiency.
Although multi-threading can save time and resources by executing multiple tasks at the same time, using it in code can lead to safety and reliability issues.
In this article, you'll learn what is threading in Python and how you can use it to make multiple tasks run concurrently.
What is Threading?
Threading, as previously stated, refers to the concurrent execution of multiple tasks in a single process. This is accomplished by utilizing Python's threading
module.
Threads are smaller units of the program that run concurrently and share the same memory space.
How to Create Threads and Execute Concurrently
Python provides a module called threading
that provides a high-level threading interface to create and manage threads in Python programs.
Create and Start Thread
A thread can be created using the Thread
class provided by the threading
module. Using this class, you can create an instance of the Thread
and then start it using the .start()
method.
import threading
# Creating Target Function
def num_gen(num):
for n in range(num):
print("Thread: ", n)
# Main Code of the Program
if __name__ == "__main__":
print("Statement: Creating and Starting a Thread.")
thread = threading.Thread(target=num_gen, args=(3,))
thread.start()
print("Statement: Thread Execution Finished.")
A thread is created by instantiating the Thread
class with a target
parameter that takes a callable object in this case, the num_gen
function, and an args
parameter that accepts a list or tuple of arguments, in this case, 3
.
This means that you are telling Thread
to run the num_gen()
function and pass 3
as an argument.
If you run the code, you'll get the following output:
Statement: Creating and Starting a Thread.
Statement: Thread Execution Finished.
Thread: 0
Thread: 1
Thread: 2
You can notice that the Statement section of the code has finished before the Thread
did. Why does this happen?
The thread starts executing concurrently with the main program and the main program does not wait for the thread to finish before continuing its execution. That's why the above code resulted in executing the print
statement before the thread was finished.
To understand this, you need to understand the execution flow of the program:
First, the
"Statement: Creating and Starting a Thread."
print statement is executed.Then the thread is created and started using
thread.start()
.The thread starts executing concurrently with the main program.
The
"Statement: Thread Execution Finished."
print statement is executed by the main program.The thread continues and prints the output.
The thread and the main program run independently that's why their execution order is not fixed.
join() Method - The Saviour
Seeing the above situation, you might have thought then how to suspend the execution of the main program until the thread is finished executing.
Well, the join()
method is used in that situation, it doesn't let e*xecute the code further until the current thread terminates*.
import threading
# Creating Target Function
def num_gen(num):
for n in range(num):
print("Thread: ", n)
# Main Code of the Program
if __name__ == "__main__":
print("Statement: Creating and Starting a Thread.")
thread = threading.Thread(target=num_gen, args=(3,))
thread.start()
thread.join()
print("Statement: Thread Execution Finished.")
After creating and starting a thread, the join()
method is called on the Thread
instance (thread
). Now run the code, and you'll get the following output.
Statement: Creating and Starting a Thread.
Thread: 0
Thread: 1
Thread: 2
Statement: Thread Execution Finished.
As can be seen, the "Statement: Thread Execution Finished."
print statement is executed after the thread terminates.
Daemon Threads
Daemon threads run in the background and terminate immediately whether they completed the work or not when the main program exits.
You can make a daemon thread by passing the daemon
parameter when instantiating the Thread
class. You can pass a boolean value to indicate whether the thread is a daemon (True
) or not (False
).
import threading
import time
def daemon_thread():
while True:
print("Daemon thread is running.")
time.sleep(1)
print("Daemon thread finished executing.")
if __name__ == "__main__":
thread1 = threading.Thread(target=daemon_thread, daemon=True)
thread1.start()
print("Main program exiting.")
A thread is created by instantiating the Thread
class passing the daemon_thread
function inside it and to mark it as a daemon thread, the daemon
parameter is set to True
.
The daemon_thread()
function is an infinite loop that prints a statement, sleeps for one second, and then again prints a statement.
Now when you run the above code, you'll get the following output.
Daemon thread is running.Main program exiting.
You can see that as soon as the main program exits, the daemon thread terminates.
At the time when the daemon_thread()
function enters the loop, the concurrently running main program exits, and the daemon_thread()
function never reaches the next print
statement as can be seen in the output.
threading.Lock - Avoiding Race Conditions
Threads, as you know, run concurrently in a program. If your program has multiple threads, they may share the same resources or the critical section of the code at the same time, this type of condition is called race conditions.
This is where the Lock
comes into play, it acts like a synchronization barrier that prevents multiple threads from accessing the particular code or resources simultaneously.
The thread calls the acquire()
method to acquire the Lock
and the release()
method to release the Lock
.
import threading
# Creating Lock instance
lock = threading.Lock()
data = ""
def read_file():
global data
with open("sample.txt", "r") as file:
for info in file:
data += "\n" + info
def lock_task():
lock.acquire()
read_file()
lock.release()
if __name__ == "__main__":
thread1 = threading.Thread(target=lock_task)
thread2 = threading.Thread(target=lock_task)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
# Printing the data read from the file
print(f"Data: {data}")
First, a Lock
is created using the threading.Lock()
and store it inside the lock
variable.
An empty string is created (data
) for storing the information from both threads concurrently.
The read_file()
function is created that reads the information from the sample.txt
file and adds it to the data.
The lock_task()
function is created and when it is called, the following events occur:
The
lock.acquire()
method will acquire the Lock immediately when thelock_task()
function is called.If the
Lock
is available, the program will execute theread_file()
function.After the
read_file()
function finished executing, thelock.release()
method will release theLock
to make it available again for other threads.
Within the if __name__ == "__main__"
block, two threads are created thread1
and thread2
that both runs the lock_task()
function.
Both threads run concurrently and attempt to access and execute the read_file()
function at the same time but only one thread can access and enter the read_file()
at a time due to the Lock
.
The main program waits for both threads to execute completely because of thread1.join()
and thread2.join()
.
Then using the print
statement, the information present in the file is printed.
Data:
Hello there! Welcome to GeekPython.
Hello there! Welcome to GeekPython.
As can be seen in the output, one thread at a time reads the file. However, there were two threads that's why the file was read two times, first by thread1
and then by thread2
.
Semaphore Objects in Threading
Semaphore allows you to limit the number of threads that you want to access the shared resources simultaneously. Semaphore has two methods:
acquire()
: Thread can acquire the semaphore if it is available. When a thread acquires a semaphore, the semaphore's count decrement if it is greater than zero. If the count is zero, the thread waits until the semaphore is available.release()
: After using the resources, the thread releases the semaphore that results in an increment in the count. This means that shared resources are available.
Semaphore is used to limit access to shared resources, preventing resource exhaustion and ensuring controlled access to resources with limited capacity.
import threading
# Creating a semaphore
sem = threading.Semaphore(2)
def thread_task(num):
print(f"Thread {num}: Waiting")
# Acquire the semaphore
sem.acquire()
print(f"Thread {num}: Acquired the semaphore")
# Simulate some work
for _ in range(5):
print(f"Thread {num}: In process")
# Release the semaphore when done
sem.release()
print(f"Thread {num}: Released the semaphore.")
if __name__ == "__main__":
thread1 = threading.Thread(target=thread_task, args=(1,))
thread2 = threading.Thread(target=thread_task, args=(2,))
thread3 = threading.Thread(target=thread_task, args=(3,))
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
print("All threads have finished.")
In the above code, Semaphore
is instantiated with the integer value of 2
which means two threads are allowed to run at the same time.
Three threads are created and all of them use the thread_task()
function. But only two threads are allowed to run at the same time, so two threads will access and enter the thread_task()
function at the same time, and when any of the threads releases the semaphore, the third thread will acquire the semaphore.
Thread 1: Waiting
Thread 1: Acquired the semaphore
Thread 1: In process
Thread 1: In process
Thread 1: In process
Thread 1: In process
Thread 1: In processThread 2: Waiting
Thread 2: Acquired the semaphore
Thread 1: Released the semaphore.
Thread 2: In process
Thread 2: In process
Thread 3: WaitingThread 2: In process
Thread 3: Acquired the semaphore
Thread 3: In process
Thread 2: In process
Thread 2: In process
Thread 2: Released the semaphore.
Thread 3: In process
Thread 3: In process
Thread 3: In process
Thread 3: In process
Thread 3: Released the semaphore.
All threads have finished.
Using ThreadPoolExecutor to Execute Tasks from a Pool of Worker Threads
The ThreadPoolExecutor
is a part of concurrent.features
module that is used to execute multiple tasks concurrently. Using ThreadPoolExecutor
, you can run multiple tasks or functions concurrently without having to manually create and manage threads.
from concurrent.futures import ThreadPoolExecutor
# Creating pool of 4 threads
executor = ThreadPoolExecutor(max_workers=4)
# Function to evaluate square number
def square_num(num):
print(f"Square of {num}: {num * num}.")
task1 = executor.submit(square_num, 5)
task2 = executor.submit(square_num, 2)
task3 = executor.submit(square_num, 55)
task5 = executor.submit(square_num, 4)
# Wait for tasks to complete and then shutdown
executor.shutdown()
The above code creates a ThreadPoolExecutor
with a maximum of 4
worker threads which means the thread pool can have a maximum of 4 worker threads executing the tasks concurrently.
Four tasks are submitted to the ThreadPoolExecutor
using the submit
method with the square_num()
function and various arguments. This will execute the function with specified arguments and prints the output.
In the end, the shutdown
method is called, so that ThreadPoolExecutor
shutdowns after the tasks are completed and resources are freed.
You don't have to explicitly call the shutdown
method if you create ThreadPoolExecutor
using the with statement.
from concurrent.futures import ThreadPoolExecutor
# Task
def square_num(num):
print(f"Square of {num}: {num * num}.")
# Using ThreadPoolExecutor as context manager
with ThreadPoolExecutor(max_workers=4) as executor:
task1 = executor.submit(square_num, 5)
task2 = executor.submit(square_num, 2)
task3 = executor.submit(square_num, 55)
task5 = executor.submit(square_num, 4)
In the above code, the ThreadPoolExecutor
is used with the with
statement. When the with
block is exited, the ThreadPoolExecutor
is automatically shut down and its resources are released.
Both codes will produce the same result.
Square of 5: 25.
Square of 2: 4.
Square of 55: 3025.
Square of 4: 16.
Common Function in Threading
The threading
module provides numerous functions and some of them are explained below.
Getting Main and Current Thread
The threading
module has a main_thread()
and a current_thread()
function which is used to get the main thread and the currently running thread respectively.
import threading
def task():
for _ in range(2):
# Getting the current thread name
print(f"Current Thread: {threading.current_thread().name} is running.")
# Getting the main thread name
print(f"Main thread : {threading.main_thread().name} started.")
thread1 = threading.Thread(target=task)
thread2 = threading.Thread(target=task)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print(f"Main thread : {threading.main_thread().name} finished.")
Because the main_thread()
and current_thread()
functions return a Thread
object, threading.main_thread().name
is used to get the name of the main thread and threading.current_thread().name
is used to get the name of the current thread.
Main thread : MainThread started.
Current Thread: Thread-1 (task) is running.
Current Thread: Thread-1 (task) is running.
Current Thread: Thread-2 (task) is running.
Current Thread: Thread-2 (task) is running.
Main thread : MainThread finished.
Monitoring Currently Active Threads
The threading.enumerate()
function is used to return the list of Thread
objects that are currently running. This includes the main thread even if it is terminated and excludes terminated threads and threads that have not started yet.
If you want to get the number of Thread
objects that are currently alive, you can utilize the threading.active_count()
function.
import threading
def task():
print(f"Current Thread : {threading.current_thread().name} is running.")
# Getting the main thread name
print(f"Main thread : {threading.main_thread().name} started.")
threads_list = []
for _ in range(5):
thread = threading.Thread(target=task)
thread.start()
threads_list.append(thread)
# Getting the active thread count
print(f"\nActive Thread Count: {threading.active_count()}")
for thread in threads_list:
thread.join()
print(f"Main thread : {threading.main_thread().name} finished.")
# Getting the active thread count
print(f"Active Thread Count: {threading.active_count()}")
# Getting the list of active threads
for active in threading.enumerate():
print(f"Active Thread List: {active.name}")
Output
Main thread : MainThread started.
Current Thread : Thread-1 (task) is running.
Active Thread Count: 2
Current Thread : Thread-2 (task) is running.
Active Thread Count: 2
Current Thread : Thread-3 (task) is running.
Active Thread Count: 2
Current Thread : Thread-4 (task) is running.
Active Thread Count: 2
Current Thread : Thread-5 (task) is running.
Active Thread Count: 1
Main thread : MainThread finished.
Active Thread Count: 1
Active Thread List: MainThread
Getting Thread Id
import threading
import time
def task():
print(f"Thread {threading.get_ident()} is running.")
time.sleep(1)
print(f"Thread {threading.get_ident()} is terminated.")
print(f"Main thread started.")
threads_list = []
for _ in range(5):
thread = threading.Thread(target=task)
thread.start()
threads_list.append(thread)
for thread in threads_list:
thread.join()
print(f"Main thread finished.")
Every thread running in a process is assigned an identifier and the threading.get_ident()
function is used to retrieve the identifier of the currently running thread.
Main thread started.
Thread 9824 is running.
Thread 7188 is running.
Thread 4616 is running.
Thread 3264 is running.
Thread 7716 is running.
Thread 7716 is terminated.
Thread 9824 is terminated.
Thread 7188 is terminated.Thread 4616 is terminated.
Thread 3264 is terminated.
Main thread finished.
Conclusion
A thread is a smaller unit in the program that is created using the threading
module in Python. Threads are tasks or functions that you can use multiple times in your program to execute concurrently to save time and resources.
In this article, you've learned:
What is threading and how do you create and start a thread
Why
join()
method is usedWhat are daemon threads and how to create one
How to Lock threads to avoid race conditions
How semaphore is used to limit the number of threads that can access the shared resources at the same time.
How you can execute a group of tasks using the
ThreadPoolExecutor
without having to create threads.Some common functions provided by the
threading
module.
πOther articles you might be interested in if you liked this one
β Comparing the accuracy of 4 pre-trained deep learning models?
β What are coroutines in Python and how do use them in asynchronous programming?
β Async/Await in Python using the asyncio module?
β How to structure a Flask app using Flask Blueprint?
β Upload and display images on the frontend using Flask in Python.
β How to connect the SQLite database with the Flask app using Python?
That's all for now
Keep Codingββ
Top comments (0)