We will go through the basics of multi-threading and single threading and advantages and disadvantages of both.
Thread is the basic unit of CPU utilization i.e. how an application uses CPU. Let's understand it through an example.
Let's say we open a web-page, we often see the text contents being loaded before the image and video contents. Here loading the web-page is a process and the process contains 2 threads, one for loading the text contents and the other for loading the images contents.
Let's take another example consisting of basic coding problem. Suppose, we want to get the sum of the numbers in an array of length N. We can make this simple process of adding number multi threaded by associating 2 threads for summing them. One for the first half of array, and second for the other half of the array. And then sum the sum of two halves. Here the threads summing the both the halves are child threads and the thread having the final sum is called parent thread.
A process can have as many threads as required (limited to the hardware requirements, and efficiency overheads). Thus, it is clear that code, data, files belonging to a particular process will be common to all the threads in multi threaded process.
But each thread has it's unique thread Id, program counter, register set and stack memory as illustrated in following diagram:
Responsiveness - Let's say in the aforementioned example, while loading a web-page, there is some large image being loaded and taking it's time. As the whole process is multi-threaded, the loading of image will not block the loading of text content thus making it more responsive to the user.
Resource sharing - Threads share the memory and resources of the process by default thus allowing application to have several different threads within same address space.
Economy - As threads share the same memory and resources that of the processes. It's economical to create and context switch threads vis-a-vis process creation.
- Identifying tasks to multi thread to make the application efficient.
- Maintaining data integrity as there maybe situations where the same data is being manipulated by different threads.
- Balancing cost- It is important to share the workload of application equally among different threads otherwise there will be threads doing less work than other creating economical overheads.
- It is rather easy to test and debug a single threaded application than multi threaded one.
We would come across two terms very frequently, parallelism and concurrency. Generally, these two go hand-in-hand. But what exactly do they mean?
A system is parallel if it can perform more than one task simultaneously.
Concurrency is when more than one task makes progress. Even if the system is single core, the CPU schedulers rapidly switch between processes therefore making illusion of parallel system and thus allowing progress for different tasks.
Therefore, it is important to note that concurrency can occur without parallelism.
Types of parallelism
Data parallelism- Here, same data is divided into groups and those subset of data are operated on different cores.
Task parallelism - Unique operations are performed onto the same data on different cores.
Many user level threads(threads created in application by user using thread library(explained later)) are mapped onto single kernel level thread. In this, there are following problems:-
- Block statement on 1 user level thread blocks all the thread.
- No true concurrency.
- Not efficient use of multi-core architecture.
Single user level thread is mapped onto single kernel level thread. It has following advantage over many to one model:-
- Block statement on one thread doesn't blocks any other thread.
- True concurrency.
- Efficient use of multi-core system.
But it has a problem of overhead for creating as much kernel level threads.
Many user level threads are mapped onto equal or less number of kernel level threads. It solves the problem of overhead for creating kernel level threads.
This model has a variant namely 2 level model which includes many to many as well one to one model.
In this certain thread of a process are mapped onto a certain kernel level thread until it finishes it's execution.
Thread library is API for programmer to create and manage threads in their applications.
There can be two approaches for implementing thread library:-
The first approach is to provide a library entirely in user space with no kernel support. All code and data structures for the library exist in user space. This means that invoking a function in the library results in a local function call in user space and not a system call.
The second approach is to implement a kernel-level library supported directly by the operating system. In this case, code and data structures for the library exist in kernel space. Invoking a function in the API for the library typically results in a system call to the kernel.
There are 3 main thread libraries:-
- POSIX Pthreads- Maybe a user level or kernel level library. Mostly used by Linux/Unix based OS.
- Windows - Kernel level.
- Java - Thread created and managed directly in Java programs. As, JVM itself runs on an OS. So, it is implemented using a thread library present on the OS.
There are 2 strategies for thread creation:-
- Asynchronous - Parent thread creates child then executes independently. It means, little data sharing between parent and child thread.
- Synchronous - Parent thread waits for the child thread to finish it's execution. More of data sharing is done here.
- POSIX standard for thread creation and synchronization.
- These are mere specifications for thread behavior, not it's implementation.
- Mostly implemented by UNIX type system.
- Windows doesn't support it natively.
- Similar to Pthread creation and management in many ways.
- Differences in method names. For eg:-
pthread.join()function is implemented here using
2 techniques for implementing Java threads:-
- New class derived from thread class and then override the run method.
- Implement the runnable interface.
JVM hides the implementation details of underlying OS and provide consistent, abstract environment that allows Java program to operate at any platform.
All the above user level thread library creation and management comes in the category of explicit threading where the programmer creates and manages threads.
One more way to create and manage threads is to transfer the creation and management from application developers to compilers and run-time libraries. This strategy is known as implicit threading.
2 common strategies of implicit threading are:-
- Thread Pool
- Open MP
There were few difficulties while explicit threading:-
- How much thread to create in order to use multi-core architecture efficiently?
- Time for creating thread.
The general idea behind a thread pool is to create a number of threads at process startup and place them into a pool, where they sit and wait for work. When a server receives a request, it awakens a thread from this pool—if one is available—and passes it the request for service. Once the thread completes its service, it returns to the pool and awaits more work. If the pool contains no available thread, the server waits until one becomes free.
Benefits of thread pool:-
- Servicing a request with an existing thread is faster than waiting to create a thread.
- Limits on number of threads. Thus, benefiting the system which doesn't support large number of threads.
- These are set of compiler directives as well as API to provide support for parallel programming.
- It identifies the parallel region in the process and executes them.
- We can also have control over number of thread being created and data being shared between the threads.
Finally, it's time to delve inside the issues while threading.
Whether all the threads of the process will be duplicated or it will become
single threaded after executing
exec() statement still almost works the same way i.e. program specified as
parameter will replace the whole process.
Signals are to mark any event while executing a process.
There are 2 types of handler:-
- Default signal handler - Kernel runs this while handling the signal.
- User defined signal handler - User defined handler will override the default signal handler.
In single threaded program, all the signals are delivered to the process.
In multi threaded program, there are 4 options:-
- Deliver the signal to the thread to which the signal applies.
- Deliver the signal to every thread in the process.
- Deliver the signal to certain threads in the process.
- Assign a specific thread to receive all signals for the process.
In case of synchronous signals, the signal needs to be delivered to the thread which signal applies.
In case of asynchronous signals, if the signal is affecting all the threads, then it is to be delivered to every thread. If it is affecting certain thread, then the signal is to be delivered to that certain thread.
Windows doesn't explicitly provides for signal handling but emulate it through Asynchronous procedure calls(APC) . APC is delivered to particular thread rather than the process.
The thread which is to be cancelled is known as target thread.
There are 2 strategies for thread cancellation:-
- Asynchronous cancellation- One thread immediately terminates the target thread resulting in abrupt termination.
- Deferred cancellation - Target thread periodically checks whether it should terminate or not, thus terminating in orderly fashion.
pthread_cancel(tid) only requests to cancel a thread. Original cancellation depends on how target thread is set up to handle request i.e. deferred or asynchronous.
Threads belonging to a process share the data of the process. However, in some circumstances, each thread might need its own copy of certain data. We will call such data thread-local storage(TLS). Most thread libraries—including Windows and Pthreads—provide some form of support for thread-local storage. Java provides support as well.
For a user level thread to be executed, it has to communicate with kernel level thread. The scheme for this communication is known as scheduler activation.
- Kernel provides the application with a set of virtual processors known as light weight process(LWP)
- App can schedule user threads on LWP.
- Kernel must inform application about the events, called as upcall.
- Upcalls are handled by the thread library with an upcall handler, and upcall handlers must run on a virtual processor.
- In the case of upcall(let's say block), the kernel makes an upcall to the application informing it that a thread is about to block and identifying the specific thread.
- The kernel then allocates a new virtual processor to the application.
- The application runs an upcall handler on this new virtual processor, which saves the state of the blocking thread and relinquishes the virtual processor on which the blocking thread is running.
- The upcall handler then schedules another thread that is eligible to run on the new virtual processor.
- When the event that the blocking thread was waiting for occurs, the kernel makes another upcall to the thread library informing it that the previously blocked thread is now eligible to run.
- The upcall handler for this event also requires a virtual processor, and the kernel may allocate a new virtual processor.
- After marking the unblocked thread as eligible to run, the application schedules an eligible thread to run on an available virtual processor.
That's it for the basics of threading. Hope you had a good read.