DEV Community

loading...

Quick Tutorial on Multi-threaded Programming in C

htnguy profile image Hieu Nguyen Originally published at devsurvival.com Updated on ・6 min read

What is Multi-threaded Programming?

Alt Text

Multi threaded programming is a design approach that splits work into smaller units of work and distribute them among a collection of workers or threads that can solve the individual job concurrently. Imagine that you and your friends are trying to make a pizza. It would not make sense for everyone to make the dough and then make the sauce and then cook it. Instead, one person can make the dough while someone else can make the sauce and preheat the oven.

Thread vs Process

Process - When you are running any program, you are in fact running a process. For instance, you downloaded an application on your computer. It is sitting idling on your computer(like a lot of things we download and never use), it is not a process. When you open the application and it is actively running woahla! It becomes a process. Note: you can have more than one process of a program.

Thread - A process will always have one thread commonly called the main or parent thread. However, a process can have more than one thread. A thread that is created in the main thread is called the child thread.

So what happens when you create a thread? Thread creation will usually accepts a routine or function that will get invoked when the thread execution begins. Whent the child thread is executing the function/routine, the parent thread will continues execution unless a wait is invoked.

Memory Space

"Memory... is the diary that we all carry about with us." ~ Oscar Wilde

  • All programs consume memory or space (RAM)
  • A Process has its own memory space
  • One process can not access another process's memory space - else it would cause corruption or the common segmentation fault (sound familiar? )
  • The memory space of a process can grow and shrink based on the need of the program.
  • a section is a segment of a process's address/memory space
Section Purpose
Data Global Variables
Code/Text Source Code
Heap Dynamic Memory Allocation
Stack Local variables, function invocation and parameters

Multi-thread vs Multi-process

Maybe you have heard of multi-core or multi-processesing. Multi-processsing and multi-threading despite their similarities are different things. When you are following multi-processing, you are creating multiple processes of the same program. On the other hand, when you create a thread, you are creating multiple child threads within one process that can run different routine or functions at the same time. There are pros and cons of using each

Multi-Process Multi-thread
each process has its own address space threads shares the same data, heap, and code section, but have their own stack and register
Overhead for creating Process Thread creation is fast
Greater CPU ultilzation due to context switching(CPU switching from one process to another) Greater CPU Utilization due to switching between the threads
Failure to synchronize threads can result in corrupt data

Let's Learn by Example

I will try to solidy some of these concepts through a programming example. The program is very simple and all it does is take two sets of numbers and compute their sum: A: [1,2,3] B: [4,2,6] => A+B = 1+2+3+4+2+6 = 18. Note: the order of addition does not matter


#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <semaphore.h>
#include <unistd.h>


int sum = 0; // Global int to store the sum of the two arrays
sem_t mutex; // Synchronization Bit

void *add(void *arg){
    int *ptr = (int *) arg;
    while(*ptr != -1){
        sem_wait(&mutex);
        sum += *ptr;
        printf("value: %d sum %d\n", *ptr,sum );
        sem_post(&mutex);
        ptr++;
    }
    return NULL;
}

int main(int argc, char *args[]){
    int A[4] = {1,2,3, -1}; // -1 marks the end of array
    int B[4] = {4,2,6, -1}; 

    pthread_t t_a, t_b;
    sem_init(&mutex, 0, 1);
    pthread_create(&t_a , NULL, add, A);
    pthread_create(&t_b, NULL, add, B);

    pthread_join(t_a, NULL);
    pthread_join(t_b, NULL);
    printf("Total: %d\n", sum);

    return 0;
}

Output

value: 1 sum 1
value: 4 sum 5
value: 2 sum 9
value: 6 sum 15
value: 2 sum 7
value: 3 sum 18
Total: 18

Closer Look

Look at the main function

int main(int argc, char *args[]){
    int A[4] = {1,2,3, -1}; // -1 marks the end of array
    int B[4] = {4,2,6, -1}; 

    pthread_t t_a, t_b;
    sem_init(&mutex, 0, 1);
    pthread_create(&t_a , NULL, add, A);
    pthread_create(&t_b, NULL, add, B);

    pthread_join(t_a, NULL);
    pthread_join(t_b, NULL);
    printf("Total: %d\n", sum);

    return 0;
}

We initialized two array A and B. Note the -1 indicates the end of array. and we declared two c struct pthread_t: t_a and t_b. Then we called sem_init(&mutext)which will initialize the mutext struct, for now you don't have to worry about sem_t mutex and sem_init(&mutex, 0, 1), sem_wait(&mutex), and sem_post(&mutex). In short they are the key parts of synchronization and protecting the critical section problem.

We then call pthread_create(&t_a, NULL, add, A) which creates a new thread. the information pertaining to this thread is store in t_a and this thread will run the void *add(void *arg) function with array int A[4] = {1,2,3,-1}. Same thing for thread p_thread t_b and int B[4] = {4,2,6,-1}

At this point all three threads are running concurrently. Wait. Why three? Remember, a process always has at least one thread: the parent thread. What if the parent thread finishes while the child threads are runnning? Welp, kids are no better than their parents, so sadly, the child threads will terminate when the parent terminates.

To prevent this from happening, we use pthread_join( t_a, NULL) which will cause the parent thread to wait for the child thread t_a to terminate before continuing

So what does the child threads do?

The two child threads will each run this functionvoid *add(void*args) to compute the sum. Note: the sum is a global variable. Where does global variables live in the memory space? In the data section. Which sections are shared by threads? data, heap, and code/text. My point is: both pthread_t t_a and pthread_t t_b can access the int sum = 0

int sum = 0; // Global int to store the sum of the two arrays
sem_t mutex; // Synchronization Bit

void *add(void *arg){
    int *ptr = (int *) arg;
    while(*ptr != -1){
        sem_wait(&mutex);
        sum += *ptr;
        printf("value: %d sum %d\n", *ptr,sum );
        sem_post(&mutex);
        ptr++;
    }
    return NULL;
}

First we have to convert the paramater void *arg into a int *ptr because we passed in the int array when creating the thread and the variable name of the array : int A[4]= {1,2,3,-1} in other words, A stores the pointer to the first element of the array. If you want to learn more about pointers click here.

we loop over the array. Ignore the sem_wait(&mutex) and sem_post(&mutex) these functions are again part of the synchronization and critical section problem. . For now, just pretend that they were not there.

we dereference *ptr and increment and the value of sum and move to the next element using ptr++

Once both child threads finish looping through their array and terminates, the parent thread will continues execution and print the total

    pthread_join(t_a, NULL);
    pthread_join(t_b, NULL);
    printf("Total: %d\n", sum);

    return 0;

I highly recommend copying the entire code section above and running it using gcc -o OUTPUT_FILE_NAME src.c -pthread. Here, src.c is the C source code file you copied into. OUTPUT_FILE_NAME is an arbitrary name of your executable and -pthreads will link the pthread library. Run the program using ./OUTPUT_FILE_NAME. Mess around with it and have fun.

originally posted at https://www.devsurvival.com/multi-threaded-programming/

Discussion (0)

pic
Editor guide