DEV Community

Cover image for Internals of goroutines and Channels
Girish Talekar
Girish Talekar

Posted on • Updated on

Internals of goroutines and Channels

Why go is popular? because of concurrency and in golang how we achieve concurrency? through goroutines.

Before you get into details of goroutines we need to understand Threads and Processes

What is the difference between process and threads ?

How process works

The program has been loaded into the computer’s memory in binary form. Now what?

An executing program needs more than just the binary code that tells
the computer what to do. The program needs memory and various operating
system resources in order to run. A “process” is what we call a program
that has been loaded into memory along with all the resources it needs
to operate. The “operating system” is the brains behind allocating all
these resources, and comes in different flavors such as macOS, iOS,
Microsoft Windows, Linux, and Android. The OS handles the task of
managing the resources needed to turn your program into a running
process.

How process works

A process run independently and in isolation of other process, it can not directly access shared data in other processes.Switching from one process to another requires some time (relatively) for saving and loading registers, memory maps, and other resources. This helps in isolation of the process and that's why you’ve been able to quit that program without affecting others.

There are three main costs of a process switch

  • First is the kernel needs to store the contents of all the CPU registers for that process, then restore the values for another process.
  • The kernel also needs to flush the CPU’s mappings from virtual memory to physical memory as these are only valid for the current process.
  • Finally there is the cost of the operating system context switch, and the overhead of the scheduler function to choose the next process to occupy the CPU.

There are a surprising number of registers in a modern processor. Because a process switch can occur at any point in a process’ execution, the operating system needs to store the contents of all of these registers because it does not know which are currently in use

How Threads works

A process can have multiple threads and

Process

Single vs multi threaded

Because threads share the same address space as the process and other threads within the process, the operational cost of communication between the threads is low, which is an advantage. The disadvantage is that a problem with one thread in a process will certainly affect other threads and the viability of the process itself

Concurrency vs parallelism

How Goroutines works

Goroutines take the idea of threads a step further.

Goroutines are cooperatively scheduled, rather than relying on the kernel to manage their time sharing.

The switch between goroutines only happens at well defined points, when an explicit call is made to the Go runtime scheduler.

The compiler knows the registers which are in /use and saves them automatically.

goroutine scheduling points

Because the heap and stack overwriting each other would be
catastrophic, the operating system usually arranges to place an area of
unwritable memory between the stack and the heap to ensure that if they
did collide, the program will abort.

This is called a guard page, and effectively limits the stack size of a process, usually in the order of several megabytes.

Thread stacks and guard pages

Because it is hard to predict the stack requirements of a particular thread, a large amount of memory is reserved for each thread’s stack along with a guard page. The downside is that as the number of threads in your program increases, the amount of available address space is reduced.

goroutine stacks

Internals of Go Scheduling

Arguably, one of the more important aspects of the Go run-time is the goroutine scheduler. The runtime keeps track of each goroutine, and will schedule them to run in turn on a pool of thr eads belonging to the process. Goroutines are separate from threads but rely upon them to run, and scheduling goroutines onto threads effectively is crucial for the efficient performance of Go programs. The idea behind goroutines is that they are capable of running concurrently,like threads, but are also extremely lightweight in comparison. So, while there might be multiple threads created for a process running a Go program, the ratio of goroutines to threads should be much higher than 1-to-1. Multiple threads are often necessary to ensure that goroutines are not unnecessarily blocked. When one goroutine makes a blocking call,the thread running it must block. Therefore, at least one more thread should be created by the runtime to continue the execution of other goroutines that are not in blocking calls. Multiple threads are allowed to run in parallel upto a programmer defined maximum, which is stored in the variable GOMAXPROCS[6].

It is important to keep in mind that all the OS sees is a single user level process requesting and running multiple threads.The concept of scheduling goroutines onto these threads is merely a construct in the virtual environment of the runtime.When we refer to the Go runtime and scheduler in this pa-per we are referring to these higher level entities, which are completely separate from the operating system.

In the Go runtime, there are three main C-structs that help keep track of everything and support the runtime and scheduler:

THE G STRUCT

A G struct represents a single goroutine[9]. It contains the fields necessary to keep track of its stack and current status. It also contains references to the code that it is responsible for running. See figure 2.

THE M STRUCT

The M struct is the Go runtime’s representation of an OS thread[9]. It has pointers to fields such as the global queue of G’s, the G that it is currently running, its own cache, and a handle to the scheduler. See figure 3.

THE SCHED STRUCT

The Sched struct is a single, global struct[9] that keeps track of the different queues of G’s and M’s and some other information the scheduler needs in order to run, such as the global Sched lock. There are two queues containing G structs, one is the runnable queue where M’s can find work, and the other is a free list of G’s. There is only one queue pertaining to M’s that the scheduler maintains; the M’s in this queue are idle and waiting for work. In order to modify these queues,the global Sched lock must be held. See figure 4.

The runtime starts out with several G’s. One is in charge of garbage collection, another is in charge of scheduling, and one represents the user’s Go code. Initially, one M is created to kick off the runtime. As the program progresses,more G’s may be created by the user’s Go program, and more M’s may become necessary to run all the G’s. As this happens, the runtime may provision additional threads upto GOMAXPROCS. Hence at any given time, there are at most GOMAXPROCS active M’s.

Since M’s represent threads, an M is required to run a goroutine. An M without a currently associated G will pick up a G from the global runnable queue and run the Go code belonging to that G. If the Go code requires the M to block,for instance by invoking a system call, then another M will be woken up from the global queue of idle M’s. This is done to ensure that goroutines, still capable of running, are not blocked from running by the lack of an available M.

System calls force the calling thread to trap to the kernel,causing it to block for the duration of the system call execution. If the code associated with a G makes a blocking system call, the M running it will be unable to run it or any other G until the system call returns. M’s do not exhibit the same blocking behavior for channel communication, even though goroutines block on channel communication. The operating system does not know about channel communication, and the intricacies of channels are handled purely by the runtime. If a goroutine makes a channel call, it may need to block, but there is no reason that the M running that G should be forced to block as well. In a case such as this, the G’s status is set to waiting and the M that was previously running it continues running other G’s until the channel communication is complete. At that point the G’s status is set back to runnable and will be run as soon as there is an M capable of running it.

Discussion (0)