DEV Community

Cover image for Golang - Garbage Collection in General
Satyajiijt Roy
Satyajiijt Roy

Posted on

Golang - Garbage Collection in General

As we all know the golang is a garbage collected language like other languages like java, python , C# etc. Golang is a statically typed garbage collected language.

What is Garbage Collection and Why it is needed

So many articles has written about this subject. So I am going to keep that small and try to get some deeper insight about the concept.

In SML programs (and in most other programming languages), it is possible to create garbage : allocated space that is no longer usable by the program.

When software is executed on your computer, there are two important memory location parts that it uses: Stack and Heap.

Above code has a RandomBox type and GenerateRandomBox() is a function which returns RandomBox type. The ref has been assigned in stack and struct data is in heap. Golang uses Escape Analysis to to determine that.

When the reference goes away, we end up with Garbage. This is how we get Garbage, which requires to be clean up time to time.

When the program memory footprint reaches a certain threshold, the whole application will be suspended, the Garbage Collector scans all the objects assigned memory space and recycling are no longer used, after the end of this process, the user program can continue, the language also use this strategy implement garbage collection in the early days, but today’s implementation is much complicated

Garbage collection can be initiated manually or automatically depending on the programming language you are using. Every program compiler or interpreter uses a specific algorithm to perform Garbage collection. Garbage collection in a compiled language works the same way as in an interpreted language.

Golang use Tracing garbage collectors even though their code is usually compiled to machine code ahead-of-time. Go uses a concurrent mark and sweep garbage collector algorithm.

What happens in Garbage Collection Process

Go’s garbage collector is called concurrent because it can safely run in parallel with the main program. When compiler decides that this is the time to run garbage collection based on some condition (discussed below), this is what it follows

Mark Setup

  • which means compiler try to stops everything, literally called stop the world step. Nothing gets done in your application at this time.
  • First it does is to enable write barrier which mean nothing get written in memory when this barrier is on (Stop all go-routines). Compiler has to perform this to make sure that your application is not losing any data.

  • Once the STW is performed and the write barrier is on, collector moves on to next phase.

Marking

In this phase the following happens

  • Inspect stacks to find the root pointer in heap
  • Traverse the heap and see if they are still in use

Collector also uses go-routine like us and it takes 25% of your available go-routines and assign them to itself. Means based on our previous example, 1 thread will be dedicated to collector.

Now, if the collect finds out that it might go out of memory while performing this task, because some other go-routine is allocating more then it can mark. So it will choose that go-routine and ask it to help with marking. This process is called Mark Assist

Mark Termination

Here collector will again perform the STW, turn the write barrier off, perform some clean-up and calculate the next Garbage collection schedule

Note: The goal is to keep the STW down or within the 100ms on every collection it needs to perform

One the Mark Termination process is complete (STW and Write Barrier is Off), application start working again with all the OS Threads available.

This is what happens every time the collection happens, you may ask that yes, it did the marking by identifying the dangling values, however, it didn’t clean up them. That part is called Sweeping

Myth Buster: Sweeping is not part of Garbage Collection, it happens outside of collection process.

Sweeping

This is process where we claim the non-marked memory allocations. We need to get them back, thats the whole purpose of Garbage Collection.

  • Claiming the unused locations happens when a new allocation happens. So, technically latency for sweeping is not added to garbage collection.

Garbage Collection Triggers

  • The first metric the garbage collector will watch is the growth of the heap. By default, it will run when the heap doubles its size. (code)
  • The second metric the garbage collector is watching is the delay between two garbage collectors. If it has not been triggered for more than two minutes, one cycle will be forced. (code)
  • Application memory can also trigger garbage collection is the runtime.mallocgc function, which at runtime divides objects on the heap into micro objects, small objects, and large objects by size. The creation of each of these three objects can trigger a new garbage collection cycle.

  • Manually triggering it by calling gc().

GC Collector knobs

Unlike java it was decided that a golang developer shouldn’t have to tune their Garbage collector whenever they move to different hardware. So they only provided one tuning config called SetGCPercentage or GOGC environment variable. Default for GOGC is 100%

GC Pacer

There is pacer algorithm trying to figure out when to start new collection. As we know that the calculation happens when Mark Termination phase of Garbage collection. The Pacer algorithm tries to calculate this and if it find that it can get more advantage of starting the collection even before the condition applies, it will do that. So take away is that the collection can even start before then the calculated time if pacer think that it can get more benefit.

Hope this blog was able to provide little more in-depth inside about the Garbage Collection process and how it is implemented in Golang

Happy Learning!!

Top comments (0)