In this article, I am going to present you with some examples showing why you should be cautions regarding the operation of the garbage collector. The point of the article is to understand that the way you store pointers has a great impact on the performance of the garbage collector, especially when you are dealing with very large amounts of pointers.
The presented examples will use pointers, slices and maps, which are all native Go data types.
What is Garbage Collection?
Garbage collection is the process of freeing up memory space that is not being used. In other words, the garbage collector sees which objects are out of scope and cannot be referenced anymore, and frees the memory space they consume. This process happens in a concurrent way while a Go program is running and not before or after the execution of the program. According to the documentation of the Go garbage collection:
βThe GC runs concurrently with mutator threads, is type accurate(also known as precise). allows multiple GC threads to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. Allocation is done using size segregated per P allocation areas to minimize fragmentation while eliminating locks in the common case.β
Using a Slice
In this example we will use a slice to store a large amount of structures. Each structure stores two integer values. Follow the below mentioned Go code:
package main
import "runtime"
type data struct{
i,j int
}
func main() {
var N = 40000000
var str []data
for i:=0;i<N;i++ {
value := i
str = append(str, data{value,value})
}
runtime.GC()
_ = str[0]
}
The last statement (_ = str[0]) is used for preventing the garbage collector from garbage collecting the str variable too early, as it is not referenced or used outside of the for loop. The same technique will be used in the below three Go programs that follow. Apart from this important detail, a for loop is used for putting values into structures that are stored in the slice.
Using a Map with Pointers
In this, we are going to use a map for storing all our pointers as integers. The program contains the following Go code:
package main
import "runtime"
func main() {
var N = 40000000
myMap := make(map[int]*int)
for i:=0;i<N;i++ {
value := i
myMap[value] = &value
}
runtime.GC()
_ = myMap[0]
}
The name of the map that stores the integer pointers is myMap. A for loop is used for putting the integer values into map.
Using a Map Without Pointers
In this we are going to use a map that stores plain values without pointers. The Go code is mentioned below:
package main
import "runtime"
func main() {
var N = 40000000
myMap := make(map[int]int)
for i:=0;i<N;i++ {
value := i
myMap[value] = value
}
runtime.GC()
_ = myMap[0]
}
As before, a for loop is used for putting the integer values into the map.
Splitting the Map
The implementation of this section will split the map into a map of maps, which is also called sharding. The program of this section contains the following Go code:
package main
import "runtime"
func main() {
var N = 40000000
split := make([]map[int]int,200)
for i := range split{
split[i] = make(map[int]int)
}
for i:=0;i<N;i++ {
value := i
split[i%200][value] = value
}
runtime.GC()
_ = split[0][0]
}
This time, we are using two for loops: one for loop for creating the hash of hashes and another one for storing the desired data in the hash of hashes.
Comparing the Performance of the Presented Techniques
As all four programs are using huge data structures, they are consuming large amounts of memory. Program that consumes lots of memory space trigger the Go garbage collector more often. So, in this section we are going to compare the performance of each one of these four implementations using time(1) command.
What will be important in this presented output is not the exact number but the time difference between the four different approaches. Here we go:
$ time go run 1_sliceGC.go
real 0m1.511s
user 0m0.000s
sys 0m0.015s
$ time go run 2_mapStar.go
real 0m10.395s
user 0m0.000s
sys 0m0.015s
$ time go run 3_mapNoStar.go
real 0m8.227s
user 0m0.000s
sys 0m0.015s
$ time go run 4_mapSplit.go
real 0m8.028s
user 0m0.000s
sys 0m0.015s
So, it turns out that maps slow down the Go garbage collector whereas slices collaborate much better with it. It should be noted here that this is not a problem with maps but a result of the way the Go garbage collector works. However, unless you are dealing with maps that store huge amounts of data, this problem will not become evident in your programs.
Thanks for reading :)
Top comments (0)