Even though memory allocations are not always easy to spot, they are fairly expensive due to overhead and garbage collector load. In a seemingly innocent function print_int
func print_int(x int) {
fmt.Println(x)
}
compiler claims (-gcflags="-m"
)
x escapes to heap
so runtime.convT64
function is used to convert x
into a pointer
00007 (6) CALL runtime.convT64(SB)
which is implemented in runtime.iface.go:
func convT64(val uint64) (x unsafe.Pointer) {
if val < uint64(len(staticuint64s)) {
x = unsafe.Pointer(&staticuint64s[val])
} else {
x = mallocgc(8, uint64Type, false)
*(*uint64)(x) = val
}
return
}
The most interesting bit is x = unsafe.Pointer(&staticuint64s[val])
part, which returns a pointer from a preallocated staticuint64s
pool of ints between 0
and 255
:
// staticuint64s is used to avoid allocating in convTx for small integer values.
var staticuint64s = [...]uint64{
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
...
0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff,
}
It's a fairly cheap way to trade a little memory for allocation reduction and is used in many other managed languages like Java. To make it even more useful, runtime reuses the same cache also for convT16
and convT32
functions.
It's such a useful technique that it's also used for dynamic caches, also known as object pools, but the extra flexibility of not being limited by a fairly small range of values comes at a cost of synchronization, so it's important to measure this overhead when evaluating object pools.
In summary, consider using static preallocated caches to reduce allocation count and improve performance.
Top comments (0)