0 Preface
JSON is a data format often used by many developers in their work. It is generally used in scenarios such as configuration files or network data transfer. And because of its simplicity, easy to understand, and good readability, JSON has become one of the most common formats in the entire IT world. For such things, Golang, like many other languages, also provides standard library level support, which is encoding/json.
Just like JSON itself is easy to understand, the encoding/json
library for manipulating JSON is also very easy to use. But I believe that many developers may have encountered various strange problems or bugs like I did when I first used this library. This article summarizes the problems and mistakes I personally encountered when using Golang to operate JSON. I hope it can help more developers who read this article to master the skills of Golang to operate JSON more smoothly and avoid unnecessary mistakes’ pit.
BTW, the content of this article is based on Go 1.22. There may be slight differences between different versions. Readers should to pay attention when reading and using it. At the same time, all cases listed in this article use encoding/json
and do not involve any third-party JSON library.
1 Basic use
Let’s simply talk about the basic use of encoding/json
library first.
As a kind of data formats, the core operations of JSON are just two: Serialization, and Deserialization. Serialization is to convert a Go object into a string (or byte sequence) in JSON format. Deserialization is the opposite, converting JSON format data into a Go object.
The object mentioned here is a broad concept. It not only refers to structure objects, but also includes slice and map type data. They support JSON serialization as well.
Example:
import (
"encoding/json"
"fmt"
)
type Person struct {
ID uint
Name string
Age int
}
func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
Age: 18,
}
output, err := json.Marshal(p)
if err != nil {
panic(err)
}
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","Age":18}`
var p Person
err := json.Unmarshal([]byte(str), &p)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", p)
}
The core is the two functions json.Marshal
and json.Unmarshal
, which are used for serialization and deserialization respectively. Both functions will return error, here I simply panic.
Readers who have used encoding/json
might know that there is another pair of tools that are often used: NewEncoder
and NewDecoder
. A brief look at the source code shows that the underlying core logic calls of the two are the same as Marshal, so I won’t give examples here.
2 Pitfalls
2.1 Public struct fields
This is probably the most common mistake made by developers who are new to Go or encoding/json
. That is, if we're manipulating JSON with structs, then the member fields of the struct must be public, i.e., capitalized, and private members cannot be parsed.
Example:
type Person struct {
ID uint
Name string
age int
}
func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
age: 18,
}
output, err := json.Marshal(p)
if err != nil {
panic(err)
}
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","age":18}`
var p Person
err := json.Unmarshal([]byte(str), &p)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", p)
}
// Output Marshal:
{"ID":1,"Name":"Bruce"}
// Output Unmarshal:
{ID:1 Name:Bruce age:0}
Here, age
is set as a private variable, so there is no age
field in the serialized JSON string. Similarly, when you deserialize a JSON string into Person
, you can't read the value of age
correctly.
The reason for this is simple. If we dig deeper into Marshal's source code, we can see that it actually uses reflect
underneath to dynamically parse struct objects:
// .../src/encoding/json/encode.go
func (e *encodeState) marshal(v any, opts encOpts) (err error) {
// ...skip
e.reflectValue(reflect.ValueOf(v), opts)
return nil
}
Golang prohibits reflective access to the private members of a structure at the language design level, so this kind of reflective parsing naturally fails, and the same applies to deserialization.
2.2 Use map
sparingly
As mentioned earlier, JSON can manipulate not only structures, but also slice, map and other types of data. slice is special, but map and structure are actually the same in JSON format:
m := map[string]any{
"ID": 1,
"Name": "Bruce",
}
// ...skip the structure code
// JSON output:
{
"ID": 1,
"Name": "Bruce"
}
In this case, unless there is a special circumstance or need, use maps sparingly. Because maps incur additional overhead, additional code, and additional maintenance costs.
Why?
First of all, as in the Person
example above, since ID
and Name
are different types, if we want to deserialize this JSON data with a map, we can only assert a map of type map[string]any
. any
, which is interface{}
, means that if we want to use Name
or ID
on their own, we need to convert the type of the JSON data with the type assertion if we want to use Name
or ID
alone:
var m map[string]any
// ...Deserialize JSON data into `m`,ignore the code...
// Fetch Name
name, ok := m["Name"].(string)
The type assertion itself is an extra step, and in order to prevent panic, we need to determine the second parameter ok, which definitely increases the development effort as well as the code burden.
In addition, a map
is inherently unconstrained with respect to data. We can predefine fields and types in structures, but not in maps. This means that we can only understand what's in the map through documentation or comments or the code itself. Moreover, while a structure can restrict the key and value types of JSON data from being changed, a map can't restrict changes to JSON and can only be detected by business logic code. Just think about the amount of work and maintenance costs involved.
The reason I mention this pitfall is that before I developed in Go, my main language was Python, and Python, you know, doesn't have structures, it just has dict (map) to load JSON data. When I first got into Go, I used to use map to interact with JSON. But because Go is statically typed, you have to explicitly convert the type (type assertion), not like Python way. Those troublesome gave me a headache for a while.
In short, use map sparingly, or as little as possible, to manipulate JSON.
2.3 Beware of struct combinations
Go is object-oriented, but it doesn't have classes, only structures, and structures don't have inheritance. So Go uses a kind of combination to reuse different structures. In many cases, this combination is very convenient, as we can manipulate other members of the combination as if they were members of the structure itself, like this:
type Person struct {
ID uint
Name string
address
}
type address struct {
Code int
Street string
}
func (a address) PrintAddr() {
fmt.Println(a.Code, a.Street)
}
func Group() {
p := Person{
ID: 1,
Name: "Bruce",
address: address{
Code: 100,
Street: "Main St",
},
}
// Access all address's fields and methods directly
fmt.Println(p.Code, p.Street)
p.PrintAddr()
}
// Output
100 Main St
100 Main St
Convenient, right? I think so too. But there's a small pitfall to be aware of when we incorporate combinations into our use of JSON. Take a look at the code below:
// The structure used here is the same as the previous one,
// so I won't repeat it. error is also not captured to save space.
func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
address: address{
Code: 100,
Street: "Main St",
},
}
// It would be more pretty by MarshalIndent
output, _ := json.MarshalIndent(p, "", " ")
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","address":{"Code":100,"Street":"Main St"}}`
var p Person
_ = json.Unmarshal([]byte(str), &p)
fmt.Printf("%+v\n", p)
}
// Output MarshalPerson:
{
"ID": 1,
"Name": "Bruce",
"Code": 100,
"Street": "Main St"
}
// Ouptput UnmarshalPerson:
{ID:1 Name:Bruce address:{Code:0 Street:}}
This code is slightly more informative, so let's run through it bit by bit.
Let's start with the MarshalPerson
function. Here a Person object is declared and then the serialization result is beautified with MarshalIndent
and printed. As we can see from the printout, the entire Person object is flattened. In the case of the Person struct, it looks like it still has an address
member field, despite the combination. So sometimes we take it for granted that the serialized JSON of Person looks like this:
// The imagine of JSON serialization result
{
"ID": 1,
"Name": "Bruce",
"address": {
"Code": 100,
"Street": "Main St"
}
}
But it doesn't, it's flattened. This is more in line with the feeling we had earlier when we accessed the address member directly through Person, i.e., the address member seems to become a member of Person directly. This is something to be aware of, as the combination will flatten the serialized JSON result.
Another slightly counter-intuitive point is that the address struct is a private struct, and it seems that private members shouldn't be serialized? That's right, and that's one of the things that's not so great about this form of composition: it exposes the public members of the private composition object. So, it's important to note here that this exposure is sometimes unintentional, but it can cause unwanted data leaks.
Then there's the UnmarshalPerson
function. With the previous function out of the way, this is easy to understand, but it's really just a matter of flattening the JSON result after combining. So, if we need to deserialize back to Person, we also need a flattened JSON data.
In fact, in my personal use of Go over the years, when I encounter these kinds of structures that need to be converted to JSON, I usually don't use combinations very much, unless there are some special circumstances. After all, it's too easy to cause the problems mentioned above. Also, since JSON is flattened and structs are not defined to be flattened (looks not), once the struct is defined to be more and more complex, the harder it is to visually compare it to the original flattened JSON data, which will make the code much less readable.
If you don't have special needs (e.g., the raw JSON data is flattened and there are multiple structs with duplicate fields that need to be reused), from my personal point of view, I would recommend that you try to write it this way:
type Person struct {
ID int
Name string
Address address
}
Of course, if this structure doesn't involve JSON serialization, then I'd still prefer to use a combination, it's really convenient.
2.4 Care needed when deserializing part of member fields (patch update data)
Look directly at the code:
type Person struct {
ID uint
Name string
}
// PartUpdateIssue simulates parsing two different
// JSON strings with the same structure
func PartUpdateIssue() {
var p Person
// The first data has the ID field and is not 0
str := `{"ID":1,"Name":"Bruce"}`
_ = json.Unmarshal([]byte(str), &p)
fmt.Printf("%+v\n", p)
// The second data does not have an ID field,
// deserializing it again with p preserves the last value
str = `{"Name":"Jim"}`
_ = json.Unmarshal([]byte(str), &p)
// Notice the output ID is still 1
fmt.Printf("%+v\n", p)
}
// Output
{ID:1 Name:Bruce}
{ID:1 Name:Jim}
The comment explains it clearly: when we use the same structure to deserialize different JSON data repeatedly, once the value of a JSON data contains only some of the member fields, then the uncovered members will be left with the value of the last deserialization. This is actually a dirty data pollution problem.
This is a problem that can be easily encountered and, when triggered, is quite insidious. I previously wrote a post (Bugs in Golang caused by Zero Value and features of the gob library) about a similar situation encountered when using the gob
library. Of course, the gob problem is related to zero values, which is not quite the same as the JSON problem we're talking about today, but they both end up behaving in a similar way, with some of the member fields being contaminated by dirty data.
The solution is also simple: each time you deserialize JSON, use a brand new struct object to load the data.
Anyway, be careful with this situation.
2.5 Handling pointer fields
Many developers get a big headache when they hear the word pointer, but you don't have to, it's not that complicated. But pointers in Go do give developers one of the most common panics in Go programs: the null pointer exception. And what happens when pointers are combined with JSON?
Look at this code:
type Person struct {
ID uint
Name string
Address *Address
}
func UnmarshalPtr() {
str := `{"ID":1,"Name":"Bruce"}`
var p Person
_ = json.Unmarshal([]byte(str), &p)
fmt.Printf("%+v\n", p)
// It would panic this line
// fmt.Printf("%+v\n", p.Address.Street)
}
// Output
{ID:1 Name:Bruce Address:<nil>}
We define the Address
member as a pointer, and when we deserialize a piece of JSON data that doesn't contain an Address
, the pointer field is set to nil
because it doesn't have a corresponding piece of data. encoding/json
doesn't create empty Address
object and point to it for this field. If we call p.Address.xxx
directly, the program will panic because p.Address
is nil
.
So, if there is a pointer to a member of our structure, remember to determine if the pointer is nil
before using it. This is a bit tedious, but it can't be helped. After all, writing a few lines of code may not be a big deal compared to the damage caused by a panic in a production environment.
Also, when creating a structure with a pointer field, the assignment of the pointer field can be relatively cumbersome:
type Person struct {
ID int
Name string
Age *int
}
func Foo() {
p := Person{
ID: 1,
Name: "Bruce",
Age: new(int),
}
*p.Age = 20
// ...
}
Some people say, "Well, isn't it recommended that we try not to use pointers in any of the member variables of a JSON data structure?” I don't think so this time, because pointers do have a couple of scenarios where they are better suited than non-pointer members. One is that pointers reduce some of the overhead, and the other is what we'll talk about in the next section, zero-value related issues.
2.6 Confusion caused by zero values
Zero value is a feature of Golang variables that we can simply think of as default values. That is, if we don't explicitly assign a value to a variable, Golang assigns it a default value. For example, as we have seen in the previous example, int
has a default value of 0, string
has an empty string, pointer has a zero value of nil
, and so on.
What are the pitfalls of handling JSON with zero values?
Look at the following example:
type Person struct {
Name string
ChildrenCnt int
}
func ZeroValueConfusion() {
str := `{"Name":"Bruce"}`
var p Person
_ = json.Unmarshal([]byte(str), &p)
fmt.Printf("%+v\n", p)
str2 := `{"Name":"Jim","ChildrenCnt":0}`
var p2 Person
_ = json.Unmarshal([]byte(str2), &p2)
fmt.Printf("%+v\n", p2)
}
// Output
{Name:Bruce ChildrenCnt:0}
{Name:Jim ChildrenCnt:0}
We added a ChildrenCnt
field to the Person structure to count the number of children of the person. Because of the zero value, this field is assigned 0 when the JSON data loaded by p does not have ChildrenCnt
data, which is misleading: we can't distinguish objects with this missing data from objects that do have 0 children. In the case of Bruce and Jim, one of them has 0 children due to missing data, and the other has 0. Bruce's number of children should be "unknown", and if we really treat it as 0, it may cause problems in business.
This kind of confusion is very fatal in some scenarios with strict data requirements. So is there any way to avoid this kind of zero-value interference? There really is, is the last legacy of the last section of the use of pointers scenarios.
Let's change Person's ChildrenCnt
type to *int
and see what happens:
type Person struct {
Name string
ChildrenCnt *int
}
// Output
{Name:Bruce ChildrenCnt:<nil>}
{Name:Jim ChildrenCnt:0xc0000124c8}
The difference is that Bruce has no data, so ChildrenCnt
is a nil
, and Jim is a non-null pointer. At this point it is clear that the number of Bruce's children is unknown.
Essentially this approach still utilizes zero values, the zero values of pointers. It's kind of fighting fire with fire (laughs).
2.7 Pitfalls of tags
Finally, we're talking about tags. Tags are also a very important feature in Golang and are often accompanied by JSON. In fact, for those of you who have used Go tags, you should know that tags are a very flexible and easy to use feature. So what are the pitfalls of using such a great feature?
One is the name problem: Tag can specify the name of a field in JSON data, which is very flexible and useful, but it is also error-prone, and to some extent adds some professionalism to the programmer himself.
For example, if a programmer intentionally or unintentionally defines a structure like this:
type PersonWrong struct {
FirstName string `json:"last_name"`
LastName string `json:"first_name"`
}
The Tag of FirstName and LastName are swapped. would you want to beat the programmer up if you encountered such code? I've actually encountered something like this in my production code. Of course, it was unintentional, a mistake during a code update. However, when it comes to this kind of situation, such bugs are usually not easy to locate. Mostly because, well, who the **** would have thought?
Anyway, don't do it, and be careful when you write it.
Another problem is related to the combination of omitempty
+ zero, see the code:
type Person struct {
Name string `json:"person_name"`
ChildrenCnt int `json:"cnt,omitempty"`
}
func TagMarshal() {
p := Person{
Name: "Bruce",
ChildrenCnt: 0,
}
output, _ := json.MarshalIndent(p, "", " ")
println(string(output))
}
// Output
{
"person_name": "Bruce"
}
See the problem? We assigned a value of 0 to ChildrenCnt
when we created the new struct object p. Because of the omitempty
tag, it causes the JSON to omit empty values when it is serialized or deserialized. This is manifested in serialization by the fact that the output JSON data does not contain ChildrenCnt
and looks as if it does not exist. What is an empty value? That's right, it's a zero value.
So, the familiar confusion arises: Bruce has 0 children, not unknown data. The output JSON says that Bruce's children count is unknown.
Deserialization suffers from the same problem, so I won't give you an example.
What about this omitempty
problem? Since it's still essentially a zero-valued problem, use pointers.
3 Summary
In this article, I've listed 7 mistakes that I've made when using the encoding/json
library, and I've encountered most of them in my own work. If you haven't encountered them yet, congratulations! It's also a reminder to be careful with JSON in the future; if you've encountered any of these problems and are confused by them, I hope this article has helped you.
If there are any errors or lack of clarity in this post, please do not hesitate to point them out, thank you!
Top comments (1)
Not a single thing is worth writing a syllable, let alone a paragraph, about. Only one sub-section item (omitempty) is worth mentioning, the rest is pure incompetence.