TECH SCHOOL

Posted on Mar 15, 2020 • Edited on Jun 26, 2021

Generate and serialize protobuf message in Go

#grpc #go #tutorial #beginners

Welcome back to the gRPC course! In previous lectures, we have learned how to write protobuf messages and generate Go + Java codes from them. Today we will start using those codes to generate and serialise random protobuf messages to binary and JSON files.

Here's the link to the full gRPC course playlist on Youtube
Github repository: pcbook-go and pcbook-java
Gitlab repository: pcbook-go and pcbook-java

In this first half of the lecture, we will write Go codes to:

Generate a random protobuf message (laptop object).
Write the protobuf message to a binary file.
Read back the content of that file into another message.
Write the protobuf message to a JSON file.
Compare its size with the binary file to see which one is smaller.

Alright, let's start!

Generate random protobuf messages

The first thing we need to do is to initialise our pcbook package:



go mod init gitlab.com/techschool/pcbook

As you can see, a go.mod file is generated for us:



module gitlab.com/techschool/pcbook

go 1.13

Now let's create a sample package to generate some random laptop data. I love using random data because it's very useful when writing unit tests. It will return different values for each call, and the data look very natural and close to reality.



pcbook
├── proto
├── pb
├── sample
│   ├── generator.go
│   └── random.go
├── go.mod
├── main.go
└── Makefile

Generate a random keyboard

First we need a keyboard, so I define a function NewKeyboard(), which returns a pointer to the pb.Keyboard object.



package sample

import "gitlab.com/techschool/pcbook/pb"

// NewKeyboard returns a new sample keyboard
func NewKeyboard() *pb.Keyboard {
    keyboard := &pb.Keyboard{
        Layout:  randomKeyboardLayout(),
        Backlit: randomBool(),
    }

    return keyboard
}

It will have a layout, so I will write a function to generate a random keyboard layout. And also a function to generate a random boolean for the backlit field. Let's write them in the new file random.go.



package sample

import "math/rand"

func randomBool() bool {
    return rand.Intn(2) == 1
}

func randomKeyboardLayout() pb.Keyboard_Layout {
    switch rand.Intn(3) {
    case 1:
        return pb.Keyboard_QWERTY
    case 2:
        return pb.Keyboard_QWERTZ
    default:
        return pb.Keyboard_AZERTY
    }
}

The randomBool() function is easy. A bool only has 2 possible values: true or false, so I use the rand.Intn() function of the math/rand package, with n equal to 2. It will give us a random integer, which is either 0 or 1. So we just return true if the value is 1.

For the randomKeyboardLayout() function, there are 3 possible values, so let's use rand.Intn(3). If the value is 1 then return QWERTY. If the value is 2, then return QWERTZ. Otherwise, return AZERTY.

Generate a random CPU

Next, a function to generate a random CPU. There are many fields that we need to fill.

First we will need a function to return a random CPU brand. Let's go to the random.go file to define it. One easy way to do that is to select a random value from a predefined set of brands, such as "Intel" and "AMD".



func randomCPUBrand() string {
    return randomStringFromSet("Intel", "AMD")
}

func randomStringFromSet(a ...string) string {
    n := len(a)
    if n == 0 {
        return ""
    }
    return a[rand.Intn(n)]
}

Here I defined a randomStringFromSet() function, which takes a set of variable number of strings as input, and return 1 random string from that set.

Next, we will generate a random CPU name based on the brand with this function. Since we know there are only 2 brands, a simple if here would be enough.



func randomCPUName(brand string) string {
    if brand == "Intel" {
        return randomStringFromSet(
            "Xeon E-2286M",
            "Core i9-9980HK",
            "Core i7-9750H",
            "Core i5-9400F",
            "Core i3-1005G1",
        )
    }

    return randomStringFromSet(
        "Ryzen 7 PRO 2700U",
        "Ryzen 5 PRO 3500U",
        "Ryzen 3 PRO 3200GE",
    )
}

The next CPU field we have to fill is the number of cores. Let's say we want it to be between 2 cores and 8 cores. So we will need a randomInt() function to generate a random integer between min and max.



func randomInt(min, max int) int {
    return min + rand.Int()%(max-min+1)
}

In this formula, the rand.Intn() function will return an integer from 0 to max - min. So if we add min to it, we will get a value from min to max. This randomInt() function can be used to set the number of cores as well as the number of threads. The number of threads will be a random integer between the number of cores and 12.

Alright, next field is minGhz, which is a float64. I want the CPU to have the minimum frequency between 2.0 and 3.5. So we need a function to generate a float64 in range from min to max.



func randomFloat64(min, max float64) float64 {
    return min + rand.Float64()*(max-min)
}

It's a bit different from the randomInt function because the rand.Float64() function will return a random float between 0 and 1. So we will multiply it with (max - min) to get a value between 0 and max - min. When we add min to this value, we will get a number from min to max.

Let's come back to our generator. We generate the max frequency to be a random float64 between min frequency and 5.0 Ghz.

Put everything together, we got this NewCPU() function:



// NewCPU returns a new sample CPU
func NewCPU() *pb.CPU {
    brand := randomCPUBrand()
    name := randomCPUName(brand)

    numberCores := randomInt(2, 8)
    numberThreads := randomInt(numberCores, 12)

    minGhz := randomFloat64(2.0, 3.5)
    maxGhz := randomFloat64(minGhz, 5.0)

    cpu := &pb.CPU{
        Brand:         brand,
        Name:          name,
        NumberCores:   uint32(numberCores),
        NumberThreads: uint32(numberThreads),
        MinGhz:        minGhz,
        MaxGhz:        maxGhz,
    }

    return cpu
}

Generate a random GPU

The NewGPU function will be implemented the same way. We write a function to return a random GPU brand, which can be either "NVIDIA" or "AMD". Then we generate a random GPU name based on the brand.



func randomGPUBrand() string {
    return randomStringFromSet("Nvidia", "AMD")
}

func randomGPUName(brand string) string {
    if brand == "Nvidia" {
        return randomStringFromSet(
            "RTX 2060",
            "RTX 2070",
            "GTX 1660-Ti",
            "GTX 1070",
        )
    }

    return randomStringFromSet(
        "RX 590",
        "RX 580",
        "RX 5700-XT",
        "RX Vega-56",
    )
}

The minGhz and maxGhz fields are generated using the randomFloat64 function that we have defined before.

Now there's one field left: the memory. Let's say we want it to be between 2 and 6 GB. So we will use the randomInt() function here with a type conversion to uint64. For the unit, just use the Memory_GIGABYTE enum generated by protoc. And we're done with the GPU.



// NewGPU returns a new sample GPU
func NewGPU() *pb.GPU {
    brand := randomGPUBrand()
    name := randomGPUName(brand)

    minGhz := randomFloat64(1.0, 1.5)
    maxGhz := randomFloat64(minGhz, 2.0)
    memGB := randomInt(2, 6)

    gpu := &pb.GPU{
        Brand:  brand,
        Name:   name,
        MinGhz: minGhz,
        MaxGhz: maxGhz,
        Memory: &pb.Memory{
            Value: uint64(memGB),
            Unit:  pb.Memory_GIGABYTE,
        },
    }

    return gpu
}

Generate a random RAM

The next thing is RAM. It's almost identical to the GPU memory



// NewRAM returns a new sample RAM
func NewRAM() *pb.Memory {
    memGB := randomInt(4, 64)

    ram := &pb.Memory{
        Value: uint64(memGB),
        Unit:  pb.Memory_GIGABYTE,
    }

    return ram
}

Generate a random storage

Then comes the storage. We will define 2 different functions: 1 for SSD and 1 for HDD.

For the SSD, we will set the driver to be Storage_SSD And the memory size will be from 128 to 1024 GB.

For the HDD, the driver must be Storage_HDD, and the memory size will be between 1 and 6 TB.



// NewSSD returns a new sample SSD
func NewSSD() *pb.Storage {
    memGB := randomInt(128, 1024)

    ssd := &pb.Storage{
        Driver: pb.Storage_SSD,
        Memory: &pb.Memory{
            Value: uint64(memGB),
            Unit:  pb.Memory_GIGABYTE,
        },
    }

    return ssd
}

// NewHDD returns a new sample HDD
func NewHDD() *pb.Storage {
    memTB := randomInt(1, 6)

    hdd := &pb.Storage{
        Driver: pb.Storage_HDD,
        Memory: &pb.Memory{
            Value: uint64(memTB),
            Unit:  pb.Memory_TERABYTE,
        },
    }

    return hdd
}

Generate a random screen

Now we will make a new screen. The size of the screen will be between 13 and 17 inches. It's a float32 number, so let's define a randomFloat32 function. It's the same as randomFloat64 function, except for the types should be float32.



func randomFloat32(min, max float32) float32 {
    return min + rand.Float32()*(max-min)
}

Next, the screen resolution. We will set the height to be a random integer between 1080 and 4320. And calculate the width from the height with the ratio of 16 by 9.



func randomScreenResolution() *pb.Screen_Resolution {
    height := randomInt(1080, 4320)
    width := height * 16 / 9

    resolution := &pb.Screen_Resolution{
        Width:  uint32(width),
        Height: uint32(height),
    }
    return resolution
}

Then the screen panel. In our application, there are only 2 types of panel: either IPS or OLED. So we just use rand.Intn(2) here, and a simple if would do the job.



func randomScreenPanel() pb.Screen_Panel {
    if rand.Intn(2) == 1 {
        return pb.Screen_IPS
    }
    return pb.Screen_OLED
}

The last field we have to set is the multitouch, which is just a random boolean. Then we have this function to generate a new screen:



// NewScreen returns a new sample Screen
func NewScreen() *pb.Screen {
    screen := &pb.Screen{
        SizeInch:   randomFloat32(13, 17),
        Resolution: randomScreenResolution(),
        Panel:      randomScreenPanel(),
        Multitouch: randomBool(),
    }

    return screen
}

Generate a random laptop

Alright, all the components are ready, now we can generate a new laptop.

First, it needs a unique random identifier. So let’s create a randomID() function for that. I'm gonna use Google UUID. We can run this command in the terminal to install the package:



go get github.com/google/uuid

Now we can call uuid.New() to get a random ID and convert it to string.



func randomID() string {
    return uuid.New().String()
}

Next, we will generate the laptop brand and name similar to what we’ve done with the CPU and GPU.



func randomLaptopBrand() string {
    return randomStringFromSet("Apple", "Dell", "Lenovo")
}

func randomLaptopName(brand string) string {
    switch brand {
    case "Apple":
        return randomStringFromSet("Macbook Air", "Macbook Pro")
    case "Dell":
        return randomStringFromSet("Latitude", "Vostro", "XPS", "Alienware")
    default:
        return randomStringFromSet("Thinkpad X1", "Thinkpad P1", "Thinkpad P53")
    }
}

The brands we use are Apple, Dell, and Lenovo. We use switch case statement here to generate the correct laptop name of the brand.

Now let's use these functions to generate a new random laptop. We add the CPU and the RAM by calling their generator functions. The GPUs should be a list of values, so I define a slice here. Let's say we only have 1 GPU for now. Similar for the storages, but this time I will add 2 items: 1 for the SSD and the other for the HDD. The screen and keyboard fields are pretty straight-forward.



// NewLaptop returns a new sample Laptop
func NewLaptop() *pb.Laptop {
    brand := randomLaptopBrand()
    name := randomLaptopName(brand)

    laptop := &pb.Laptop{
        Id:       randomID(),
        Brand:    brand,
        Name:     name,
        Cpu:      NewCPU(),
        Ram:      NewRAM(),
        Gpus:     []*pb.GPU{NewGPU()},
        Storages: []*pb.Storage{NewSSD(), NewHDD()},
        Screen:   NewScreen(),
        Keyboard: NewKeyboard(),
        Weight: &pb.Laptop_WeightKg{
            WeightKg: randomFloat64(1.0, 3.0),
        },
        PriceUsd:    randomFloat64(1500, 3500),
        ReleaseYear: uint32(randomInt(2015, 2019)),
        UpdatedAt:   ptypes.TimestampNow(),
    }

    return laptop
}

Then comes the oneof field: the Weight. We can either specify the weight in kilograms or pounds. There are 2 structs that protoc has generated for us. Here I use the pb.Laptop_WeightKg to set it to a random value between 1 and 3 kilograms.

The price is a random float between 1500 and 3000. The release year is a random integer between 2015 and 2019.

And finally the updateAt timestamp. We can use the TimestampNow() function provided by golang/protobuf/ptypes package.

And we're done with the random laptop generator.

Serialize protobuf messages

Now we will create a new serializer package and write some functions to serialize the laptop objects to files. So let's create a file.go file here.



pcbook
├── proto
├── pb
├── sample
│   ├── generator.go
│   └── random.go
├── serializer
│   └── file.go
├── go.mod
├── main.go
└── Makefile

Write protobuf message to binary file

The first function will be used to write a protobuf message to a file in binary format. In our case, the message would be the laptop object. We can use the proto.Message interface to make it more general.



// WriteProtobufToBinaryFile writes protocol buffer message to binary file
func WriteProtobufToBinaryFile(message proto.Message, filename string) error {
    data, err := proto.Marshal(message)
    if err != nil {
        return fmt.Errorf("cannot marshal proto message to binary: %w", err)
    }

    err = ioutil.WriteFile(filename, data, 0644)
    if err != nil {
        return fmt.Errorf("cannot write binary data to file: %w", err)
    }

    return nil
}

In this function, we first call proto.Marshal to serialize the message to binary. If an error occurs, we just wrap it and return to the caller.

Else, we use ioutil.WriteFile() function to save the data to the specified file name. Again, wrap and return any error that occurs during this process. If every goes well, we just return nil here, meaning no errors.

Now I'm gonna show you how to write a unit test for it. Let's create a file_test.go here. Note that having the _test suffix in the file name is a must, so Go can understand that it's a test file.



pcbook
├── proto
├── pb
├── sample
│   ├── generator.go
│   └── random.go
├── serializer
│   ├── file.go
│   └── file_test.go
├── go.mod
├── main.go
└── Makefile

We also have a convention for the unit test function name. It must start with Test prefix and takes a pointer to testing.T object as input.

I usually call t.Parallel() for almost all of my unit tests so that they can be run in parallel, and any racing condition can be easily detected.



func TestFileSerializer(t *testing.T) {
    t.Parallel()
}

Alright, let's say we want to serialize the object to laptop.bin file inside the tmp folder. So we need to create the tmp folder first.

Then use the NewLaptop() function to make a new laptop1. And call the WriteProtobufToBinaryFile() function to save it to the laptop.bin file. Since this function returns an error, we must check that this error is nil, which means the file is successfully written.

To do that, I often use the testify package. Run this command in the terminal to get it:



go get github.com/stretchr/testify

Then we can simply call require.NoError(t, err).

In visual studio code, we can click this "run test" link to run this test. There's an issue saying import cycle not allowed.

It's because we're in the same serializer package, but also import it. To fix this, just add _test to our package name to make it a different package, and also tell Go that this is a test package. Now if we re-run the test, it passed, and the laptop.bin file is written to the tmp folder.



package serializer_test

import (
    "testing"

    "github.com/stretchr/testify/require"
    "gitlab.com/techschool/pcbook/sample"
    "gitlab.com/techschool/pcbook/serializer"
)

func TestFileSerializer(t *testing.T) {
    t.Parallel()

    binaryFile := "../tmp/laptop.bin"

    laptop1 := sample.NewLaptop()
    err := serializer.WriteProtobufToBinaryFile(laptop1, binaryFile)
    require.NoError(t, err)
}

Read protobuf message from binary file

Now we will write another function to read back that binary file into a protobuf message object. I will name it function ReadProtobufFromBinaryFile().

First we need to use ioutil.ReadFile() to read the binary data from the file. Then we call proto.Unmarshal() to deserialize the binary data into a protobuf message.



// ReadProtobufFromBinaryFile reads protocol buffer message from binary file
func ReadProtobufFromBinaryFile(filename string, message proto.Message) error {
    data, err := ioutil.ReadFile(filename)
    if err != nil {
        return fmt.Errorf("cannot read binary data from file: %w", err)
    }

    err = proto.Unmarshal(data, message)
    if err != nil {
        return fmt.Errorf("cannot unmarshal binary to proto message: %w", err)
    }

    return nil
}

OK let's test it. In our unit test, I will define a new laptop2 object, and call ReadProtobufFromBinaryFile() to read the file data into that object. We will check that there's no errors.



func TestFileSerializer(t *testing.T) {
    t.Parallel()

    binaryFile := "../tmp/laptop.bin"

    laptop1 := sample.NewLaptop()
    err := serializer.WriteProtobufToBinaryFile(laptop1, binaryFile)
    require.NoError(t, err)

    laptop2 := &pb.Laptop{}
    err = serializer.ReadProtobufFromBinaryFile(binaryFile, laptop2)
    require.NoError(t, err)

    require.True(t, proto.Equal(laptop1, laptop2))
}

We also want to check that laptop2 contains the same data as laptop1. To do that, we can use the proto.Equal function provided by the golang/protobuf package. This function must return true, so we use require.True() here.

Write protobuf message to JSON file

Now since the data is written in binary format, we cannot read it. Let's write another function to serialize it to JSON format.

In this function, we must convert the protobuf message into a JSON string first. To do that, I will create a new function named ProtobufToJSON(), and code it in a separate json.go file, under the same serializer package.

Now to convert a protobuf message to JSON, we can use the jsonb.Marshaler struct. Basically, we just need to call marshaler.MarshalToString() function.



package serializer

import (
    "github.com/golang/protobuf/jsonpb"
    "github.com/golang/protobuf/proto"
)

// ProtobufToJSON converts protocol buffer message to JSON string
func ProtobufToJSON(message proto.Message) (string, error) {
    marshaler := jsonpb.Marshaler{
        EnumsAsInts:  false,
        EmitDefaults: true,
        Indent:       "  ",
        OrigName:     true,
    }

    return marshaler.MarshalToString(message)
}

There's a couple of things that we can config, such as:

Write enums as integers or strings.
Write fields with default value or not.
What's the indentation we want to use.
Do we want to use the original field name as in the proto file.

Let's use these configs for now, and we will try other values later.

Now come back to our function, after calling ProtobufToJSON, we got the JSON string. All we need to do is to write that string to the file.



// WriteProtobufToJSONFile writes protocol buffer message to JSON file
func WriteProtobufToJSONFile(message proto.Message, filename string) error {
    data, err := ProtobufToJSON(message)
    if err != nil {
        return fmt.Errorf("cannot marshal proto message to JSON: %w", err)
    }

    err = ioutil.WriteFile(filename, []byte(data), 0644)
    if err != nil {
        return fmt.Errorf("cannot write JSON data to file: %w", err)
    }

    return nil
}

OK, now let's call this function in our unit test. Check there's no errors returned, and run the test.



func TestFileSerializer(t *testing.T) {
    t.Parallel()

    binaryFile := "../tmp/laptop.bin"
    jsonFile := "../tmp/laptop.json"

    laptop1 := sample.NewLaptop()

    err := serializer.WriteProtobufToBinaryFile(laptop1, binaryFile)
    require.NoError(t, err)

    err = serializer.WriteProtobufToJSONFile(laptop1, jsonFile)
    require.NoError(t, err)

    laptop2 := &pb.Laptop{}
    err = serializer.ReadProtobufFromBinaryFile(binaryFile, laptop2)
    require.NoError(t, err)

    require.True(t, proto.Equal(laptop1, laptop2))
}

Voilà, the laptop.json file is successfully created!

As you can see, the field names are exactly the same as we defined in our proto files, which is lower_snake_case.

Now if we change the config OrigName to false, and rerun the test, the field names will change to lowerCamelCase. Similarly, all the enum fields are now written in string format, such as the IPS panel. If we change the EnumsAsInts config to true, and rerun the test, the panel will become an integer instead:



{
  "id": "bcba7da0-75c1-4a00-b3e4-15ab0c8b0e56",
  "brand": "Lenovo",
  "name": "Thinkpad P53",
  "cpu": {
    "brand": "AMD",
    "name": "Ryzen 7 PRO 2700U",
    "numberCores": 8,
    "numberThreads": 10,
    "minGhz": 2.133907067057327,
    "maxGhz": 2.5030895265517077
  },
  "ram": {
    "value": "7",
    "unit": 5
  },
  "gpus": [
    {
      "brand": "Nvidia",
      "name": "GTX 1660-Ti",
      "minGhz": 1.455518372547095,
      "maxGhz": 1.684519055820871,
      "memory": {
        "value": "6",
        "unit": 5
      }
    }
  ],
  "storages": [
    {
      "driver": 2,
      "memory": {
        "value": "981",
        "unit": 5
      }
    },
    {
      "driver": 1,
      "memory": {
        "value": "1",
        "unit": 6
      }
    }
  ],
  "screen": {
    "sizeInch": 13.843882,
    "resolution": {
      "width": 6161,
      "height": 3466
    },
    "panel": 2,
    "multitouch": false
  },
  "keyboard": {
    "layout": 3,
    "backlit": false
  },
  "weightKg": 1.3791412402755228,
  "priceUsd": 2732.0172467953116,
  "releaseYear": 2015,
  "updatedAt": "2020-03-15T18:39:01.781405Z"
}

Now I want to make sure that the generated laptops are different every time we run the test. So let's run go test ./... multiple times to see what happens. The ... here means that we want to run unit tests in all sub-packages.

OK looks like only the unique ID is changed, the rest stays the same. It's because by default, the rand package uses a fixed seed value. We have to tell it to use a different seed for each run.

So let's create a init() function in random.go file. This is a special function that will be called exactly once before any other code in the package is executed. In this function, we will tell rand to use the current unix nano as the seed value.



func init() {
    rand.Seed(time.Now().UnixNano())
}

Now if we run the test multiple times, the laptop data will be changed. Excellent!

Alright, now let's define a make test command to run the unit tests. We can use the -cover flag to measure the code coverage of our tests, and the -race flag to detect any racing condition in our codes:



gen:
    protoc --proto_path=proto proto/*.proto --go_out=plugins=grpc:pb

clean:
    rm pb/*.go 

server:
    go run cmd/server/main.go -port 8080

client:
    go run cmd/client/main.go -address 0.0.0.0:8080

test:
    go test -cover -race ./...

Now run make test in the terminal. As you can see, 73.9% of the codes in the serializer package are covered.

We can also go to the test file, and click "run package tests" at the top. It will run all tests in that package and report the code coverage.

Then we can open the files to see which part of our code is covered and which one is not. The covered code is in blue, and uncovered code is in red. This is very useful for us to write a strong test set to cover different branches.

Compare size of binary and JSON file

There's one more thing I want to show you before we switch to Java. Let's open the terminal, go to the tmp folder and run ls command:

We can see that the size of the JSON file is about 5 times bigger than that of the binary file.

So we will save a lot of bandwidth when using gRPC instead of a normal JSON API. And since it's smaller, it's also faster to transport. That's the beautiful thing of a binary protocol.

If you like the article, please subscribe to our Youtube channel and follow us on Twitter for more tutorials in the future.

If you want to join me on my current amazing team at Voodoo, check out our job openings here. Remote or onsite in Paris/Amsterdam/London/Berlin/Barcelona with visa sponsorship.

Top comments (1)

Max Cian • Dec 6 '20

Thanks for your sharing, I have some update for you to check out.

I found a version mismatching issue when I followed your tutorial.

My versions:
protoc-gen-go v1.23.0
protoc v3.13.0

The error message is showed when I use jsonpb.Marshaler{}.MarshalToString :

message ProtoMessage
cannot use message (type protoreflect.ProtoMessage) as type protoiface.MessageV1 in argument to marshaler.MarshalToString:
    protoreflect.ProtoMessage does not implement protoiface.MessageV1 (missing ProtoMessage method)go
multiple-value marshaler.MarshalToString() in single-value contextgo

I checked the solution discussed on github.com/golang/protobuf/issues/... to replace the function of ProtobufToJSON to solved this version mismatching problem with the following change:

func ProtobufToJSON(message proto.Message) (string, error) {
    marshaler := protojson.MarshalOptions{
        Indent:          "  ",
        UseProtoNames:   true,
        EmitUnpopulated: true,
    }
    b, err := marshaler.Marshal(message)
    return string(b), err
}

DEV Community