DEV Community

Tom Deneire ⚡
Tom Deneire ⚡

Posted on • Originally published at tomdeneire.Medium

Handling Common Data Formats in Go (TXT)

Photo by [Alina Grubnyak](https://unsplash.com/@alinnnaaaa?utm_source=medium&utm_medium=referral) on [Unsplash](https://unsplash.com?utm_source=medium&utm_medium=referral)
Photo by Alina Grubnyak on Unsplash

Reading and writing data is one of the basic elements of programming. In Go, which is a compiled language with strong typing, this might seem somewhat more complicated than other languages like Python. Here’s how to handle the most common data formats like a true Gopher…

Text

“Data” does not necessarily have to be structured data. A plain .txt file (or any file format really) can be a legitimate source of data. Although opening a file and reading its contents may seem like a trivial enough task, there are still several ways to do this in Go.

Reading a whole file

One way is to read a complete file into memory, so this is only an option when you know a file is going to be limited in size (like configuration files and such).

The canonical way to do this, is to use os.ReadFile (before Go 1.16 there was also ioutil.ReadFile, which is now deprecated)


Output:
Hello
world
Enter fullscreen mode Exit fullscreen mode

Reading a file bit by bit

When dealing with larger files, it is better to read them bit by bit, either as a limited number of bytes, or line by line, or even (in the case of text) word by word.

All of these options involve opening the file with os.Open and making sure it also gets closed. (In the previous example, os.ReadFile took care of that on its own). In Go, we do that using the defer keyword, which guarantees the execution of a function after the surrounding function ends, even if it panics.

Byte chunks

The first method first declares a bytes buffer of 4 bytes. It then reads the file in 4 byte chunks at a time until it reaches the end of the file, which is signaled by the Read() method by returning a specific error, namely io.EOF (end of file):



Output:

Hell = 4 bytes
o
wo = 4 bytes
rldo = 3 bytes
Enter fullscreen mode Exit fullscreen mode

As you can see, this is not really optimal for handling text, as obviously the byte chunks split up words arbitrarily. To counter that, Go has several handy functions in the bufio package (which is dedicated to buffered I/O, as in the previous example) that allow more text-oriented buffering.

Line by line

The bufio package uses a special type for these operations, called a scanner. This type comes with a Scan() method, which allows to step through the ‘tokens’ of the input. The specification of a token is defined by a split function. Scanning stops when an error is encountered and only returns that error when the Errmethod is called.



Output:

Hello
world
Enter fullscreen mode Exit fullscreen mode

Word by word

Scanning line by line is actually the default split for a scanner, so we could just leave out the line

scanner.Split(bufio.ScanLines)
Enter fullscreen mode Exit fullscreen mode

If we want to user another splitter, like word by word, we can simply replace said line with:

scanner.Split(bufio.ScanWords)
Enter fullscreen mode Exit fullscreen mode

Other input sources

In the above, we have been reading text from a file, but you can also apply the same techniques with different input sources. Reading in general is most often done in Go through the use of an io.Reader, which is an interface that wraps the basic Read method, meaning that it will accept any type that implements the Read method. As bufio.NewScanner takes an io.Reader as input, we can use all kinds of input sources.

For instance, you can read from standard input os.Stdin (which is also an io.Reader) like this:

scanner := bufio.NewScanner(os.Stdin)
Enter fullscreen mode Exit fullscreen mode

or you can create an io.Reader from another variable, e.g. a string:

input := "Hello world"
scanner := bufio.NewScanner(strings.NewReader(input))
Enter fullscreen mode Exit fullscreen mode

or bytes:

input := []byte("Hello world")
scanner := bufio.NewScanner(bytes.NewReader(input))
Enter fullscreen mode Exit fullscreen mode

You will now understand that os.Open, which we discussed earlier, also returns an io.Reader (or similar types, i.e. that offer a Read method). And so do other I/O devices in Go, like a network connections or pipes.

In other words, these readers are everywhere in Go, so it is important to know how to handle them well.

Stay tuned for the rest of this mini-series which will discuss TXT, CSV, JSON, XML and SQLite!


Hi! 👋 I’m Tom. I’m a software engineer, a technical writer and IT burnout coach. If you want to get in touch, check out https://tomdeneire.github.io

Top comments (0)