For the last three posts, we have only handled text and nothing more. But now we will delve into binary files and how to read from and write to it. I'm sure the first question if don't know what a binary file is, "What is it?". Good question especially for something we are about to use.
A binary file is a file that isn't intended to be read like text. Instead, they are based more on numerical values.
When opening a binary file, the mode is additionally set to binary and we do this by simply appending a 'b' to the end.
Example:
FILE *binary_file = fopen(path, "rb");
Two functions you will be familiarized with are fread()
and fwrite()
. Usually, they are reserved for working with binary files but are as capable of operating on text files.
Both have the same number of arguments and all but the first have the exact same purpose. The first argument for fwrite()
is where the data will be transferred from to the file whereas in fread()
it is the destination for any data read from a file. Because the first argument's type is a void
pointer, we can receive or transmit any type of data. The second is the size of each object. The third is how many objects we want to read/write. And finally, the fourth and last is the file stream.
I feel like you now deserve an example:
#include <stdio.h>
int main(void)
{
FILE *file = fopen("test", "wb");
int numbers[4] = { 65, 66, 67, 10 };
fwrite(numbers, sizeof(int), 4, file);
fclose(file);
}
Output:
~/Desktop
➜ clang main.c
~/Desktop
➜ ./a.out
~/Desktop
➜ cat test
ABC
~/Desktop
➜
If you try to open the file that was written with a text editor, most likely it will complain at you with the error "Unknown encoding" or something similar. If you use a terminal command like cat
you will see the contents is A, B, and C.
Even though we didn't write any text, we used the character's corresponding ASCII value. And if you didn't know, the number 10 is the ASCII value for a newline.
Next
For the next post, we'll leverage them a bit more to read and write entire C structures!
Top comments (1)
I wondered why a text editor would think that the file is binary even though all numbers we wrote can be interpreted as ASCII. But you're right, the text editor complains. It becomes clear when we use xxd to look into the file instead of cat:
We see that each number we wrote resulted in a 4 bytes block, e.g. 65 (dec) == 0x41 (hex) and in the file we see 0x41000000. The reason is the type of the numbers we wrote: int. The size of an int is 4 bytes on 64bit platforms, thus each number requires 4 bytes in the output file.
There's also a type with only 1 byte of size: char. So let's do the same thing again with char instead of int:
Now the output is:
And now the text editor won't complain anymore.