DEV Community

The difference between "binary" and "text" files

David Peter on December 30, 2018

This article explores the topic of "binary" and "text" files. What is the difference between the two (if any)? Is there a clear definition for what...
Collapse
 
jones1618 profile image
Stephen Jones • Edited

It's a good, informative article but most people have no clue what a "Unicode code point" is. Instead, I'd make the following distinction:

  • A text file consists of plain, unformatted words, letters and punctuation intended to be readable by humans. In a text file, every 8- or 16-bit "code" corresponds with exactly one letter, number or punctuation mark.

  • A binary file consists of complex structured data meant primarily to be read by applications that translate those structures into something useful by humans (pictures, audio, video, richly formatted text, etc).

Still, I imagine that 99% of computer users these days (except developers) never deal with text files directly. Almost all content people care about live in binary files. The exceptions to that rule are some office documents in XML or RTF format that, while technically might be text documents, are so densely coded and packed with syntax that they might as well be considered binary.

Collapse
 
sharkdp profile image
David Peter

Thank you for the feedback. See answers on Reddit: reddit.com/r/programming/comments/...

Collapse
 
theodesp profile image
Theofanis Despoudis

Its useful to analyse what similar libraries do to check if a file is binary

github.com/search?q=isBinary

Collapse
 
rhymes profile image
rhymes • Edited

Hi David, fantastic post and explanation!

BTW I absolutely love bat, I've aliased it to cat months ago :-D

Collapse
 
sharkdp profile image
David Peter

Thank you, glad you liked it!

Collapse
 
eldinphp profile image
Eldin Egrlić

Good post. So basically

text file is: 00 7A CA 8S 0F DE SO

binary file is : 00 01 00 01 00 10

and if a file contains more 00 than usual, it's considered binary :P

Collapse
 
sharkdp profile image
David Peter

Good point. I have renamed the article. Thank you.