This article explores the topic of "binary" and "text" files. What is the difference between the two (if any)? Is there a clear definition for what...
For further actions, you may consider blocking this person and/or reporting abuse
It's a good, informative article but most people have no clue what a "Unicode code point" is. Instead, I'd make the following distinction:
A text file consists of plain, unformatted words, letters and punctuation intended to be readable by humans. In a text file, every 8- or 16-bit "code" corresponds with exactly one letter, number or punctuation mark.
A binary file consists of complex structured data meant primarily to be read by applications that translate those structures into something useful by humans (pictures, audio, video, richly formatted text, etc).
Still, I imagine that 99% of computer users these days (except developers) never deal with text files directly. Almost all content people care about live in binary files. The exceptions to that rule are some office documents in XML or RTF format that, while technically might be text documents, are so densely coded and packed with syntax that they might as well be considered binary.
Thank you for the feedback. See answers on Reddit: reddit.com/r/programming/comments/...
Its useful to analyse what similar libraries do to check if a file is binary
github.com/search?q=isBinary
Hi David, fantastic post and explanation!
BTW I absolutely love bat, I've aliased it to cat months ago :-D
Thank you, glad you liked it!
Good post. So basically
text file is: 00 7A CA 8S 0F DE SO
binary file is : 00 01 00 01 00 10
and if a file contains more 00 than usual, it's considered binary :P
Good point. I have renamed the article. Thank you.