DEV Community

Discussion on: Dev 101: Unicode

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Emoji are kind of an accident. There were some Japanese character sets, and boards, that used a bunch of them. Unicode wanted to ensure it could actually encode all those existing sets, thus added those funny symbols. Of course, it'd didn't take long before use of them exploded and they gained their own life.

They're no even part of the basic multilingual plane, in Unicode, since they weren't deemed essential enough. This is still a problem today, since languages like Java don't handle them correctly.

Collapse
 
lrgranger profile image
Ray

Thanks for the comment Edaqa!

I'd like to do a follow up (Dev 102?) about some of the specific ways in which Unicode isn't handled well by various programing languages. In my research for the above I read some about langauges (PHP I believe was one) which assume all Unicode characters are 16 bit, and they calculate string lenght based on this assumption. I'm sure there are others!

From a psychology and linguistics point of view I find the history and adoption of emoji facinating, and I'd love to know more about how people interpret the different emoji meanings, and how that impacts communication.

(PS I checked out at your cooking blog and it looks amazing!! Gonna give the soy seitan a try 😋)

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Let me know if you will write that article, as I have one I could update. The string type is broken goes into some of the common problems that exist with Unicode in languages. I've been curious if it applies still, and also it needs a good editing. :)