Emoji Encoding, Unicode, and Internationalization with Naomi Meyer

Naomi is a Software Development Engineer at Adobe on the Globalization, Core Services team where she works on the internationalization and localization of Creative Cloud products. Before coding full time, Naomi worked as a teacher across Asia and West Africa. She enjoys weekends outside - hiking, camping, and riding bikes.

Talk: Emoji Encoding, Unicode οΏ½, and Internationalization

Abstract: Why does 'πŸ‘©πŸΏβ€πŸŽ€'.length = 7? Is JavaScript UTF-8 or UTF-16? What happens under the hood when you set ? Have you ever wondered how emoji and complex scripting languages are encoded to work correctly across browsers and devices - for billions of people around the world? Or how new emoji are introduced and approved? Have you ever seen one of these: β–‘ οΏ½ β€œspecial” glyph characters before and want more information on why they might appear and how to avoid them in the future? Let’s talk about Unicode encoding in JavaScript and across the world wide web! We’ll go over best practices, common pitfalls, and provide resources to learn more - even where to go if you want to submit a new emoji proposal! πŸ˜€

Here is a download link to the talk slides (PDF)

This talk will be presented as part of CodeLand:Distributed on July 23. After the talk is streamed as part of the conference, it will be added to this post as a recorded video.

whoa, as someone who minored in linguistics (to complement my STEM major) I find this blend of language and tech amazing!


Thanks Adriana! If you like this kinda thing, I highly recommend the book Because Internet by Gretchen McCulloch, I love blending linguistics and tech! πŸ₯³


thanks for the reading suggestion, Naomi! I always enjoy a good read 😊 seeing people like you mixing it up is inspiring for a recent grad who is still exploring careers

So nice to see more and more linguists here πŸ€“πŸ˜


Thank you for this amazing talk !

Given that different platform can have their variation of each emoji, are they doing a good job of keeping them similar? or are they straying further and further apart? Is anybody keeping them in check?


Thanks Mac, this is a great question! The gun/pistol emoji is a perfect example of this controversial issue, check out the many different articles that discuss this! The tl dr is there's not really anyone right now keeping it in check. Overall most emoji platform/vendors are pretty consistent but there have definitely been notable controversial differences.


What do you think of using emojis in URLs/domain names?
I used to have one that redirected to my site as a 2 "character" thing was worked anywhere and could express what I wanted real quick.


Oh cool, yeah those are cute 😊! How did you do it?


It was a trend here on dev.to for a while. The gist was to get the Latin representation of the emoji and register that for the domain, and then link to it online with the emojis


Oooh I also just found this Wikipedia article on it: en.m.wikipedia.org/wiki/Emoji_domain

Awesome - good to know πŸ₯³


Are emojis the most culturally-aware and culturally-impactful software concept in existence?

What is the biggest impact emoji proliferation has had on both coding and culture?


This could probably be an entire dissertation on it's own...I'd read it!


Agreed! I think including emoji in Unicode has been great for internationalization of software because it forces developers to properly encode emoji and ALL multi-byte characters to render on the page. Which expands the character set supported to many additional languages! πŸ‘

I didn't have time to discuss this in the talk, but another huge benefit of emoji - is they're great motivators for users to upgrade their software to the latest version, then users get the newest emoji set from Unicode while ALSO getting enormous security benefits by upgrading! A few years ago Wordpress used emoji as motivation for users to upgrade when the real reason they needed to upgrade was a serious security vulnerability.

I also like the idea that emoji are used as written gestures πŸŽ‰


I've done some digging on unicode and emojis, but this talk goes in depth which is great. It's wild how weird it gets, but it makes sense. Thanks for breaking it all down into a way that makes it clear what is happenin.


Thanks Shannon! πŸŽ‰


Thanks for the great talk, @naeohmi ! What is the process like to get make an emoji "official"? Is it hard to do, and are there any big requirements to keep in mind? Also: I LOVE YOUR SLIDES!!!


Curious as well.


Thanks so much! 😊 The current process is pretty intensive and can take years to get through, so yes I would say it's not really easy to do.

There's a great documentary that helps to answer this question on youtube that I highly recommend, in goes in depth on how the Trans flag emoji was added!

Also here's the official Unicode documentation on the full process! And some more info from emojipedia on emoji proposals πŸŽ‰


That talk was awesooooooome! The encoding for strings in JS blew my mind I now know how to write 'café' πŸ˜„

Also that part of proposing an emoji was really cool. It reminded me of this emoji which is a very popular food in my country πŸ˜…


Thanks Juan! I'm happy you found it useful 😁




This was an amazing talk! The content, slides, and delivery were all excellent πŸ‘

What is the future of Unicode? Are there any new standards on the horizon?


The Unicode Consortium is alive and strong and adding additional characters and emoji with each new version! Read more about Unicode today here πŸŽ‰


I love your enthusiasm and the topic is fascinating! I'm learning so much about how emoji works.


πŸ₯³ awesome, thanks Nadica!


@naeohmi Oh my god, that addition of unicdoes just blew my mind. Your energy is on another level. Thank you for your talk!


🀣 Thanks Adnan, I get excited about Emoji Encodings! πŸ₯³


Wow, I'll never look at emojis the same way again. πŸ˜†

Great talk Naomi!


I wanna know more!


What a fun talk! Love your enthusiasm for this. :) I knew there was weirdness with emoji, but hadn't really dug into it yet, so this was informative and fun!


Thanks Linda! 😁


This was so fascinating! 🀯 Thank you so much for the incredible talk @naeohmi .


Thanks Julianna! 😊


Love this talk! U+1F60D


JS uses UTF-18? TIL


JavaScript is a mix between UCS-2 and UTF-16 here's a great article with more in-depth information πŸ‘