DEV Community

Cover image for Anatomy of The Emoji
Juice County Prodigy for andcomputers

Posted on

Anatomy of The Emoji

This post originally appeared on &computers

Plus Kimoji just shut down the app store, ah!
And we made a million a minute, we made a million a minute
(We did)
Yeah, we made a million a minute, we made a million a minute, ah!
- Kanye West: Facts (Charlie Heat Version)

"Kimojis" are actually NOT an officially recognized set of emojis. They're literally just pictures with no universally accepted text representation.

This &!%$ Emojis Not Even Real Unicode!!!<br>
This &!%$ Emojis Not Even Real Unicode!!!

What Is an Emoji Really Though

Every emoji is represented by a specific alphanumeric code. For instance the code U+1F412


U+1F412: also referred to as a monkey in some parts of the world.

These representations are decided upon by a very official non-profit organization called the Unicode Consortium which includes Apple, Google, IBM, Adobe Systems and a number of other corporations with interests in text-processing.

A special committee within the consortium decides which emojis get included as part of the official unicode standard. The Unicode Consortium is responsible for a number of other really important things such as figuring out how computers should represent text from languages with completely different character sets.1

Such considerations can be a huge deal for developers, especially when attempting to create global applications that will serve a multilingual user base. It's equal parts beautiful, disturbing, and inspiring exactly how much thought goes into these things.2
The whole point of unicode generally speaking is to create a unified and universal representation of characters for use across computing. Luckily for everyone, this includes the emojis that provide a much appreciated level of richness to text conversations around the world.

Putting it All Together

The following equation shows how different emojis are combined to give way to lots of different representations, its almost like its own little language. Actually I think that's exactly what it is.

A few interesting notes:

  1. U+1F3FF is one of six emojis used to represent skin tones. Emoji skin tones are based on the Fitzpatrick scale which classifies skin based on how it responds to ultraviolet (UV) light, not ethnicity.

  2. You may notice two extra code points: U+200D and U+FE0F. These are special characters that help manage the display/form of different characters or combinations of characters.

Bonus Section

If you happen to be a developer or are just interested in fooling around, here are two code snippets you might be interested in. I'd suggest them if you are considering doing any type of project with emojis and text processing. Also, check out this post for an interesting example of how I used these in a fun project.

Regular expression in Python 2 for finding emojis within text.

This second one isn't getting embedded because, it'd be a lot of scrolling, but check it out here. A dictionary with keys as emoji unicodes and values as their descriptions.


1 Unicode characters in Arabic
2 Required reading for developers

Top comments (0)