DEV Community

Cover image for Pretty <ruby> for CJK languages
Mátyás Mustoha
Mátyás Mustoha

Posted on

Pretty <ruby> for CJK languages

Recently, I've been experimenting with East Asian typography and with creating print-quality output using HTML and CSS. However, it didn't take long and I noticed something: rubies are ugly! I haven't really found articles about the topic in English, so here's my attempt at one.

Wait, what?

If you're not familiar with the name “ruby”, they are small characters above the text, usually for providing pronunciation hints. For example, they can show furigana for Japanese, or bopomofo for Chinese, but also Latin letters as well.

The ruby element consists of a ruby base, and the ruby text, that most often sits on its top:

Ruby base and ruby text

In HTML, we can use the <ruby> tag to define a whole group, in which <rb>1 defines the ruby base, and <rt> the ruby text.2 (Spaces added below for readability.)

<ruby lang="ja"> <rb>東京</rb>  <rt>とうきょう</rt> </ruby>
<ruby lang="zh"> <rb>北京</rb>  <rt>Běijīng</rt> </ruby>
<ruby lang="vi"> <rb>河內</rb>  <rt>Hà Nội</rt> </ruby>
Enter fullscreen mode Exit fullscreen mode

The naive approach

Now what happens when the ruby text is wider than the ruby base? By default, <ruby> acts sort of like a single block of text:

Default ruby style

In Japanese typography, however, it often looks more pleasant to spread the text over the neighboring characters, without any spacing3:

Pretty ruby style

This could be solved with a little CSS:

  • take the ruby text out of the regular text flow with position: absolute, then
  • align it horizontally to the center of its parent, with something like left: 50%; transform: translateX(-50%), and
  • move it to the top with bottom: 100%.
ruby {
    position: relative;
}
ruby rt {
    position: absolute;
    left: 50%;
    transform: translateX(-50%);
    bottom: 100%;
}
Enter fullscreen mode Exit fullscreen mode

And this works perfectly fine in Firefox, producing the earlier image.

Unfortunately, the implementation in Chrome and Safari lags behind at the moment, and the position attribute does not seem to work at all there.

An alternative

If we cannot use the built-in <rt> element, we could try to replace it with the CSS pseudo-element ::before. If, instead of

<ruby>東京<rt>とうきょう</rt></ruby>
Enter fullscreen mode Exit fullscreen mode

we write

<ruby data-rt="とうきょう">東京</ruby>
Enter fullscreen mode Exit fullscreen mode

this stores the ruby text as a custom attribute, which we can access from CSS:

ruby[data-rt]::before {
    content: attr(data-rt);
}
Enter fullscreen mode Exit fullscreen mode

and, in addition to the very first styling attempt, to make it look like the original <rt> tag:

ruby[data-rt]::before {
    font-size: .5em;
    line-height: 1;
}
Enter fullscreen mode Exit fullscreen mode

The result looks visually the same as our first attempt!

Sidenotes

The above approach should work for most cases, including vertical writing. Corner cases might appear however if you try to build on top of if. As usual, most of these can be solved with a hint JavaScript code.

  • Shorter text: You might also want to spread out the characters if the ruby base is wider than the ruby text. An approach for that is to split the text to individual characters with JavaScript, then spread them with flexbox styling (justify-content: space-around for example happens to match the Japanese styling specification). However, you cannot target CSS pseudo-elements with JS, so you might need to manually construct a child element for your <ruby>es.
  • Body overflow: If you want to be very precise, you might want to handle ruby texts flowing out of the body text area, i.e. make the text align to one of the sides.
  • Overlaps: The ruby texts might overlap or touch, though in practice the chance for that shouldn't be too high. If this becomes an issue, you can detect such cases using getBoundingClientRect(), and add some padding if necessary.
  • Compound words: If you want to use multiple ruby texts in one single ruby element (eg. per-character pronunciation), you might need to split the ruby elements. If the ruby base can break eg. at line ends, the ruby texts should probably follow that too.

You might also need to do some preprocessing, based on your source text:

  • From HTML: If your text is in HTML and already uses <ruby> and <rt>, you can use JavaScript to query all ruby elements, and move the text content from the <rt> into the data property of the <ruby>.
  • From Markdown: If your text is in Markdown or similar, a common ruby pattern is like this: {東京|とう|きょう}, that is, {base|text1|text2|...|textN}, where each text segment is the reading of a base character.
  • From plain text: If you have plain text, where the reading is next to the word (eg. 東京《とうきょう》), you can always just replace them with a regular expression, as long as the writing is consistent.

A nicely typeset page pleases the eye, and often requires just a tiny bit of additional care. If you happen to work with East Asian text a lot, I hope this will help to make your content look even better.


  1. The <rb> tag is actually unnecessary now (you can directly write the text there), but in this example shows the element structure more clearly. 

  2. For a long time, <ruby> wasn't well supported, so people also used “creative” solutions, like tables for alignment. You might still run into those on some sites. 

  3. See https://www.w3.org/TR/jlreq/?lang=en for the whole specification. 

Top comments (2)

Collapse
 
fruntend profile image
fruntend

Сongratulations 🥳! Your article hit the top posts for the week - dev.to/fruntend/top-10-posts-for-f...
Keep it up 👍

Collapse
 
mmatyas profile image
Mátyás Mustoha

Thanks!