Benjamin Black

Posted on Sep 13, 2018 • Edited on Aug 19, 2019

Save page weight with web font subsetting

#fonts #unicode

My posts are usually notes and reference materials for myself, which I publish here with the hope that others might find them useful.

Web Fonts are a wonderful thing for design, but they can be very heavy to use and slow down the user experience. Fortunately, all modern browsers support font subsetting.

A CDN like Google Web Fonts will do a very good job of automatically subsetting fonts, delivering the optimized font files, but sometimes we want or need to host our own font files.

Fortunately, we can produce our own subset fonts. This technique looks at the text of a web site to build a list of the glyphs ("characters") which are actually used by that site, and constructs a subset font file which excludes all the unused glyphs, resulting in a significantly (sometimes massively) reduced font file size.

We'll make use of the fonttools Python package (and dependencies), and the glyphhanger NPM package.

As an example, let's take a look at minimizing the use of Lato by Cracked, if Cracked were to host the font files themselves.

But first, let's install some necessary Python utilities -- fonttools, zopfli (for WOFF), and brotli (for WOFF2):

$ pip install fonttools zopfli brotli

The 'Regular' (normal) weight of Lato comes in at 117 KB in TTF format, and 32KB in WOFF2 format.

First, we need to accumulate a list of all of the glyphs which our site actually uses.

Use the glyphhanger utility to output a Unicode range for which glyphs from a specific font family are used by a website. This tool is chosen because its output format is the same used as input to the pyftsubset utility, and by the unicode-range CSS property.

$ npx glyphhanger http://cracked.com --family='Lato' > glyphs.txt
...

$ cat glyphs.txt
U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019

Next, for the subset of glyphs used by the site, we need to extract those glyphs from the original font into a subset font.

Use the pyftsubset utility which comes with the fonttools package to create the subsets in WOFF/WOFF2 format:

$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff --with-zopfli

$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff2

These commands produce Lato-Regular.subset.woff and Lato-Regular.subset.woff2.

When I ran these commands, the resulting Lato-Regular.subset.woff2 was 11 KB, about one-third of the size of the full-set WOFF2 file (and less than 10% of the size of the TTF).

These subset font files can be used in a @font-face rule as usual, with the addition of a unicode-range property indicating which glyphs are included in the font file. The range is the same as that output to glyphs.txt by the glyphhanger utility.

@font-face {
  font-family: 'Lato';
  font-style: normal;
  font-weight: normal;
  src: url('Lato-Regular.subset.woff2') format('woff2'),
    url('Lato-Regular.subset.woff') format('woff');
  unicode-range: U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019;
}

ToDo: Integrate font subsetting into the build process.

Top comments (4)

Ahmed Musallam • Sep 14 '18

This is very cool! I remember wanting to do something like this, but did not know the tools you mentioned existed. I ended up using FontForge and deleting glyph ranges by hand through the gui. If I recall correctly, it worked. I really wish I had documented that!!

Ben Halpern • Sep 13 '18

How does this work with a CDN? You're telling the host what data to send over?

stereobooster • Sep 13 '18

This data should be communicated through URL, like this:

@import url("http://fonts.googleapis.com/css?family=Lato:300,400,700&subset=latin");

Benjamin Black • Sep 13 '18

Google Web Fonts, for example, lets you choose a subset of the font you intend to use so that the font includes only, e.g., the so-called "Latin-1" characters, and not all of the extended characters.