DEV Community

Benjamin Black
Benjamin Black

Posted on • Edited on

Save page weight with web font subsetting

My posts are usually notes and reference materials for myself, which I publish here with the hope that others might find them useful.

Web Fonts are a wonderful thing for design, but they can be very heavy to use and slow down the user experience. Fortunately, all modern browsers support font subsetting.

A CDN like Google Web Fonts will do a very good job of automatically subsetting fonts, delivering the optimized font files, but sometimes we want or need to host our own font files.

Fortunately, we can produce our own subset fonts. This technique looks at the text of a web site to build a list of the glyphs ("characters") which are actually used by that site, and constructs a subset font file which excludes all the unused glyphs, resulting in a significantly (sometimes massively) reduced font file size.

We'll make use of the fonttools Python package (and dependencies), and the glyphhanger NPM package.

As an example, let's take a look at minimizing the use of Lato by Cracked, if Cracked were to host the font files themselves.

But first, let's install some necessary Python utilities -- fonttools, zopfli (for WOFF), and brotli (for WOFF2):

$ pip install fonttools zopfli brotli
Enter fullscreen mode Exit fullscreen mode

The 'Regular' (normal) weight of Lato comes in at 117 KB in TTF format, and 32KB in WOFF2 format.

First, we need to accumulate a list of all of the glyphs which our site actually uses.

Use the glyphhanger utility to output a Unicode range for which glyphs from a specific font family are used by a website. This tool is chosen because its output format is the same used as input to the pyftsubset utility, and by the unicode-range CSS property.

$ npx glyphhanger http://cracked.com --family='Lato' > glyphs.txt
...

$ cat glyphs.txt
U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019
Enter fullscreen mode Exit fullscreen mode

Next, for the subset of glyphs used by the site, we need to extract those glyphs from the original font into a subset font.

Use the pyftsubset utility which comes with the fonttools package to create the subsets in WOFF/WOFF2 format:

$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff --with-zopfli

$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff2
Enter fullscreen mode Exit fullscreen mode

These commands produce Lato-Regular.subset.woff and Lato-Regular.subset.woff2.

When I ran these commands, the resulting Lato-Regular.subset.woff2 was 11 KB, about one-third of the size of the full-set WOFF2 file (and less than 10% of the size of the TTF).

These subset font files can be used in a @font-face rule as usual, with the addition of a unicode-range property indicating which glyphs are included in the font file. The range is the same as that output to glyphs.txt by the glyphhanger utility.

@font-face {
  font-family: 'Lato';
  font-style: normal;
  font-weight: normal;
  src: url('Lato-Regular.subset.woff2') format('woff2'),
    url('Lato-Regular.subset.woff') format('woff');
  unicode-range: U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019;
}
Enter fullscreen mode Exit fullscreen mode

ToDo: Integrate font subsetting into the build process.

Top comments (4)

Collapse
 
ahmedmusallam profile image
Ahmed Musallam

This is very cool! I remember wanting to do something like this, but did not know the tools you mentioned existed. I ended up using FontForge and deleting glyph ranges by hand through the gui. If I recall correctly, it worked. I really wish I had documented that!!

Collapse
 
ben profile image
Ben Halpern

How does this work with a CDN? You're telling the host what data to send over?

Collapse
 
stereobooster profile image
stereobooster

This data should be communicated through URL, like this:

@import url("http://fonts.googleapis.com/css?family=Lato:300,400,700&subset=latin");
Collapse
 
benjaminblack profile image
Benjamin Black

Google Web Fonts, for example, lets you choose a subset of the font you intend to use so that the font includes only, e.g., the so-called "Latin-1" characters, and not all of the extended characters.