My posts are usually notes and reference materials for myself, which I publish here with the hope that others might find them useful.
Web Fonts are a wonderful thing for design, but they can be very heavy to use and slow down the user experience. Fortunately, all modern browsers support font subsetting.
A CDN like Google Web Fonts will do a very good job of automatically subsetting fonts, delivering the optimized font files, but sometimes we want or need to host our own font files.
Fortunately, we can produce our own subset fonts. This technique looks at the text of a web site to build a list of the glyphs ("characters") which are actually used by that site, and constructs a subset font file which excludes all the unused glyphs, resulting in a significantly (sometimes massively) reduced font file size.
We'll make use of the fonttools Python package (and dependencies), and the glyphhanger NPM package.
As an example, let's take a look at minimizing the use of Lato by Cracked, if Cracked were to host the font files themselves.
But first, let's install some necessary Python utilities -- fonttools
, zopfli
(for WOFF), and brotli
(for WOFF2):
$ pip install fonttools zopfli brotli
The 'Regular' (normal) weight of Lato comes in at 117 KB in TTF format, and 32KB in WOFF2 format.
First, we need to accumulate a list of all of the glyphs which our site actually uses.
Use the glyphhanger
utility to output a Unicode range for which glyphs from a specific font family are used by a website. This tool is chosen because its output format is the same used as input to the pyftsubset
utility, and by the unicode-range
CSS property.
$ npx glyphhanger http://cracked.com --family='Lato' > glyphs.txt
...
$ cat glyphs.txt
U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019
Next, for the subset of glyphs used by the site, we need to extract those glyphs from the original font into a subset font.
Use the pyftsubset
utility which comes with the fonttools
package to create the subsets in WOFF/WOFF2 format:
$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff --with-zopfli
$ pyftsubset Lato-Regular.ttf --unicodes-file=glyphs.txt --flavor=woff2
These commands produce Lato-Regular.subset.woff
and Lato-Regular.subset.woff2
.
When I ran these commands, the resulting Lato-Regular.subset.woff2
was 11 KB, about one-third of the size of the full-set WOFF2 file (and less than 10% of the size of the TTF).
These subset font files can be used in a @font-face
rule as usual, with the addition of a unicode-range
property indicating which glyphs are included in the font file. The range is the same as that output to glyphs.txt
by the glyphhanger
utility.
@font-face {
font-family: 'Lato';
font-style: normal;
font-weight: normal;
src: url('Lato-Regular.subset.woff2') format('woff2'),
url('Lato-Regular.subset.woff') format('woff');
unicode-range: U+A,U+20,U+26-29,U+2C-39,U+3F,U+41-59,U+61-69,U+6B-70,U+72-7A,U+A9,U+2019;
}
ToDo: Integrate font subsetting into the build process.
Top comments (4)
This is very cool! I remember wanting to do something like this, but did not know the tools you mentioned existed. I ended up using FontForge and deleting glyph ranges by hand through the gui. If I recall correctly, it worked. I really wish I had documented that!!
How does this work with a CDN? You're telling the host what data to send over?
This data should be communicated through URL, like this:
Google Web Fonts, for example, lets you choose a subset of the font you intend to use so that the font includes only, e.g., the so-called "Latin-1" characters, and not all of the extended characters.