If you have ever built a web page or made content for the web, you probably already know (when used correctly) how efficient images are at providing information quickly and driving engagement.
As computers and networks got faster, images on the web went from accessory to essential. Today, 0.9mb of a typical 1.7mb web page is made up of images, according to the HTTP Archive. In other words, of all the bytes your browser downloads when accessing a web page, half of them are probably for images.
In 2017, a study by Google showed that 53% of mobile users leave a website if it takes more than 3 seconds to load. While images are only one of many factors that impact web performance, saving bytes by using the right format and compression is crucial to ensure nobody needlessly waits for images to load.
So, how does image compression work and how do we use it appropriately for the web? That’s what we are going to tackle in this article, from the ground up.
- How Common Digital Images Actually Work
- How are Raster Images Compressed ?
- How are Raster Images Loaded in a Browser ?
- Main Formats Used on the Web and When to Use Them
Let’s start by making the distinction between raster images and vector images.
Raster images are files that describe a grid of pixels, which are individual dots of color. The grid has a defined number of rows and columns in which every pixel, by default, is black. By changing the color of these pixels, we can represent visual information, and by adding rows or columns, we can represent more information on the same file.
Linux’s original mascot and logo, Tux, next to an intimidatingly bigger version of itself, illustrating how raster images are simply a grid of pixels.
A term widely used for raster images is bitmap, and that is exactly what it is: a map of pixels that are made of bits. The number of bits it takes to describe a pixel depends on how many different colors an image should be able to represent.
If a pixel is 1 bit, it can only hold the value 0 or 1, black or white, which makes for a monochromatic image. If a pixel is 2 bits, it can be 0, 1, 2, or 3, making up to 4 different colors. If a pixel is 4 bits… you get the idea. Choosing the pixel size of an image defines its “color depth,” defining how many different color combinations a pixel can represent.
One part of this image has a 24-bit color depth and the other has only 8 bits. A phenomenon called color banding can be observed where there is not enough colors available to represent nuances.
On the web, it is not uncommon to work with images that have a 24-bit color depth and others that are 8 bits (depending on the nature of the image and what the objective is).
On the other hand, vector images are files that describe geometric shapes using mathematical functions. They are not made of pixels - they tell the computer what it should draw with pixels. That difference is what allows vector images to scale infinitely. When a raster image is stretched, the computer increases the appearance of each pixel that composes the image, but when a vector image is stretched, the computer re-calculates what should be drawn using the instruction provided by the vector file.
This vector version of Tux can be scaled indefinitely without becoming pixelated.
Vector images are used on the web in a format called SVG. They are made of human-readable markups that describe shapes. It is particularly useful for logos, diagrams, and everything that can be described using geometry that needs to be scalable. When used properly, a vector file is often much smaller than a raster image.
Faster computers and cheaper memory brought about high-fidelity raster images. As aforementioned, the more colors an image file represents, the bigger the file will be, as each and every pixel in that image has a set amount of memory ranging from, most commonly, 1 to 24-bits.
As it became easier to store and exchange image files, the need for clever ways of reducing their size became critical: ‘‘cheaper” memory still wasn’t cheap, and bandwidth was very limited.
There are two main types of compression, lossless and lossy. Most images formats invented around the 1990s rely on either strategy.
Lossless compression, as its name suggests, allows for reducing the size of a raster image without losing detail. One of the many strategies for doing so, called run-length encoding, consists of identifying repeated patterns and expressing them in a more memory-efficient way. It is quicker to say “5 white pixels” than “white pixel, white pixel, white pixel, white pixel, white pixel.” Most lossless image compression formats use more advanced versions of that principle, based on dictionaries that describe and index repeated elements, that can then be referred to throughout the file by their “name.”
Lossy compression is more aggressive, as it removes details that the human eye doesn’t easily observe. There are different strategies for achieving this, and they are sometimes combined.
The most common approach consists of “merging” pixels that are close to one another, and are of only a slightly different shade, to build areas of the exact same color. Adjusting the threshold of what colors are considered “similar” makes the compression more aggressive and the image smaller, but is typically visible in the image.
JPEG lossy compression: quality index 90 versus 20.
Additionally, a process called chroma subsampling allows for applying a different compression threshold - using the logic described before - to different channels (sub-groups of pixels split by role) of the same image.
YCbCr channels decomposition of an image: first channel is luma, the second blue minus luma and the third red minus chroma. This decomposition is used in chroma subsampling to apply different compression levels to brightness and colors.
The idea is that, since human vision is less sensitive to color (chroma) than to luma (brightness), it is possible to compress the color channels of an image more aggressively than brightness channels, without losing visible details.
Downgrading an image to a lower color depth could also be considered as lossy compression, as details are lost in the process of reducing the different colors each pixel could represent. Most image formats allow you to choose a lower color depth for a specific image, which is a good use-case for grayscale images.
Choosing between lossy and lossless compression is not always obvious and depends on what the expected use of the image is. This is especially true for the web, where visual impact and saving bytes are of equal importance.
In addition to compression, some image formats offer interlacing and progressivity, which are loading strategies that allow for displaying a low quality version of an image before it is entirely loaded. Without interlacing or progressivity, the browser displays the image as it loads, from top to bottom, with most of the image obscured from view. Using these strategies, it is possible to show ‘something’ to the user sooner.
To use interlacing and progressivity or not should be carefully considered for two main reasons. Firstly, depending on their implementations, these techniques tend to make files bigger and/or increase overall loading time. Secondly, with regard to content, it sometimes does not make sense to show a blurry version of an image to the user while it loads.
See example of images loading at ~5:00.
Now that we have reviewed the most common mechanisms available to store and compress images, let’s have a look at the images format that are available for the web, how they work, and in which cases it makes sense to use them.
Besides being the primary provider of “lolcats”, GIF is also one of the earliest image formats (1987) designed for sending images over a network.
It is lossless but only supports 8 bits of color depth for a total of 256 different colors. GIF supports custom palettes which extends this limitation a bit by referencing and using 256 colors that are actually used in the image.
GIF also supports transparency, but since it can only say if a pixel is fully transparent or not, it is difficult to retain a proper anti-aliasing (which is a way of smoothing edges by using semi-transparent pixels).
GIF supports interlacing, although it tends to make files slightly bigger.
Finally, and obviously, GIF supports animation.
Nowadays, it rarely makes sense to use a GIF for the purpose of raster images animation: lossy video compression formats are much more efficient, lighter, support much higher color depths, sound and have been really easy to use on the web since the HTML5
<video> tag was released.
However, the HTML5
<video> tag does not support transparency. In that scenario, using a GIF could “do the trick”, with all the limitations aforementioned.
I, too, have no idea how it should be pronounced.
JPEG is a popular way to compress photographs with no transparency. It relies on lossy compression to bring down the size, which can be adjusted using a “quality” index that goes from 0 to 100. JPEG supports chroma subsampling, which is often automatically activated by image manipulation programs when using a quality index below 50.
Generally, using an 80 to 90 quality index is a rule of thumb when compressing for the web. In practice, fine-tuning the quality and playing with chroma subsampling can yield very interesting results.
Finally, JPEG supports progressivity. JPEG’s progressivity doesn’t make for bigger files but tends to make decoding a bit longer.
PNG is lossless and often used on the web for its support of alpha transparency (the extra information on each pixel that determines how transparent they are). Contrary to GIF and its 256 colors palette, PNG supports both transparency and high color depths, which allows for much more efficient anti-aliasing.
PNG is commonly used for logos, icons, diagrams and cropped photographs, although its lossless nature makes for massive file sizes.
There is a case to be made for overuse of PNG for logos and icons, where a vector image format such as SVG would outperform it in terms of file size and scalability.
It is worth noting that PNG also supports animation. In 2008, Mozilla created the APNG format, and while support for this format is somewhat scant, it offers an alternative to GIFs, and support for high color depths and alpha transparency.
SVG is the only vector image format available for the web. It is commonly used for logos, icons, diagrams (where it outperforms PNG) and vector animation.
Not only have all browsers since Internet Explorer 9 supported SVG, but they are also able to interact with it: it is possible to use CSS to manipulate SVG elements, allowing for dynamic styling and animation.
Last but not least, let’s talk about WebP - an image format coming from Google. It has been around for almost 10 years, and has interesting properties, as it is both lossy and lossless, supports alpha transparency and even animation.
Being able to do many different things with a unique format is WebP’s main strength. For example, its ability to support lossy compression and alpha transparency at the same time differentiates WebP, giving it an edge over PNG for use on transparent photographs.
WebP looks perfect on paper (and Google Lighthouse will raise a warning regarding image compression if a webpage’s images are bigger than what they would be in WebP), but there are still a few issues with using it.
The first issue is that, while Google touts WebP as more performant than JPEG, it does not appear to be always the case in reality, as Mozilla Research demonstrated back in 2014. But even if it were true, there is still a difference in terms of image quality as, by default, WebP relies on chroma-subsampling which you might not want on your image in some cases.
The second issue is that WebP’s support is still lacking: both Safari desktop and Safari iOS don’t support it.
Both of these problems can be mitigated and WebP’s properties can be used when it makes sense, but as with every other image format available for the web, it cannot replace all the other ones (yet?).
Knowing how image formats and compression work informs the decision on which strategy to use for images on the web, but it is not always an easy decision to make.
There is no magic formula: having breathtaking, high-quality images on your web page won’t compensate for the fact that it takes too long to load, and vice versa.
Regardless of your role - developer, designer, project manager or even content editor - having this information in mind can help you make better decisions. It can also help during the design phase of a page - for example: does the impact this full blown photograph bring trump the fact that it will likely be materialized in the form of a 2 or 3 megabyte PNG file?
Experience, dialogue and empathy helps. It is all about actually caring, and finding the right compromise for the job.