"Data compression - the process of encoding information using fewer bits than the original representation." That's the definition from Wikipedia. But when we talk about textures (images that we use while rendering 3D graphics), it's not that simple. There are 4 different things we can mean by talking about texture compression, some of them you may not know. In this article, I'd like to give you some basic information about them.
1. Lossless data compression. That's the compression used to shrink binary data in size losing no single bit. We may talk about compression algorithms and libraries that implement them, like popular zlib or LZMA SDK. We may also mean file formats like ZIP or 7Z, which use these algorithms, but also define a way to pack multiple files with their whole directory structure into a single archive file.
Important thing to note here is that we can use this compression for any data. Some file types like text documents or binary executables have to be compressed in a lossless way so that no bits are lost or altered. You can also compress image files this way. Compression ratio depends on the data. The size of the compressed file will be smaller if there are many repeating patterns - the data look pretty boring, like many pixels with the same color. If the data is more varying, each next pixel has even slightly different value, then you may end up with a compressed file as large as original one or even larger. For example, following two images have size 480 x 480. When saved as uncompressed BMP R8G8B8 file, they both take 691,322 bytes. When compressed to a ZIP file, the first one is only 15,993, while the second one is 552,782 bytes.
We can talk about this compression in the context of textures because assets in games are often packed into archives in some custom format which protects the data from modification, speeds up loading, and may also use compression. For example, the new Call of Duty Warzone takes 162 GB of disk space after installation, but it has only 442 files because developers packed the largest data in some archives in files Data/data/data.000, 001 etc., 1 GB each.
2. Lossy compression. These are the algorithms that allow some data loss, but offer higher compression ratios than lossless ones. We use them for specific kinds of data, usually some media - images, sound, and video. For video it's virtually essential, because raw uncompressed data would take enormous space for each second of recording. Algorithms for lossy compression use the knowledge about the structure of the data to remove the information that will be unnoticeable or degrade quality to the lowest degree possible, from the perspective of human perception. We all know them - these are formats like JPEG for images and MP3 for music.
They have their pros and cons. JPEG compresses images in 8x8 blocks using Discrete Fourier Transform (DCT). You can find awesome, in-depth explanation of it on page: Unraveling the JPEG. It's good for natural images, but with text and diagrams it may fail to maintain desired quality. My first example saved as JPEG with Quality = 20% (this is very low, I usually use 90%) takes only 24,753 B, but it looks like this:
GIF is good for such synthetic images, but fails on natural images. I saved my second example as GIF with a color palette of 32 entries. The file is only 90,686 B, but it looks like this (look closer to see dithering used due to a limited number of colors):
Lossy compression is usually accompanied by lossless compression - file formats like JPEG, GIF, MP3, MP4 etc. compress the data losslessly on top of its core algorithm, so there is no point in compressing them again.
3. GPU texture compression. Here comes the interesting part. All formats described so far are designed to optimize data storage and transfer. We need to decompress all the textures packed in ZIP files or saved as JPEG before uploading them to video memory and using for rendering. But there are other types of texture compression formats that can be used by the GPU directly. They are lossy as well, but they work in a different way - they use a fixed number of bytes per block of NxN pixels. Thanks to this, a graphics card can easily pick right block from the memory and uncompress it on the fly, e.g. while sampling the texture. Some of such formats are BC1..7 (which stands for Block Compression) or ASTC (used on mobile platforms). For example, BC7 uses 1 byte per pixel, or 16 bytes per 4x4 block. You can find some overview of these formats here: Understanding BCn Texture Compression Formats.
The only file format I know which supports this compression is DDS, as it allows to store any texture that can be loaded straight to DirectX in various pixel formats, including not only block compressed but also cube, 3D, etc. Most game developers design their own file formats for this purpose anyway, to load them straight into GPU memory with no conversion.
4. Internal GPU texture compression. Pixels of a texture may not be stored in video memory the way you think - row-major order, one pixel after the other, R8G8B8A8 or whatever format you chose. When you create a texture with
VK_IMAGE_TILING_OPTIMAL (always do that, except for some very special cases), the GPU is free to use some optimized internal format. This may not be true "compression" by its definition, because it must be lossless, so the memory reserved for the texture will not be smaller. It may even be larger because of the requirement to store additional metadata. (That's why you have to take care of extra
VK_IMAGE_ASPECT_METADATA_BIT when working with sparse textures in Vulkan.) The goal of these formats is to speed up access to the texture.
Details of these formats are specific to GPU vendors and may or may not be public. Some ideas of how a GPU could optimize a texture in its memory include:
- Swizzle order of pixels in Morton order or some other way to improve locality of reference and cache hit rate when accessing spatially neighboring pixels.
- Store metadata telling that a block of pixels or entire texture is cleared to a specific color, so that the clear operation is fast because it need not write all the pixels.
- For depth texture: store minimum and/or maximum depth per block of MxN pixels so that whole group of rendered pixels can be tested and rejected early without testing each individual pixel. This is commonly known as Hi-Z.
- For MSAA texture: store bit mask per pixel telling how many different colors are in its samples, so that not all the samples need to be necessarily read or written to memory.
How to make best use of those internal GPU compression formats if they differ per graphics card vendor and we don't know their details? Just make sure you leave the driver as much optimization opportunities as possible by:
- always using
- not using flags
VK_SHARING_MODE_CONCURRENTfor any textures that don't need them,
- not using formats
VK_IMAGE_CREATE_MUTABLE_FORMAT_BITfor any textures that don't need them,
- issuing minimum necessary number of barriers, always to the state optimal for intended usage and never to
See also article Delta Color Compression Overview at GPUOpen.com.
Summary: As you can see, the term "texture compression" can mean different things, so when talking about anything like this, always make sure to be clear what do you mean unless it's obvious from the context.