Color Conversion Is Not A Linear Problem

Those who work with digital video are probably familiar with luminance/chrominance-based color models. One of the simplest models of this type is YCoCg. Conversion between YCoCg and RGB is performed as follows:

\[\begin{bmatrix} Y \\ C_o \\ C_g \end{bmatrix} = \begin{bmatrix} \frac{1}{4} & \frac{1}{2} & \frac{1}{4} \\ \frac{1}{2} & 0 & -\frac{1}{2} \\ -\frac{1}{4} & \frac{1}{2} & -\frac{1}{4} \end{bmatrix} \cdot \begin{bmatrix} R \\ G \\ B \end{bmatrix}\] \[\begin{bmatrix} R \\ G \\ B \end{bmatrix} = \begin{bmatrix} 1 & 1 & -1 \\ 1 & 0 & 1 \\ 1 & -1 & -1 \end{bmatrix} \cdot \begin{bmatrix} Y \\ C_o \\ C_g \end{bmatrix}\]

The first equation tells us that if R,G and B are in the [0,1] range, Y will be in the [0,1] range, while Co and Cg will be in the [-0.5,0.5] range.

Usually, the chrominance components have lower perceptual importance than luminance, so they are downsampled (using a single value for each 2x2 block of pixels, for example) to reduce size. The downsampling usually works fine in most situations, but the loss can sometimes become quite visible and distracting.

One issue is that this downsampling is usually implemented without taking into account the final loss that occurs when converting the image back to RGB.

In fact, after upsampling the chrominance components and combining them with luminance, the resulting RGB values frequently exceed [0,1] and need to be clamped back into valid range, causing saturation: the color conversion thus becomes non-linear.

Taking advantage of this, we found that we could downsample chrominance in a way that takes into account the entire color conversion process, leading to improvements in final image quality.

There are many approaches to doing this. One simple method is to identify when the YCoCg to RGB conversion of a block triggers saturation. When it does, there will be several YCoCg values that produce the same saturated RGB pixel. We can then choose which one brings the other pixels in the block closest to their original RGB value (before performing chroma downsampling).

Our experiments showed a clearly visible difference in some extreme cases, with significant improvements to the decoded image quality without any changes to the decoder.


Original image before chroma downsampling	Same image after independent downsampling and RGB conversion	Same image after optimized downsampling and RGB conversion