Optimising JPEG compression by identifying image characteristics

JPEG compression allows the user to make a trade-off between image file size and image quality. However, it can be hard to find a good balance between the two. I graduated at Voormedia researching how to compress JPEG images based on their characteristics aiming to find this balance.

We wanted to find a close to optimal compression rate for any given image. Imagine a website containing several images. If we do not compress the images the website will most likely be very slow. However, if we compress the images too much then the images will look bad.

We will present an approach on how to balance the trade-off between size and quality by identifying certain characteristics in an image (e.g., luminance, colors, frequencies) and use these as a metric to estimate how to balance the compression before the compression takes place.

Understanding JPEG

JPEG images uses lossy compression. This means that when we compress the image we will throw away some of the information which usually results in an overall lower quality and smaller image. The problem is finding the right compression rate. If we simplify the JPEG algorithm a little then we can say that JPEG uses 4 basic steps:

Color conversion.

This step changes the representation of colors in the image.

Color conversion
Color conversion from RGB to YCrCb

Subsampling.

This step exploits one of the human eye’s weaknesses, namely that it is less sensitive to the chrominance (color) information in an image than it is to the luminance (perceived brightness). Some of the chrominance information of the image can therefore be removed.

Block-processing with Discrete Cosine Transformation.

During this step, all the pixels of the image are divided into blocks consisting of 8 x 8 pixels. These blocks will then be transformed into the frequency-space by the Discrete Cosine Transformation. This means that instead of expressing the image in pixels, we transform the block to be expressed in waves. This allows us to separate low and high frequencies. Because of this, we can then apply quantisation, which will allow us to remove high frequencies which the human eye cannot see.

Discrete cosine transformation
Image "A" undergoing block-processing with discrete cosine transformation


Variable length encoding.

The fourth and last step will reorder the data that is left in the blocks of the image for optimal storage.

Problem

The problem is that there is no good way of knowing how much we should compress an image to retain good perceived quality. Using the same compression settings on two different images can yield different results. An example of this can be seen below. Both these images are compressed using the same compression rate.

Einstein original
Einstein original
Einstein compressed
Einstein compressed
Gradient original
Gradient original
Gradient compressed
Gradient compressed

Image characteristics

Since there might be an infinite number of different characteristics that could be measured in an image, some characteristics will be discarded to leave room for the seemingly important ones. The criteria for choosing a characteristic was the following:

The characteristic should be...

  • Able to be measured and turned into a metric.
  • Mentioned in other scientific papers if they are not obviously related to the compression and perceivable image quality. However, if a new characteristic is discovered that clearly affects the compression then that characteristic can be added as well.
  • Measurable on the original image.
  • Related to how the image compresses.

Based on these criteria we had a discussion to decide which characteristics to aim for. Ultimately, we decided to focus on the following characteristics:

Color

Colors are one of the criteria used by human eyes to identify objects around them.

Image containing many colors
Image containing many colors

Grey-scale Components

Grey-scale images are challenging when using JPEG compression. The human eye is a lot more sensitive to brightness variations than to hue variations. This makes it harder to compress brightness data without giving visual degradation when using the human eye.

Image containing grey-scale components
Image containing grey-scale components

Luminance

Human eyes are very sensitive to luminance. What separates luminance from grey-scale components is that luminance can be measure in both color and grey- scale images.

Image containing a low amount of luminance
Image containing a low amount of luminance

Edges

One important characteristic to look at when analysing an image is the amount of edges. JPEG will have a hard time with sharp edges due to the fact that 8 x 8 pixel blocks are used.

Image containing several sharp edges
Image containing several sharp edges

Frequencies

The human eyes are a lot more sensitive to medium frequencies than low and high frequencies. This is something that we can exploit when compressing our images.

Image containing a high percentage of high frequencies
Image containing a high percentage of high frequencies

Experiments

When the characteristics were decided some experiments were conducted. The goal of the experiments was to find out how the different characteristics affect the different images.

Experiment 1: How much compression is needed for different image types

The goal of the first experiment was to find out how much actual users would compress an image. Several different categories of images were selected: photos, art, clip art, modern art, large images, screenshots, diagrams, images with text, etc.


This resulted in approximately 60 images to cover these categories. This would ensure that as many as possible different types of images would be included in the experiment.

The experiment had to be somewhat compact to ensure that the test subject did not get tired. Six test subjects were used to conduct the experiment. Having more subjects would not be necessary since we were looking for a range and not any exact numbers. All subjects were told to compress every image as much as possible without having the image looking bad. The user would see the original image to the left and the compressed image to the right.

The results of the first experiment were more spread than expected. All test subjects used the same monitor, the same lightning, sat at the same height and had the same instructions. Even so, the range of acceptable compression could sometimes vary between 40 and 80 which is quite much when compressing JPEG images. The reason for this is most likely due to the fact that they all focused on different elements in the image.

This was still very beneficial to the project since it was now possible to know which “range” to aim for when trying to create an algorithm for finding the almost optimal compression rate. Another important element gathered from the experiment was that it was now possible to sort images based on how much they could be compressed.

Compression rate for the 10 first images
FileSubject 1Subject 2Subject 3Subject 4
1.jpg80754565
2.jpg75606545
3.png85508080
4.jpg60305045
5.jpg50253545
6.jpg55403555
7.jpg35201530
8.jpg65354050
9.jpg55557040
10.jpg55308050

Experiment 2: Measuring characteristics of images

The second experiment was created to measure the initial characteristics of every image in the set. The experiment was automated so we would not have to go trough each image manually. All the selected characteristics were then measured on the original image. The results were then added to an Excel file for easy viewing. This makes it possible to quickly look up the initial characteristics for any image in the set. This can then be used together with the other experiment to cross reference the characteristics.

By combining the results from both experiments it becomes possible to see the correlation for each characteristic and the average compression for all images.

Correlation between compression rate and characteristics. 1 being the highest and 0 being the lowest
CharacteristicCorrelation
Percentage of High frequency0.54
Percentage of Middle frequency0.47
Percentage of Low frequency0.13
Percentage of color Red0.14
Percentage of color Green0.14
Percentage of color Blue0.18
Percentage of color Grey0.01
Percentage of color White0.004
Percentage of color Black0.26
Mean Luminance0.18

Optimal compression algorithm

From the experiments it could be concluded that there seems to be a decent correlation between the percentage of "high frequencies" and the compression rate. Out of the 60 images used through the experiments 40 images were randomly selected for validation and the rest of the 20 images were used for verification. This can be seen by taking the first 40 images and creating a 2D-scatter plot where the percentage of high frequencies are on the X-axis and the compression rate is on the Y-axis. With this in mind it was time to try to create an algorithm for estimating the close to optimal compression.

Correlation between average compression rate and the percentage of high frequencies

After visually analysing the 2D-scatterplot we tried to create a model for estimating the desired compression rate by aiming for the average compression rate. However, using the average as a goal for the compression estimation is not necessarily good. The average does not tell us so much about how well the compression is going to be. Therefore, instead of using an average, a wider range had to be used. Therefore, we tried to set a linear range that would be considered as the "acceptable compression range".

Acceptable compression range

By using the middle line as a base, the formula then became (HF = percentage of high frequencies):

Y = -2/5 ∗ HF + 80 (the middle line created from Y = kx + m).

This formula was then used to estimate the close to optimal compression rate for the remaining 20 images. The results were then compared against our current compression algorithm on TinyJPEG to measure how satisfied the users were. From the results it was clear that the new estimation formula would have slightly more satisfied users.

Summary

Based on the experiments and the results we can say that it is possible to measure different characteristics of an image to get an estimation of how much an image can be compressed. The characteristics play a big part of how well an image can be compressed, however, there was only one of the chosen characteristics that had any real significance and that was the percentage of high frequencies.

The main problem with the current estimation formula is that there is no clear connection to why some images could not be properly estimated. If we want to create an algorithm for finding the absolute best possible compression rate in the future then we would most likely have to conduct more experiments including other characteristics.

I do believe that this research could have laid the groundwork for a new way of finding the best possible compression rate. By using the technique presented in this research not only would the current process be faster but also more accurate since we are compressing each image based on their own characteristics.

Interested in reading more? Have a look at my thesis.