Friday, September 6, 2013

Activity 13 - Image Compression

For this activity, we want to try our hands on image compression. There are two main ways to compress an image - lossy and lossless compression - and there are many techniques underneath both. There is discrete cosine transform [1], color pallete indexing [2] and chroma subsampling [3] for lossy compression. On the other hand, there is run-length encoding [4] and DPCM [5] for lossless compression. For more details on these types of compression, you can read the links provided.

Here, we used principal component analysis (PCA) to compress an image. First, I chose a simple image, one from the Windows 7 sample images. I resized the original 1024x768 into a 800x600 image so that processing will be a little faster.

Figure 1. "Tulips" by David Nadalin, 2008. From Windows 7
default sample images.



Then I split the image into smaller 10x10 blocks, which becomes the components that will be used in PCA. So because I have a 800x600 image, I expect to have 4800 10x10 blocks, corresponding to 4800 components. Using the code provided by Ma'am Jing, I computed for the eigenimages of the image above. To visualize the components, here are the principal components of the image:

Figure 2. Left: Principal components of the Tulips image. Right: Two dimensional
discrete cosine transform, image taken from Wikimedia Commons.

As the number of components approaches infinity, the components will look like a 2D discrete cosine transform. Actually, one can derive the DFT from PCA [6], as they are closely related. So far, though, this is what we've got. Using the pca algorithm in Scilab, we are able to compute for the principal components and factors of a given matrix. We can select the first few components which contribute the most to the image. Below are the reconstructions using different number of principal components:

Figure 3. Reconstructed images using different number of components, as indicated by the digits in red. The image
labeled "grayscale" pertains to the original RGB image converted into grayscale by Scilab. 

We see that we obtain a very good image already while using only 5 components, because the sum of the ratios (or the "effect") of the first 5 components already reaches 98% of the actual image. Therefore, we can reduce this image to a smaller file size without losing much detail. In fact, the original grayscale image is 247 kb, while the reconstructed image using 5 components is 220 kb. However we should note that I saved the images in .png format, a known lossless compression technique, so there is another compression being done while saving the image into this particular file format. When I tried saving it in .jpeg, a very lossy compression technique, the file sizes shrunk considerably more. On the other hand, when I tried saving it in .tiff, a file type which uses no compression, the file sizes are all similar, 470 kb. So I conclude that the compressed file size still depends on the file type that we use to save our images, and the PCA method we used only distorts the image but does not visibly decrease the file size.

To investigate further, I used another image (one with more details) with the same dimensions. After doing the procedure above here is what I obtained:

Figure 4.  Reconstructed images using different number of components, as indicated by the digits in red. The image
in the lower right corner is the original RGB image, scaled to 800x600 before converting into grayscale and
processing using Scilab.

The result is the same, but this time we need at least 25 components already to be able to see a satisfactory image. The reconstructed image with 25 components is 287 kb, while the original grayscale image is 341 kb. All images above were saved in .png format.

Lastly, I tried to compress a colored image by applying PCA onto the three channels and then reconstructing them back into an RGB image. This is what I obtained:

Figure 5. Reconstructed color images using different number of components, as indicated by the red digits. The
image labeled "original" is the original color image, taken here and was already used before in Activity 4. 

As we can see here, there is a trade-off between image sharpness and color quality when we try to compress color image. This is evident between the images with 50 and 75 components used. If we only use 50, we preserve some of the color but lose some of the sharpness of the original, but if we use 75, we preserve more sharpness but lose some of the color information. For me, however, I prefer the 50% compression because it is closer to the original. I used GIMP to open the images as layers and composed them together as a single RGB image. It is interesting to note however that the file sizes of all these images are the same (saved as .png), which means that we only distorted the image but not compressed its file size.

Grade I give myself: 12/10. I was able to apply all the necessary steps and extended the activity to color image compression. Although I really doubt compression did happen...

Sources:
[1] Watson, A. B. (1994). Image compression using the discrete cosine transform. Mathematica journal, 4(1), 81. pdf
[2] Pei, S. C., Chuang, Y. T., & Chuang, W. H. (2006). Effective palette indexing for image compression using self-organization of Kohonen feature map. Image Processing, IEEE Transactions on, 15(9), 2493-2498. pdf
[3] Red.com. (2013). Chroma Subsampling Techniques. http://www.red.com/learn/red-101/video-chroma-subsampling
[4] Antoshenkov, G. (1995, March). Byte-aligned bitmap compression. In Data Compression Conference, 1995. DCC'95. Proceedings (p. 476). IEEE. pdf
[5] Goldstein, J. A., Freytag, L. K., & Keith, M. (2004). U.S. Patent No. 6,771,830. Washington, DC: U.S. Patent and Trademark Office. site
[6] Forsyth, D. (2011). Principal Component analysis. pdf

No comments:

Post a Comment