AP 186: Activity 4 - Image Types and Formats

Yesterday's activity was about the different image types and formats used in image processing. The first task was to obtain a true color image and manipulate it by converting the image into grayscale, binary and indexed forms. For this part, I used GNU Image Manipulation Program (GIMP) and obtained true color images of beautiful landscapes online.

This is the image that I chose, scaled down to 25% for web viewing, which I obtained here. The original size is 1280 x 1024 pixels and the resolution is 72 x 72 pixels per inch (ppi). Also, the image file size is 404 kB. After rescaling, however, the new image size, resolution and file size is 320 x 256 pixels, 72 pixels per inch and 58.0 kB. The next three images show the grayscale, binary and indexed forms of the original, true color image.

Clockwise from top left: (a) rescaled image in true color, 58.0 kB; (b) image converted

into grayscale, 49.8 kB; (c) image converted into binary, 4.76 kB; and (d) indexed image

with 32 values, 25.7 kB.

For the grayscale image, I clicked Image > Mode > Grayscale in GIMP and obtained the image converted into grayscale. The image size shrunk because the original 3-channel RGB image was flattened into a 1-channel gray image. Color data was lost, but the brightness level is preserved. According to this source, the formula that GIMP uses to convert RGB into grayscale is Y = 0.3R + 0.59G + 0.11B which results into a weighted gray value known as luminance.

For the binary image, I clicked Image > Indexed and selected the 1-bit palette to convert the image into black and white. Essentially, what we are doing here is that we are mapping the 24-bit RGB values into either 0 or 1 (black or white); hence the term binary. The file size of the resulting image is very small as so much color information is lost here.

Lastly, for the indexed image, I clicked Image > Indexed and selected the option to create an optimum palette but with only 32 colors. The only possible colors are limited to the palette, and when the exact color cannot be found, the program approximates (see this source) based on the nearest color available in the given color palette. This, obviously, would result to loss of color information.

It is also worth mentioning that the original image is in .GIF format. To explore further, I obtained other images in other file formats: .JPEG, .PNG and .BMP. I converted all images into grayscale, binary and indexed using the same procedure and settings as stated above.

JPEG image obtained here

Original: 1600 x 1200 px, 460 kB, 180 x 180 ppi

Rescaled: 400 x 300 px, 30.0 kB, 180 x 180 ppi

I exported the images at 80% of the quality of the rescaled images.

Clockwise from top left: (a) rescaled image in true color, 30.0 kB; (b) image converted

into grayscale, 24.4 kB; (c) image converted into binary, 23.7 kB; and (d) indexed image

with 32 values, 25.3 kB.

PNG image obtained here

Original: 1504 x 768 px, 1.68 MB, 72 x 72 ppi

Rescaled: 376 x 192 px, 212 kB, 72 x 72 ppi

I chose the compression level to be 0 upon exporting the converted images.

Clockwise from top left: (a) rescaled image in true color, 212 kB; (b) image converted

into grayscale, 70.8 kB; (c) image converted into binary, 9.13 kB; and (d) indexed image

with 32 values, 71.0 kB.

BMP image obtained here

Original: 1000 x 846 px, 2.41 MB, 72 x 72 ppi

Rescaled: 250 x 212 px, 155 kB, 72 x 72 ppi

Clockwise from top left: (a) rescaled image in true color, 155 kB; (b) image converted

into grayscale, 53.2 kB; (c) image converted into binary, 6.75 kB; and (d) indexed image

with 32 values, 52.4 kB.

Generally, I observed that when the images are converted into grayscale, binary and indexed regardless of image file type, the file size decreases. This indicates that some information was lost, especially color. Speaking about file types, the next task is to compile a brief write-up about the different image file types. There are two basic types: the raster type, which basically treats an image as a two-dimensional pixel grid, and the vector type, which constructs an image based on geometric and other mathematical expressions.

Raster Formats

Graphics Interchange Format (GIF)

- Created in 1987 based on the Lempel-Ziv-Welch (LZW) linear compression routine, a lossless compression algorithm

- Offers an interlace version so that a preliminary version can be viewed before transmission of image

- Limited to 256 colors or gray values in an image, defines RGB values from 0-255, and so images lose some color data when converted into GIF

- Original version (GIF87a) does not allow transparency creation, but a later version (GIF89a) allows one layer of transparency

- LZW algorithm was patented and so GIF became less favored in the graphics and education communities

Joint Photographic Experts Group (JPEG)

- Created in 1990s and became the leader in image file format compression techniques

- Referred to as the JPEG file interchange format or JFIF

- Displays each pixel in 24 bits of data, allowing for 16.7 million different colors to be displayed

- Uses discrete cosine transform which divides the image into 8x8-pixel sizes and compresses each section separately

- Allows progressive display, which enables viewer to see a low-quality version before seeing the actual image

- Lossless compression technique which may result in degradation of image quality

- Possibility of distortion such as the Gibbs phenomenon because of the compression technique

Tagged Image File Format (TIFF)

- Developed by Microsoft and Aldus in 1986, controlled by Adobe now

- Created primarily to be compatible with different image processing devices

- Uses over 70 different “tags” to convey image information, but image processing programs need to be able to decode all possible tags

- Can support full range of image sizes, resolutions and color depth

- Lossless compression technique which does not compromise image quality

- Large file size which limits usability in Internet educational environments

Portable Networks Graphics (PNG)

- Created in 1995 as a response to GIF being patented

- Uses LZ77 compression technique, a precursor of LZW but is not patented

- Lossless

- Uses “chunk architecture” that allows storage of image attributes into metadata

- Supports three pixel types: palette-mapped, 16-bit grayscale, and 48-bit true-color (RGB)

- Uses a 2D interlacing scheme that displays high-res images faster, but increases file size

- Inherent gamma correction (brightness) independent of platform or device

- Not supported by earlier browser versions

Exchangeable Image File Format (EXIF)

- Design rule for Camera File System (DCF) standard created by the Japan Electronics and Information Technology Industries Association (JEITA) for compatibility in different imaging devices (i.e. digital cameras)

- Created in 1995

- Uses JPEG compression for compressed files and TIFF for uncompressed files

Raw Image Format (RAW)

- Records everything that the sensor captures without correction

- Enables user to manipulate the image completely from raw data

- No loss of data

- Dependent on the imaging device and manufacturer if available or not

Windows Bitmap File (BMP)

- Also known as device-independent bitmap (DIB), Windows BMP, Windows DIB, Compatible Bitmap

- Used as the standard bitmap storage format for Windows platforms independent of the machine

- Pixels can be defined by a varying number of bits: 1-, 2-, 4-, 8-, 16-, 24-, and 32-bit per pixel format

- 16-bit and 32-bit images are stored uncompressed, but indexed color images may be compressed with 4-bit or 8-bit Run-length encoding (RLE) or Huffman 1D algorithm, where both are lossless compression techniques

Netpbm file formats

- A family of image file formats such as PBM for bitmaps, PGM for grayscales, and PPM for pixel maps which represent full RBG color

- Sometimes referred to as PNM or Portable Anymap Utilities

- Mostly UNIX platform-supported, also some Intel-based PCs

- Developed in the 1980s to provide a common file format for bitmaps that is easily transmittable in an e-mail message

WebP

- A new image format developed by Google that provides both lossy and lossless file compression for images optimized for web viewing

- Lossy compression uses predictive coding which uses the value of a pixel to encode the neighboring pixel block, encoding only the difference in the actual and predicted pixel values, the same as VP8 video codec compression in video keyframes

- Lossless compression uses VP8L compression which uses processed image fragments to exactly reconstruct new pixels, and shares some common features with the LZ77 compression technique

- Supported in various web browsers, particularly Google Chrome and Opera

- Open source, and has added support for different image editing tools

- Lossless files are 26% smaller than PNG and lossy files are 25-34% smaller than JPEG

- Supports lossless transparency (known as alpha channel)

High Dynamic Range (HDR) Raster Formats

Radiance HDR format (RGBE)

- An image format compatible with the Radiance rendering system

- Stores pixels as 4 bits: one for each RGB value and another one for a shared exponent

- Allows pixels to have the extended range and precision of floating point values

- Can handle very bright pixels without loss of value for darker pixels

- Lossless image format

- Another format uses the XYZ color scheme plus a shared exponent

Vector Formats

Computer Graphics Metafile (CGM)

- First released in 1986 and registered in 1995 as an Internet Media type

- A standardized platform-independent format used for the interchange of bitmap and vector data developed by the International Standards Organization (ISO) and the American National Standards Institute (ANSI)

- Developed because of the need for a graphics file type able to display scalable or vector graphics, useful for showing technical drawings or maps and other schematic data

- Has support for raster data which can be included internally as cell arrays

- Support for user interaction, i.e. can be zoomed and panned

- Uses RLE compression, which is lossless, and also supports other compression techniques such as JPEG

Scalable Vector Graphics (SVG)

- Released in 2001, a platform for XML-based two-dimensional graphics

- Features include shapes, text and embedded raster graphics, and are used in many Web-based applications, user interfaces and the like

- Builds upon XML, JPEG and PNG file formats

- As XML files, they can be edited by any text editor, but can be completely manipulated by drawing programs

Other Types

3D Vector Formats (AMF, DWF ) – 3D graphics

Compound Formats (EPS, PDF, SWF) – combine pixel and vector data

Stereo Formats (MPO, JPS, PNS) – multiple or side-by-side image format based on common raster types

The third and final task is to explore the Scilab Image and Video Processing (SIVP) toolbox to manipulate images. We were given a few commands to explore, and I wrote here a simple description of the commands from Scipy's help module.

imread - reads an image file
imshow - displays an image on the Scilab graphic window
gray_imread - only available in the SIP package of Scilab; an equivalent SIVP command can be rgb2gray which converts RGB images into grayscale
im2bw - converts an image into binary (black and white)
histplot - plots a histogram of a given data vector according to either the number of classes or a vector defining the classes
imwrite - writes an image to a file
imfinfo - shows information about the image file

In the Scilab console I used all commands to reproduce the results I had with the .PNG file I used above. First, I converted the image into grayscale:

Im = imread("3_rescale.png");Im_gray = rgb2gray(Im);

Then, I took the histogram of the grayscale values using histplot. The first argument corresponds to the number of gray values, and the second argument is the grayscale matrix converted into double precision data type:

histplot(256, double(Im_gray));

To separate the background from the foreground, we have to identify the threshold gray value that differentiates the background from the foreground. We see from the original image that the foreground (trees) is darker than the background (sky), and so we look for the lowest gray value in the histogram. From inspection we could see that the threshold gray value is 108, and if we normalize it to 256 gray values we obtain a threshold value of 0.4219. Using this threshold value we convert the truecolor image into binary:

Im_bw = im2bw(Im, 0.4235);

and here we are able to extract the foreground, which are the trees, from the image.

Grade I give myself: 12/10. I really worked hard for this activity and enjoyed it. I was able to do the tasks not just on one truecolor image but on four, and I was able to compile a detailed writeup on image file formats. Lastly, I was also able to separate the foreground from the background of an image using simple commands in Scilab.

Sources (for the writeup):

Image file formats: past, present and future (2001)

Richard H. Wiggins III, H. Christian Davidson, H. Ric Harnsberger, Jason R. Lauman, and Patricia A. Goede

http://radiographics.rsna.info/content/21/3/789.full

History of the Portable Network Graphics (PNG) Format (2009)

Greg Roelofs

http://www.libpng.org/pub/png/pnghist.html

The RAW image file format (2013)

Richard Martin

http://www.nyip.edu/photo-articles/archive/the-raw-image-format

Exchangeable image file format for digital still cameras: Exif Version 2.3 (2010)

Standardization Committee, Camera & Imaging Products Association