The fundamental difference(s) between GIF, JPEG and PNG image file formats
Note: I will expand the explanation of the JPEG file format soon, as it isn't on the same level with the descriptions of the other two formats.
GIF (Graphics Interchange Format)
GIF image format uses the LZW (Lempel-Ziv Welch) compression method to reduce the size of image files. This method works by taking advantage of repetition in the image file data. For example, let's assume we have the following 8x8 pixel b/w image:
========
===XX===
==X==X==
=X====X=
=X====X=
==X==X==
===XX===
========
where "=" symbolizes white and "X" symbolizes black. As I noted above, LZW looks for repetition in strings and the compression works (in a simplified form) by reading image data and then writing the found repetitive characters, along with the amount of them in the compressed image file. From the example above, the result could be something like this (when read line by line):
1. 8=
2. 3= 2X 3=
3. 2= X 2= X 2=
4. = X 4= X =
5. = X 4= X =
6. 2= X 2= X 2=
7. 3= 2X 3=
8. 8=
which would make the compressed image file data (not counting image headers and possible other information) only 44 bytes long as opposed to the original 64 bytes of raw image data. Pretty good compression, even with such primitive and crude algorithm.
In real life, LZW compression is much more powerful than this very crude example, as it doesn't just read the image data line by line. In the example picture the first eleven characters are all the same color so, writing them as 8= followed by 3= takes more space than writing them as 11= (the image file size has to be recorded anyway so, this does not present a problem, the program rendering the image will cut each line after the 8th character anyway and then render the remaining three white pixels on the second line). Also, LZW compression is not limited to single-character repetition but, it can handle repetition in longer strings as well. For example, the lines 3 and 6 could be more efficiently written as 2==X 2=, instead of 2= X 2= X 2=. This makes GIF a very powerful format for line-drawings and pictures with large single-color areas but, (very) poor for multi-colored images with small areas of single color, such as paintings or photographs.
JPEG (Joint Photographic Experts Group (File Interchange Format))
JPEG file format uses a completely different approach to compression. While the LZW compression utilized with GIFs is lossless, JPEGs use lossy compression and reduces the image file size by a rather complex process which can, in a simplified form, be described as: divide the image into small blocks, average the information in each block, process the averaged data (this is where the loss in picture information occurs), write the processed data into the compressed image file. The actual process is far more complex (as was in the case of LZW compression, as well) and I won't go into the gory technical details here but, a couple of points worth mentioning are that the way JPEG images are processed includes DCT (Discrete Cosine Transform, a relative of the Fourier transform) and, that probably the most widely used coding used is the Huffman coding (baseline JPEG only allows Huffman coding, the alternative is arithmetic coding). Because of the way JPG compression works, it is an ideal format for multi-colored pictures, paintings and photographs but, (very) poor for line drawings or pictures with large single-colored areas.
PNG (Portable Network Graphics)
PNG file format is the most advanced of the three image file formats but, due to it still being a relatively new (as compared to GIF and JPEG) format, it is not yet as widely or thoroughly supported and, thus, has not (yet) achieved the popularity it would no doubt deserve.
PNG offers the lossless compression and transparency of GIF at the same time with compression efficiency comparable to JPEG in the case of multi-colored images or, images with color gradients. Transparency in PNG format is full-range, as compared to on-off with GIFs. What this means, is that you can have images with partial transparency or, transparency gradients (much the same way as with color gradients only, the gradient is from opaque to transparent). In addition PNG offers a lot of other features, such as gamma correction, two-dimensional interlacing and a pretty good attempt at forwards compatibility (so that a program designed for older PNG format may be able to open newer PNG files and, the program used to open the image can even evaluate whether or not it can do so correctly). Also, the file format is based on a public domain compression algorithm that has proven efficient over the years.
Ingredients of a PNG image file
Image data inside the PNG files is divided into multiple "chunks" of data, which can be any image information (such as image pixel values, gamma correction, text, an alpha channel, etc). These chunks have a four-character name and a CRC (cyclic redundancy check) checksum to verify the integrity of the data in each chunk. These chunk names are used to identify the chunks to a decoding program (such as a browser or a graphics program). Also, decoders read the chunk names to determine whether or not they can correctly decode and display the image. As stated before, each chunk has a four byte name. These four bytes contain control information (metadata, if you want) about the image and, are as follows:
First character:
uppercase - the chunk is critical to the display of the file's contents
("critical" chunk)
lowercase - the chunk is not strictly necessary in order to meaningfully
display the file's contents ("ancillary" chunk)
Second character:
uppercase - the chunk is part of the public specification ("public" chunk)
lowercase - the chunk is not a part of the formal PNG specification
("proprietary" chunk)
Third character:
uppercase - reserved for future use. Currently, all chunk names must have
an uppercase third character.
Fourth character:
uppercase - this chunk has been denoted as "unsafe to copy" under certain
circumstances ("unsafe-to-copy" chunk)
lowercase - this chunk is safe to copy ("safe-to-copy" chunk)
Note the second character, which is reserved for future extendability as companies can register new "public" chunk types (for information such as character set etc.) not currently supported in the specification. This means the format can be extended in any way necessary by just registering a new chunk type for either public or private use; if the second character is in lowercase, the chunk(s) are only meant to be be recognized and decoded by custom decoders designed to support the data inside those chunks - other decoders simply ignore the information. Currently, text can be stored with the images, using the ISO-8859-1 (Latin-1) character set only.
The combined storing of alpha-numeric information along with the image provides endless possibilities eg. a stock broker can receive an image file depicting the recent behaviour of a certain stock and, have all the dates, times and stock values as text inside the same file for easy processing. Or, a surgeon can receive all relevant patient information (such as name, age, address, relevant medical history etc.) in a single PNG compressed X-ray image.
PNG format supports multiple compression formats but, the only method currently defined is the "deflate" compression (referred to as "PNG compression type 0" in the specification). This option makes it possible to incorporate new, better compression engines (such as wavelet-based methods) into the format with relative ease - PNG has been designed for long-range supportability. Here's a brief overview of the deflate compression method:
Deflate compression is a patent-free LZ77 derivative (used in zip and gzip) which uses a combination of LZ77 and Huffman encoding and is independent of CPU type, operating system, file system and character set. It compresses data with an efficiency comparable to the best currently-available general-purpose compression methods and - due to its patent-free nature - can be implemented freely. The deflate algorithm does not attempt to compress specialized data, such as raster graphics, as well as algorithms optimized for those tasks do. However, as the format and compression methods can be extended and upgraded, such algorithms can be implemented later on, if necessary.
Summary
- PNG provides 10-30% better compression than GIF in addition to the capability of incorporating improved compression engines as they are developed
- PNG handles bitdepths including truecolor images up to 48 bits per pixel and greyscale images up to 16 bits per pixel (compared to 8 bits per pixel with GIF and 32 bits per pixel with JPEG)
- PNG provides for incorporation of textual data along with the image data
- PNG provides a full alpha channel (compared to on-off transparency with GIF and no-transparency with JPEG)
- PNG provides gamma correction to correct/account for display gamma
- PNG provides the capability to add new filters in the future
- PNG compression method is patent-free and readily available in the public domain
- multi-image sequences (read: animations, supported in the GIF format) have been explicitly disallowed in the PNG specification. There may one day be a multimedia PNG variant for these
- PNG compression is lossless (as opposed to lossy compression with JPEG) but, offers the compression efficiency of JPEG with photograph-like images
- PNG can handle high-bandwidth (read: sharp transition, such as line-art) images equally well with GIF (compared to either clearly visible errors or, very poor compression with JPEG)
The biggest real problem with the PNG file format is the lack of decent support. What more can I say?
A humoristic (and not entirely accurate) analogy to the differences in the compression methods (you perform the role of the compression algorithm) would be packing plates to a box:
- If you were the LZW algorithm, you would look for plates of similar shape in your kitchen shelves, group them together and then neatly pack each different set of plates in the box.
- If you were the JPEG algorithm, you would look for plates of similar color in your kitchen shelves, pack them in small boxes of similar size, regardless of the plate sizes and shapes (thus breaking some of them up a bit to make them fit) and then stuff all the small boxes in the large one.
- If you were the PNG algorithm, you would simply pack the plates in the box, include a note telling how to place them once unpacked and still have room for more plates, that you might buy in the future.
Hopefully, this clarifies the fundamental differences between the formats. For all the gory details, go read the URLs below.
http://www.faqs.org/faqs/compression-faq/part2/section-6.html - Introduction to JPEG
http://www.dcs.ed.ac.uk/home/mxr/gfx/2d/GIF-comp.txt - LZW and GIF explained
http://www.libpng.org/pub/png/spec/ - PNG (Portable Network Graphics) Specification, Version 1.2
P.S. Thanks to spiregrain for constructive comments about the HTML-formatting :)