Popular File Formats

 

Some popular file formats for information exchange are described below. One of the most important is the 8 - bit GIF format, because of its historical connection to the WWW and HTML markup language as the first image type recognized by net browsers. However, currently the most important common file format is JPEG.

GIF

GIF is palette - based: the colors used in an image (a frame) in the file have their RGB values that can hold up to 256 entries, and the data for the image refer to the colors by their indexes (0 – 255) in the palette table. The color definitions in the palette can be drawn from a color space of millions of shades (224 shades, 8 - bits for each primary), but the maximum number of colors a frame can use is 256. This limitation seemed reasonable when GIF was developed because few people could afford the hardware to display more colors simultaneously. Simple graphics, line drawings, cartoons, and grey - scale photographs typically need fewer than 256 colors.

As a further refinement, each frame can designate one index as a "transparent background color": any pixel assigned this index takes on the color of the pixel in the same position from the background, which may have been determined by a previous frame of animation.

Many techniques, collectively called dithering, have been developed to approximate a wider range of colors with a small color palette by using pixels of two or more colors to approximate in between colors. These techniques sacrifice spatial resolution to approximate deeper color resolution. While not part of the GIF specification, dithering can of course be used in images subsequently encoded as GIF images. This is often not an ideal solution for GIF images, both because the loss of spatial resolution typically makes an image look fuzzy on the screen, and because the dithering patterns often interfere with the compressibility of the image data, working against GIF's main purpose.

In the early days of graphical web browsers, graphics cards with 8 - bit buffers (allowing only 256 colors) were common and it was fairly common to make GIF images using the websafe palette. This ensured predictable display, but severely limited the choice of colors. Now that 32 - bit graphics cards, which support 24 - bit color, are the norm, palettes can be populated with the optimum colors for individual images.

A small color table may suffice for small images, and keeping the color table small allows the file to be downloaded faster. Both the 87a and 89a specifications allow color tables of 2n colors for any n from 1 through 8. Most graphics applications will read and display GIF images with any of these table sizes; but some do not support all sizes when creating images. Tables of 2, 16, and 256 colors are widely supported.

GIF file format

GIF screen descriptor

GIF color map

GIF image descriptor

GIF four - pass interlace display row order

JPEG

The JPEG compression algorithm is at its best on photographs and paintings of realistic scenes with smooth variations of tone and color. For web usage, where the amount of data used for an image is important, JPEG is very popular. JPEG / Exif is also the most common format saved by digital cameras.

On the other hand, JPEG may not be as well suited for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels can cause noticeable artifacts. Such images may be better saved in a lossless graphics format such as TIFF, GIF, PNG, or a raw image format. The JPEG standard actually includes a lossless coding mode, but that mode is not supported in most products.

As the typical use of JPEG is a lossy compression method, which somewhat reduces the image fidelity, it should not be used in scenarios where the exact reproduction of the data is required (such as some scientific and medical imaging applications and certain technical image processing work).

JPEG is also not well suited to files that will undergo multiple edits, as some image quality will usually be lost each time the image is decompressed and recompressed, particularly if the image is cropped or shifted, or if encoding parameters are changed – see digital generation loss for details. To avoid this, an image that is being modified or may be modified in the future can be saved in a lossless format, with a copy exported as JPEG for distribution.

The compression method is usually lossy, meaning that some original image information is lost and cannot be restored, possibly affecting image quality. There is an optional lossless mode defined in the JPEG standard; however, this mode is not widely supported in products.

There is also an interlaced "Progressive JPEG" format, in which data is compressed in multiple passes of progressively higher detail. This is ideal for large images that will be displayed while downloading over a slow connection, allowing a reasonable preview after receiving only a portion of the data. However, progressive JPEGs are not as widely supported, and even some software which does support them (such as versions of Internet Explorer before Windows 7) only displays the image after it has been completely downloaded.

There are also many medical imaging and traffic systems that create and process 12 - bit JPEG images, normally grayscale images. The 12 - bit JPEG format has been part of the JPEG specification for some time, but again, this format is not as widely supported.

A number of alterations to a JPEG image can be performed losslessly (that is, without recompression and the associated quality loss) as long as the image size is a multiple of 1 MCU block (Minimum Coded Unit) (usually 16 pixels in both directions, for 4:2:0 chroma subsampling). Utilities that implement this include jpegtran, with user interface Jpegcrop, and the JPG_TRANSFORM plugin to IrfanView.

Blocks can be rotated in 90 degree increments, flipped in the horizontal, vertical and diagonal axes and moved about in the image. Not all blocks from the original image need to be used in the modified one.

The top and left edge of a JPEG image must lie on a 8 × 8 pixel block boundary, but the bottom and right edge need not do so. This limits the possible lossless crop operations, and also prevents flips and rotations of an image whose bottom or right edge does not lie on a block boundary for all channels (because the edge would end up on top or left, where – as a fore mentioned – a block boundary is obligatory).

When using lossless cropping, if the bottom or right side of the crop region is not on a block boundary then the rest of the data from the partially used blocks will still be present in the cropped file and can be recovered.

It is also possible to transform between baseline and progressive formats without any loss of quality, since the only difference is the order in which the coefficients are placed in the file.

Furthermore, several JPEG images can be losslessly joined together, as long as the edges coincide with block boundaries.

PNG

One interesting development stemming from the popularity of the Internet is efforts toward more system - independent image formats. One such format is Portable Network Graphics (PNG). This standard is meant to supersede the GIF standard and extends it in important ways. The motivation for a new standard was in part the patent held by UNISYS and CompuServe on the LZW compression method. (Interestingly, the patent covers only compression, not decompression: this is why the UNIX gunzip utility can decompress LZW - compressed files.)

Special features of PNG files include support for up to 48 bits of color information — a large increase. Files may also contain gamma - correction information for correct display of color images and alpha - channel information for such uses as control of transparency. Instead of a progressive display based on widely separated rows, as in GIF images, the display progressively displays pixels in a two - dimensional fashion a few at a time over seven passes through each 8 x 8 block of an image.

Comparison to Graphics Interchange Format (GIF)

1.      On small images, GIF can achieve greater compression than PNG.

2.      On most images, except for the above cases, GIF will be bigger than indexed PNG.

3.      PNG gives a much wider range of transparency options than GIF, including alpha channel transparency.

4.      Whereas GIF is limited to 8 - bit indexed color, PNG gives a much wider range of color depths, including 24 - bit (8 bits per channel) and 48 - bit (16 bits per channel) truecolor, allowing for greater color precision, smoother fades, etc. When an alpha channel is added, up to 64 bits per pixel (before compression) are possible.

5.      When converting an image from the PNG format to GIF, the image quality may suffer due to if the PNG image has more than 256 colors.

6.      GIF intrinsically supports animated images. PNG supports animation only via unofficial extensions (see the section on animation, above).

PNG images are less widely supported by older browsers. In particular, IE6 has limited support for PNG. As users adopt newer browsers, this becomes less of an issue.

Comparison to JPEG

JPEG (Joint Photographic Experts Group) format can produce a smaller file than PNG for photographic (and photo - like) images, since JPEG uses a lossy encoding method specifically designed for photographic image data, which is typically dominated by soft, low - contrast transitions, and an amount of noise or similar irregular structures. Using PNG instead of a high - quality JPEG for such images would result in a large increase in file size with negligible gain in quality.

By contrast, when storing images that contain text, line art, or graphics – images with sharp transitions and large areas of solid color – the PNG format can compress image data more than JPEG can, and without the noticeable visual artifacts which JPEG produces around high - contrast areas. Where an image contains both sharp transitions and photographic parts a choice must be made between the two effects. JPEG does not support transparency.

Because JPEG uses lossy compression, it suffers from generation loss, where repeatedly encoding and decoding an image progressively loses information and degrades the image. Because PNG is lossless, it is a suitable format for storing images to be edited. While PNG is reasonably efficient when compressing photographic images, there are lossless compression formats designed specifically for photographic images, lossless JPEG 2000 and Adobe DNG (Digital negative) for example. However these formats are either not widely supported or proprietary. An image can be saved into JPEG format for distribution so that the single pass of JPEG encoding will not noticeably degrade the image.

The PNG specification does not include a standard for embedded Exif image data from sources such as digital cameras. TIFF, JPEG 2000, and DNG support EXIF data.

Early web browsers did not support PNG images; JPEG and GIF were the main image formats. JPEG was commonly used when exporting images containing gradients for web pages, because of GIF's limited color depth. However, JPEG compression causes a gradient to blur slightly. A PNG file will reproduce a gradient as accurately as possible for a given bit depth, while keeping the file size small. PNG became the optimal choice for small gradient images as web browser support for the format improved.

Comparison to TIFF

Tagged Image File Format (TIFF) is a format that incorporates an extremely wide range of options. While this makes TIFF useful as a generic format for interchange between professional image editing applications, it makes adding support for it to applications a much bigger task and so it has little support in applications not concerned with image manipulation (such as web browsers). It also means that many applications can read only a subset of TIFF types, creating more potential user confusion.

The most common general - purpose, lossless compression algorithm used with TIFF is Lempel – Ziv – Welch (LZW). This compression technique, also used in GIF, was covered by patents until 2003. There is a TIFF variant that uses the same compression algorithm as PNG uses, but it is not supported by many proprietary programs. TIFF also offers special - purpose lossless compression algorithms like CCITT Group IV, which can compress bilevel images (e.g., faxes or black - and - white text) better than PNG's compression algorithm.

TIFF

TIFF is a flexible, adaptable file format for handling images and data within a single file, by including the header tags (size, definition, image - data arrangement, applied image compression) defining the image's geometry. For example, a TIFF file can be a container holding compressed (lossy) JPEG and (lossless) Pack Bits compressed images. A TIFF file also can include a vector - based clipping path (outlines, croppings, image frames). The ability to store image data in a lossless format makes a TIFF file a useful image archive, because, unlike standard JPEG files, a TIFF file using lossless compression (or none) may be edited and resaved without losing image quality. This is not the case when using the TIFF as a container holding compressed JPEG. Other TIFF options are layers and pages.

TIFF offers the option of using LZW compression, a lossless data compression technique for reducing a file's size. Until 2004, use of this option was limited because the LZW technique was under several patents. However, these patents have expired.

EXIF

Exchange Image Fife (EXIF) is an image format for digital cameras. Initially developed in 1995, its current version was published in 2002 by the Japan Electronics and Information Technology Industries Association (JEITA). Compressed EXIF files use the baseline JPEG format. A variety of tags (many more than in TIFF) is available to facilitate higher quality printing, since information about the camera and picture - taking conditions (flash, exposure, light source, white balance, type of scene) can be stored and used by printers for possible color - correction algorithms. The EXTF standard also includes specification of file format for audio that accompanies digital images. It also supports tags for information needed for conversion to FlashPix (initially developed by Kodak).

Graphics Animation Files

A few dominant formats are aimed at storing graphics animations (i.e., series of drawings or graphic illustrations) as opposed to video (i.e., series of images). The difference is that animations are considerably less demanding of resources than video files. However, animation file formats can be used to store video information and indeed are sometimes used for such.

FLC is an important animation or moving picture file format; it was originally created by Animation Pro. Another format, FLI, is similar to FLC. GL produces somewhat better quality moving pictures. GL animations can also usually handle larger file sizes. Many older formats are used for animation, such as DL and Amiga IFF, as well as alternates such as Apple Quicktime. And, of course, there are also animated GIF89 files.

PS and PDF

PostScript is an important language for typesetting, and many high - end printers have a PostScript interpreter built into them. PostScript is a vector - based, rather than pixel - based, picture language: page elements are essentially defined in terms of vectors. With fonts defined this way, PostScript includes text as well as vector / structured graphics; bit - mapped images can also be included in output files. Encapsulated PostScript files add some information for including PostScript files in another document.

Several popular graphics programs, such as Illustrator and FreeHand, use PostScript. However, the PostScript page description language itself does not provide compression; in fact, PostScript files are just stored as ASCII. Therefore files are often large, and in academic settings, it is common for such files to be made available only after compression by some UNIX utility, such as compress or gzip.

Therefore, another text + figures language has begun to supersede PostScript: Adobe Systems Inc. includes LZW (see Chapter) compression in its Portable Document Format (PDF) file format. As a consequence, PDF files that do not include images have about the same compression ratio, 2:1 or 3:1, as do files compressed with other LZW - based compression tools, such as UNIX compress or gzip on PC - based winzip (a variety of pkzip).

For files containing images, PDF may achieve higher compression ratios by using separate JPEG compression for the image content (depending on the tools used to create original and compressed versions). The Adobe Acrobat PDF reader can also be configured to read documents structured as linked elements, with clickable content and handy summary tree structured fink diagrams provided.

Windows WMF

Windows MetaFile (WMF) is the native vector file format for the Microsoft Windows operating environment. WMF files actually consist of a collection of Graphics Device Interface (GDI) function calls, also native to the Windows environment. When a WMF file is "played" (typically using the Windows PlayMetaFile {) function) the described graphic is rendered. WMF files are ostensibly device independent and unlimited in size.

Windows BMP

bitmap or pixmap is a type of memory organization or image file format used to store digital images. The term bitmap comes from the compute programming terminology, meaning just a map of bits, a spatially mapped array of bits. Now, along with pixmap, it commonly refers to the similar concept of a spatially mapped array of pixels. Raster images in general may be referred to as bitmaps or pixmaps, whether synthetic or photographic, in files or memory.

In some contexts, the term bitmap implies one bit per pixel, while pixmap is used for images with multiple bits per pixel. Many graphical user interfaces use bitmaps in their builtin graphics subsystems; for example, the Microsoft Windows and OS/2 platforms' GDI subsystem, where the specific format used is the Windows and OS/2 bitmap file format, usually named with the file extension of .BMP (or .DIB for device independent bitmap). Besides BMP, other file formats that store literal bitmaps include InterLeaved Bitmap (ILBM), Portable Bitmap (PBM), X Bitmap (XBM), and Wireless Application Protocol Bitmap (WBMP). Similarly, most other image file formats, such as JPEG,TIFF, PNG, and GIF, also store bitmap images (as opposed to vector graphics), but they are not usually referred to as bitmaps, since they use compressed formats internally.

Macintosh PAINT and PICT

PAINT was originally used in the MacPaint program, initially only for 1 - bit monochrome images.

PICT is used in MacDraw (a vector - based drawing program) for storing structured graphics.

X Windows PPM

This is the graphics format for the X Windows System. Portable PixMap (PPM) supports 24 - bit color bitmaps and can be manipulated, using many public domain graphic editors, such as at. It is used in the X Windows System for storing icons, pixmaps, backdrops, and so on.