3. Preview
• Image Compression is the art and science of
reducing amount data required to represent an
image
• Data required for two hour standard definition(SD)
television movie using 720×480×24 bits pixel arrays
• Answer is 224 Gbytes
• To save storage space and reduce transmission
time
• If you have 8 megapixel camera then what is the
size of one uncompressed image?
4. Preview
• Applications in many other areas like televideo
conferencing, remote sensing, document and
medical imaging, and Facsimile transmission(FAX)
5. Fundamentals
• Data compression refers to the process of reducing
the amount of data required to represent a given
quantity of information
• Data and Information are not the same thing
• Data are the means by which information is
conveyed
• Various amounts of data can be used to represent
the same amount of information
• Data contains irrelevant or repeated information
called redundant data
6. Fundamentals
• Relative data redundancy R
• R = 1 – 1/C where C commonly called the
compression ratio is defined as
• C = b/b’ where b and b’ denote the number of bits
in two representations of the same information
• If C = 10,for instance, means larger representation
has 10 bits of data for every 1 bit of data in the
smaller representation
• Corresponding relative data redundancy of the
larger representation is 0.9 indicating 90% of its
data is redundant
7. Principal types of data
redundancies
• 1) Coding Redundancy 2) Spatial and temporal
redundancy 3) Irrelevant information
• 1) Coding Redundancy: A code is a system of
symbols (letters, numbers, bits) used to represent
a body of information or set of events
• Each piece of information or event is assigned a
sequence of code symbols, called a code word
• Number of symbols in each code word is its length
• 2) Spatial and temporal redundancy: Pixels of most
2-D intensity arrays are correlated spatially (i.e.
each pixel is similar to or dependent on
neighboring pixels
8. Principal types of data
redundancies
• Information is unnecessarily replicated in the
representations of the correlated pixels
• In a video sequence, temporally correlated pixels
also duplicate information
• 3) Irrelevant information: Most 2-D intensity arrays
contain information that is ignored by human
visual system and/or extraneous to the intended
use of the image. It is redundant in the sense that
it is not used
14. Measuring Image Information
• How few bits are actually needed to represent the
information in an image?
• Is there a minimum amount of data that is
sufficient to describe an image without losing
information?
• Answer is given by Information Theory
• I(E) = log(1/P(E) )= - log(P(E)) units of information
• If the base 2 is selected, the unit of information is
the bit
15. Measuring Image Information
• Given a source of statistically independent random
events from a discrete set of possible events
{a1,a2,…..,aj} with associated probabilities {P(a1),
P(a2),……,P(aj)}, the average information per
source output, called the entropy of the source, is
• aj is called the source symbols
• Because they are statistically independent, the
source itself is called a zero-memory source
16. Measuring Image Information
• H for previous example (first image) is 1.6614
bits/pixel
• H for second image is 8 bits/pixel
• H for third image is 1.566 bits/pixel
• Shannon’s first theorem
20. Some Basic Compression Methods
• Arithmetic Coding
• Generates nonblock codes and it is used to remove
coding redundacy
• One to one correspondence between source
symbols and code words does not exist
• Entire sequence of source symbols is assigned a
single arithmetic code word
• Number of symbols in the message increases, the
interval used to represent it becomes smaller and
the number of information units required to
represent the interval becomes larger
23. Arithmetic Coding
• Three decimal digits are used to represent the five
symbol message
• 0.6 decimal digits per source symbol
• Entropy is 0.58 decimal digit per source symbol
• Length of the sequence being coded increases, the
resulting arithmetic code approaches the bound
established by shannon’s first theorem
• Two disadvantage 1) the addition of the end of
message indicator that is needed to separate one
message from another ; 2) the use of finite
precision arithmetic
24. LZW Coding
• Addresses spatial redundancies in an image
• The technique, called Lempel-Ziv-Welch (LZW)
coding, assigns fixed length code words to variable
length sequences of source symbols
• Probabilities are not required
• It was protected under a United States patent, LZW
compression has been integrated into a variety of
mainstream imaging file formats, including GIF,
TIFF, and PDF. The PNG format was created to get
around LZW licensing requirements
27. Run Length Coding
• Images with repeating intensities along their rows(columns) can
often be compressed by representing runs of identical
intensities as run-length pairs, where each run-length pair
specifies the start of a new intensity and the number of
consecutive pixels that have that intensity
28. Run Length Coding
• RLE was developed in 1950s and used in FAX coding
• Compression is achieved by eliminating a simple form of spatial
redundancy-group of identical intensities
• When there are few(or no) runs of identical pixels, run-length
encoding results in data expansion
• BMP file format uses a form of run-length encoding in which
image data is represented in two different modes; encoded and
absolute
• Either mode can occur anywhere in the image
• Encoded mode, a two byte RLE representation is used. The first
byte specifies the number of consecutive pixels that have the
color index contained in the second byte. The 8-bit color index
selects the run’s intensity from a table of 256 possible
intensities
29. Run Length Coding
• Absolute mode, the first byte is 0 and the second byte signals
one of four possible conditions as shown in table
• When the second byte is 0 or 1, the end of a line or the end of
the image has been reached
• If it is 2, the next two bytes contain unsigned horizontal and
vertical offsets to a new spatial position (and pixel) in the image
Second Byte Value Conditions
0 End of line
1 End of image
2 Move to a new position
3-255 Specify pixels individually
30. Run Length Coding
• Effective when compressing binary images
• Additional compression can be achieved by variable length
coding the run lengths themselves
• The approximate run-length entropy of the image is then
HRL = H0 + H1 /L0 + L1
• Where the variables L0 and L1 denote the average values of
black and white run lengths, respectively
31. Types of File Formats
• Simplest way of storing image data is by using a 2D array of
pixel intensities. This is referred to as a Bitmap.
• Another way of encoding the images is to use vector
graphics where the image is stored as a collection of
vectors.
• Popular file formats are listed below:
• 1) GIF(Graphics Interchange Format) 2) JPEG(Joint
Photographic Experts Group) 3) PNG (Portable Network
Group) 4) DICOM(Digital Imaging and COMmunication) 5)
SVG (Vector Graphics file format) 6) TIFF (Tagged Image File
Format
32. Types of File Formats
• 1) GIF: Uses lossless compression LZW technique
• Quality of image is very high
• It supports 256 colors (8-bit)
• File is smaller in size, has good compression, and is good in
displaying flat color areas
• Also supports animation
• Can store multiple images and using timing information can
build animations where multiple static images play
continuously, creating the illusion of motion
• 2) JPEG : Used for storing continuous tone images
• Provides lossy and lossless compression
33. Types of File Formats
• Used DCT and DWT technique for compression
• Common format for storing and transmitting photographic
images on the World Wide Web
• 3) PNG : Specially designed for the Web
• Supports grey scale or RGB images
• Designed for transmitting images on the Internet
• Supports transparency and interlacing
• One useful feature of PNG is its built-in text capabilities for
image indexing, allowing storage of text within the file itself
34. Types of File Formats
• 4) DICOM : Popular format in medical imaging
• Contains image data and also metadata such as patient
details, equipment, and acquisition details
• Provides many communication standards
• 5) SVG : It is a vector graphics file format that enables 2D
images to be displayed on the web
• Scalable to the size of the viewing window and adjust in
size and resolution according to the window in which they
are displayed
• 6) TIFF : A flexible file format supporting a variety of image
compression standards, including JPEG, JPEG-LS, JPEG-
2000, and others