SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Source Coding
                  Wireless Ad Hoc Networks
           University of Tehran, Dept. of E&CE,
                                        Fall 2007,
                                   Farshad Lahouti




Media Basics

  Contents:
   Brief introduction to digital media
   (Audio/Video)
      Digitization
      Compression
      Representation
      Standards




                                                     1
Signal Digitization




     Pulse Code Modulation (PCM)




Sampling

 Sampling theory – Nyquist theorem
 the discrete time sequence of a sampled continuous
 function { V(tn) } contains enough information to
 reproduce the function V=V(t) exactly provided that the
 sampling rate is at least twice that of the highest
 frequency contained in the original signal V(t)


 Analog signal sampled at a constant rate
   telephone: 4 kHz signal BW, 8,000 samples/sec
   CD music: 22 kHz signal BW, 44,100 samples/sec




                                                           2
Quantization

 Discretization along energy axis
   Every time interval the signal is converted to a digital
   equivalent
   Using 2 bits the following signal can be digitized




Digitization Examples
 Each sample quantized,             Example: 8,000 samples/sec,
 i.e., rounded                      256 quantized values -->
                                    64,000 bps
   e.g., 28 possible quantized
   values                           Receiver converts it back to
                                    analog signal:
 Each quantized value                  some quality reduction
 represented by bits             Example rates
   8 bits for 256 values           CD: 1.411 Mbps – 16
                                   bits/sample stereo
                                   Internet telephony: 5.3 - 13
                                   kbps
                                   MP3: 96, 128, 160 kbps




                                                                   3
Approximate size for 1 second audio

Channels         Resolution       Fs        File Size

Mono             8bit             8Khz      64Kb
Stereo           8bit             8Khz      128Kb
Mono             16bit            8Khz      128Kb
Stereo           16bit            16Khz     256Kb
Stereo           16bit            44.1Khz   1441Kb*
Stereo           24bit            44.1Khz   2116Kb


 1CD 700M 70-80 mins




 Lossy and lossless Compression

   Lossless compression (more later)
          Data Compression
          APE (MonkeyAudio)
          Image compression for biomedical applications
          …
   Lossy compression
           Hide errors where humans will not see or hear it
           Study hearing and vision system to understand how
         we see/hear
             Perceptual Coding




                                                               4
Requirements for Compression Algorithms

Lossless compression
   Decoded signal is mathematically equivalent to the original one
   Drawback : achieves only a small or modest level of compression
 Lossy compression
   Decoded signal is of a lower quality than the original one
   Advantage: achieves very high degree of compression
   Objective: maximize the degree of compression with a certain quality
General compression requirements
   Ensure a good quality of decoded signal
   Achieve high compression ratios
   Minimize the complexity of the encoding and decoding process
   Support multiple channels
   Support various data rates
   Give small delay in processing




Compression Tools

 Transform Coding
 Variable Rate Coding
     Entropy Coding
     Huffman Coding
     Run-length Coding
 Predictive Coding
     DPCM
     ADPCM




                                                                          5
Variable Length Coding
  Ignores semantics of input data and compresses media
  streams by regarding them as sequences of digits or
  symbols
     Examples: run length encoding, Huffman encoding , ...
                   -
  Run-length encoding
     A compression technique that replaces consecutive
     occurrences of a symbol with the symbol followed by the
     number of times it is repeated
        a a a a a => 5a
        000000000000000000001111111 => 0x20 1x7
     Most useful where symbols appear in long runs: e.g., for
     images that have areas where the pixels all have the same
     value, fax and cartoons for examples.




Entropy coding
A few words about Entropy

  Entropy
      A measure of information content
  Entropy of the English Language
      How much information does each
      character in “typical” English text
      contain?

  From a probability view
  If the probability of a binary event is 0.5 (like
  a coin), then, on average, you need one bit
  to represent the result of this event.
  As the probability of a binary event                  The figure is expressing that unless an
  increases or decreases, the number of bits          event is totally random, you can convey
  you need, on average, to represent the              the information of the event in fewer bits,
  result decreases                                    on average, than it might first appear




                                                                                                    6
Entropy (Shannon 1948)
  For a set of messages S with probability p(s), s ∈S, the
  self information of s is:
                                 1
                i ( s) = log          = − log p( s)
                               p ( s)
  measured in bits if the log is base 2.
  The lower the probability, the higher the self-information
  Entropy is the weighted average of self information.

                                           1
                  H ( S ) = ∑ p( s) log
                           s∈S            p( s)




Entropy Example
     p(S ) = {0.25, 0.25, 0.25, 0.125, 0.125}
    H (S ) = 3 × 0.25 log 4 + 2 × 0.125 log 8 = 2.25

    p(S ) = {0.5, 0.125, 0.125, 0.125, 0.125}
    H (S ) = 0.5 log 2 + 4 × 0.125 log 8 = 2

     p(S ) = {0.75, 0.0625, 0.0625, 0.0625, 0.0625}
    H (S ) = 0.75 log(4 / 3) + 4 × 0.0625 log 16 = 1.3




                                                               7
Statistical (Entropy) Coding
   Entropy Coding
    • Lossless coding
    • Takes advantage of the probabilistic nature of information
    • Example: Huffman coding, arithmetic coding
Theorem (Shannon)
    (lower bound): For any probability distribution p(S) with
  associated uniquely decodable code C,




                           H ( S ) ≤ la (C )

           Recall Huffman coding…




Huffman Coding
 A popular compression technique that assigns variable length
 codes to symbols, so that the most frequently occurring symbols
 have the shortest codes
 Huffman coding is particularly effective where the data are
 dominated by a small number of symbols
 Suppose to encode a source of N =8 symbols: {a,b,c,d,e,f,g,h}
 The probabilities of these symbols are: P(a) = 0.01, P(b)=0.02,
 P(c)=0.05, P(d)=0.09, P(e)=0.18, P(f)=0.2, P(g)=0.2, P(h)=0.25
 If we assign 3 bits per symbol (N =2^3=8), the average length of the
 symbols is:

 The theoretical lowest average length – entropy
      H(P) = - ∑ iN=0 P(i)log2P(i) = 2.57 bits /symbol
 If we use Huffman encoding, the average length = 2.63 bits/symbol




                                                                        8
Huffman Coding (Cont’d)

 The Huffman code assignment procedure is based on a binary tree
 structure. This tree is developed by a sequence of pairing operations
 in which the two least probable symbols are joined at a node to form
 two branches of a tree. More precisely:
     1. The list of probabilities of the source symbols are associated
     with the leaves of a binary tree.
     2. Take the two smallest probabilities in the list and generate an
     intermediate node as their parent and label the branch from
     parent to one of the child nodes 1 and the branch from parent to
     the other child 0.
     3. Replace the probabilities and associated nodes in the list by the
     single new intermediate node with the sum of the two probabilities.
     If the list contains only one element, quit. Otherwise, go to step 2.




Huffman Coding (Cont’d)




                                                                             9
Huffman Coding (Cont’d)

 The new average length of the source is

 The efficiency of this code is
 How do we estimate the P(i) ? Relative frequency of the symbols
 How to decode the bit stream ? Share the same Huffman table
 How to decode the variable length codes ? Prefix codes have the
 property that no codeword can be the prefix (i.e., an initial segment)
 of any other codeword. Huffman codes are prefix codes !
     11010000000010001 => ?
 Does the best possible codes guarantee to always reduce the size of
 sources? No. Worst case exists. Huffman coding is better averagely.
 Huffman coding is particularly effective where the data are dominated
 by a small number of symbols




Transform Coding
 Frequency analysis ?
    Time domain ? Not easy!
   Time domain -> Transform domain
        Sequence to be coded is converted into new sequence
        using a transformation rule.
        New sequence - transform coefficients.
        Process is reversible - get back to original sequence
        using inverse transformation.
    Example - Fourier transform (FT)
    Coefficients represent proportion of energy
    contributed by different frequencies.




                                                                          10
Transform Coding (Cont…)
  In transform coding - choose transformation such that
  only subset of coefficients have significant values.
  Energy confined to subset of ‘important’ coefficients.
  Known as ‘energy compaction’.
  Example - FT of bandlimited signal:




Differential Coding – DPCM & ADPCM

Based on the fact that neighboring samples … x(n-1), x(n),
x(n+1), … in a discrete time sequence changes slowly in
many applications, e.g., voice, audio, …
A differential PCM coder (DPCM) quantizes and encodes the
difference d(n) = x(n) – x(n-1)
Advantage of using difference d(n) instead of the actual
value x(n)
    Reduce the number of bits to represent a sample
General DPCM: d(n) = x(n) – a1x(n-1) - a2x(n-2) -…- akx(n-k)
                  a1, a2, …ak are fixed
Adaptive DPCM: a1, a2, …ak are dynamically changed with
signal




                                                               11
Psychoacoustic
Human aural response




Psychoacoustic Model
   Basically: If you can’t hear the sound, don’t encode it
   Natural Bandlimiting
       Audio perception is 20-20 kHz but most sounds in low
      frequencies (e.g., 2 kHz to 4 kHz)
   Human frequency response:
      Frequency masking: If a stronger sound and weaker
      sound compete, you can’t hear the weaker sound. Don’t
      encode it.
      Temporal masking: After a loud sound, there’s a while
      before we can hear a soft sound.
      Stereo redundancy: At low frequencies, we can’t detect
      where the sound is coming from. Encode it mono.




                                                               12
Perceptual Coding: Examples
 MP3 = MPEG 1/2 layer 3 audio; achieves CD quality
 in about 192 kbps (a 3.7:1 compression ratio): higher
 compression possible
 Sony MiniDisc uses Adaptive Transform Coding
 (ATRAC) to achieve a 5:1 compression ratio (about
 141 kbps)


                     http://www.mpeg.org
             http://www.minidisc.org/aes_atrac.html




Artefacts of compression
 Some areas of the spectrum are lost in the
 encoding process
   MP3 encoded recordings rarely sound identical to
   original uncompressed audio files
   On small or PC speakers, however, MP3 compressed
   audio can be acceptable




                                                         13
Examples




                  (1.12MB)         128kbps (105KB)




                  96Kbps(78.9KB)    64kbps (52.6KB)




WAV File (34Mb)




                                                      14
Mp3 file (3Mb)




    LPC and Parametric Coding
    LPC and Parametric Coding

LPC (Linear Predictive Coding)
  Based on the human utterance organ model
          s(n) = a1s(n-1) + a2s(n-2) +…+ aks(n-k) + e(n)
  Estimate a1, a2, …ak and e(n) for each piece (frame) of
  speech
  Encode and transmit/store a1, a2, …ak and type of e(n)
  Decoder reproduce speech using a1, a2, …ak and e(n)
 - very low bit rate but relatively low speech quality
Parametric coding:
  Only coding parameters of sound generation model
  LPC is an example where parameters are a1, a2, …ak , e(n)
  Music instrument parameters: pitch, loudness, timbre, …




                                                              15
Speech Compression
Speech Compression

 Handling speech with other media information such as
 text, images, video, and data is the essential part of
 multimedia applications
 The ideal speech coder has a low bit-rate, high perceived
 quality, low signal delay, and low complexity.
 Delay
    Less than 150 ms one way end to end delay for a
                         -        - -
    conversation
    Processing (coding) delay, network delay
    Over Internet, ISDN, PSTN, ATM, …
 Complexity
    Computational complexity of speech coders depends on
    algorithms
    Contributes to achievable bit rate and processing delay
                                -




G.72x Speech Coding Standards
G.72x Speech Coding Standards


   Quality
      “intelligible” - >“natural” or “subjective” quality
      Depending on bit rate
                          -
   Bit-rate




                                                              16
G.72x Audio Coding Standards
G.72x Audio Coding Standards

 Silence Compression - detect the "silence", similar to
 run-length coding
 Adaptive Differential Pulse Code Modulation (ADPCM)
 e.g., in CCITT G.721 -- 16 or 32 Kb/s.
     (a) Encodes the difference between two or more
     consecutive signals; the difference is then quantized- - >
     hence the loss
     (b) Adapts at quantization so fewer bits are used when
     the value is smaller.
     It is necessary to predict where the waveform is headed-
     - >difficult
 Linear Predictive Coding (LPC) fits signal to speech
 model and then transmits parameters of model --> sounds
 like a computer talking, 2.4 Kb/s.




Video Digitization and Compression
   Video is sequence of images (frames) displayed at
   constant frame rate
      e.g. 24 images/sec
   Digital image is a 2-D array of pixels
   Each pixel represented by bits
      R:G:B
      Y:U:V
         Y = 0.299R + 0.587G + 0.114B (Luminance or Brightness)
         U = B - Y (Chrominance 1, color difference)
         V = R - Y (Chrominance 2, color difference)
   Redundancy
      spatial
      Temporal




                                                                  17
Intra-frame coding




             Transform              Quantize                  Encode

          JPEG (Joint Photographic Experts Group)
Original size
     640x480x3=922KB
JPEG Compression Ratios:
   30:1 to 50:1 compression is possible with small to moderate defects
   100:1 compression is quite feasible for low-quality purposes




    JPEG Steps
1 Block Preparation:
     From RGB to YUV (YIQ) planes
     8x8 blocks
2 Transform:
     2-D Discrete Cosine Transform (DCT) on blocks (lossy?)
3 Quantization:
     Quantize DCT Coefficients (lossy)
4 Encoding of Quantized Coefficients (lossless)
     Zigzag Scan
     Differential Pulse Code Modulation (DPCM) on DC component
     Run Length Encoding (RLE) on AC Components
     Entropy Coding: Huffman or Arithmetic




                                                                         18
JPEG           Transform          Quantize              Encode




                                                     Block
                                                     Preparation



                  Transform        Quantize




                                                      Decompression:
                                 Encode               Reverse the order




 (1) Block Preparation




     RGB Input Data                       After Block Preparation

 Input image: 640 x 480 RGB (24 bits/pixel) transformed to three planes:
     Y: (640 x 480, 8-bit/pixel) Luminance (brightness) plane.
     U, V: (320 X 240 8-bits/pixel) Chrominance (color) planes.




                                                                           19
(2) Discrete Cosine Transform (DCT)
A transformation from spatial domain to frequency domain (similar to FFT)
Definition of 8-point DCT:




F[0,0] is the DC component and other F[u,v] define AC components of DCT




 The 64 (8 x 8) DCT Basis Functions
                                       u
       DC Component




                                                              v




       Block-based 2-D DCT
       •Karhunen-Loeve (KL) transform ?




                                                                            20
8x8 DCT Example




                                             or v
                                                        or u
                                    DC Component


  Original values of an 8x8 block         Corresponding DCT coefficients
     (in spatial domain)                      (in frequency domain)




(3) Quantized                                  q(u,v)

DCT Coefficients
 Uniform quantization:
 Divide by constant N and round result.
 In JPEG, each DCT F[u,v] is divided by
 a constant q(u,v).
  - quantization table (filter ?)


F[u,v]

                                                               Rounded
                                                               F[u,v]/ q(u,v)




                                                                                21
(4) Zigzag Scan
 Maps an 8x8 block into a 1 x 64 vector
 Zigzag pattern group low frequency coefficients in top of vector.




(5) Encoding of Quantized
DCT Coefficients
   DC Components:
       DC component of a block is large and varied, but often
       close to the DC value of the previous block.
       Encode the difference of DC component from previous 8x8
       blocks using Differential Pulse Code Modulation (DPCM).

   AC components:
       The 1x64 vector has lots of zeros in it.
       Using RLE, encode as (skip, value) pairs, where skip is the
       number of zeros and value is the next non-zero component.
       Send (0,0) as end-of-block value.




                                                                     22
(6) Runlength Coding
              A typical 8x8 block of quantized DCT coefficients.
       Most of the higher order coefficients have been quantized to 0.

                            12 34    0 54 0 0      0     0
                            87 0     0 12 3 0      0     0
                             16 0    0 0 0 0       0     0
                             0 0     0 0 0 0       0     0
                             0 0     0 0 0 0       0     0
                             0 0     0 0 0 0       0     0
                             0 0     0 0 0 0       0     0
                             0 0     0 0 0 0       0     0

       Zig-zag scan: the sequence of DCT coefficients to be transmitted:
               12 34 87 16 0 0 54 0 0 0 0 0 0 12 0 0 3 0 0 0 .....
            DC coefficient (12) is sent via a separate Huffman table.
                   Runlength coding remaining coefficients:
             34 | 87 | 16 | 0 0 54 | 0 0 0 0 0 0 12 | 0 0 3 | 0 0 0 .....

  Further compression: statistical (entropy) coding




  Quantization Table Used           Compressed Image

                                                                  JPEG
                               Compression Ratio: 7.7
                                                                 Example


                               Compression Ratio: 12.3

                                                                       Original
                                                                        Image


                               Compression Ratio: 33.9


                                                             Blocking artifact
                                                             (JPEG 2000 ?)

                               Compression Ratio: 60.1




                                                                                  23
MPEG: Inter-Frame Coding
                                          Predicted
 Intra-coded                               P-frame
   I-frame




Motion Estimation + Compesentation




                                                      24
Video compression: A big picture




Bi-Directional Prediction
 Intra-Coded
   I-Frame




                                        Bi-directional
        I B B P B B P B B P B B I         Predicted
                                          B-Frame
                Group of frames (GOF)

Q: 3D Transform Coding ?




                                                         25
VBR vs CBR: Rate Control
  Variable-Bit-Rate                                            Rate
                                                             Controller
      Fixed quantizer Qp                          Qp


      “Constant” quality                                                              CBR
                                      Raw          Video      VBR         Smoothing
      E.g. RMVB                                   Encoder                  Buffer

  Constant-Bit-Rate
      Adaptive quanitzer
      “Constant” rate – easier control
           Difference (compared to target
           rate can be 0.5% or less)
           E.g. RM, MPEG-1
      Rate-distortion optimization
  Recall that transport layer also has
  rate control …




Standardization Organizations
 ITU-T VCEG (Video Coding Experts
 Group)
     standards for advanced moving image
     coding methods appropriate for
     conversational and non-conversational
     audio/visual applications.
 ISO/IEC MPEG        (Moving Picture
 Experts Group)
    standards for compression and coding,
    decompression, processing, and
    coded representation of moving
    pictures, audio, and their combination
                                                WG - work group
 Relation                                       SG – sub group
     ITU-T H.262~ISO/IEC 13818-2(mpeg2)          ISO/IEC JTC 1/SC 29/WG 1
        Generic Coding of Moving Pictures and      Coding of Still Pictures
     Associated Audio.
                                                 ISO/IEC JTC 1/SC 29/WG 11
     ITU-T H.263~ISO/IEC 14496-2(mpeg4)




                                                                                            26
Coding Rate and Standards

    Mobile   Videophone   ISDN
                                            Video CD     Digital TV         HDTV
  videophone over PSTN videophone


     8     16            64      384        1.5            5                20
                kbit/s                                    Mbit/s



   Very low bitrate           Low bitrate              Medium bitrate       High bitrate


    MPEG-4 H.263 H.261 MPEG-1                                      MPEG-2




ISO MPEG-1 (Moving Pictures Experts Group).
   MPEG-1
Progressively scanned video for
   multimedia applications, at a bit
   rate 1.5Mb/s access time for
   CD-ROM players.
   Video format: near VHS quality




                                                                                           27
ISO MPEG-2
 MPEG-2

 Standard for Digital Television,
 DVD
 4 to 8 Mb/s / 10 to 15 Mb/s >>
 MPEG -1
 Supports various modes of
 scalability (Spatial, temporal,
 SNR)
 There are differences in
 quantization and better Variable
 length codes tables for
 progressive video sequences.




ISO MPEG-4
 A much broader standard.
 MPEG-4 was aimed primarily
 at low bit rate video
 communication, but not limited
 to
 Applications:
      1. Digital television
      2. Interactive graphics
         applications
      3. Interactive multimedia
         (World Wide Web)
 Two version: Divx 3 and Divx
 4 (Internet world)
 Important concept
      Video object




                                    28
MPEG-4 Object Video
 Instead of ”frames”: Video Object Planes
 Shape Adaptive DCT


                 A video frame
                                                                      Alpha map
 VOP




                                                                      SA DCT



Background VOP               VOP




MPEG-4 Structure


                                  A/V
                                          Decoder
                                 object
                                                    Compositor




                                  A/V
                                          Decoder
                                 object
  Bitstream                                                      Audio/Video scene
                  MUX



                                  A/V
                                          Decoder
                                 object




                                                                                     29
Example

                       Object 3


      Object 1




                                  Object 4
      Object 2




 Problems, comments?




Another Example




                                             30
Status
 Microsoft, RealVideo,
 QuickTime, ...
    But only recentagular frame
    based
 H.264 = MPEG-4 part 10
 (2003)
 Shape coding
 Synthetic scene




H.264
 H.26x (x=1,2,3)
     ITU-T Recommendations
     Real time video communication applications.
 MPEG Standards
     Video storage, broadcast video, video streaming applications
 H.26 L = ITU-T + MPEG = JVT coding
 Latest project of Joint Video Team formed by ITU-T SG16 Q6
 ( VCEG) and the ISO/IEC JTC 1/SC 29 WG 11 ( MPEG )
 Basic configuration similar to H.263 and MPEG-4 Part 2




                                                                    31
H.264 Design
 Goals
 Enhanced Compression performance
 Provision of network friendly packet based video
 representation addressing the conversational and non-
 conversational applications
 Conceptual Separation between Video Coding Layer ( VCL)
 and Network Adaptation Layer ( NAL)




H.264 Design ( Contd. )


                  Video Coding Layer
                                             Control Data




            Macro-block

                    Data Partitioning
          Slice/Partition

                  Network Adaptation Layer




                                                            32
H.264 Design ( Contd.)
 Video Coding Layer
 Core High compression representation
 Block based motion compensated transform video coder
 New features enabled to achieve significant improvement in coding
 efficiency.

 Network Adaptation Layer
 Provides the ability to customize the format of the VCL data over a
 variety of networks
 Unique packet based interface
 Packetisation and appropriate signaling is a part of NAL
 specification




Video Coding Evolution


                                                              H.264




         Y. Wang, J. Ostermann, Y.-Q. Zhang, Digital Video
         Processing and Communication. Prentice Hall, 2001.




                                                                       33

Contenu connexe

Tendances

Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filterarulraj121
 
Image Restoration
Image RestorationImage Restoration
Image RestorationPoonam Seth
 
Block Truncation Coding
Block Truncation CodingBlock Truncation Coding
Block Truncation Codingriyagam
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentationasodariyabhavesh
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domainAshish Kumar
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentalsA B Shinde
 
Fundamental steps in Digital Image Processing
Fundamental steps in Digital Image ProcessingFundamental steps in Digital Image Processing
Fundamental steps in Digital Image ProcessingShubham Jain
 
Thresholding.ppt
Thresholding.pptThresholding.ppt
Thresholding.pptshankar64
 
image basics and image compression
image basics and image compressionimage basics and image compression
image basics and image compressionmurugan hari
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial DomainA B Shinde
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processingAhmed Daoud
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing PresentationRevanth Chimmani
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compressionjeevithaelangovan
 

Tendances (20)

Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filter
 
Image Restoration
Image RestorationImage Restoration
Image Restoration
 
Block Truncation Coding
Block Truncation CodingBlock Truncation Coding
Block Truncation Coding
 
NOISE FILTERS IN IMAGE PROCESSING
NOISE FILTERS IN IMAGE PROCESSINGNOISE FILTERS IN IMAGE PROCESSING
NOISE FILTERS IN IMAGE PROCESSING
 
Image compression models
Image compression modelsImage compression models
Image compression models
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentation
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domain
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentals
 
Fundamental steps in Digital Image Processing
Fundamental steps in Digital Image ProcessingFundamental steps in Digital Image Processing
Fundamental steps in Digital Image Processing
 
Thresholding.ppt
Thresholding.pptThresholding.ppt
Thresholding.ppt
 
image basics and image compression
image basics and image compressionimage basics and image compression
image basics and image compression
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial Domain
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
Ppt ---image processing
Ppt ---image processingPpt ---image processing
Ppt ---image processing
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Linear block code
Linear block codeLinear block code
Linear block code
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing Presentation
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
 

Similaire à Arithmetic Coding

Noise infotheory1
Noise infotheory1Noise infotheory1
Noise infotheory1vmspraneeth
 
Noise info theory and Entrophy
Noise info theory and EntrophyNoise info theory and Entrophy
Noise info theory and EntrophyIzah Asmadi
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptPrincessSaro
 
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.pptZeyadAlabsy
 
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdfSHIVAM691605
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
Huffman coding
Huffman codingHuffman coding
Huffman codingGeorge Ang
 
Lecture1
Lecture1Lecture1
Lecture1ntpc08
 
Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)nes
 
Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Aref35
 
VII Compression Introduction
VII Compression IntroductionVII Compression Introduction
VII Compression Introductionsangusajjan
 
Lossless image compression.(1)
Lossless image compression.(1)Lossless image compression.(1)
Lossless image compression.(1)MohnishSatidasani
 

Similaire à Arithmetic Coding (20)

Lec5 Compression
Lec5 CompressionLec5 Compression
Lec5 Compression
 
Noise infotheory1
Noise infotheory1Noise infotheory1
Noise infotheory1
 
Noise info theory and Entrophy
Noise info theory and EntrophyNoise info theory and Entrophy
Noise info theory and Entrophy
 
add9.5.ppt
add9.5.pptadd9.5.ppt
add9.5.ppt
 
Image compression
Image compressionImage compression
Image compression
 
Komdat-Kompresi Data
Komdat-Kompresi DataKomdat-Kompresi Data
Komdat-Kompresi Data
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
 
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
 
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Lecture1
Lecture1Lecture1
Lecture1
 
Data Compression
Data CompressionData Compression
Data Compression
 
Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)
 
Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3
 
VII Compression Introduction
VII Compression IntroductionVII Compression Introduction
VII Compression Introduction
 
Compressionbasics
CompressionbasicsCompressionbasics
Compressionbasics
 
Lossless
LosslessLossless
Lossless
 
Lossless
LosslessLossless
Lossless
 
Lossless image compression.(1)
Lossless image compression.(1)Lossless image compression.(1)
Lossless image compression.(1)
 

Plus de anithabalaprabhu (20)

Shannon Fano
Shannon FanoShannon Fano
Shannon Fano
 
Ch 04 Arithmetic Coding ( P P T)
Ch 04  Arithmetic  Coding ( P P T)Ch 04  Arithmetic  Coding ( P P T)
Ch 04 Arithmetic Coding ( P P T)
 
Compression
CompressionCompression
Compression
 
Datacompression1
Datacompression1Datacompression1
Datacompression1
 
Speech Compression
Speech CompressionSpeech Compression
Speech Compression
 
Z24 4 Speech Compression
Z24   4   Speech CompressionZ24   4   Speech Compression
Z24 4 Speech Compression
 
Dictor
DictorDictor
Dictor
 
Dictionary Based Compression
Dictionary Based CompressionDictionary Based Compression
Dictionary Based Compression
 
Module 4 Arithmetic Coding
Module 4 Arithmetic CodingModule 4 Arithmetic Coding
Module 4 Arithmetic Coding
 
Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
06 Arithmetic 1
06 Arithmetic 106 Arithmetic 1
06 Arithmetic 1
 
Lassy
LassyLassy
Lassy
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Lossy
LossyLossy
Lossy
 
Planning
PlanningPlanning
Planning
 
Losseless
LosselessLosseless
Losseless
 
Lec32
Lec32Lec32
Lec32
 
Huffman Student
Huffman StudentHuffman Student
Huffman Student
 
Huffman Encoding Pr
Huffman Encoding PrHuffman Encoding Pr
Huffman Encoding Pr
 

Dernier

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Dernier (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Arithmetic Coding

  • 1. Source Coding Wireless Ad Hoc Networks University of Tehran, Dept. of E&CE, Fall 2007, Farshad Lahouti Media Basics Contents: Brief introduction to digital media (Audio/Video) Digitization Compression Representation Standards 1
  • 2. Signal Digitization Pulse Code Modulation (PCM) Sampling Sampling theory – Nyquist theorem the discrete time sequence of a sampled continuous function { V(tn) } contains enough information to reproduce the function V=V(t) exactly provided that the sampling rate is at least twice that of the highest frequency contained in the original signal V(t) Analog signal sampled at a constant rate telephone: 4 kHz signal BW, 8,000 samples/sec CD music: 22 kHz signal BW, 44,100 samples/sec 2
  • 3. Quantization Discretization along energy axis Every time interval the signal is converted to a digital equivalent Using 2 bits the following signal can be digitized Digitization Examples Each sample quantized, Example: 8,000 samples/sec, i.e., rounded 256 quantized values --> 64,000 bps e.g., 28 possible quantized values Receiver converts it back to analog signal: Each quantized value some quality reduction represented by bits Example rates 8 bits for 256 values CD: 1.411 Mbps – 16 bits/sample stereo Internet telephony: 5.3 - 13 kbps MP3: 96, 128, 160 kbps 3
  • 4. Approximate size for 1 second audio Channels Resolution Fs File Size Mono 8bit 8Khz 64Kb Stereo 8bit 8Khz 128Kb Mono 16bit 8Khz 128Kb Stereo 16bit 16Khz 256Kb Stereo 16bit 44.1Khz 1441Kb* Stereo 24bit 44.1Khz 2116Kb 1CD 700M 70-80 mins Lossy and lossless Compression Lossless compression (more later) Data Compression APE (MonkeyAudio) Image compression for biomedical applications … Lossy compression Hide errors where humans will not see or hear it Study hearing and vision system to understand how we see/hear Perceptual Coding 4
  • 5. Requirements for Compression Algorithms Lossless compression Decoded signal is mathematically equivalent to the original one Drawback : achieves only a small or modest level of compression Lossy compression Decoded signal is of a lower quality than the original one Advantage: achieves very high degree of compression Objective: maximize the degree of compression with a certain quality General compression requirements Ensure a good quality of decoded signal Achieve high compression ratios Minimize the complexity of the encoding and decoding process Support multiple channels Support various data rates Give small delay in processing Compression Tools Transform Coding Variable Rate Coding Entropy Coding Huffman Coding Run-length Coding Predictive Coding DPCM ADPCM 5
  • 6. Variable Length Coding Ignores semantics of input data and compresses media streams by regarding them as sequences of digits or symbols Examples: run length encoding, Huffman encoding , ... - Run-length encoding A compression technique that replaces consecutive occurrences of a symbol with the symbol followed by the number of times it is repeated a a a a a => 5a 000000000000000000001111111 => 0x20 1x7 Most useful where symbols appear in long runs: e.g., for images that have areas where the pixels all have the same value, fax and cartoons for examples. Entropy coding A few words about Entropy Entropy A measure of information content Entropy of the English Language How much information does each character in “typical” English text contain? From a probability view If the probability of a binary event is 0.5 (like a coin), then, on average, you need one bit to represent the result of this event. As the probability of a binary event The figure is expressing that unless an increases or decreases, the number of bits event is totally random, you can convey you need, on average, to represent the the information of the event in fewer bits, result decreases on average, than it might first appear 6
  • 7. Entropy (Shannon 1948) For a set of messages S with probability p(s), s ∈S, the self information of s is: 1 i ( s) = log = − log p( s) p ( s) measured in bits if the log is base 2. The lower the probability, the higher the self-information Entropy is the weighted average of self information. 1 H ( S ) = ∑ p( s) log s∈S p( s) Entropy Example p(S ) = {0.25, 0.25, 0.25, 0.125, 0.125} H (S ) = 3 × 0.25 log 4 + 2 × 0.125 log 8 = 2.25 p(S ) = {0.5, 0.125, 0.125, 0.125, 0.125} H (S ) = 0.5 log 2 + 4 × 0.125 log 8 = 2 p(S ) = {0.75, 0.0625, 0.0625, 0.0625, 0.0625} H (S ) = 0.75 log(4 / 3) + 4 × 0.0625 log 16 = 1.3 7
  • 8. Statistical (Entropy) Coding Entropy Coding • Lossless coding • Takes advantage of the probabilistic nature of information • Example: Huffman coding, arithmetic coding Theorem (Shannon) (lower bound): For any probability distribution p(S) with associated uniquely decodable code C, H ( S ) ≤ la (C ) Recall Huffman coding… Huffman Coding A popular compression technique that assigns variable length codes to symbols, so that the most frequently occurring symbols have the shortest codes Huffman coding is particularly effective where the data are dominated by a small number of symbols Suppose to encode a source of N =8 symbols: {a,b,c,d,e,f,g,h} The probabilities of these symbols are: P(a) = 0.01, P(b)=0.02, P(c)=0.05, P(d)=0.09, P(e)=0.18, P(f)=0.2, P(g)=0.2, P(h)=0.25 If we assign 3 bits per symbol (N =2^3=8), the average length of the symbols is: The theoretical lowest average length – entropy H(P) = - ∑ iN=0 P(i)log2P(i) = 2.57 bits /symbol If we use Huffman encoding, the average length = 2.63 bits/symbol 8
  • 9. Huffman Coding (Cont’d) The Huffman code assignment procedure is based on a binary tree structure. This tree is developed by a sequence of pairing operations in which the two least probable symbols are joined at a node to form two branches of a tree. More precisely: 1. The list of probabilities of the source symbols are associated with the leaves of a binary tree. 2. Take the two smallest probabilities in the list and generate an intermediate node as their parent and label the branch from parent to one of the child nodes 1 and the branch from parent to the other child 0. 3. Replace the probabilities and associated nodes in the list by the single new intermediate node with the sum of the two probabilities. If the list contains only one element, quit. Otherwise, go to step 2. Huffman Coding (Cont’d) 9
  • 10. Huffman Coding (Cont’d) The new average length of the source is The efficiency of this code is How do we estimate the P(i) ? Relative frequency of the symbols How to decode the bit stream ? Share the same Huffman table How to decode the variable length codes ? Prefix codes have the property that no codeword can be the prefix (i.e., an initial segment) of any other codeword. Huffman codes are prefix codes ! 11010000000010001 => ? Does the best possible codes guarantee to always reduce the size of sources? No. Worst case exists. Huffman coding is better averagely. Huffman coding is particularly effective where the data are dominated by a small number of symbols Transform Coding Frequency analysis ? Time domain ? Not easy! Time domain -> Transform domain Sequence to be coded is converted into new sequence using a transformation rule. New sequence - transform coefficients. Process is reversible - get back to original sequence using inverse transformation. Example - Fourier transform (FT) Coefficients represent proportion of energy contributed by different frequencies. 10
  • 11. Transform Coding (Cont…) In transform coding - choose transformation such that only subset of coefficients have significant values. Energy confined to subset of ‘important’ coefficients. Known as ‘energy compaction’. Example - FT of bandlimited signal: Differential Coding – DPCM & ADPCM Based on the fact that neighboring samples … x(n-1), x(n), x(n+1), … in a discrete time sequence changes slowly in many applications, e.g., voice, audio, … A differential PCM coder (DPCM) quantizes and encodes the difference d(n) = x(n) – x(n-1) Advantage of using difference d(n) instead of the actual value x(n) Reduce the number of bits to represent a sample General DPCM: d(n) = x(n) – a1x(n-1) - a2x(n-2) -…- akx(n-k) a1, a2, …ak are fixed Adaptive DPCM: a1, a2, …ak are dynamically changed with signal 11
  • 12. Psychoacoustic Human aural response Psychoacoustic Model Basically: If you can’t hear the sound, don’t encode it Natural Bandlimiting Audio perception is 20-20 kHz but most sounds in low frequencies (e.g., 2 kHz to 4 kHz) Human frequency response: Frequency masking: If a stronger sound and weaker sound compete, you can’t hear the weaker sound. Don’t encode it. Temporal masking: After a loud sound, there’s a while before we can hear a soft sound. Stereo redundancy: At low frequencies, we can’t detect where the sound is coming from. Encode it mono. 12
  • 13. Perceptual Coding: Examples MP3 = MPEG 1/2 layer 3 audio; achieves CD quality in about 192 kbps (a 3.7:1 compression ratio): higher compression possible Sony MiniDisc uses Adaptive Transform Coding (ATRAC) to achieve a 5:1 compression ratio (about 141 kbps) http://www.mpeg.org http://www.minidisc.org/aes_atrac.html Artefacts of compression Some areas of the spectrum are lost in the encoding process MP3 encoded recordings rarely sound identical to original uncompressed audio files On small or PC speakers, however, MP3 compressed audio can be acceptable 13
  • 14. Examples (1.12MB) 128kbps (105KB) 96Kbps(78.9KB) 64kbps (52.6KB) WAV File (34Mb) 14
  • 15. Mp3 file (3Mb) LPC and Parametric Coding LPC and Parametric Coding LPC (Linear Predictive Coding) Based on the human utterance organ model s(n) = a1s(n-1) + a2s(n-2) +…+ aks(n-k) + e(n) Estimate a1, a2, …ak and e(n) for each piece (frame) of speech Encode and transmit/store a1, a2, …ak and type of e(n) Decoder reproduce speech using a1, a2, …ak and e(n) - very low bit rate but relatively low speech quality Parametric coding: Only coding parameters of sound generation model LPC is an example where parameters are a1, a2, …ak , e(n) Music instrument parameters: pitch, loudness, timbre, … 15
  • 16. Speech Compression Speech Compression Handling speech with other media information such as text, images, video, and data is the essential part of multimedia applications The ideal speech coder has a low bit-rate, high perceived quality, low signal delay, and low complexity. Delay Less than 150 ms one way end to end delay for a - - - conversation Processing (coding) delay, network delay Over Internet, ISDN, PSTN, ATM, … Complexity Computational complexity of speech coders depends on algorithms Contributes to achievable bit rate and processing delay - G.72x Speech Coding Standards G.72x Speech Coding Standards Quality “intelligible” - >“natural” or “subjective” quality Depending on bit rate - Bit-rate 16
  • 17. G.72x Audio Coding Standards G.72x Audio Coding Standards Silence Compression - detect the "silence", similar to run-length coding Adaptive Differential Pulse Code Modulation (ADPCM) e.g., in CCITT G.721 -- 16 or 32 Kb/s. (a) Encodes the difference between two or more consecutive signals; the difference is then quantized- - > hence the loss (b) Adapts at quantization so fewer bits are used when the value is smaller. It is necessary to predict where the waveform is headed- - >difficult Linear Predictive Coding (LPC) fits signal to speech model and then transmits parameters of model --> sounds like a computer talking, 2.4 Kb/s. Video Digitization and Compression Video is sequence of images (frames) displayed at constant frame rate e.g. 24 images/sec Digital image is a 2-D array of pixels Each pixel represented by bits R:G:B Y:U:V Y = 0.299R + 0.587G + 0.114B (Luminance or Brightness) U = B - Y (Chrominance 1, color difference) V = R - Y (Chrominance 2, color difference) Redundancy spatial Temporal 17
  • 18. Intra-frame coding Transform Quantize Encode JPEG (Joint Photographic Experts Group) Original size 640x480x3=922KB JPEG Compression Ratios: 30:1 to 50:1 compression is possible with small to moderate defects 100:1 compression is quite feasible for low-quality purposes JPEG Steps 1 Block Preparation: From RGB to YUV (YIQ) planes 8x8 blocks 2 Transform: 2-D Discrete Cosine Transform (DCT) on blocks (lossy?) 3 Quantization: Quantize DCT Coefficients (lossy) 4 Encoding of Quantized Coefficients (lossless) Zigzag Scan Differential Pulse Code Modulation (DPCM) on DC component Run Length Encoding (RLE) on AC Components Entropy Coding: Huffman or Arithmetic 18
  • 19. JPEG Transform Quantize Encode Block Preparation Transform Quantize Decompression: Encode Reverse the order (1) Block Preparation RGB Input Data After Block Preparation Input image: 640 x 480 RGB (24 bits/pixel) transformed to three planes: Y: (640 x 480, 8-bit/pixel) Luminance (brightness) plane. U, V: (320 X 240 8-bits/pixel) Chrominance (color) planes. 19
  • 20. (2) Discrete Cosine Transform (DCT) A transformation from spatial domain to frequency domain (similar to FFT) Definition of 8-point DCT: F[0,0] is the DC component and other F[u,v] define AC components of DCT The 64 (8 x 8) DCT Basis Functions u DC Component v Block-based 2-D DCT •Karhunen-Loeve (KL) transform ? 20
  • 21. 8x8 DCT Example or v or u DC Component Original values of an 8x8 block Corresponding DCT coefficients (in spatial domain) (in frequency domain) (3) Quantized q(u,v) DCT Coefficients Uniform quantization: Divide by constant N and round result. In JPEG, each DCT F[u,v] is divided by a constant q(u,v). - quantization table (filter ?) F[u,v] Rounded F[u,v]/ q(u,v) 21
  • 22. (4) Zigzag Scan Maps an 8x8 block into a 1 x 64 vector Zigzag pattern group low frequency coefficients in top of vector. (5) Encoding of Quantized DCT Coefficients DC Components: DC component of a block is large and varied, but often close to the DC value of the previous block. Encode the difference of DC component from previous 8x8 blocks using Differential Pulse Code Modulation (DPCM). AC components: The 1x64 vector has lots of zeros in it. Using RLE, encode as (skip, value) pairs, where skip is the number of zeros and value is the next non-zero component. Send (0,0) as end-of-block value. 22
  • 23. (6) Runlength Coding A typical 8x8 block of quantized DCT coefficients. Most of the higher order coefficients have been quantized to 0. 12 34 0 54 0 0 0 0 87 0 0 12 3 0 0 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zig-zag scan: the sequence of DCT coefficients to be transmitted: 12 34 87 16 0 0 54 0 0 0 0 0 0 12 0 0 3 0 0 0 ..... DC coefficient (12) is sent via a separate Huffman table. Runlength coding remaining coefficients: 34 | 87 | 16 | 0 0 54 | 0 0 0 0 0 0 12 | 0 0 3 | 0 0 0 ..... Further compression: statistical (entropy) coding Quantization Table Used Compressed Image JPEG Compression Ratio: 7.7 Example Compression Ratio: 12.3 Original Image Compression Ratio: 33.9 Blocking artifact (JPEG 2000 ?) Compression Ratio: 60.1 23
  • 24. MPEG: Inter-Frame Coding Predicted Intra-coded P-frame I-frame Motion Estimation + Compesentation 24
  • 25. Video compression: A big picture Bi-Directional Prediction Intra-Coded I-Frame Bi-directional I B B P B B P B B P B B I Predicted B-Frame Group of frames (GOF) Q: 3D Transform Coding ? 25
  • 26. VBR vs CBR: Rate Control Variable-Bit-Rate Rate Controller Fixed quantizer Qp Qp “Constant” quality CBR Raw Video VBR Smoothing E.g. RMVB Encoder Buffer Constant-Bit-Rate Adaptive quanitzer “Constant” rate – easier control Difference (compared to target rate can be 0.5% or less) E.g. RM, MPEG-1 Rate-distortion optimization Recall that transport layer also has rate control … Standardization Organizations ITU-T VCEG (Video Coding Experts Group) standards for advanced moving image coding methods appropriate for conversational and non-conversational audio/visual applications. ISO/IEC MPEG (Moving Picture Experts Group) standards for compression and coding, decompression, processing, and coded representation of moving pictures, audio, and their combination WG - work group Relation SG – sub group ITU-T H.262~ISO/IEC 13818-2(mpeg2) ISO/IEC JTC 1/SC 29/WG 1 Generic Coding of Moving Pictures and Coding of Still Pictures Associated Audio. ISO/IEC JTC 1/SC 29/WG 11 ITU-T H.263~ISO/IEC 14496-2(mpeg4) 26
  • 27. Coding Rate and Standards Mobile Videophone ISDN Video CD Digital TV HDTV videophone over PSTN videophone 8 16 64 384 1.5 5 20 kbit/s Mbit/s Very low bitrate Low bitrate Medium bitrate High bitrate MPEG-4 H.263 H.261 MPEG-1 MPEG-2 ISO MPEG-1 (Moving Pictures Experts Group). MPEG-1 Progressively scanned video for multimedia applications, at a bit rate 1.5Mb/s access time for CD-ROM players. Video format: near VHS quality 27
  • 28. ISO MPEG-2 MPEG-2 Standard for Digital Television, DVD 4 to 8 Mb/s / 10 to 15 Mb/s >> MPEG -1 Supports various modes of scalability (Spatial, temporal, SNR) There are differences in quantization and better Variable length codes tables for progressive video sequences. ISO MPEG-4 A much broader standard. MPEG-4 was aimed primarily at low bit rate video communication, but not limited to Applications: 1. Digital television 2. Interactive graphics applications 3. Interactive multimedia (World Wide Web) Two version: Divx 3 and Divx 4 (Internet world) Important concept Video object 28
  • 29. MPEG-4 Object Video Instead of ”frames”: Video Object Planes Shape Adaptive DCT A video frame Alpha map VOP SA DCT Background VOP VOP MPEG-4 Structure A/V Decoder object Compositor A/V Decoder object Bitstream Audio/Video scene MUX A/V Decoder object 29
  • 30. Example Object 3 Object 1 Object 4 Object 2 Problems, comments? Another Example 30
  • 31. Status Microsoft, RealVideo, QuickTime, ... But only recentagular frame based H.264 = MPEG-4 part 10 (2003) Shape coding Synthetic scene H.264 H.26x (x=1,2,3) ITU-T Recommendations Real time video communication applications. MPEG Standards Video storage, broadcast video, video streaming applications H.26 L = ITU-T + MPEG = JVT coding Latest project of Joint Video Team formed by ITU-T SG16 Q6 ( VCEG) and the ISO/IEC JTC 1/SC 29 WG 11 ( MPEG ) Basic configuration similar to H.263 and MPEG-4 Part 2 31
  • 32. H.264 Design Goals Enhanced Compression performance Provision of network friendly packet based video representation addressing the conversational and non- conversational applications Conceptual Separation between Video Coding Layer ( VCL) and Network Adaptation Layer ( NAL) H.264 Design ( Contd. ) Video Coding Layer Control Data Macro-block Data Partitioning Slice/Partition Network Adaptation Layer 32
  • 33. H.264 Design ( Contd.) Video Coding Layer Core High compression representation Block based motion compensated transform video coder New features enabled to achieve significant improvement in coding efficiency. Network Adaptation Layer Provides the ability to customize the format of the VCL data over a variety of networks Unique packet based interface Packetisation and appropriate signaling is a part of NAL specification Video Coding Evolution H.264 Y. Wang, J. Ostermann, Y.-Q. Zhang, Digital Video Processing and Communication. Prentice Hall, 2001. 33