SlideShare une entreprise Scribd logo
1  sur  24
Audio Compression
Techniques
  Lecture 8


              Prepared by
              Razia Nisar Noorani

                                    1
Introduction
   Digital Audio Compression
     Removal   of redundant or otherwise irrelevant
      information from audio signal
     Audio compression algorithms are often referred to as
      “audio encoders”
   Applications
     Reduces required storage space
     Reduces required transmission bandwidth




                                                          2
Audio Compression
   Audio signal – overview
     Sampling   rate (# of samples per second)
     Bit rate (# of bits per second). Typically,
      uncompressed stereo 16-bit 44.1KHz signal has a
      1.4MBps bit rate
     Number of channels (mono / stereo / multichannel)
   Reduction by lowering those values or by data
    compression / encoding



                                                          3
Audio Data Compression
   Redundant information
     Implicit
             in the remaining information
     Ex. oversampled audio signal
          oversampling is the process of sampling a signal with a
           sampling frequency significantly higher than twice the
           bandwidth or highest frequency of the signal being sampled
   Irrelevant information
     Perceptuallyinsignificant
     Cannot be recovered from remaining information



                                                                        4
Audio Data Compression
   Lossless Audio Compression
     Removes   redundant data
     Resulting signal is same as original – perfect
      reconstruction
   Lossy Audio Encoding
     Removes   irrelevant data
     Resulting signal is similar to original


                                                       5
Audio Data Compression
   Audio vs. Speech Compression
    Techniques
     Speech  Compression uses a human vocal
      tract model to compress signals
     Audio Compression does not use this
      technique due to larger variety of possible
      signal variations


                                                    6
Generic Audio Encoder
   Psychoacoustic Model
     Psychoacoustics – study of how sounds are
      perceived by humans
     Uses perceptual coding
         eliminate information from audio signal that is
          inaudible to the ear
     Detectsconditions under which different audio
     signal components mask each other

                                                            7
Psychoacoustic Model
   Signal Masking
     Threshold  cut-off
     Spectral (Frequency / Simultaneous) Masking
     Temporal Masking
   Threshold cut-off and spectral masking
    occur in frequency domain, temporal
    masking occurs in time domain

                                                8
Signal Masking
   Threshold cut-off
     Hearing  threshold
      level – a function of
      frequency
     Any frequency
      components below the
      threshold will not be
      perceived by human
      ear


                              9
Signal Masking
   Spectral Masking
    A   frequency
      component can be
      partly or fully masked
      by another component
      that is close to it in
      frequency
     This shifts the hearing
      threshold


                                10
Signal Masking
   Temporal Masking
    A  quieter sound can
      be masked by a louder
      sound if they are
      temporally close
     Sounds that occur
      both (shortly) before
      and after volume
      increase can be
      masked


                              11
Spectral Analysis
   a device or algorithm that identifies a
    frequency domain representation of a
    time domain signal.
   Tasks of Spectral Analysis
     To derive masking thresholds to determine which
      signal components can be eliminated
     To generate a representation of the signal to which
      masking thresholds can be applied
   Spectral Analysis is done through transforms or
    filter banks
                                                            12
Spectral Analysis
   Transforms
     Fast Fourier Transform (FFT)
     Discrete Cosine Transform (DCT) - similar to
      FFT but uses cosine values only
     Modified Discrete Cosine Transform (MDCT)
      [used by MPEG-1 Layer-III, MPEG-2 AAC,
      Dolby AC-3] – overlapped and windowed
      version of DCT


                                                     13
Spectral Analysis
   Filter Banks
   a filter bank is an array of band-pass filters that
    separates the input signal into multiple
    components, each one carrying a single
    frequency subband of the original signal
     Time  sample blocks are passed through a set of
      bandpass filters
     Masking thresholds are applied to resulting frequency
      subband signals
     Poly-phase and wavelet banks are most popular filter
      structures                                          14
Filter Bank Structures
   Polyphase Filter Bank
    [used in all of the MPEG-1 encoders]
     Signal is separated into subbands, the widths
      of which are equal over the entire frequency
      range
     The resulting subband signals are
      downsampled to create shorter signals (which
      are later reconstructed during decoding
      process)

                                                  15
Filter Bank Structures
   Wavelet Filter Bank
    [used by Enhanced Perceptual Audio
    Coder (EPAC) by Lucent]
     Unlike  polyphase filter, the widths of the
      subbands are not evenly spaced (narrower for
      higher frequencies)
     This allows for better time resolution (ex. short
      attacks), but at expense of frequency
      resolution

                                                     16
Noise Allocation
   System Task: derive and apply shifted hearing
    threshold to the input signal
     Anything  below the threshold doesn’t need to be
      transmitted
     Any noise below the threshold is irrelevant
   Frequency component quantization
     Tradeoff between space and noise
     Encoder saves on space by using just enough bits for
      each frequency component to keep noise under the
      threshold - this is known as noise allocation

                                                         17
Noise Allocation
   Pre-echo
     In case a single audio block contains silence followed
      by a loud attack, pre-echo error occurs - there will be
      audible noise in the silent part of the block after
      decoding
     This is avoided by pre-monitoring audio data at
      encoding stage and separating audio into shorter
      blocks in potential pre-echo case
     This does not completely eliminate pre-echo, but can
      make it short enough to be masked by the attack
      (temporal masking)

                                                            18
Additional Encoding Techniques
   Other encoding techniques techniques are
    available (alternative or in combination)
     Predictive Coding
     Coupling / Delta Encoding
     Huffman Encoding




                                            19
Additional Encoding Techniques
   Predictive Coding
     Often used in speech and image compression
     Estimates the expected value for each sample based
      on previous sample values
     Transmits/stores the difference between the expected
      and received value
     Generates an estimate for the next sample and then
      adjusts it by the difference stored for the current
      sample
     Used for additional compression in MPEG2 AAC
      (Advance audio Coding)
                                                        20
Additional Encoding Techniques
   Coupling / Delta encoding
     Used  in cases where audio signal consists of two or
      more channels (stereo or surround sound)
     Similarities between channels are used for
      compression
     A sum and difference between two channels are
      derived; difference is usually some value close to zero
      and therefore requires less space to encode
     This is a case of lossless encoding process



                                                           21
Additional Encoding Techniques
   Huffman Coding
     Information-theory-based   technique
     An element of a signal that often reoccurs in the
      signal is represented by a simpler symbol, and its
      value is stored in a look-up table
     Implemented using a look-up tables in encoder and in
      decoder
     Provides substantial lossless compression, but
      requires high computational power and therefore is
      not very popular
     Used by MPEG1 and MPEG2 AAC

                                                         22
Encoding - Final Stages
 Audio data packed into frames
 Frames stored or transmitted




                                  23
Questions



            24

Contenu connexe

Tendances

Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
Disha Modi
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
Godo Dodo
 

Tendances (20)

Audio encoding principles
Audio encoding principlesAudio encoding principles
Audio encoding principles
 
Audio compression 1
Audio compression 1Audio compression 1
Audio compression 1
 
Compression
CompressionCompression
Compression
 
Mp3
Mp3Mp3
Mp3
 
Audio compression
Audio compression Audio compression
Audio compression
 
Digital Video And Compression
Digital Video And CompressionDigital Video And Compression
Digital Video And Compression
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
Data compression
Data compressionData compression
Data compression
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
 
Audio and Video Compression
Audio and Video CompressionAudio and Video Compression
Audio and Video Compression
 
SPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEWSPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEW
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Chapter 5 - Data Compression
Chapter 5 - Data CompressionChapter 5 - Data Compression
Chapter 5 - Data Compression
 
Audio compression
Audio compressionAudio compression
Audio compression
 
simple video compression
simple video compression simple video compression
simple video compression
 
Video Compression
Video CompressionVideo Compression
Video Compression
 
Hw2
Hw2Hw2
Hw2
 
Data Communication & Computer network: Data compression
Data Communication & Computer network: Data compressionData Communication & Computer network: Data compression
Data Communication & Computer network: Data compression
 
MPEG Compression Standards
MPEG Compression StandardsMPEG Compression Standards
MPEG Compression Standards
 

En vedette (14)

05 audio
05 audio05 audio
05 audio
 
AUDIO DEVICES, FORMATS AND CODECS
AUDIO DEVICES, FORMATS AND CODECSAUDIO DEVICES, FORMATS AND CODECS
AUDIO DEVICES, FORMATS AND CODECS
 
Audio devices, formats and codecs
Audio devices, formats and codecsAudio devices, formats and codecs
Audio devices, formats and codecs
 
Lecture # 3
Lecture # 3Lecture # 3
Lecture # 3
 
Lecture # 2
Lecture # 2Lecture # 2
Lecture # 2
 
Audio Codec
Audio CodecAudio Codec
Audio Codec
 
Lecture5 graphics
Lecture5   graphicsLecture5   graphics
Lecture5 graphics
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Ppt on audio file formats
Ppt on audio file formatsPpt on audio file formats
Ppt on audio file formats
 
Lecture 4 text
Lecture 4   textLecture 4   text
Lecture 4 text
 
Analog to digital conversion
Analog to digital conversionAnalog to digital conversion
Analog to digital conversion
 

Similaire à Lecture 8 audio compression

Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarking
Likan Patra
 

Similaire à Lecture 8 audio compression (20)

Final presentation
Final presentationFinal presentation
Final presentation
 
Sub band project
Sub band projectSub band project
Sub band project
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
 
M1L1-2.ppt
M1L1-2.pptM1L1-2.ppt
M1L1-2.ppt
 
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm PortfolioPHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
 
Multimedia Compression and Communication
Multimedia Compression and CommunicationMultimedia Compression and Communication
Multimedia Compression and Communication
 
Presentation2
Presentation2Presentation2
Presentation2
 
Digital Watermarking Of Audio Signals.pptx
Digital Watermarking Of Audio Signals.pptxDigital Watermarking Of Audio Signals.pptx
Digital Watermarking Of Audio Signals.pptx
 
Audio compression
Audio compressionAudio compression
Audio compression
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.ppt
 
Lte course
Lte courseLte course
Lte course
 
Cancellation of Noise from Speech Signal using Voice Activity Detection Metho...
Cancellation of Noise from Speech Signal using Voice Activity Detection Metho...Cancellation of Noise from Speech Signal using Voice Activity Detection Metho...
Cancellation of Noise from Speech Signal using Voice Activity Detection Metho...
 
Mk3422222228
Mk3422222228Mk3422222228
Mk3422222228
 
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition TechniquesAnalysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition Techniques
 
Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report
 
Audio watermarking
Audio watermarkingAudio watermarking
Audio watermarking
 
Speaker recognition.
Speaker recognition.Speaker recognition.
Speaker recognition.
 
Audio_Overview.pptx
Audio_Overview.pptxAudio_Overview.pptx
Audio_Overview.pptx
 

Plus de Mr SMAK

Fyp list batch-2009 (project approval -rejected list)
Fyp list batch-2009 (project approval -rejected list)Fyp list batch-2009 (project approval -rejected list)
Fyp list batch-2009 (project approval -rejected list)
Mr SMAK
 
Assigments2009
Assigments2009Assigments2009
Assigments2009
Mr SMAK
 
Evaluation of cellular network
Evaluation of cellular networkEvaluation of cellular network
Evaluation of cellular network
Mr SMAK
 
Common protocols
Common protocolsCommon protocols
Common protocols
Mr SMAK
 
Cellular network
Cellular networkCellular network
Cellular network
Mr SMAK
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
Mr SMAK
 
Parallel architecture
Parallel architectureParallel architecture
Parallel architecture
Mr SMAK
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
Mr SMAK
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
Mr SMAK
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
Mr SMAK
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
Mr SMAK
 
Chapter 2 ASE
Chapter 2 ASEChapter 2 ASE
Chapter 2 ASE
Mr SMAK
 
Structure of project plan and schedule
Structure of project plan and scheduleStructure of project plan and schedule
Structure of project plan and schedule
Mr SMAK
 
Proposal format
Proposal formatProposal format
Proposal format
Mr SMAK
 
Proposal announcement batch2009
Proposal announcement batch2009Proposal announcement batch2009
Proposal announcement batch2009
Mr SMAK
 
List ofsuparco projectsforuniversities
List ofsuparco projectsforuniversitiesList ofsuparco projectsforuniversities
List ofsuparco projectsforuniversities
Mr SMAK
 
Fyp timeline & assessment policy batch 2009
Fyp timeline & assessment policy batch 2009Fyp timeline & assessment policy batch 2009
Fyp timeline & assessment policy batch 2009
Mr SMAK
 

Plus de Mr SMAK (20)

Fyp list batch-2009 (project approval -rejected list)
Fyp list batch-2009 (project approval -rejected list)Fyp list batch-2009 (project approval -rejected list)
Fyp list batch-2009 (project approval -rejected list)
 
Assigments2009
Assigments2009Assigments2009
Assigments2009
 
Week1
Week1Week1
Week1
 
Evaluation of cellular network
Evaluation of cellular networkEvaluation of cellular network
Evaluation of cellular network
 
Common protocols
Common protocolsCommon protocols
Common protocols
 
Cellular network
Cellular networkCellular network
Cellular network
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Parallel architecture
Parallel architectureParallel architecture
Parallel architecture
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Chapter 2 ASE
Chapter 2 ASEChapter 2 ASE
Chapter 2 ASE
 
Structure of project plan and schedule
Structure of project plan and scheduleStructure of project plan and schedule
Structure of project plan and schedule
 
Proposal format
Proposal formatProposal format
Proposal format
 
Proposal announcement batch2009
Proposal announcement batch2009Proposal announcement batch2009
Proposal announcement batch2009
 
List ofsuparco projectsforuniversities
List ofsuparco projectsforuniversitiesList ofsuparco projectsforuniversities
List ofsuparco projectsforuniversities
 
Fyp timeline & assessment policy batch 2009
Fyp timeline & assessment policy batch 2009Fyp timeline & assessment policy batch 2009
Fyp timeline & assessment policy batch 2009
 

Dernier

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Dernier (20)

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Lecture 8 audio compression

  • 1. Audio Compression Techniques Lecture 8 Prepared by Razia Nisar Noorani 1
  • 2. Introduction  Digital Audio Compression  Removal of redundant or otherwise irrelevant information from audio signal  Audio compression algorithms are often referred to as “audio encoders”  Applications  Reduces required storage space  Reduces required transmission bandwidth 2
  • 3. Audio Compression  Audio signal – overview  Sampling rate (# of samples per second)  Bit rate (# of bits per second). Typically, uncompressed stereo 16-bit 44.1KHz signal has a 1.4MBps bit rate  Number of channels (mono / stereo / multichannel)  Reduction by lowering those values or by data compression / encoding 3
  • 4. Audio Data Compression  Redundant information  Implicit in the remaining information  Ex. oversampled audio signal  oversampling is the process of sampling a signal with a sampling frequency significantly higher than twice the bandwidth or highest frequency of the signal being sampled  Irrelevant information  Perceptuallyinsignificant  Cannot be recovered from remaining information 4
  • 5. Audio Data Compression  Lossless Audio Compression  Removes redundant data  Resulting signal is same as original – perfect reconstruction  Lossy Audio Encoding  Removes irrelevant data  Resulting signal is similar to original 5
  • 6. Audio Data Compression  Audio vs. Speech Compression Techniques  Speech Compression uses a human vocal tract model to compress signals  Audio Compression does not use this technique due to larger variety of possible signal variations 6
  • 7. Generic Audio Encoder  Psychoacoustic Model  Psychoacoustics – study of how sounds are perceived by humans  Uses perceptual coding  eliminate information from audio signal that is inaudible to the ear  Detectsconditions under which different audio signal components mask each other 7
  • 8. Psychoacoustic Model  Signal Masking  Threshold cut-off  Spectral (Frequency / Simultaneous) Masking  Temporal Masking  Threshold cut-off and spectral masking occur in frequency domain, temporal masking occurs in time domain 8
  • 9. Signal Masking  Threshold cut-off  Hearing threshold level – a function of frequency  Any frequency components below the threshold will not be perceived by human ear 9
  • 10. Signal Masking  Spectral Masking A frequency component can be partly or fully masked by another component that is close to it in frequency  This shifts the hearing threshold 10
  • 11. Signal Masking  Temporal Masking A quieter sound can be masked by a louder sound if they are temporally close  Sounds that occur both (shortly) before and after volume increase can be masked 11
  • 12. Spectral Analysis  a device or algorithm that identifies a frequency domain representation of a time domain signal.  Tasks of Spectral Analysis  To derive masking thresholds to determine which signal components can be eliminated  To generate a representation of the signal to which masking thresholds can be applied  Spectral Analysis is done through transforms or filter banks 12
  • 13. Spectral Analysis  Transforms  Fast Fourier Transform (FFT)  Discrete Cosine Transform (DCT) - similar to FFT but uses cosine values only  Modified Discrete Cosine Transform (MDCT) [used by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3] – overlapped and windowed version of DCT 13
  • 14. Spectral Analysis  Filter Banks  a filter bank is an array of band-pass filters that separates the input signal into multiple components, each one carrying a single frequency subband of the original signal  Time sample blocks are passed through a set of bandpass filters  Masking thresholds are applied to resulting frequency subband signals  Poly-phase and wavelet banks are most popular filter structures 14
  • 15. Filter Bank Structures  Polyphase Filter Bank [used in all of the MPEG-1 encoders]  Signal is separated into subbands, the widths of which are equal over the entire frequency range  The resulting subband signals are downsampled to create shorter signals (which are later reconstructed during decoding process) 15
  • 16. Filter Bank Structures  Wavelet Filter Bank [used by Enhanced Perceptual Audio Coder (EPAC) by Lucent]  Unlike polyphase filter, the widths of the subbands are not evenly spaced (narrower for higher frequencies)  This allows for better time resolution (ex. short attacks), but at expense of frequency resolution 16
  • 17. Noise Allocation  System Task: derive and apply shifted hearing threshold to the input signal  Anything below the threshold doesn’t need to be transmitted  Any noise below the threshold is irrelevant  Frequency component quantization  Tradeoff between space and noise  Encoder saves on space by using just enough bits for each frequency component to keep noise under the threshold - this is known as noise allocation 17
  • 18. Noise Allocation  Pre-echo  In case a single audio block contains silence followed by a loud attack, pre-echo error occurs - there will be audible noise in the silent part of the block after decoding  This is avoided by pre-monitoring audio data at encoding stage and separating audio into shorter blocks in potential pre-echo case  This does not completely eliminate pre-echo, but can make it short enough to be masked by the attack (temporal masking) 18
  • 19. Additional Encoding Techniques  Other encoding techniques techniques are available (alternative or in combination)  Predictive Coding  Coupling / Delta Encoding  Huffman Encoding 19
  • 20. Additional Encoding Techniques  Predictive Coding  Often used in speech and image compression  Estimates the expected value for each sample based on previous sample values  Transmits/stores the difference between the expected and received value  Generates an estimate for the next sample and then adjusts it by the difference stored for the current sample  Used for additional compression in MPEG2 AAC (Advance audio Coding) 20
  • 21. Additional Encoding Techniques  Coupling / Delta encoding  Used in cases where audio signal consists of two or more channels (stereo or surround sound)  Similarities between channels are used for compression  A sum and difference between two channels are derived; difference is usually some value close to zero and therefore requires less space to encode  This is a case of lossless encoding process 21
  • 22. Additional Encoding Techniques  Huffman Coding  Information-theory-based technique  An element of a signal that often reoccurs in the signal is represented by a simpler symbol, and its value is stored in a look-up table  Implemented using a look-up tables in encoder and in decoder  Provides substantial lossless compression, but requires high computational power and therefore is not very popular  Used by MPEG1 and MPEG2 AAC 22
  • 23. Encoding - Final Stages  Audio data packed into frames  Frames stored or transmitted 23
  • 24. Questions 24

Notes de l'éditeur

  1. Hello, Today I will talk about the common techniques commonly used for digital audio compression of various audio filetype formats.
  2. -I will discuss the difference between redundant and irrelevant further in my presentation. -Depending on storage or transmission, there is an optimization in size