SlideShare une entreprise Scribd logo
1  sur  159
Télécharger pour lire hors ligne
Director, eSILICON LABS, INDIA
VIDEO CODEC
Vinayagam M
Next Generation Broadcasting Technology
2
3
Agenda
HVS
Images / Video
Video / Image Compression
Image Coding
Video Coding
Video Coder Architecture
Video Codec Standards
HEVC
4
HVS
5
HVS
• HVS properties influence the
design/tradeoffs of imaging/video
systems
• Basic properties of HVS “front-end”
– 4 types of photo-receptors in the retina
– Rods, 3 types of cones
• Rods
– Achromatic (no concept of color)
– Used for scotopic vision (low light levels)
– Concentrated in periphery
• Cones
– 3 types: S - Short, M- Medium, L - Long
– Red, Green, and Blue peaks
– Used for Photopic Vision (daylight levels)
– Concentrated in fovea (center of the
retina)
6
HVS…
• Eyes, optic nerve, parts of the brain
• Transforms electromagnetic energy
• Image Formation
– Cornea, Sclera, Pupil, Iris, Lens, Retina, Fovea
• Transduction
– Retina, Rods, and Cones
• Processing
– Optic Nerve, Brain
• Retina and Fovea
– Retina has photosensitive receptors at back of eye
– Fovea is small, dense region of receptors
Only cones (no rods)
Gives visual acuity
– Outside Fovea
Fewer receptors overall
Larger proportion of rods
Fov
ea
Retina
7
HVS…
• Transduction (Retina)
– Transform light to neural impulses
– Receptors signal bipolar cells
– Bipolar cells signal ganglion cells
– Axons in the ganglion cells form optic
nerve
• Image Formation in the Human Eye
8
HVS…
• HVS Properties
– Tradeoff in resolution between space and time
Low resolution for high spatial AND high temporal frequencies
However, eye tracking can convert fast-moving object into low retinal frequency
– Achromatic versus chromatic channels
Achromatic channel has highest spatial resolution
Yellow/Blue has lower spatial resolution than Red/Green channel
– Color refers to how we perceive a narrow band of electromagnetic energy
Source, Object, Observer
9
HVS…
• Visual System
– Visual system transforms light energy into sensory experience of sight
10
HVS…
• Color Perception (Color Theory)
– Hue
Distinguishes named colors, e.g., RGB
Dominant wavelength of the light
– Saturation
Perceived intensity of a specific color
How far color is from a gray of equal intensity
– Brightness (lightness)
Perceived intensity
Hue Scale
SaturationOriginallightness
11
HVS…
• Visual Perception
– Resolution and Brightness
– Spatial Resolution depends on
Image Size
Viewing Distance
– Brightness
Perception of brightness is higher than perception of color
Different perception of primary colors
Relative Brightness: green:red:blue=59%:30%:11%
– B/W vs. Color
12
HVS…
• Visual Perception
– Temporal Resolution
Effects caused by inertia of human eye
Perception of 16 frames/second as continuous sequence
Special Effect: Flicker
Flicker
Perceived if frame rate or refresh rate of screen too low (<50Hz)
Especially in large bright areas
Higher refresh rate requires
Higher scanning frequency
Higher bandwidth
13
HVS…
• Visual Perception Influence
– Viewing distance
– Display ratio (width/height – 4/3 for conventional TV)
– Number of details still visible
– Intensity (luminance)
14
HVS…
• Imaging / Visual System designed
based on HVS principles
• Example
– Image Sensor
– Television
– Image / Video Display
• Image Sensor
– CCD (charge coupled device):
Arrays of photo diodes
Linearity
Less light needed
Electronic shuttering
– CMOS
Cheaper
Easy manufacturing
• Television
– NTSC (National Television System
Committee):
60 Hz, 30 fps, 525 scan lines
North America, Japan, Korea ….
– PAL (Phase Alteration by Line):
50 Hz, 25 fps, 625 scan lines
Europe …
• Image / Video Display
– CRT Monitor
– LCD TV/Display Monitor
15
IMAGE / VIDEO
16
IMAGE / VIDEO
• Images
– View Observation by HVS @ time instant
– A multidimensional array of numbers (such as intensity image) or vectors
(such as color image)
Each component in the image called pixel
associates with the pixel value (a single
number in the case of intensity images or
a vector in the case of color images)












39871532
22132515
372669
28161010












39656554
42475421
67965432
43567065












99876532
92438585
67969060
78567099
17
IMAGE / VIDEO…
• Video
– Series of Frames (or Images)
18
IMAGE / VIDEO…
• Images / Video Frame
– A multidimensional function of spatial coordinates
– Spatial Coordinate
(x,y) for 2D case such as photograph,
(x,y,z) for 3D case such as CT scan images
(x,y,t) for movies
– The function f may represent intensity (for monochrome images) or color
(for color images) or other associated values
Image “After snow storm” f(x,y)
x
y
Origin
19
IMAGE / VIDEO…
• Images / Video Frame
– An image that has been discretized both in Spatial coordinates and
associated value
Consist of 2 sets:(1) a point set and (2) a value set
Can be represented in the form
– I = {(x,a(x)): x ε X, a(x) ε F}
where X and F are a point set and value set, respectively
An element of the image, (x,a(x)) is called a pixel
where
x is called the pixel location and
a(x) is the pixel value at the location x
– Conventional Coordinate for Image Representation
20
IMAGE / VIDEO…
• Images / Video Frame Representation
– Basic Unit : Pixel
– Dimensions
Height
Width
– Frame rate determines how long the pixel
exists, i.e. how it moves
– Color Depth of the pixel
How many bits are used to represent the color of
each pixel?
21
IMAGE / VIDEO…
• Image Type
– Binary Image
– Intensity Image
– Color Image
– Index image
22
IMAGE / VIDEO…
• Binary Image
– Binary image or black and white image
– Each pixel contains one bit
1 represent white
0 represents black












1111
1111
0000
0000
Binary Data
23
IMAGE / VIDEO…
• Intensity Image
– Intensity / Monochrome/ Gray Scale Image
– Each pixel corresponds to light intensity normally represented in gray
scale (gray level)












39871532
22132515
372669
28161010
Gray Scale Values
24
IMAGE / VIDEO…
• Color Image
– Each pixel contains a vector representing red, green and blue components












39871532
22132515
372669
28161010












39656554
42475421
67965432
43567065












99876532
92438585
67969060
78567099
RGB Components
25
IMAGE / VIDEO…
• Index Image
– Each pixel contains index number pointing to a color in a color table










256
746
941
Index Value
Index
No.
Red
component
Green
component
Blue
component
1 0.1 0.5 0.3
2 1.0 0.0 0.0
3 0.0 1.0 0.0
4 0.5 0.5 0.5
5 0.2 0.8 0.9
… … … …
Color Table
26
IMAGE / VIDEO…
• Colourspace Representations
– RGB (Red, Green, Blue) – Basic analog components (from camera/to TV)
– YPbPr (Y,B-Y,R-Y) – ANALOG Colourspace (derived from RGB)
Y=Luminance, B=Blue,
– R=Red
– YUV – Colour difference signals scaled to be modulated on a composite
carrier
– YIQ – Used in NTSC. I=In-phase, Q=Quadrature (IQ plane is 33o rotation
of UV plane)
– YCbCr/YCC – DIGITAL representation of the YPbPr Colourspace (8bit, 2s
compliment)
27
IMAGE / VIDEO…
• RGB Color
– All color can be composed by adding specific amounts of R, G, & B
– 8-bits (28) specifies the amount of each color
– This is the scheme used by most electronic displays to generate color;
e.g. we often call our computer monitors, "RGB displays"
8-bits Red
8-bits Green
8-bits Blue
28
IMAGE / VIDEO…
• Color Reduction
– Human eye is not as sensitive to color as it is to Luminance
– To this end, to save costs the various standards decided to
Maintain luminance information in our images, but Reduce color information
Using RBG, though, how do we easily reduce color information without
removing luminance?
For this, and other technical reasons, a separate color space was chosen by
most video standards …
29
IMAGE / VIDEO…
• Colour Image: RGB
• YCbCr
– Even though most displays actually
use RGB to create the image, YCbCr
is used most often in consumer
electronics for transmission of the
image
– Historically, B/W televisions
transmitted only luminance (Y)
– The color signals were added later
30
IMAGE / VIDEO…
• YCbCr Generated By Sub sampling
– YUV 4:4:4 = 8bits per Y,U,V channel (no downsampling the chroma
channels)
– YUV 4:2:2 = 4 Y pixels sampled for every 2 U and 2 V (2:1 horizontal
downsampling, no vertical downsampling
– YUV 4:2:0 = 2:1 horizontal downsampling, 2:1 vertical downsampling
– YUV 4:1:1 = 4 Y pixels sampled for every 1 U and 1 V (4:1 horizontal
downsampling, no vertical downsampling)
• YUV 4:4:4
Y Y Y Y
Y Y Y Y
4:4:4 Format (3 bytes/pixel):
Cb Cr Cb Cr Cb Cr Cb Cr
Cb Cr Cb Cr Cb Cr Cb Cr
31
IMAGE / VIDEO…
• YUV 4:2:2
• YUV 4:2:0
Y Y Y Y
Y Y Y Y
4:2:2 Format (2 bytes/pixel):
Cb Cr
Cb Cr
Cb Cr
Cb Cr
Y Y Y Y
Y Y Y Y
Cb Cr Cb Cr
4:2:0 Format (1.5 bytes/pixel):
32
IMAGE / VIDEO…
• Up sampling
• Downsampling
nT
Input Signal
1 2 3 4
F(nT) F(nT/2)
nT
Intermediate Signal
12345678
Interpolating
low-pass filter
nT
nT
F(nT/2)
Output Signal
12345678
nT
Input Signal
1 2 3 4
F(nT)
Decimating
low-pass filter
prevents alias
at lower rate
F(2nT)
1
Output Signal
2
33
IMAGE / VIDEO…
• RGB to YCbCr
• RGB to YUV Conversion
– Y = 0.299R + 0.587G + 0.114B
– U= (B-Y)*0.565
– V= (R-Y)*0.713
U-V plane at Y=0.5
Clamp the output: Y=[16, 235], U,V=[16,239]
34
VIDEO / IMAGE COMPRESSION
35
VIDEO/IMAGE COMPRESSION
• How can we use fewer bits?
• To understand how image/audio/video signals are compressed to
save storage and increase transmission efficiency
• Reduces signal size by taking advantage of correlation
– Spatial
– Temporal
– Spectral
36
VIDEO/IMAGE COMPRESSION…
• Compression Methods
• Need to take advantage of redundancy
– Images
Space
Frequency
– Video
Space
Frequency
Time
Linear Predictive AutoRegressive Polynomial Fitting
Model-Based
Huffman
Statistical
Arithmetic Lempel-Ziv
Universal
Lossless
Spatial/Time-Domain
Subband Wavelet
Filter-Based
Fourier DCT
Transform-Based
Frequency-Domain
Lossy
Waveform-Based
Compression Methods
37
VIDEO/IMAGE COMPRESSION…
• Need to take advantage of redundancy
RGB
YCbCr
Blocks
Macro
Blocks
I B P
Remove Temporal Redundancy
Transform
QuantizationCoding
01100010101,0
Remove Spatial Redundancy
Motion
Compensation
38
VIDEO/IMAGE COMPRESSION…
• Spatial Redundancy
– Take advantage of similarity among most neighboring
pixels
• RGB to YUV
– Less information required for YUV (humans less sensitive
to chrominance)
• Macro Blocks
– Take groups of pixels (16x16)
• Discrete Cosine Transformation (DCT)
– Based on Fourier analysis where represent signal as sum
of sine's and cosine’s
– Concentrates on higher-frequency values
– Represent pixels in blocks with fewer numbers
• Quantization
– Reduce data required for coefficients
• Entropy coding
– Compress
39
VIDEO/IMAGE COMPRESSION…
• Spatial Redundancy Reduction
Zig-Zag Scan,
Run-length
coding
Quantization
• major reduction
• controls ‘quality’
“Intra-Frame
Encoded”
40
VIDEO/IMAGE COMPRESSION…
• When may spatial redundancy elimination be ineffective?
– High-resolution images and displays
– May appear ‘coarse’
• What kinds of images/movies?
– A varied image or ‘busy’ scene
– Many colors, few adjacent
Original (63 kb)
Low (7kb)
Very Low (4 kb)Due to Loss of Resolution
Solution ? Temporal Redundancy Reduction
41
VIDEO/IMAGE COMPRESSION…
• Temporal Redundancy Reduction
– Take advantage of similarity between successive frames
950 951 952
42
VIDEO/IMAGE COMPRESSION…
• Temporal Redundancy Reduction
– Take advantage of similarity between successive frames
43
VIDEO/IMAGE COMPRESSION…
• Temporal Redundancy Reduction
– Take advantage of similarity between successive frames
44
VIDEO/IMAGE COMPRESSION…
When may temporal redundancy
reduction be ineffective?
45
VIDEO/IMAGE COMPRESSION…
• Many scene changes vs. few scene changes
• Sometimes high motion
46
VIDEO/IMAGE COMPRESSION…
• Many scene changes vs. few scene changes
• Sometimes high motion
47
IMAGE CODING
48
IMAGE CODING
• Lossless Compression
• Lossy Compression
• Transform Coding
49
IMAGE CODING…
• Image compression system is composed of three key building blocks
– Representation
Concentrates important information into a few parameters
– Quantization
Discretizes parameters
– Binary encoding
Exploits non-uniform statistics of quantized parameters
Creates bitstream for transmission
50
IMAGE CODING…
• Image compression system is composed of three key building blocks
– Representation
Concentrates important information into a few parameters
– Quantization
Discretizes parameters
– Binary encoding
Exploits non-uniform statistics of quantized parameters
Creates bitstream for transmission
51
IMAGE CODING…
• Generally, the only operation that is lossy is the quantization stage
• The fact that all the loss (distortion) is localized to a single operation
greatly simplifies system design
• Can design loss to exploit human visual system (HVS) properties
• Source decoder performs the inverse of each of the three operations
52
IMAGE CODING…
• Representations - Transform and Subband Filtering Methods
– Goal
Transform signal into another domain where most of the information (energy) is
concentrated into only a small fraction of the coefficients
– Enables perceptual processing
Exploiting HVS response to different frequency components
53
IMAGE CODING…
• Representations - Transform and Subband Filtering Methods
– Examples of “traditional” transforms
KLT, DFT, DCT
– Examples of “traditional” Subband filtering methods
Perfect reconstruction filter banks, wavelets
– Transform and Subband interpretations
All of the above are linear representations and can be interpreted from either a
transform or a Subband filtering viewpoint
– Transform viewpoint
Express signal as a linear combination of basis vectors
Stresses linear expansion (linear algebra) perspective
– Subband filtering viewpoint
Pass signal through a set of filters and examine the frequencies passed by
each filter (Subband)
Stresses filtering (signal processing) perspective
54
IMAGE CODING…
• Representations – Transform Image Coding
– A good transform provides
Most of the image energy is concentrated into a small fraction of the
coefficients
Coding only these small fraction of the coefficients and discarding the rest can
often lead to excellent reconstructed quality
The more energy compaction the better
– Orthogonal transforms are particularly useful
Energy in discarded coefficients is equal to energy in reconstruction error
55
IMAGE CODING…
• Representations – Transform Image Coding
– Karhunen-Loeve Transform (KLT)
Optimal energy compaction
Requires knowledge of signal covariance
In general, no simple computational algorithm
– Discrete Fourier Transform (DFT)
Fast algorithms
Good energy compaction, but not as good as DCT
– Discrete Cosine Transform (DCT)
Fast algorithms
Good energy compaction
All real coefficients
Overall good performance and widely used for image and video coding
56
IMAGE CODING…
• Discrete Cosine Transform (DCT)
– 1-D Discrete Cosine Transform (N-point)
– 1-D DCT basis vectors
– 2-D DCT: Separable transform of 1-D DCT
– 2-D DCT basis vectors?
Basis pictures!
– 2-D basis vectors for 2-D DCT are basis pictures!
– 64 basis pictures for 8x8-pixel 2-D DCT
– Image coding with the 2-D DCT is equivalent to approximating the image
as a linear combination of these basis pictures!
57
IMAGE CODING…
• Representations – Coding Transform Coefficients
– Selecting the basis pictures to approximate an image is equivalent to
selecting the DCT coefficients to code
– General methods of coding/discarding coefficients
Zonal Coding
▫ Code all coefficients in a zone and discard others
▫ Example zone: Spatial low frequencies
▫ Only need to code coefficient amplitudes
Threshold Coding
▫ Keep coefficients with magnitude above a threshold
▫ Coefficient amplitudes and locations must be coded
▫ Provides best performance
58
IMAGE CODING…
• Video / Image Coding are Block
based Coding
– Frames are divided into Sub-Block
and then coded
• Macroblock (MB) and Block Layer
– Process the data in blocks of 8x8
samples
– Convert Red-Green-Blue into
Luminance (greyscale) and
Chrominance (Blue color difference
and Red color difference)
– Use half resolution for Chrominance
(because eye is more sensitive to
greyscale than to color)
59
IMAGE CODING…
• Macroblock (MB) and Block Layer
– Macroblock
Consist of
16x16 luminance block
8x8 chrominance block
Basic unit for motion estimation
– Block
8 pixels by 8 lines
Basic unit for DCT
60
IMAGE CODING…
• Lossless Compression
– General-Purpose Compression: Entropy Encoding
– Remove statistical redundancy from data
– ie, Encode common values with short codes, uncommon values with
longer codes
• Lossless Compression
– Huffman Coding
– Example : ABCCDEAAB
After compression: 1011000000001010111011
– Compression ratio
According to probability of the characters appears in the uncompressed
data
C:12 D:13 F:5 E:9 B:16 A:45
1425
55
100
30
10
0
0
0
0
1
1
1
1
000 001 0100 0101 011 1
61
IMAGE CODING…
• Lossless Compression
– Run-Length Coding
Reduce the number of samples to code
Implementation is simple
Input Sequence
0,0,-3,5,1,0,-2,0,0,0,0,2,-4,3,-2,0,0,0,1,0,0,-2,EOB
Run-Length Sequence
(2,-3)(0,5)(0,1)(1,-2)(4,2)(0,-4)(0,3)(0,-2)(3,1)(2,-2)EOB
62
IMAGE CODING…
• Lossless Compression
– Transform Coding
(-1,1) (1,1)
(0.4,1.4) = 0.4•(1,0)+1.4•(0,1)
= 0.9•(1,1)+0.5•(-1,1)
Basis vector
{ (1,0), (0,1) }
New basis vector
{ (1,1), (-1,1) }
New vector
(0.9, 0.5)
(0,1)
(1,0)
63
IMAGE CODING…
• Lossless Compression
– Transform Coding : DCT
Transform blocks of images to frequency domain, code only the significant
transform coefficients
2D DCT
– Transform Coding : DCT
8x8 DCT Basis Function
64
IMAGE CODING…
• Lossless Compression
– Transform Coding : DCT
2D DCT Coefficients
65
IMAGE CODING…
• Lossy Compression
– Lossy Predictive Coding
66
IMAGE CODING…
• Lossy Compression
– Quantization
Many to one mapping
Quantization is the most import means of irrelevancy reduction
– Implementation
Lookup Table
Divide by quantization step-size (round/truncate)
67
IMAGE CODING…
• Lossy Compression
– Divide by quantization step-size
Input signal:0 1 2 3 4 5 6 7(3 bits)
Step-size:2
Quantization:0 0 1 1 2 2 3 3(2 bits)
Inverse quantization:0 0 2 2 4 4 6 6
Quantization Errors:0 1 0 1 0 1 0 1
– Lookup Table
Divide each DCT coefficient by an integer, discard remainder
Result: loss of precision
Typically, a few non-zero coefficients are left
68
IMAGE CODING…
• Lossy Compression
– Zigzag Scan
Efficient encoding of the position
of non-zero transform
coefficients
Scan” quantized coefficients in a
zig-zag order
Non-zero coefficients tend to be
grouped together
69
IMAGE CODING…
• DCT + Quantization + Run-Level-Coding
70
VIDEO CODING
71
VIDEO CODING
• Lossless Compression
• Lossy Compression
• Transform Coding
• Motion Coding
72
VIDEO CODING…
• Video
– Sequence of frames (images) that are related
• Moving images contain significant temporal redundancy
– Successive frames are very similar
– Related along the temporal dimension - Temporal redundancy exists
73
VIDEO CODING…
• Video Coding
– The objective of video coding is to compress moving images
– Main addition over image compression
Temporal redundancy
Video coder must exploit the temporal redundancy
– The MPEG (Moving Picture Experts Group) and H.26X are the major
standards for video coding
• Video coding algorithms usually contains two coding schemes :
– Intraframe coding
Intraframe coding does not exploit the correlation among adjacent
frames
Intraframe coding therefore is similar to the still image coding
– Interframe coding
The interframe coding should include motion estimation/compensation process
to remove temporal redundancy
• Basic Concept
– Use interframe correlation for attaining better rate distortion
74
VIDEO CODING…
• Usually high frame rate: Significant temporal redundancy
• Possible representations along temporal dimension
– Transform/Subband Methods
Good for textbook case of constant velocity uniform global motion
Inefficient for nonuniform motion, I.e. real-world motion
Requires large number of frame stores
Leads to delay (Memory cost may also be an issue)
– Predictive Methods
Good performance using only 2 frame stores
However, simple frame differencing in not enough
75
VIDEO CODING…
• Main addition over image compression
– Exploit the temporal redundancy
• Predict current frame based on previously coded frames
• Types of coded frames
– I-frame
Intra-coded frame, coded independently of all other frames
– P-frame
Predictively coded frame, coded based on previously coded frame
– B-frame
Bi-directionally predicted frame, coded based on both previous and future coded
frames
76
VIDEO CODING…
• Motion-Compensated Prediction
– Simple frame differencing fails when there is motion
– Must account for motion
Motion-compensated (MC) prediction
– MC-prediction generally provides significant improvements
– Questions
How can we estimate motion?
How can we form MC-prediction?
• Motion Estimation
– Ideal Situation
Partition video into moving objects
Describe object motion
Generally very difficult
– Practical approach: Block-Matching Motion Estimation
Partition each frame into blocks
Describe motion of each block
No object identification required
Good, robust performance
77
VIDEO CODING…
• Block-Matching Motion Estimation
– Assumptions
Translational motion within block
All pixels within each block have the same motion
– ME Algorithm
Divide current frame into non-overlapping N1xN2 blocks
For each block, find the best matching block in reference frame
– MC-Prediction Algorithm
Use best matching blocks of reference frame as prediction of blocks in current
frame
78
VIDEO CODING…
• Block-Matching - Determining the Best Matching Block
– For each block in the current frame search for best matching block in the
reference frame
Metrics for determining “best match”
Candidate blocks: All blocks in, e.g., (± 32,±32) pixel area
Strategies for searching candidate blocks for best match
Full search: Examine all candidate blocks
Partial (fast) search: Examine a carefully selected subset
– Estimate of motion for best matching block: “motion vector”
• Motion Vectors and Motion Vector Field
– Motion Vector
Expresses the relative horizontal and vertical offsets (mv1,mv2), or motion, of
a given block from one frame to another
Each block has its own motion vector
– Motion Vector Field
Collection of motion vectors for all the blocks in a frame
79
VIDEO CODING…
• Example of Fast Search: 3-Step
(Log) Search
– Goal: Reduce number of search
points
Example:(± 7,±7) search area
Dots represent search points
Search performed in 3 steps
(coarse-to-fine)
– Step 1: (± 4 pixels )
– Step 2: (± 2 pixels )
– Step 3: (± 1 pixels )
– Best match is found at each step
– Next step: Search is centered
around the best match of prior step
– Speedup increases for larger
search areas
80
VIDEO CODING…
• Motion Vector Precision
– Motivation
Motion is not limited to integer-pixel offsets
However, video only known at discrete pixel locations
To estimate sub-pixel motion, frames must be spatially interpolated
– Fractional MVs are used to represent the sub-pixel motion
– Improved performance (extra complexity is worthwhile)
– Half-pixel ME used in most standards: MPEG-1/2/4
– Why are half-pixel motion vectors better?
Can capture half-pixel motion
Averaging effect (from spatial interpolation) reduces prediction error ->
Improved prediction
For noisy sequences, averaging effect reduces noise -> Improved
compression
81
VIDEO CODING…
• Practical Half-Pixel Motion Estimation Algorithm
– Half-Pixel ME (coarse-fine) Algorithm
Coarse Step: Perform integer motion estimation on blocks; find best integer-
pixel MV
Fine Step: Refine estimate to find best half-pixel MV
Spatially interpolate the selected region in reference frame
Compare current block to interpolated reference frame block
Choose the integer or half-pixel offset that provides best match
Typically, bilinear interpolation is used for spatial interpolation
• Example
– MC-Prediction for Two Consecutive Frames
82
VIDEO CODING…
• Bi-Directional MC-Prediction
– Bi-Directional MC-Prediction is used to estimate a block in the current
frame from a block in
Previous frame
Future frame
Average of a block from the previous frame and a block from the future frame
– Motion compensated prediction
Predict the current frame based on reference frame(s) while compensating for
the motion
– Examples of block-based motion-compensated prediction (P-frame)
and bi-directional prediction (B-frame)
83
VIDEO CODING…
• Motion Estimation and Compensation
– The amount of data to be coded can be reduced significantly if the
previous frame is subtracted from the current frame
84
VIDEO CODING…
• Motion Estimation and Compensation
– Uses Block-Matching
The MPEG and H.26X standards use block-matching technique for motion
estimation /compensation
In the block-matching technique, each current frame is divide into equal-size
blocks, called source blocks
Each source block is associated with a search region in the reference frame
The objective of block-matching is to find a candidate block in the search
region best matched to the source block
The relative distances between a source block and its candidate blocks are
called motion vectors
Video Sequence
The current frameThe reconstructed
reference frame
Bx: Search area
associated with X
MV: Motion Vector
X: Source block for
block-matching
85
VIDEO CODING…
• Motion Estimation and Compensation
– Uses Block-Matching
86
VIDEO CODING…
• Motion Estimation and Compensation
The Reconstructed Previous Frame The Current Frame
Results of Block-
Matching
The Predicted
Current Frame
87
VIDEO CODING…
• Motion Estimation and Compensation
– Search Range
– The size of the search range =
– The number of candidate blocks =
)2)(2( max2max1 yx dNdN  
)12)(12( maxmax   yx dd
88
VIDEO CODING…
• Motion Estimation and Compensation
– Motion Vector and Search Area
   pnpn 22 Search Area:
Motion vector: (u, v)
89
VIDEO CODING…
• Motion Estimation and Compensation
– Matching Function
Mean square error(MSE)
Mean absolute difference(MAD)
Number of threshold difference(NTD)
Normalized cross-correlation function(NCF)






1
0
21
0
221121
21
21
1
1
1
1
)]1,,(),,([
1
),(
N
n
N
n
tdndnftnnf
NN
ddMSE






1
0
1
0
221121
21
21
1
1
1
1
|)1,,(),,(|
1
),(
N
n
N
n
tdndnftnnf
NN
ddMAD
90
VIDEO CODING…
• Motion Estimation and Compensation
– Algorithm
Full search block matching (FSB)
Fast algorithm
▫ 2D Logarithmic Search (TDL)
▫ Three Step Search (TSS)
▫ Cross-Search Algorithm (CSA)
▫ …
– Full Search Algorithm
If p=7, then there are
(2p+1)(2p+1)=225 candidate blocks.
u
v
Search Area
Candidate
Block
91
VIDEO CODING…
• Motion Estimation and Compensation
– Full Search Algorithm
Intensive computation
Need for fast Motion Estimation !
92
VIDEO CODING…
• Motion Estimation and Compensation
– 2D Logarithmic Search
Diamond-shape search area
Matching function
▫ MSE
-7 –6 –5 –4 –3 –2 –1 0 +1 +2 +3 +4 +5 +6 +7
+7
+6
+5
+4
+3
+2
+1
0
-1
-2
-3
-4
-5
-6
-7
MV
93
VIDEO CODING…
• Motion Estimation and Compensation
– Three-Step Search
The first step involves block-matching based on 4-pel resolution at the nine
location
The second step involves block-matching based on 2-pel resolution around the
location determined by the first step
The third step repeats the process in the second step (but with resolution 1-pel)
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
11 1
11
11 1
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
3
3
333
3
3 3
2 2 2
2
222
2
94
VIDEO CODING…
• Motion Estimation and Compensation
– Motion Vector Prediction
predMVx = Median(MV1x, MV2x, MV3x)
predMVy = Median(MV1y, MV2y, MV3y)
MVx`=MVx - predMVx
MVy`=MVy - predMVy
95
VIDEO CODER ARCHITECTURE
96
VIDEO CODER ARCHITECTURE
• Image / Video Coding Based on Block-Matching
– Assume frame f-1 has been encoded and reconstructed, and frame f is the
current frame to be encoded
• Exploiting the redundancies
– Temporal
MC-Prediction (P and B frames)
– Spatial
Block DCT
– Color
Color Space Conversion
• Scalar quantization of DCT coefficients
• Zigzag scanning, runlength and Huffman coding of the nonzero
quantized DCT coefficients
97
VIDEO CODER ARCHITECTURE…
• Video Encoder
– Divide frame f into equal-size blocks
– For each source block,
Find its motion vector using the block-matching algorithm based on the
reconstructed frame f-1
Compute the DFD of the block
– Transmit the motion vector of each block to decoder
– Compress DFD’s of each block
– Transmit the encoded DFD’s to decoder
98
VIDEO CODER ARCHITECTURE…
• Video Encoder
99
VIDEO CODER ARCHITECTURE…
• Video Decoder
– Receive motion vector of each block from encoder
– Based on the motion vector ,find the best-matching block from the
reference frame
ie,, Find the predicted current frame from the reference frame
– Receive the encoded DFD of each block from encoder
– Decode the DFD.
– Each reconstructed block in the current frame = Its decompressed DFD +
the best-matching block
100
VIDEO CODER ARCHITECTURE…
• Video Decoder
101
VIDEO CODEC STANDARDS
102
VIDEO CODEC STANDARDS
• Goal of Standards
– Ensuring Interoperability
Enabling communication between devices made by different manufacturers
– Promoting a technology or industry
– Reducing costs
What do the Standards Specify?
103
VIDEO CODEC STANDARDS…
What do the Standards Specify?
• Not the encoder
• Not the decoder
• Just the bitstream syntax and the decoding process(e.g. use IDCT,
but not how to implement the IDCT)
– Enables improved encoding & decoding strategies to be employed in a
standard-compatible manner
104
VIDEO CODEC STANDARDS…
• The Scope of Picture and Video Coding Standardization
– Only the Syntax and Decoder are standardized:
Permits optimization beyond the obvious
Permits complexity reduction for implementability
Provides no guarantees of Quality
Pre-Processing Encoding
Source
Destination
Post-Processing
& Error Recovery
Decoding
Scope of Standard
105
VIDEO CODEC STANDARDS…
106
VIDEO CODEC STANDARDS…
• Based on the same fundamental building blocks
– Motion-compensated prediction (I, P, and B frames)
– 2-D Discrete Cosine Transform (DCT)
– Color space conversion
– Scalar quantization, runlengths, Huffman coding
• Additional tools added for different applications:
– Progressive or interlaced video
– Improved compression, error resilience, scalability, etc.
• MPEG-1/2/4, H.261/3/4
– Frame-based coding
• MPEG-4
– Object-based coding and Synthetic video
107
VIDEO CODEC STANDARDS…
• The Video Standards uses all the three types of frames as shown
below
Encoding order: I0, P3, B1, B2, P6, B4, B5, I9, B7, B8.
Playback order: I0, B1, B2, P3, B4, B5, P6, B7, B8, I9.
108
VIDEO CODEC STANDARDS…
• Video Structure
– Video standards code video sequences in hierarchy of layers
– There are usually 5 Layers
GOP (Group of Pictures)
Picture
Slice
Macroblock
Block
109
VIDEO CODEC STANDARDS…
• Video Structure
– A GOP usually started with I frame, followed by a sequence of P and B
frames
– A Picture is indeed a frame in the video sequence
– A Slice is a portion in a picture
Some standards do not have slices
Some view a slice as a row
Each slice in H.264 is not necessary to be a row
It can be any shape containing integral number of macroblocks
– A Macroblock is a 16×16 block
Many standards use Marcoblocks as the basic unit for block-matching
operations
– A Block is a 8×8 block
Many standards use the Blocks as the basic unit for DCT
110
VIDEO CODEC STANDARDS…
• Scalable Video Coding
– Three classes of scalable video coding techniques
Temporal Scalability
Spatial Scalability
SNR Scalability
– Uses B frames for attaining temporal scalability
B frames depend on other frames
No other frames depend on B frames
Discard B frames without affecting other frames
111
VIDEO CODEC STANDARDS…
• Scalable Video Coding – Spatial Scalability
– Basically Resolution Scalability
Here the base layer is the low resolution version of the video sequence
– The base layer uses coaster quantizer for DFD coding
– The residuals in the base layer is refined in the enhancement layer
112
VIDEO CODEC STANDARDS…
113
HEVC
114
HEVC
• Video Coding Standards Overview
Next Generation Broadcasting
115
HEVC…
• MPEG-H
– High Efficiency Coding and Media Delivery in
Heterogeneous Environments a new suite of
standards providing technical solutions for
emerging challenges in multimedia industries
– Part 1: System, MPEG Media Transport (MMT)
Integrated services with multiple components in a hybrid
delivery environment, providing support for seamless and
efficient use of heterogeneous network environments,
including broadcast, multicast, storage media and mobile
networks
– Part 2: Video, High Efficiency Video Coding
(HEVC)
Highly immersive visual experiences, with ultra high definition
displays that give no perceptible pixel structure even if
viewed from such a short distance that they subtend a large
viewing angle (up to 55 degrees horizontally for 4Kx2K
resolution displays, up to 100 degrees for 8Kx4K)
– Part 3: Audio, 3D-Audio
Highly immersive audio experiences in which the decoding
device renders a 3D audio scene. This may be using 10.2 or
22.2 channel configurations or much more limited speaker
configurations or headphones, such as found in a personal
tablet or smartphone.
116
HEVC…
• Transport/System Layer Integration
– On going definitions (MPEG, IETF,…,DVB): benefit from H.264/AVC
– MPEG Media Transport (MMT) ?
117
HEVC…
• HEVC = High Efficiency Video Coding
• Joint project between ISO/IEC/MPEG and ITU-T/VCEG
– ISO/IEC: MPEG-H Part 2 (23008-2)
– ITU-T: H.265
• JCT-VC committee
– Joint Collaborative Team on Video Coding
– Co-chairs: Dr. Gary Sullivan (Microsoft, USA) and Dr. Jens-Reiner Ohm (RWTH
Aachen, Germany)
• Target
– Roughly half the bit-rate at the same subjective quality compared to H.264/AVC (50%
over H.264/AVC)
– x10 complexity max for encoder and x2/3 max for decoder
• Requirements
– Progressive required for all profiles and levels
Interlaced support using field SEI message
– Video resolution: sub QVGA to 8Kx4K, with more focus on higher resolution video
content (1080p and up)
– Color space and chroma sampling: YUV420, YUV422, YUV444, RGB444
– Bit-depth: 8-14 bits
– Parallel Processing Architecture
118
HEVC…
• H.264 Vs H.265
119
HEVC…
• Potential applications
– Existing applications and usage scenarios
IPTV over DSL : Large shift in IPTV eligibility
Facilitated deployment of OTT and multi-screen services
More customers on the same infrastructure: most IP traffic is video
More archiving facilities
– Existing applications and usage scenarios
1080p60/50 with bitrates comparable to 1080i
Immersive viewing experience: Ultra-HD (4K, 8K)
Premium services (sports, live music, live events,…): home theater, Bars
venue, mobile
HD 3DTV Full frame per view at today’s HD delivery rates
What becomes possible with 50% video rate reduction?
120
HEVC…
• Tentative Timeline
121
HEVC…
• History
122
HEVC…
• H.264 Vs H.265
123
HEVC…
• H.264 Vs H.265
124
HEVC…
• HEVC Encoder
125
HEVC…
• HEVC Decoder
126
HEVC…
• Video Coding Techniques : Block-based hybrid video coding
– Interpicture prediction
Temporal statistical dependences
– Intrapicture prediction
Spatial statistical dependences
– Transform coding
Spatial statistical dependences
• Uses YCbCr color space with 4:2:0 subsampling
– Y component
Luminance (luma)
Represents brightness (gray level)
– Cb and Cr components
Chrominance (chroma).
Color difference from gray toward blue and red
127
HEVC…
• Video Coding Techniques : Block-based hybrid video coding
– Motion compensation
Quarter-sample precision is used for the MVs
7-tap or 8-tap filters are used for interpolation of fractional-sample
positions
– Intrapicture prediction
33 directional modes, planar (surface fitting), DC (flat)
Modes are encoded by deriving most probable modes (MPMs) based
on those of previously decoded neighboring PBs
– Quantization control
Uniform reconstruction quantization (URQ)
– Entropy coding
Context adaptive binary arithmetic coding (CABAC)
– In-Loop deblocking filtering
Similar to the one in H.264 and More friendly to parallel processing
– Sample adaptive offset (SAO)
Nonlinear amplitude mapping
For better reconstruction of amplitude by histogram analysis
128
HEVC…
• Coding Tree Unit (CTU) - A picture is partitioned into CTUs
– The CTU is the basic processing unit instead of Macro Blocks (MB)
– Contains luma CTBs and chroma CTBs
A luma CTB covers L × L samples
Two chroma CTBs cover each L/2 × L/2 samples
– HEVC supports variable-size CTBs
The value of L may be equal to 16, 32, or 64.
Selected according to needs of encoders - In terms of memory and
computational requirements
Large CTB is beneficial when encoding high-resolution video content
– CTBs can be used as CBs or can be partitioned into multiple CBs using
quadtree structures
– The quadtree splitting process can be iterated until the size for a luma
CB reaches a minimum allowed luma CB size (8 × 8 or larger).
129
HEVC…
• Block Structure
– Coding Tree Units (CTU)
Corresponds to macroblocks in earlier coding standards (H.264, MPEG2, etc)
Luma and chroma Coding Tree Blocks (CTB)
Quadtree structure to split into Coding Units (CUs)
16x16, 32x32, or 64x64, signaled in SPS
130
HEVC…
• A new framework composed of three
new concepts
– Coding Units (CU)
– Prediction Units (PU)
– Transform Units (TU)
• The decision whether to code a
picture area using inter or intra
prediction is made at the CU level
Goal: To be as flexible as possible and to adapt the
compression-prediction to image peculiarities
131
HEVC…
• Block Structure
– Coding Units (CU)
Luma and chroma Coding Blocks (CB)
Rooted in CTU
Intra or inter coding mode
Split into Prediction Units (PUs) and Transform Units (TUs)
132
HEVC…
• Block Structure
– Prediction Units (PU)
Luma and chroma Prediction Blocks (PB)
Rooted in CU
Partition and motion info
133
HEVC…
• Block Structure
– Transform Units (TU)
Rooted in CU
4x4, 8x8, 16x16, 32x32 DCT, and 4x4 DST
134
HEVC…
• Relationship of CU, PU and TU
135
HEVC…
• Intra Prediction
– 35 intra modes: 33 directional modes +
DC + planar
– For chroma, 5 intra modes: DC, planar,
vertical, horizontal, and luma derived
– Planar prediction (Intra_Planar)
Amplitude surface with a horizontal and
vertical slope derived from boundaries
– DC prediction (Intra_DC)
Flat surface with a value matching the
mean value of the boundary samples
– Directional prediction (Intra_Angular)
33 different directional prediction is
defined for square TB sizes from 4×4 up
to 32×32
136
HEVC…
• Intra Prediction
– Adaptive reference sample filtering
3-tap filter: [1 2 1]/4
Not performed for 4x4 blocks
For larger than 4x4 blocks, adaptively performed for a subset of modes
Modes except vertical/near-vertical, horizontal/near-horizontal, and DC
– Mode dependent adaptive scanning
4x4 and 8x8 intra blocks only
All other blocks use only diagonal upright scan (left-most scan pattern)
137
HEVC…
• Intra Prediction
– Boundary smoothing
Applied to DC, vertical, and horizontal modes, luma only
Reduces boundary discontinuity
– For DC mode, 1st column and row of samples in predicted block are
filtered
– For Hor/Ver mode, first column/row of pixels in predicted block are filtered
138
HEVC…
• Inter Prediction
– Fractional sample interpolation
¼ pixel precision for luma
– DCT based interpolation filters
8-/7- tap for luma
4-tap for chroma
Supports 16-bit implementation
with non-normative shift
– High precision interpolation and
biprediction
– DCT-IF design
Forward DCT, followed by
inverse DCT
139
HEVC…
• Inter Prediction
– Asymmetric Motion Partition (AMP) for Inter PU
– Merge
Derive motion (MV and ref pic) from spatial and
temporal neighbors
Which spatial/temporal neighbor is identified by
merge_idx
Number of merge candidates (≤ 5) signaled in slice
header
Skip mode = merge mode + no residual
– Advanced Motion Vector Prediction (AMVP)
Use spatial/temporal PUs to predict current MV
140
HEVC…
• Transforms
– Core transforms: DCT based
4x4, 8x8, 16x16, and 32x32
Square transforms only
Support partial factorization
Near-orthogonal
Nested transforms
– Alternative 4x4 DST
4x4 intra blocks, luma only
– Transform skipping mode
By-pass the transform stage
Most effective on “screen content”
4x4 TBs only
141
HEVC…
• Scaling and Quantization
– HEVC uses a uniform reconstruction quantization (URQ)
scheme controlled by a quantization parameter (QP).
– The range of the QP values is defined from 0 to 51
142
HEVC…
• Entropy Coding
– One entropy coder, CABAC
Reuse H.264 CABAC core algorithm
More friendly to software and hardware
implementations
Easier to parallelize, reduced HW area, increased
throughput
– Context modeling
Reduced # of contexts
Increased use of by-pass bins
Reduced data dependency
– Coefficient coding
Adaptive coefficient scanning for intra 4x4 and 8x8
▫ Diagonal upright, horizontal, vertical
Processed in 4x4 blocks for all TU sizes
Sign data hiding:
▫ Sign of first non-zero coefficient conditionally hidden in
the parity of the sum of the non-zero coefficient
magnitudes
▫ Conditions: 2 or more non-zero coefficients, and
“distance” between first and last coefficient > 3
143
HEVC…
• Entropy Coding - CABAC
– Binarization: CABAC uses Binary Arithmetic Coding which means that only binary decisions (1 or
0) are encoded. A non-binary-valued symbol (e.g. a transform coefficient or motion vector) is
"binarized" or converted into a binary code prior to arithmetic coding. This process is similar to the
process of converting a data symbol into a variable length code but the binary code is further
encoded (by the arithmetic coder) prior to transmission.
– Stages are repeated for each bit (or "bin") of the binarized symbol.
– Context model selection: A "context model" is a probability model for one or more bins of the
binarized symbol. This model may be chosen from a selection of available models depending on
the statistics of recently coded data symbols. The context model stores the probability of each bin
being "1" or "0".
– Arithmetic encoding: An arithmetic coder encodes each bin according to the selected probability
model. Note that there are just two sub-ranges for each bin (corresponding to "0" and "1").
– Probability update: The selected context model is updated based on the actual coded value (e.g. if
the bin value was "1", the frequency count of "1"s is increased)
144
HEVC…
• Parallel Processing Tools
– Slices
– Tiles
– Wavefront parallel processing (WPP)
– Dependent Slices
• Slices
– Slices are a sequence of CTUs that are processed in the order
of a raster scan. Slices are self-contained and independent
– Each slice is encapsulated in a separate packet
145
HEVC…
• Tile
– Self-contained and independently decodable rectangular regions
– Tiles provide parallelism at a coarse level of granularity
Tiles more than the cores  Not efficient  Breaks dependencies
146
HEVC…
• WPP
– A slice is divided into rows of CTUs. Parallel processing of rows
– The decoding of each row can be begun as soon a few decisions have
been made in the preceding row for the adaptation of the entropy coder.
– Better compression than tiles. Parallel processing at a fine level of
granularity.
No WPP with tiles !!
147
HEVC…
• Dependent Slices
– Separate NAL units but dependent (Can only be decoded after part of
the previous slice)
– Dependent slices are mainly useful for ultra low delay applications
Remote Surgery
– Error resiliency gets worst
– Low delay
– Good Efficiency  Goes well with WPP
148
HEVC…
• Slice Vs Tile
– Tiles are kind of zero overhead slices
Slice header is sent at every slice but tile information once for a sequence
Slices have packet headers too
Each tile can contain a number of slices and vice versa
– Slices are for :
Controlling packet sizes
Error resiliency
– Tiles are for:
Controlling parallelism (multiple core architecture)
Defining ROI regions
149
HEVC…
• Tile Vs WPP
– WPP
Better compression than tiles
Parallel processing at a fine level of granularity
But …
Needs frequent communication between processing units
If high number of cores Can’t get full utilization
– Good for when
Relatively small number of nodes
Good inter core communication
No need to match to MTU size
Big enough shared cache
150
HEVC…
• In-Loop Filters
– Two processing steps, a deblocking filter (DBF) followed by an
sample adaptive offset (SAO) filter, are applied to the
reconstructed samples
The DBF is intended to reduce the blocking artifacts due to block-
based coding
The DBF is only applied to the samples located at block
boundaries
The SAO filter is applied adaptively to all samples satisfying
certain conditions. e.g. based on gradient.
151
HEVC…
• Loop Filters: Deblocking
– Applied to all samples adjacent to a PU or TU boundary
Except the case when the boundary is also a picture boundary, or
when deblocking is disabled across slice or tile boundaries
– HEVC only applies the deblocking filter to the edge that are
aligned on an 8×8 sample grid
This restriction reduces the worst-case computational complexity
without noticeable degradation of the visual quality
It also improves parallel-processing operation
– The processing order of the deblocking filter is defined as
horizontal filtering for vertical edges for the entire picture first,
followed by vertical filtering for horizontal edges.
152
HEVC…
• Loop Filters: Deblocking
– Simpler deblocking filter in HEVC (vs H.264 )
– Deblocking filter boundary strength is set according to
Block coding mode
Existence of non zero coefficients
Motion vector difference
Reference picture difference
153
HEVC…
• Loop Filters: SAO
– A process that modifies the decoded
samples by conditionally adding an
offset value to each sample after the
application of the deblocking filter,
based on values in look-up tables
transmitted by the encoder.
– SAO: Sample Adaptive Offsets
New loop filter in HEVC
Non-linear filter
– For each CTB, signal SAO type and
parameters
– Encoder decides SAO type and
estimates SAO parameters (rate-
distortion opt.)
154
HEVC…
• Special Coding
– I_PCM mode
The prediction, transform, quantization and entropy coding are bypassed
The samples are directly represented by a pre-defined number of bits
Main purpose is to avoid excessive consumption of bits when the signal
characteristics are extremely unusual and cannot be properly handled by hybrid
coding
– Lossless mode
The transform, quantization, and other processing that affects the decoded picture
are bypassed
The residual signal from inter- or intrapicture prediction is directly fed into the
entropy coder
It allows mathematically lossless reconstruction
SAO and deblocking filtering are not applied to this regions
– Transform skipping mode
Only the transform is bypassed
Improves compression for certain types of video content such as computer-
generated images or graphics mixed with camera-view content
Can be applied to TBs of 4×4 size only
155
HEVC…
• High Level Parallelism
– Independently decodable packets
– Sequence of CTUs in raster scan
– Error resilience
– Parallelization
– Independently decodable (re-entry)
– Rectangular region of CTUs
– Parallelization (esp. encoder)
– 1 slice = more tiles, or 1 tile = more slices
– Rows of CTUs
– Decoding of each row can be parallelized
– Shaded CTU can start when gray CTUs in
row above are finished
– Main profile does not allow tiles + WPP
combination
156
HEVC…
• Profiles, Levels and Tiers
– Historically, profile defines collection of coding
tools, whereas Level constrains decoder
processing load and memory requirements
– The first version of HEVC defined 3 profiles
Main Profile: 8-bit video in YUV4:2:0 format
Main 10 Profile: same as Main, up to 10-bit
Main Still Picture Profile: same as Main, one
picture only
– Levels and Tiers
Levels: max sample rate, max picture size,
max bit rate, DPB and CPB size, etc
Tiers: “main tier” and “high tier” within one
level
157
HEVC…
• Complexity Analysis
– Software-based HEVC decoder capabilities
(published by NTT Docomo)
Single-threaded: 1080p@30 on ARMv7
(1.3GHz),1080p@60 decoding on i5
(2.53GHz)
Multi-threaded: 4Kx2K@60 on i7 (2.7GHz),
12Mbps, decoding speed up to 100fps
– Other independent software-based HEVC
real-time decoder implementations published
by Samsung and Qualcomm during HEVC
development
– Decoder complexity not substantially higher
More complex modules: MC, Transform, Intra
Pred, SAO
Simpler modules: CABAC and deblocking
158
HEVC…
• Quality Performance
159
THANK YOU

Contenu connexe

Tendances

Video compression
Video compressionVideo compression
Video compressionnnmaurya
 
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding StandardizationTrends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding StandardizationMathias Wien
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video codingIain Richardson
 
Composite video signal
Composite video signalComposite video signal
Composite video signalRahul Giri
 
Basics of Colour Television and Digital TV
Basics of Colour Television and Digital TVBasics of Colour Television and Digital TV
Basics of Colour Television and Digital TVjanakiravi
 
Introduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIntroduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIain Richardson
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standardSajan Sahu
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionChamp Yen
 
Video compression
Video compressionVideo compression
Video compressionDarkNight14
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standardanuragjagetiya
 
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 VideoAn Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 VideoDr. Mohieddin Moradi
 
The motion estimation
The motion estimationThe motion estimation
The motion estimationsakshij91
 
Predictive coding
Predictive codingPredictive coding
Predictive codingp_ayal
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingChristian Kehl
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 

Tendances (20)

Video compression
Video compressionVideo compression
Video compression
 
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding StandardizationTrends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
 
Video coding standards ppt
Video coding standards pptVideo coding standards ppt
Video coding standards ppt
 
H.264 vs HEVC
H.264 vs HEVCH.264 vs HEVC
H.264 vs HEVC
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Composite video signal
Composite video signalComposite video signal
Composite video signal
 
Deblocking_Filter_v2
Deblocking_Filter_v2Deblocking_Filter_v2
Deblocking_Filter_v2
 
Audio compression
Audio compressionAudio compression
Audio compression
 
Basics of Colour Television and Digital TV
Basics of Colour Television and Digital TVBasics of Colour Television and Digital TV
Basics of Colour Television and Digital TV
 
Introduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIntroduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video Compression
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standard
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & Introduction
 
Video compression
Video compressionVideo compression
Video compression
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
 
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 VideoAn Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
 
The motion estimation
The motion estimationThe motion estimation
The motion estimation
 
Predictive coding
Predictive codingPredictive coding
Predictive coding
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 

Similaire à VIDEO CODECS

The Importance of Terminology and sRGB Uncertainty - Notes - 0.5
The Importance of Terminology and sRGB Uncertainty - Notes - 0.5The Importance of Terminology and sRGB Uncertainty - Notes - 0.5
The Importance of Terminology and sRGB Uncertainty - Notes - 0.5Thomas Mansencal
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2Dr. Mohieddin Moradi
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lecturesTaymoor Nazmy
 
Display Hardware
Display HardwareDisplay Hardware
Display Hardwareguest56aeb3
 
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDYCG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDYVenneladonthireddy1
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Dr. Mohieddin Moradi
 
Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Yoss Cohen
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Videoadil raja
 
Adaptive Median Filters
Adaptive Median FiltersAdaptive Median Filters
Adaptive Median FiltersAmnaakhaan
 
05 capture
05 capture05 capture
05 captureras255
 
Modern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationModern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationDr. Mohieddin Moradi
 
U21 A1 delivery powerpoint
U21 A1 delivery powerpointU21 A1 delivery powerpoint
U21 A1 delivery powerpointdgordonfilm
 
CGppts-min_compressed.pdf
CGppts-min_compressed.pdfCGppts-min_compressed.pdf
CGppts-min_compressed.pdfSayantanMajhi2
 
CGppts-min_compressed.pdf
CGppts-min_compressed.pdfCGppts-min_compressed.pdf
CGppts-min_compressed.pdfSayantanMajhi2
 
Image Quality - Radiologic Imaging
Image Quality - Radiologic ImagingImage Quality - Radiologic Imaging
Image Quality - Radiologic ImagingMaria Nicole Sicaja
 
U21 A1 delivery powerpoint
U21 A1 delivery powerpointU21 A1 delivery powerpoint
U21 A1 delivery powerpointdgordonfilm
 

Similaire à VIDEO CODECS (20)

MM3.ppt
MM3.pptMM3.ppt
MM3.ppt
 
The Importance of Terminology and sRGB Uncertainty - Notes - 0.5
The Importance of Terminology and sRGB Uncertainty - Notes - 0.5The Importance of Terminology and sRGB Uncertainty - Notes - 0.5
The Importance of Terminology and sRGB Uncertainty - Notes - 0.5
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lectures
 
Display Hardware
Display HardwareDisplay Hardware
Display Hardware
 
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDYCG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
 
Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video"
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Video
 
Adaptive Median Filters
Adaptive Median FiltersAdaptive Median Filters
Adaptive Median Filters
 
chapter5.pptx
chapter5.pptxchapter5.pptx
chapter5.pptx
 
05 capture
05 capture05 capture
05 capture
 
Modern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationModern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operation
 
Graphics
GraphicsGraphics
Graphics
 
U21 A1 delivery powerpoint
U21 A1 delivery powerpointU21 A1 delivery powerpoint
U21 A1 delivery powerpoint
 
ch1ip.ppt
ch1ip.pptch1ip.ppt
ch1ip.ppt
 
CGppts-min_compressed.pdf
CGppts-min_compressed.pdfCGppts-min_compressed.pdf
CGppts-min_compressed.pdf
 
CGppts-min_compressed.pdf
CGppts-min_compressed.pdfCGppts-min_compressed.pdf
CGppts-min_compressed.pdf
 
Image Quality - Radiologic Imaging
Image Quality - Radiologic ImagingImage Quality - Radiologic Imaging
Image Quality - Radiologic Imaging
 
U21 A1 delivery powerpoint
U21 A1 delivery powerpointU21 A1 delivery powerpoint
U21 A1 delivery powerpoint
 

Plus de Vinayagam Mariappan

Light ID based Interactive Exhibition Using Smart Glass and AR Technology
Light ID based Interactive Exhibition Using Smart Glass and AR TechnologyLight ID based Interactive Exhibition Using Smart Glass and AR Technology
Light ID based Interactive Exhibition Using Smart Glass and AR TechnologyVinayagam Mariappan
 
Automotive engineering design - Model Based Design
Automotive engineering design - Model Based DesignAutomotive engineering design - Model Based Design
Automotive engineering design - Model Based DesignVinayagam Mariappan
 
NEXT GENERATION BROADCASTING TECHNOLOGY
NEXT GENERATION BROADCASTING TECHNOLOGYNEXT GENERATION BROADCASTING TECHNOLOGY
NEXT GENERATION BROADCASTING TECHNOLOGYVinayagam Mariappan
 

Plus de Vinayagam Mariappan (7)

Light ID based Interactive Exhibition Using Smart Glass and AR Technology
Light ID based Interactive Exhibition Using Smart Glass and AR TechnologyLight ID based Interactive Exhibition Using Smart Glass and AR Technology
Light ID based Interactive Exhibition Using Smart Glass and AR Technology
 
Automotive engineering design - Model Based Design
Automotive engineering design - Model Based DesignAutomotive engineering design - Model Based Design
Automotive engineering design - Model Based Design
 
Video cloud technology
Video cloud technologyVideo cloud technology
Video cloud technology
 
Open VLC Platform
Open VLC PlatformOpen VLC Platform
Open VLC Platform
 
VLC Technology
VLC TechnologyVLC Technology
VLC Technology
 
NEXT GENERATION BROADCASTING TECHNOLOGY
NEXT GENERATION BROADCASTING TECHNOLOGYNEXT GENERATION BROADCASTING TECHNOLOGY
NEXT GENERATION BROADCASTING TECHNOLOGY
 
IP BASED MEDIA SERVICES
IP BASED  MEDIA SERVICESIP BASED  MEDIA SERVICES
IP BASED MEDIA SERVICES
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

VIDEO CODECS

  • 1. Director, eSILICON LABS, INDIA VIDEO CODEC Vinayagam M Next Generation Broadcasting Technology
  • 2. 2
  • 3. 3 Agenda HVS Images / Video Video / Image Compression Image Coding Video Coding Video Coder Architecture Video Codec Standards HEVC
  • 5. 5 HVS • HVS properties influence the design/tradeoffs of imaging/video systems • Basic properties of HVS “front-end” – 4 types of photo-receptors in the retina – Rods, 3 types of cones • Rods – Achromatic (no concept of color) – Used for scotopic vision (low light levels) – Concentrated in periphery • Cones – 3 types: S - Short, M- Medium, L - Long – Red, Green, and Blue peaks – Used for Photopic Vision (daylight levels) – Concentrated in fovea (center of the retina)
  • 6. 6 HVS… • Eyes, optic nerve, parts of the brain • Transforms electromagnetic energy • Image Formation – Cornea, Sclera, Pupil, Iris, Lens, Retina, Fovea • Transduction – Retina, Rods, and Cones • Processing – Optic Nerve, Brain • Retina and Fovea – Retina has photosensitive receptors at back of eye – Fovea is small, dense region of receptors Only cones (no rods) Gives visual acuity – Outside Fovea Fewer receptors overall Larger proportion of rods Fov ea Retina
  • 7. 7 HVS… • Transduction (Retina) – Transform light to neural impulses – Receptors signal bipolar cells – Bipolar cells signal ganglion cells – Axons in the ganglion cells form optic nerve • Image Formation in the Human Eye
  • 8. 8 HVS… • HVS Properties – Tradeoff in resolution between space and time Low resolution for high spatial AND high temporal frequencies However, eye tracking can convert fast-moving object into low retinal frequency – Achromatic versus chromatic channels Achromatic channel has highest spatial resolution Yellow/Blue has lower spatial resolution than Red/Green channel – Color refers to how we perceive a narrow band of electromagnetic energy Source, Object, Observer
  • 9. 9 HVS… • Visual System – Visual system transforms light energy into sensory experience of sight
  • 10. 10 HVS… • Color Perception (Color Theory) – Hue Distinguishes named colors, e.g., RGB Dominant wavelength of the light – Saturation Perceived intensity of a specific color How far color is from a gray of equal intensity – Brightness (lightness) Perceived intensity Hue Scale SaturationOriginallightness
  • 11. 11 HVS… • Visual Perception – Resolution and Brightness – Spatial Resolution depends on Image Size Viewing Distance – Brightness Perception of brightness is higher than perception of color Different perception of primary colors Relative Brightness: green:red:blue=59%:30%:11% – B/W vs. Color
  • 12. 12 HVS… • Visual Perception – Temporal Resolution Effects caused by inertia of human eye Perception of 16 frames/second as continuous sequence Special Effect: Flicker Flicker Perceived if frame rate or refresh rate of screen too low (<50Hz) Especially in large bright areas Higher refresh rate requires Higher scanning frequency Higher bandwidth
  • 13. 13 HVS… • Visual Perception Influence – Viewing distance – Display ratio (width/height – 4/3 for conventional TV) – Number of details still visible – Intensity (luminance)
  • 14. 14 HVS… • Imaging / Visual System designed based on HVS principles • Example – Image Sensor – Television – Image / Video Display • Image Sensor – CCD (charge coupled device): Arrays of photo diodes Linearity Less light needed Electronic shuttering – CMOS Cheaper Easy manufacturing • Television – NTSC (National Television System Committee): 60 Hz, 30 fps, 525 scan lines North America, Japan, Korea …. – PAL (Phase Alteration by Line): 50 Hz, 25 fps, 625 scan lines Europe … • Image / Video Display – CRT Monitor – LCD TV/Display Monitor
  • 16. 16 IMAGE / VIDEO • Images – View Observation by HVS @ time instant – A multidimensional array of numbers (such as intensity image) or vectors (such as color image) Each component in the image called pixel associates with the pixel value (a single number in the case of intensity images or a vector in the case of color images)             39871532 22132515 372669 28161010             39656554 42475421 67965432 43567065             99876532 92438585 67969060 78567099
  • 17. 17 IMAGE / VIDEO… • Video – Series of Frames (or Images)
  • 18. 18 IMAGE / VIDEO… • Images / Video Frame – A multidimensional function of spatial coordinates – Spatial Coordinate (x,y) for 2D case such as photograph, (x,y,z) for 3D case such as CT scan images (x,y,t) for movies – The function f may represent intensity (for monochrome images) or color (for color images) or other associated values Image “After snow storm” f(x,y) x y Origin
  • 19. 19 IMAGE / VIDEO… • Images / Video Frame – An image that has been discretized both in Spatial coordinates and associated value Consist of 2 sets:(1) a point set and (2) a value set Can be represented in the form – I = {(x,a(x)): x ε X, a(x) ε F} where X and F are a point set and value set, respectively An element of the image, (x,a(x)) is called a pixel where x is called the pixel location and a(x) is the pixel value at the location x – Conventional Coordinate for Image Representation
  • 20. 20 IMAGE / VIDEO… • Images / Video Frame Representation – Basic Unit : Pixel – Dimensions Height Width – Frame rate determines how long the pixel exists, i.e. how it moves – Color Depth of the pixel How many bits are used to represent the color of each pixel?
  • 21. 21 IMAGE / VIDEO… • Image Type – Binary Image – Intensity Image – Color Image – Index image
  • 22. 22 IMAGE / VIDEO… • Binary Image – Binary image or black and white image – Each pixel contains one bit 1 represent white 0 represents black             1111 1111 0000 0000 Binary Data
  • 23. 23 IMAGE / VIDEO… • Intensity Image – Intensity / Monochrome/ Gray Scale Image – Each pixel corresponds to light intensity normally represented in gray scale (gray level)             39871532 22132515 372669 28161010 Gray Scale Values
  • 24. 24 IMAGE / VIDEO… • Color Image – Each pixel contains a vector representing red, green and blue components             39871532 22132515 372669 28161010             39656554 42475421 67965432 43567065             99876532 92438585 67969060 78567099 RGB Components
  • 25. 25 IMAGE / VIDEO… • Index Image – Each pixel contains index number pointing to a color in a color table           256 746 941 Index Value Index No. Red component Green component Blue component 1 0.1 0.5 0.3 2 1.0 0.0 0.0 3 0.0 1.0 0.0 4 0.5 0.5 0.5 5 0.2 0.8 0.9 … … … … Color Table
  • 26. 26 IMAGE / VIDEO… • Colourspace Representations – RGB (Red, Green, Blue) – Basic analog components (from camera/to TV) – YPbPr (Y,B-Y,R-Y) – ANALOG Colourspace (derived from RGB) Y=Luminance, B=Blue, – R=Red – YUV – Colour difference signals scaled to be modulated on a composite carrier – YIQ – Used in NTSC. I=In-phase, Q=Quadrature (IQ plane is 33o rotation of UV plane) – YCbCr/YCC – DIGITAL representation of the YPbPr Colourspace (8bit, 2s compliment)
  • 27. 27 IMAGE / VIDEO… • RGB Color – All color can be composed by adding specific amounts of R, G, & B – 8-bits (28) specifies the amount of each color – This is the scheme used by most electronic displays to generate color; e.g. we often call our computer monitors, "RGB displays" 8-bits Red 8-bits Green 8-bits Blue
  • 28. 28 IMAGE / VIDEO… • Color Reduction – Human eye is not as sensitive to color as it is to Luminance – To this end, to save costs the various standards decided to Maintain luminance information in our images, but Reduce color information Using RBG, though, how do we easily reduce color information without removing luminance? For this, and other technical reasons, a separate color space was chosen by most video standards …
  • 29. 29 IMAGE / VIDEO… • Colour Image: RGB • YCbCr – Even though most displays actually use RGB to create the image, YCbCr is used most often in consumer electronics for transmission of the image – Historically, B/W televisions transmitted only luminance (Y) – The color signals were added later
  • 30. 30 IMAGE / VIDEO… • YCbCr Generated By Sub sampling – YUV 4:4:4 = 8bits per Y,U,V channel (no downsampling the chroma channels) – YUV 4:2:2 = 4 Y pixels sampled for every 2 U and 2 V (2:1 horizontal downsampling, no vertical downsampling – YUV 4:2:0 = 2:1 horizontal downsampling, 2:1 vertical downsampling – YUV 4:1:1 = 4 Y pixels sampled for every 1 U and 1 V (4:1 horizontal downsampling, no vertical downsampling) • YUV 4:4:4 Y Y Y Y Y Y Y Y 4:4:4 Format (3 bytes/pixel): Cb Cr Cb Cr Cb Cr Cb Cr Cb Cr Cb Cr Cb Cr Cb Cr
  • 31. 31 IMAGE / VIDEO… • YUV 4:2:2 • YUV 4:2:0 Y Y Y Y Y Y Y Y 4:2:2 Format (2 bytes/pixel): Cb Cr Cb Cr Cb Cr Cb Cr Y Y Y Y Y Y Y Y Cb Cr Cb Cr 4:2:0 Format (1.5 bytes/pixel):
  • 32. 32 IMAGE / VIDEO… • Up sampling • Downsampling nT Input Signal 1 2 3 4 F(nT) F(nT/2) nT Intermediate Signal 12345678 Interpolating low-pass filter nT nT F(nT/2) Output Signal 12345678 nT Input Signal 1 2 3 4 F(nT) Decimating low-pass filter prevents alias at lower rate F(2nT) 1 Output Signal 2
  • 33. 33 IMAGE / VIDEO… • RGB to YCbCr • RGB to YUV Conversion – Y = 0.299R + 0.587G + 0.114B – U= (B-Y)*0.565 – V= (R-Y)*0.713 U-V plane at Y=0.5 Clamp the output: Y=[16, 235], U,V=[16,239]
  • 34. 34 VIDEO / IMAGE COMPRESSION
  • 35. 35 VIDEO/IMAGE COMPRESSION • How can we use fewer bits? • To understand how image/audio/video signals are compressed to save storage and increase transmission efficiency • Reduces signal size by taking advantage of correlation – Spatial – Temporal – Spectral
  • 36. 36 VIDEO/IMAGE COMPRESSION… • Compression Methods • Need to take advantage of redundancy – Images Space Frequency – Video Space Frequency Time Linear Predictive AutoRegressive Polynomial Fitting Model-Based Huffman Statistical Arithmetic Lempel-Ziv Universal Lossless Spatial/Time-Domain Subband Wavelet Filter-Based Fourier DCT Transform-Based Frequency-Domain Lossy Waveform-Based Compression Methods
  • 37. 37 VIDEO/IMAGE COMPRESSION… • Need to take advantage of redundancy RGB YCbCr Blocks Macro Blocks I B P Remove Temporal Redundancy Transform QuantizationCoding 01100010101,0 Remove Spatial Redundancy Motion Compensation
  • 38. 38 VIDEO/IMAGE COMPRESSION… • Spatial Redundancy – Take advantage of similarity among most neighboring pixels • RGB to YUV – Less information required for YUV (humans less sensitive to chrominance) • Macro Blocks – Take groups of pixels (16x16) • Discrete Cosine Transformation (DCT) – Based on Fourier analysis where represent signal as sum of sine's and cosine’s – Concentrates on higher-frequency values – Represent pixels in blocks with fewer numbers • Quantization – Reduce data required for coefficients • Entropy coding – Compress
  • 39. 39 VIDEO/IMAGE COMPRESSION… • Spatial Redundancy Reduction Zig-Zag Scan, Run-length coding Quantization • major reduction • controls ‘quality’ “Intra-Frame Encoded”
  • 40. 40 VIDEO/IMAGE COMPRESSION… • When may spatial redundancy elimination be ineffective? – High-resolution images and displays – May appear ‘coarse’ • What kinds of images/movies? – A varied image or ‘busy’ scene – Many colors, few adjacent Original (63 kb) Low (7kb) Very Low (4 kb)Due to Loss of Resolution Solution ? Temporal Redundancy Reduction
  • 41. 41 VIDEO/IMAGE COMPRESSION… • Temporal Redundancy Reduction – Take advantage of similarity between successive frames 950 951 952
  • 42. 42 VIDEO/IMAGE COMPRESSION… • Temporal Redundancy Reduction – Take advantage of similarity between successive frames
  • 43. 43 VIDEO/IMAGE COMPRESSION… • Temporal Redundancy Reduction – Take advantage of similarity between successive frames
  • 44. 44 VIDEO/IMAGE COMPRESSION… When may temporal redundancy reduction be ineffective?
  • 45. 45 VIDEO/IMAGE COMPRESSION… • Many scene changes vs. few scene changes • Sometimes high motion
  • 46. 46 VIDEO/IMAGE COMPRESSION… • Many scene changes vs. few scene changes • Sometimes high motion
  • 48. 48 IMAGE CODING • Lossless Compression • Lossy Compression • Transform Coding
  • 49. 49 IMAGE CODING… • Image compression system is composed of three key building blocks – Representation Concentrates important information into a few parameters – Quantization Discretizes parameters – Binary encoding Exploits non-uniform statistics of quantized parameters Creates bitstream for transmission
  • 50. 50 IMAGE CODING… • Image compression system is composed of three key building blocks – Representation Concentrates important information into a few parameters – Quantization Discretizes parameters – Binary encoding Exploits non-uniform statistics of quantized parameters Creates bitstream for transmission
  • 51. 51 IMAGE CODING… • Generally, the only operation that is lossy is the quantization stage • The fact that all the loss (distortion) is localized to a single operation greatly simplifies system design • Can design loss to exploit human visual system (HVS) properties • Source decoder performs the inverse of each of the three operations
  • 52. 52 IMAGE CODING… • Representations - Transform and Subband Filtering Methods – Goal Transform signal into another domain where most of the information (energy) is concentrated into only a small fraction of the coefficients – Enables perceptual processing Exploiting HVS response to different frequency components
  • 53. 53 IMAGE CODING… • Representations - Transform and Subband Filtering Methods – Examples of “traditional” transforms KLT, DFT, DCT – Examples of “traditional” Subband filtering methods Perfect reconstruction filter banks, wavelets – Transform and Subband interpretations All of the above are linear representations and can be interpreted from either a transform or a Subband filtering viewpoint – Transform viewpoint Express signal as a linear combination of basis vectors Stresses linear expansion (linear algebra) perspective – Subband filtering viewpoint Pass signal through a set of filters and examine the frequencies passed by each filter (Subband) Stresses filtering (signal processing) perspective
  • 54. 54 IMAGE CODING… • Representations – Transform Image Coding – A good transform provides Most of the image energy is concentrated into a small fraction of the coefficients Coding only these small fraction of the coefficients and discarding the rest can often lead to excellent reconstructed quality The more energy compaction the better – Orthogonal transforms are particularly useful Energy in discarded coefficients is equal to energy in reconstruction error
  • 55. 55 IMAGE CODING… • Representations – Transform Image Coding – Karhunen-Loeve Transform (KLT) Optimal energy compaction Requires knowledge of signal covariance In general, no simple computational algorithm – Discrete Fourier Transform (DFT) Fast algorithms Good energy compaction, but not as good as DCT – Discrete Cosine Transform (DCT) Fast algorithms Good energy compaction All real coefficients Overall good performance and widely used for image and video coding
  • 56. 56 IMAGE CODING… • Discrete Cosine Transform (DCT) – 1-D Discrete Cosine Transform (N-point) – 1-D DCT basis vectors – 2-D DCT: Separable transform of 1-D DCT – 2-D DCT basis vectors? Basis pictures! – 2-D basis vectors for 2-D DCT are basis pictures! – 64 basis pictures for 8x8-pixel 2-D DCT – Image coding with the 2-D DCT is equivalent to approximating the image as a linear combination of these basis pictures!
  • 57. 57 IMAGE CODING… • Representations – Coding Transform Coefficients – Selecting the basis pictures to approximate an image is equivalent to selecting the DCT coefficients to code – General methods of coding/discarding coefficients Zonal Coding ▫ Code all coefficients in a zone and discard others ▫ Example zone: Spatial low frequencies ▫ Only need to code coefficient amplitudes Threshold Coding ▫ Keep coefficients with magnitude above a threshold ▫ Coefficient amplitudes and locations must be coded ▫ Provides best performance
  • 58. 58 IMAGE CODING… • Video / Image Coding are Block based Coding – Frames are divided into Sub-Block and then coded • Macroblock (MB) and Block Layer – Process the data in blocks of 8x8 samples – Convert Red-Green-Blue into Luminance (greyscale) and Chrominance (Blue color difference and Red color difference) – Use half resolution for Chrominance (because eye is more sensitive to greyscale than to color)
  • 59. 59 IMAGE CODING… • Macroblock (MB) and Block Layer – Macroblock Consist of 16x16 luminance block 8x8 chrominance block Basic unit for motion estimation – Block 8 pixels by 8 lines Basic unit for DCT
  • 60. 60 IMAGE CODING… • Lossless Compression – General-Purpose Compression: Entropy Encoding – Remove statistical redundancy from data – ie, Encode common values with short codes, uncommon values with longer codes • Lossless Compression – Huffman Coding – Example : ABCCDEAAB After compression: 1011000000001010111011 – Compression ratio According to probability of the characters appears in the uncompressed data C:12 D:13 F:5 E:9 B:16 A:45 1425 55 100 30 10 0 0 0 0 1 1 1 1 000 001 0100 0101 011 1
  • 61. 61 IMAGE CODING… • Lossless Compression – Run-Length Coding Reduce the number of samples to code Implementation is simple Input Sequence 0,0,-3,5,1,0,-2,0,0,0,0,2,-4,3,-2,0,0,0,1,0,0,-2,EOB Run-Length Sequence (2,-3)(0,5)(0,1)(1,-2)(4,2)(0,-4)(0,3)(0,-2)(3,1)(2,-2)EOB
  • 62. 62 IMAGE CODING… • Lossless Compression – Transform Coding (-1,1) (1,1) (0.4,1.4) = 0.4•(1,0)+1.4•(0,1) = 0.9•(1,1)+0.5•(-1,1) Basis vector { (1,0), (0,1) } New basis vector { (1,1), (-1,1) } New vector (0.9, 0.5) (0,1) (1,0)
  • 63. 63 IMAGE CODING… • Lossless Compression – Transform Coding : DCT Transform blocks of images to frequency domain, code only the significant transform coefficients 2D DCT – Transform Coding : DCT 8x8 DCT Basis Function
  • 64. 64 IMAGE CODING… • Lossless Compression – Transform Coding : DCT 2D DCT Coefficients
  • 65. 65 IMAGE CODING… • Lossy Compression – Lossy Predictive Coding
  • 66. 66 IMAGE CODING… • Lossy Compression – Quantization Many to one mapping Quantization is the most import means of irrelevancy reduction – Implementation Lookup Table Divide by quantization step-size (round/truncate)
  • 67. 67 IMAGE CODING… • Lossy Compression – Divide by quantization step-size Input signal:0 1 2 3 4 5 6 7(3 bits) Step-size:2 Quantization:0 0 1 1 2 2 3 3(2 bits) Inverse quantization:0 0 2 2 4 4 6 6 Quantization Errors:0 1 0 1 0 1 0 1 – Lookup Table Divide each DCT coefficient by an integer, discard remainder Result: loss of precision Typically, a few non-zero coefficients are left
  • 68. 68 IMAGE CODING… • Lossy Compression – Zigzag Scan Efficient encoding of the position of non-zero transform coefficients Scan” quantized coefficients in a zig-zag order Non-zero coefficients tend to be grouped together
  • 69. 69 IMAGE CODING… • DCT + Quantization + Run-Level-Coding
  • 71. 71 VIDEO CODING • Lossless Compression • Lossy Compression • Transform Coding • Motion Coding
  • 72. 72 VIDEO CODING… • Video – Sequence of frames (images) that are related • Moving images contain significant temporal redundancy – Successive frames are very similar – Related along the temporal dimension - Temporal redundancy exists
  • 73. 73 VIDEO CODING… • Video Coding – The objective of video coding is to compress moving images – Main addition over image compression Temporal redundancy Video coder must exploit the temporal redundancy – The MPEG (Moving Picture Experts Group) and H.26X are the major standards for video coding • Video coding algorithms usually contains two coding schemes : – Intraframe coding Intraframe coding does not exploit the correlation among adjacent frames Intraframe coding therefore is similar to the still image coding – Interframe coding The interframe coding should include motion estimation/compensation process to remove temporal redundancy • Basic Concept – Use interframe correlation for attaining better rate distortion
  • 74. 74 VIDEO CODING… • Usually high frame rate: Significant temporal redundancy • Possible representations along temporal dimension – Transform/Subband Methods Good for textbook case of constant velocity uniform global motion Inefficient for nonuniform motion, I.e. real-world motion Requires large number of frame stores Leads to delay (Memory cost may also be an issue) – Predictive Methods Good performance using only 2 frame stores However, simple frame differencing in not enough
  • 75. 75 VIDEO CODING… • Main addition over image compression – Exploit the temporal redundancy • Predict current frame based on previously coded frames • Types of coded frames – I-frame Intra-coded frame, coded independently of all other frames – P-frame Predictively coded frame, coded based on previously coded frame – B-frame Bi-directionally predicted frame, coded based on both previous and future coded frames
  • 76. 76 VIDEO CODING… • Motion-Compensated Prediction – Simple frame differencing fails when there is motion – Must account for motion Motion-compensated (MC) prediction – MC-prediction generally provides significant improvements – Questions How can we estimate motion? How can we form MC-prediction? • Motion Estimation – Ideal Situation Partition video into moving objects Describe object motion Generally very difficult – Practical approach: Block-Matching Motion Estimation Partition each frame into blocks Describe motion of each block No object identification required Good, robust performance
  • 77. 77 VIDEO CODING… • Block-Matching Motion Estimation – Assumptions Translational motion within block All pixels within each block have the same motion – ME Algorithm Divide current frame into non-overlapping N1xN2 blocks For each block, find the best matching block in reference frame – MC-Prediction Algorithm Use best matching blocks of reference frame as prediction of blocks in current frame
  • 78. 78 VIDEO CODING… • Block-Matching - Determining the Best Matching Block – For each block in the current frame search for best matching block in the reference frame Metrics for determining “best match” Candidate blocks: All blocks in, e.g., (± 32,±32) pixel area Strategies for searching candidate blocks for best match Full search: Examine all candidate blocks Partial (fast) search: Examine a carefully selected subset – Estimate of motion for best matching block: “motion vector” • Motion Vectors and Motion Vector Field – Motion Vector Expresses the relative horizontal and vertical offsets (mv1,mv2), or motion, of a given block from one frame to another Each block has its own motion vector – Motion Vector Field Collection of motion vectors for all the blocks in a frame
  • 79. 79 VIDEO CODING… • Example of Fast Search: 3-Step (Log) Search – Goal: Reduce number of search points Example:(± 7,±7) search area Dots represent search points Search performed in 3 steps (coarse-to-fine) – Step 1: (± 4 pixels ) – Step 2: (± 2 pixels ) – Step 3: (± 1 pixels ) – Best match is found at each step – Next step: Search is centered around the best match of prior step – Speedup increases for larger search areas
  • 80. 80 VIDEO CODING… • Motion Vector Precision – Motivation Motion is not limited to integer-pixel offsets However, video only known at discrete pixel locations To estimate sub-pixel motion, frames must be spatially interpolated – Fractional MVs are used to represent the sub-pixel motion – Improved performance (extra complexity is worthwhile) – Half-pixel ME used in most standards: MPEG-1/2/4 – Why are half-pixel motion vectors better? Can capture half-pixel motion Averaging effect (from spatial interpolation) reduces prediction error -> Improved prediction For noisy sequences, averaging effect reduces noise -> Improved compression
  • 81. 81 VIDEO CODING… • Practical Half-Pixel Motion Estimation Algorithm – Half-Pixel ME (coarse-fine) Algorithm Coarse Step: Perform integer motion estimation on blocks; find best integer- pixel MV Fine Step: Refine estimate to find best half-pixel MV Spatially interpolate the selected region in reference frame Compare current block to interpolated reference frame block Choose the integer or half-pixel offset that provides best match Typically, bilinear interpolation is used for spatial interpolation • Example – MC-Prediction for Two Consecutive Frames
  • 82. 82 VIDEO CODING… • Bi-Directional MC-Prediction – Bi-Directional MC-Prediction is used to estimate a block in the current frame from a block in Previous frame Future frame Average of a block from the previous frame and a block from the future frame – Motion compensated prediction Predict the current frame based on reference frame(s) while compensating for the motion – Examples of block-based motion-compensated prediction (P-frame) and bi-directional prediction (B-frame)
  • 83. 83 VIDEO CODING… • Motion Estimation and Compensation – The amount of data to be coded can be reduced significantly if the previous frame is subtracted from the current frame
  • 84. 84 VIDEO CODING… • Motion Estimation and Compensation – Uses Block-Matching The MPEG and H.26X standards use block-matching technique for motion estimation /compensation In the block-matching technique, each current frame is divide into equal-size blocks, called source blocks Each source block is associated with a search region in the reference frame The objective of block-matching is to find a candidate block in the search region best matched to the source block The relative distances between a source block and its candidate blocks are called motion vectors Video Sequence The current frameThe reconstructed reference frame Bx: Search area associated with X MV: Motion Vector X: Source block for block-matching
  • 85. 85 VIDEO CODING… • Motion Estimation and Compensation – Uses Block-Matching
  • 86. 86 VIDEO CODING… • Motion Estimation and Compensation The Reconstructed Previous Frame The Current Frame Results of Block- Matching The Predicted Current Frame
  • 87. 87 VIDEO CODING… • Motion Estimation and Compensation – Search Range – The size of the search range = – The number of candidate blocks = )2)(2( max2max1 yx dNdN   )12)(12( maxmax   yx dd
  • 88. 88 VIDEO CODING… • Motion Estimation and Compensation – Motion Vector and Search Area    pnpn 22 Search Area: Motion vector: (u, v)
  • 89. 89 VIDEO CODING… • Motion Estimation and Compensation – Matching Function Mean square error(MSE) Mean absolute difference(MAD) Number of threshold difference(NTD) Normalized cross-correlation function(NCF)       1 0 21 0 221121 21 21 1 1 1 1 )]1,,(),,([ 1 ),( N n N n tdndnftnnf NN ddMSE       1 0 1 0 221121 21 21 1 1 1 1 |)1,,(),,(| 1 ),( N n N n tdndnftnnf NN ddMAD
  • 90. 90 VIDEO CODING… • Motion Estimation and Compensation – Algorithm Full search block matching (FSB) Fast algorithm ▫ 2D Logarithmic Search (TDL) ▫ Three Step Search (TSS) ▫ Cross-Search Algorithm (CSA) ▫ … – Full Search Algorithm If p=7, then there are (2p+1)(2p+1)=225 candidate blocks. u v Search Area Candidate Block
  • 91. 91 VIDEO CODING… • Motion Estimation and Compensation – Full Search Algorithm Intensive computation Need for fast Motion Estimation !
  • 92. 92 VIDEO CODING… • Motion Estimation and Compensation – 2D Logarithmic Search Diamond-shape search area Matching function ▫ MSE -7 –6 –5 –4 –3 –2 –1 0 +1 +2 +3 +4 +5 +6 +7 +7 +6 +5 +4 +3 +2 +1 0 -1 -2 -3 -4 -5 -6 -7 MV
  • 93. 93 VIDEO CODING… • Motion Estimation and Compensation – Three-Step Search The first step involves block-matching based on 4-pel resolution at the nine location The second step involves block-matching based on 2-pel resolution around the location determined by the first step The third step repeats the process in the second step (but with resolution 1-pel) -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 11 1 11 11 1 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 3 3 333 3 3 3 2 2 2 2 222 2
  • 94. 94 VIDEO CODING… • Motion Estimation and Compensation – Motion Vector Prediction predMVx = Median(MV1x, MV2x, MV3x) predMVy = Median(MV1y, MV2y, MV3y) MVx`=MVx - predMVx MVy`=MVy - predMVy
  • 96. 96 VIDEO CODER ARCHITECTURE • Image / Video Coding Based on Block-Matching – Assume frame f-1 has been encoded and reconstructed, and frame f is the current frame to be encoded • Exploiting the redundancies – Temporal MC-Prediction (P and B frames) – Spatial Block DCT – Color Color Space Conversion • Scalar quantization of DCT coefficients • Zigzag scanning, runlength and Huffman coding of the nonzero quantized DCT coefficients
  • 97. 97 VIDEO CODER ARCHITECTURE… • Video Encoder – Divide frame f into equal-size blocks – For each source block, Find its motion vector using the block-matching algorithm based on the reconstructed frame f-1 Compute the DFD of the block – Transmit the motion vector of each block to decoder – Compress DFD’s of each block – Transmit the encoded DFD’s to decoder
  • 99. 99 VIDEO CODER ARCHITECTURE… • Video Decoder – Receive motion vector of each block from encoder – Based on the motion vector ,find the best-matching block from the reference frame ie,, Find the predicted current frame from the reference frame – Receive the encoded DFD of each block from encoder – Decode the DFD. – Each reconstructed block in the current frame = Its decompressed DFD + the best-matching block
  • 102. 102 VIDEO CODEC STANDARDS • Goal of Standards – Ensuring Interoperability Enabling communication between devices made by different manufacturers – Promoting a technology or industry – Reducing costs What do the Standards Specify?
  • 103. 103 VIDEO CODEC STANDARDS… What do the Standards Specify? • Not the encoder • Not the decoder • Just the bitstream syntax and the decoding process(e.g. use IDCT, but not how to implement the IDCT) – Enables improved encoding & decoding strategies to be employed in a standard-compatible manner
  • 104. 104 VIDEO CODEC STANDARDS… • The Scope of Picture and Video Coding Standardization – Only the Syntax and Decoder are standardized: Permits optimization beyond the obvious Permits complexity reduction for implementability Provides no guarantees of Quality Pre-Processing Encoding Source Destination Post-Processing & Error Recovery Decoding Scope of Standard
  • 106. 106 VIDEO CODEC STANDARDS… • Based on the same fundamental building blocks – Motion-compensated prediction (I, P, and B frames) – 2-D Discrete Cosine Transform (DCT) – Color space conversion – Scalar quantization, runlengths, Huffman coding • Additional tools added for different applications: – Progressive or interlaced video – Improved compression, error resilience, scalability, etc. • MPEG-1/2/4, H.261/3/4 – Frame-based coding • MPEG-4 – Object-based coding and Synthetic video
  • 107. 107 VIDEO CODEC STANDARDS… • The Video Standards uses all the three types of frames as shown below Encoding order: I0, P3, B1, B2, P6, B4, B5, I9, B7, B8. Playback order: I0, B1, B2, P3, B4, B5, P6, B7, B8, I9.
  • 108. 108 VIDEO CODEC STANDARDS… • Video Structure – Video standards code video sequences in hierarchy of layers – There are usually 5 Layers GOP (Group of Pictures) Picture Slice Macroblock Block
  • 109. 109 VIDEO CODEC STANDARDS… • Video Structure – A GOP usually started with I frame, followed by a sequence of P and B frames – A Picture is indeed a frame in the video sequence – A Slice is a portion in a picture Some standards do not have slices Some view a slice as a row Each slice in H.264 is not necessary to be a row It can be any shape containing integral number of macroblocks – A Macroblock is a 16×16 block Many standards use Marcoblocks as the basic unit for block-matching operations – A Block is a 8×8 block Many standards use the Blocks as the basic unit for DCT
  • 110. 110 VIDEO CODEC STANDARDS… • Scalable Video Coding – Three classes of scalable video coding techniques Temporal Scalability Spatial Scalability SNR Scalability – Uses B frames for attaining temporal scalability B frames depend on other frames No other frames depend on B frames Discard B frames without affecting other frames
  • 111. 111 VIDEO CODEC STANDARDS… • Scalable Video Coding – Spatial Scalability – Basically Resolution Scalability Here the base layer is the low resolution version of the video sequence – The base layer uses coaster quantizer for DFD coding – The residuals in the base layer is refined in the enhancement layer
  • 114. 114 HEVC • Video Coding Standards Overview Next Generation Broadcasting
  • 115. 115 HEVC… • MPEG-H – High Efficiency Coding and Media Delivery in Heterogeneous Environments a new suite of standards providing technical solutions for emerging challenges in multimedia industries – Part 1: System, MPEG Media Transport (MMT) Integrated services with multiple components in a hybrid delivery environment, providing support for seamless and efficient use of heterogeneous network environments, including broadcast, multicast, storage media and mobile networks – Part 2: Video, High Efficiency Video Coding (HEVC) Highly immersive visual experiences, with ultra high definition displays that give no perceptible pixel structure even if viewed from such a short distance that they subtend a large viewing angle (up to 55 degrees horizontally for 4Kx2K resolution displays, up to 100 degrees for 8Kx4K) – Part 3: Audio, 3D-Audio Highly immersive audio experiences in which the decoding device renders a 3D audio scene. This may be using 10.2 or 22.2 channel configurations or much more limited speaker configurations or headphones, such as found in a personal tablet or smartphone.
  • 116. 116 HEVC… • Transport/System Layer Integration – On going definitions (MPEG, IETF,…,DVB): benefit from H.264/AVC – MPEG Media Transport (MMT) ?
  • 117. 117 HEVC… • HEVC = High Efficiency Video Coding • Joint project between ISO/IEC/MPEG and ITU-T/VCEG – ISO/IEC: MPEG-H Part 2 (23008-2) – ITU-T: H.265 • JCT-VC committee – Joint Collaborative Team on Video Coding – Co-chairs: Dr. Gary Sullivan (Microsoft, USA) and Dr. Jens-Reiner Ohm (RWTH Aachen, Germany) • Target – Roughly half the bit-rate at the same subjective quality compared to H.264/AVC (50% over H.264/AVC) – x10 complexity max for encoder and x2/3 max for decoder • Requirements – Progressive required for all profiles and levels Interlaced support using field SEI message – Video resolution: sub QVGA to 8Kx4K, with more focus on higher resolution video content (1080p and up) – Color space and chroma sampling: YUV420, YUV422, YUV444, RGB444 – Bit-depth: 8-14 bits – Parallel Processing Architecture
  • 119. 119 HEVC… • Potential applications – Existing applications and usage scenarios IPTV over DSL : Large shift in IPTV eligibility Facilitated deployment of OTT and multi-screen services More customers on the same infrastructure: most IP traffic is video More archiving facilities – Existing applications and usage scenarios 1080p60/50 with bitrates comparable to 1080i Immersive viewing experience: Ultra-HD (4K, 8K) Premium services (sports, live music, live events,…): home theater, Bars venue, mobile HD 3DTV Full frame per view at today’s HD delivery rates What becomes possible with 50% video rate reduction?
  • 126. 126 HEVC… • Video Coding Techniques : Block-based hybrid video coding – Interpicture prediction Temporal statistical dependences – Intrapicture prediction Spatial statistical dependences – Transform coding Spatial statistical dependences • Uses YCbCr color space with 4:2:0 subsampling – Y component Luminance (luma) Represents brightness (gray level) – Cb and Cr components Chrominance (chroma). Color difference from gray toward blue and red
  • 127. 127 HEVC… • Video Coding Techniques : Block-based hybrid video coding – Motion compensation Quarter-sample precision is used for the MVs 7-tap or 8-tap filters are used for interpolation of fractional-sample positions – Intrapicture prediction 33 directional modes, planar (surface fitting), DC (flat) Modes are encoded by deriving most probable modes (MPMs) based on those of previously decoded neighboring PBs – Quantization control Uniform reconstruction quantization (URQ) – Entropy coding Context adaptive binary arithmetic coding (CABAC) – In-Loop deblocking filtering Similar to the one in H.264 and More friendly to parallel processing – Sample adaptive offset (SAO) Nonlinear amplitude mapping For better reconstruction of amplitude by histogram analysis
  • 128. 128 HEVC… • Coding Tree Unit (CTU) - A picture is partitioned into CTUs – The CTU is the basic processing unit instead of Macro Blocks (MB) – Contains luma CTBs and chroma CTBs A luma CTB covers L × L samples Two chroma CTBs cover each L/2 × L/2 samples – HEVC supports variable-size CTBs The value of L may be equal to 16, 32, or 64. Selected according to needs of encoders - In terms of memory and computational requirements Large CTB is beneficial when encoding high-resolution video content – CTBs can be used as CBs or can be partitioned into multiple CBs using quadtree structures – The quadtree splitting process can be iterated until the size for a luma CB reaches a minimum allowed luma CB size (8 × 8 or larger).
  • 129. 129 HEVC… • Block Structure – Coding Tree Units (CTU) Corresponds to macroblocks in earlier coding standards (H.264, MPEG2, etc) Luma and chroma Coding Tree Blocks (CTB) Quadtree structure to split into Coding Units (CUs) 16x16, 32x32, or 64x64, signaled in SPS
  • 130. 130 HEVC… • A new framework composed of three new concepts – Coding Units (CU) – Prediction Units (PU) – Transform Units (TU) • The decision whether to code a picture area using inter or intra prediction is made at the CU level Goal: To be as flexible as possible and to adapt the compression-prediction to image peculiarities
  • 131. 131 HEVC… • Block Structure – Coding Units (CU) Luma and chroma Coding Blocks (CB) Rooted in CTU Intra or inter coding mode Split into Prediction Units (PUs) and Transform Units (TUs)
  • 132. 132 HEVC… • Block Structure – Prediction Units (PU) Luma and chroma Prediction Blocks (PB) Rooted in CU Partition and motion info
  • 133. 133 HEVC… • Block Structure – Transform Units (TU) Rooted in CU 4x4, 8x8, 16x16, 32x32 DCT, and 4x4 DST
  • 135. 135 HEVC… • Intra Prediction – 35 intra modes: 33 directional modes + DC + planar – For chroma, 5 intra modes: DC, planar, vertical, horizontal, and luma derived – Planar prediction (Intra_Planar) Amplitude surface with a horizontal and vertical slope derived from boundaries – DC prediction (Intra_DC) Flat surface with a value matching the mean value of the boundary samples – Directional prediction (Intra_Angular) 33 different directional prediction is defined for square TB sizes from 4×4 up to 32×32
  • 136. 136 HEVC… • Intra Prediction – Adaptive reference sample filtering 3-tap filter: [1 2 1]/4 Not performed for 4x4 blocks For larger than 4x4 blocks, adaptively performed for a subset of modes Modes except vertical/near-vertical, horizontal/near-horizontal, and DC – Mode dependent adaptive scanning 4x4 and 8x8 intra blocks only All other blocks use only diagonal upright scan (left-most scan pattern)
  • 137. 137 HEVC… • Intra Prediction – Boundary smoothing Applied to DC, vertical, and horizontal modes, luma only Reduces boundary discontinuity – For DC mode, 1st column and row of samples in predicted block are filtered – For Hor/Ver mode, first column/row of pixels in predicted block are filtered
  • 138. 138 HEVC… • Inter Prediction – Fractional sample interpolation ¼ pixel precision for luma – DCT based interpolation filters 8-/7- tap for luma 4-tap for chroma Supports 16-bit implementation with non-normative shift – High precision interpolation and biprediction – DCT-IF design Forward DCT, followed by inverse DCT
  • 139. 139 HEVC… • Inter Prediction – Asymmetric Motion Partition (AMP) for Inter PU – Merge Derive motion (MV and ref pic) from spatial and temporal neighbors Which spatial/temporal neighbor is identified by merge_idx Number of merge candidates (≤ 5) signaled in slice header Skip mode = merge mode + no residual – Advanced Motion Vector Prediction (AMVP) Use spatial/temporal PUs to predict current MV
  • 140. 140 HEVC… • Transforms – Core transforms: DCT based 4x4, 8x8, 16x16, and 32x32 Square transforms only Support partial factorization Near-orthogonal Nested transforms – Alternative 4x4 DST 4x4 intra blocks, luma only – Transform skipping mode By-pass the transform stage Most effective on “screen content” 4x4 TBs only
  • 141. 141 HEVC… • Scaling and Quantization – HEVC uses a uniform reconstruction quantization (URQ) scheme controlled by a quantization parameter (QP). – The range of the QP values is defined from 0 to 51
  • 142. 142 HEVC… • Entropy Coding – One entropy coder, CABAC Reuse H.264 CABAC core algorithm More friendly to software and hardware implementations Easier to parallelize, reduced HW area, increased throughput – Context modeling Reduced # of contexts Increased use of by-pass bins Reduced data dependency – Coefficient coding Adaptive coefficient scanning for intra 4x4 and 8x8 ▫ Diagonal upright, horizontal, vertical Processed in 4x4 blocks for all TU sizes Sign data hiding: ▫ Sign of first non-zero coefficient conditionally hidden in the parity of the sum of the non-zero coefficient magnitudes ▫ Conditions: 2 or more non-zero coefficients, and “distance” between first and last coefficient > 3
  • 143. 143 HEVC… • Entropy Coding - CABAC – Binarization: CABAC uses Binary Arithmetic Coding which means that only binary decisions (1 or 0) are encoded. A non-binary-valued symbol (e.g. a transform coefficient or motion vector) is "binarized" or converted into a binary code prior to arithmetic coding. This process is similar to the process of converting a data symbol into a variable length code but the binary code is further encoded (by the arithmetic coder) prior to transmission. – Stages are repeated for each bit (or "bin") of the binarized symbol. – Context model selection: A "context model" is a probability model for one or more bins of the binarized symbol. This model may be chosen from a selection of available models depending on the statistics of recently coded data symbols. The context model stores the probability of each bin being "1" or "0". – Arithmetic encoding: An arithmetic coder encodes each bin according to the selected probability model. Note that there are just two sub-ranges for each bin (corresponding to "0" and "1"). – Probability update: The selected context model is updated based on the actual coded value (e.g. if the bin value was "1", the frequency count of "1"s is increased)
  • 144. 144 HEVC… • Parallel Processing Tools – Slices – Tiles – Wavefront parallel processing (WPP) – Dependent Slices • Slices – Slices are a sequence of CTUs that are processed in the order of a raster scan. Slices are self-contained and independent – Each slice is encapsulated in a separate packet
  • 145. 145 HEVC… • Tile – Self-contained and independently decodable rectangular regions – Tiles provide parallelism at a coarse level of granularity Tiles more than the cores  Not efficient  Breaks dependencies
  • 146. 146 HEVC… • WPP – A slice is divided into rows of CTUs. Parallel processing of rows – The decoding of each row can be begun as soon a few decisions have been made in the preceding row for the adaptation of the entropy coder. – Better compression than tiles. Parallel processing at a fine level of granularity. No WPP with tiles !!
  • 147. 147 HEVC… • Dependent Slices – Separate NAL units but dependent (Can only be decoded after part of the previous slice) – Dependent slices are mainly useful for ultra low delay applications Remote Surgery – Error resiliency gets worst – Low delay – Good Efficiency  Goes well with WPP
  • 148. 148 HEVC… • Slice Vs Tile – Tiles are kind of zero overhead slices Slice header is sent at every slice but tile information once for a sequence Slices have packet headers too Each tile can contain a number of slices and vice versa – Slices are for : Controlling packet sizes Error resiliency – Tiles are for: Controlling parallelism (multiple core architecture) Defining ROI regions
  • 149. 149 HEVC… • Tile Vs WPP – WPP Better compression than tiles Parallel processing at a fine level of granularity But … Needs frequent communication between processing units If high number of cores Can’t get full utilization – Good for when Relatively small number of nodes Good inter core communication No need to match to MTU size Big enough shared cache
  • 150. 150 HEVC… • In-Loop Filters – Two processing steps, a deblocking filter (DBF) followed by an sample adaptive offset (SAO) filter, are applied to the reconstructed samples The DBF is intended to reduce the blocking artifacts due to block- based coding The DBF is only applied to the samples located at block boundaries The SAO filter is applied adaptively to all samples satisfying certain conditions. e.g. based on gradient.
  • 151. 151 HEVC… • Loop Filters: Deblocking – Applied to all samples adjacent to a PU or TU boundary Except the case when the boundary is also a picture boundary, or when deblocking is disabled across slice or tile boundaries – HEVC only applies the deblocking filter to the edge that are aligned on an 8×8 sample grid This restriction reduces the worst-case computational complexity without noticeable degradation of the visual quality It also improves parallel-processing operation – The processing order of the deblocking filter is defined as horizontal filtering for vertical edges for the entire picture first, followed by vertical filtering for horizontal edges.
  • 152. 152 HEVC… • Loop Filters: Deblocking – Simpler deblocking filter in HEVC (vs H.264 ) – Deblocking filter boundary strength is set according to Block coding mode Existence of non zero coefficients Motion vector difference Reference picture difference
  • 153. 153 HEVC… • Loop Filters: SAO – A process that modifies the decoded samples by conditionally adding an offset value to each sample after the application of the deblocking filter, based on values in look-up tables transmitted by the encoder. – SAO: Sample Adaptive Offsets New loop filter in HEVC Non-linear filter – For each CTB, signal SAO type and parameters – Encoder decides SAO type and estimates SAO parameters (rate- distortion opt.)
  • 154. 154 HEVC… • Special Coding – I_PCM mode The prediction, transform, quantization and entropy coding are bypassed The samples are directly represented by a pre-defined number of bits Main purpose is to avoid excessive consumption of bits when the signal characteristics are extremely unusual and cannot be properly handled by hybrid coding – Lossless mode The transform, quantization, and other processing that affects the decoded picture are bypassed The residual signal from inter- or intrapicture prediction is directly fed into the entropy coder It allows mathematically lossless reconstruction SAO and deblocking filtering are not applied to this regions – Transform skipping mode Only the transform is bypassed Improves compression for certain types of video content such as computer- generated images or graphics mixed with camera-view content Can be applied to TBs of 4×4 size only
  • 155. 155 HEVC… • High Level Parallelism – Independently decodable packets – Sequence of CTUs in raster scan – Error resilience – Parallelization – Independently decodable (re-entry) – Rectangular region of CTUs – Parallelization (esp. encoder) – 1 slice = more tiles, or 1 tile = more slices – Rows of CTUs – Decoding of each row can be parallelized – Shaded CTU can start when gray CTUs in row above are finished – Main profile does not allow tiles + WPP combination
  • 156. 156 HEVC… • Profiles, Levels and Tiers – Historically, profile defines collection of coding tools, whereas Level constrains decoder processing load and memory requirements – The first version of HEVC defined 3 profiles Main Profile: 8-bit video in YUV4:2:0 format Main 10 Profile: same as Main, up to 10-bit Main Still Picture Profile: same as Main, one picture only – Levels and Tiers Levels: max sample rate, max picture size, max bit rate, DPB and CPB size, etc Tiers: “main tier” and “high tier” within one level
  • 157. 157 HEVC… • Complexity Analysis – Software-based HEVC decoder capabilities (published by NTT Docomo) Single-threaded: 1080p@30 on ARMv7 (1.3GHz),1080p@60 decoding on i5 (2.53GHz) Multi-threaded: 4Kx2K@60 on i7 (2.7GHz), 12Mbps, decoding speed up to 100fps – Other independent software-based HEVC real-time decoder implementations published by Samsung and Qualcomm during HEVC development – Decoder complexity not substantially higher More complex modules: MC, Transform, Intra Pred, SAO Simpler modules: CABAC and deblocking