Video compression deals with reducing the large size of uncompressed video data by exploiting both spatial and temporal redundancy in video. Spatial redundancy refers to correlations between nearby pixels within a frame, while temporal redundancy refers to similarities between adjacent frames. Video compression aims to efficiently reduce these redundancies to achieve higher compression ratios through techniques like predictive coding of frames, motion compensation to account for object movement, and encoding of residual blocks.
3. The volume of uncompressed video data could be extremly
large
The reason is that video contains much spatial and temporal
redundancy.
In a single frame, nearby pixels are often correlated with each
other. This is called spatial redundancy, or the intraframe
correlation.
Another one is temporal redundancy, which means adjacent
frames are highly correlated, or called the interframe
correlation. Therefore, our goal is to efficiently reduce spatial
and temporal redundancy to achieve video compression.
4. Video consist of time ordered sequence of
frames.
An obvious solution to video compression
would be predictive coding based on previous
frames.
5. Video compression deals with the compression of
visual video data.
Video contains spatial & temporal redundancy.
We reduce the redundency using lossy compression
Image compression techniques to reduce spatial
redundency
Motion compensation techniques to reduce temporal
redundency
6. This was invented in 1960s
To reduce temporal redundancy in video sequence, we use the
motion-compensated prediction method.
Temporal redundancy is exploited so that not every frame of the
video needs to be coded independently as a new image. The
difference between the current frame and other frame(s) in the
sequence will be coded — small values and low entropy, good for
compression.
7. • A practical and widely-used method of motion compensation
is to compensate for movement of blocks of the current frame,
as show in Fig.
• The following procedure is carried out for each MxN block in
the current frame:
8. Search an area in the reference frame (past or future
frame, previously coded and transmitted) to find the
best matching M x N block. This is carried out by
comparing the MxN block in the current frame with all
of the possible MxN regions in the search area. This
process of finding the best match is known as Motion
Estimation (ME).
9. When the best matching block is found in the reference
frame by motion estimation, we subtract the best
matching block from the current macroblock to
produce a residual macroblock.
Within the encoder, the residual is encoded and
decoded and added to the matching region to form a
reconstructed macroblock which is stored as a
reference for further motion-compensated prediction.
10. The residual block is encoded and transmitted
and the offset between the current block and
the position of the candidate region motion
vectors is also transmitted.
The displacement between current block and
reference block .
11. – The current image frame is referred to as Target
Frame.
– A match is sought between the macroblock in the
Target Frame
and the most similar macroblock in previous and/or
future frame(s) (referred to as Reference frame(s)).
– The displacement of the reference macroblock to the
target macroblock is called a motion vector MV.
– Figure shows the case of forward prediction in
which the Reference frame is taken to be a previous
frame.
12.
13. MV search is usually limited to a small immediate
neighborhood — both horizontal and vertical
displacements in the range [−p, p].
This makes a search window of size (2p+1) ×
(2p+1).
14. The difference between two macroblocks can then
be measured by their Mean Absolute Difference
(MAD):
15. N – size of the macroblock,
k and l – indices for pixels in the macroblock,
i and j – horizontal and vertical displacements,
C(x+k, y +l) – pixels in macroblock in Target
frame,
R(x+i+k, y +j +l) – pixels in macroblock in
Reference frame.
16.
17. Black and white TV picture is generated by exciting
the phosphor on the television screen using an electron
beam whose intensity is modulated to generate the
image .
The line created by horizontal traversal of electron
beam is called a line of the image.
In order to trace second line electron beam as to be
deflected back to the left screen .
The gun is turned off in order to prevent retrace from
becoming visible.
18. In order to keep the cost of bandwidth low it was
decided to send 525 lines 30 times in a seconds .
525 lines said to constitue -frame
To avoid the flicker , devide the into two interlaced
fields .
20. In colour tv instead of single electron gun, we have
three electron guns ,these excite red green blue
phosphor dots embeded in the screen .
To control this we require three signals red green
blue signals ,if they transmit individually they
require three times the bandwidth –problem
backward compatibility.
21. The human visual system (HVS) is less sensitive
to color than to luminance (brightness).
It is a popular way of efficiently representing color
images. Y is luminance (luma) component and the
color information can be represented as color
difference (chrominance or chroma) components,
where each chrominance component is the
difference between R, G or B and the luminance Y