Research and activity report

Activity & research report
Marco Cagnazzo
Paris, September 2013

Overview
• Activity report
– Teaching and PhD supervision
– Projects and other activities
– Bibliometrics
• Research themes
– Video coding optimization
• Motion representation
• 3D video coding
– Adaptive image compression
– Distributed video coding
• Multiview DVC
• SI effectiveness evaluation
– Robust video streaming
• Streaming protocols
• Network coding
• Conclusions
2

Timeline
• 2002-2005 : PhD @ University of Naples & University
of Nice-Sophia Antipolis (cotutelle)
• 2005-2006 : Post-doc @ National Multimedia Lab
(Naples) & Assistant professor @ University of Naples
• 2006-2008 : Post-doc @ I3S Lab, Sophia Antipolis
• Since February 2008: Maître de conférences in Digital
Video @ Telecom-ParisTech
3
2002 2004 2006 2008 2010 2012 2014

Teaching
4
Name Institution Years
Information Theory “Parthenope”
University of Naples
2004-2005 Responsible
Multimedia signal
processing
“Federico II”
2005-2007 Responsible
Compression
techniques
Telecom-ParisTech 2008- … Co-responsible
Digital video and
multimedia
Telecom-ParisTech 2009- … Responsible
Digital television Telecom-ParisTech 2009- … Co-responsible, CE
Video over mobile Telecom-ParisTech 2009- … Co-responsible, CE
3D Video Telecom-ParisTech 2010- … Co-responsible, CE

Teaching
• Collaborative Learning Thematic Project
• Tools and applications for signals, images and sound
• Image processing and analysis
• Advanced methods for image processing
• Computer vision
• Web Mining
• Introduction to image processing (ATHENS)
• Multimedia Indexing and Retrieval (ATHENS)
• Short and long student projects (“projet libres” and “stages”)
• Image and video compression
• Video over IP
• Signal and image processing
• Wavelet and signal processing
Total: ≈1100 hours (heures équivalentes TD)
5

PhD students
Name Years Subject
Marwa Meddeb 2013 - Video-conference with HEVC
Marco Calemme 2012 - 3D Video and Depth coding
Aniello Fiengo 2012 - Rate allocation for video
Giovanni Chierchia 2011 - Convex optimization
Elie Gabriel Mora 2011 - 3D Video Compression
Giovanni Petrazzuoli 2009 - 2013 DVC and IMVS
Abdel-Bassir Abou El Ailah 2009 - 2012 DVC and FRI signals
Claudio Greco 2008 - 2012 Robust video streaming
Thomas Maugey 2007 - 2010 Multiview DVC
In addition to a dozen of MSc students supervision
6

Research projects
Name Period Subject
LABNET 2001-2002 Low-complexity video coding
CNRAED 2004-2005 Hyper-spectral image coding
CPRE 46 04 06 11 2006-2007 Region based motion vector coding
Secure Media SIM 2007-2008 Secure video coding over SIM card
AIBER 2008 Wavelet-based scalable video coding
DIVINE 2007-2009 Robust video coding
DITEMOI 2007-2010 Video streaming over wireless networks (*)
PERSEE 2009-2013 Perceptual 2D and 3D video coding (*)
SWAN 2011-2013 Network coding
SURICATE Approved Video protection
WOW Submitted Interactive 3D streaming (**)
(*) Responsible for Telecom-ParisTech
(**) Project coordinator
Moreover: smaller contributions to ACDC, Pingo, Sebastian 2, NeVEx
7

Other responsibilities
• 8 PhD Thesis committees (4 as examiner, 4 as co-supervisor)
• Area editor for 2 Elsevier journals (SPIC, SIGPRO)
• Reviewer for main journals and conferences in the field
• Participation to conference organization (Organizing committees of
MMSP’10, EUVIP’11, EUSIPCO’12, ICIP’14)
• Special session co-organization (EUSIPCO’10, DSP’11, WIAMIS’13,
ASILOMAR’13)
• Correspondant académique between Telecom-ParisTech and the
• Yearly Erasmus lessons at University of Naples
• Invited lesson at the Winter Doctoral School, University of Naples
(2010)
• IEEE Senior Member (‘11), IEEE SPS Member, EURASIP Member
8

Bibliometrics
• 15 journal papers: 13 published, 2 to appear
– One paper selected as “High quality paper” by the IEEE MMTC-R Letter board, and
included in the January 2013 issue
• 4 submitted journal papers: 2 in first round; 2 in preparation for the
second round
• 3 journal papers in preparation
• 59 conference papers: 56 published and 3 to appear
– Two MMSP Top 10% awards
• One standardization contribution
• One co-edited book
– F. Dufaux, B. Pesquet-Popescu, M Cagnazzo (eds.): Emerging Technologies for 3D
Video. Wiley, 2013
• 9 book chapters: 3 published and 6 to appear
• According to the Google Scholar web site, my H-index is equal to 13
(update: August 31, 2013)
9

VIDEO CODING OPTIMIZATION
1 Standardization contribution
8 Conference papers
1 Submitted journal paper
4 Journal papers

Motion vector representation
• Quantization of motion vectors to reduce their coding
cost
• Motion vector refinement and dense motion vector
representation generated at the decoder
• Lossless coding of segmented motion fields
• Motion estimation for wavelet-based video coding
11

MC
Motion vector quantization
ME
DCT IDCT
Frame
Buffer
Q
𝜆
𝒗∗
Frame
Buffer
MC𝐵
𝜃 𝑄 𝑝
𝒗∗
𝐵(𝑄 𝑝)
• M. Cagnazzo, M. Agostini, M. Antonini, G. Laroche, and J. Jung, “Motion vector quantization for efficient low-bitrate video coding,” in SPIE Visual
Communications and Image Processing Conference, vol. 7257, (San Jose, California), 2009.
• S. Corrado, M. Agostini, M. Cagnazzo, M. Antonini, G. Laroche, and J. Jung, “Improving H.264 performances by quantization of motion vectors,” in
Picture Coding Symposium, (Chicago, IL), 2009.
• M. Agostini, M. Cagnazzo, M. Antonini, G. Laroche, and J. Jung, “A new coding mode for hybrid video coders based on quantized motion vectors,” IEEE
Transactions on Circuits and Systems for Video Technology, vol. 21, pp. 946–956, July 2011. 12
DecoderEncoder

MC
Motion vector quantization
ME
DCT IDCT
Frame
Buffer
Q
𝜆
𝒗∗
Q
Frame
Buffer
MC𝐵
𝜃 𝑄 𝑝
𝒗(𝑄 𝑣)
𝑄 𝑣
𝐵(𝑄 𝑝, 𝑄 𝑣)
13
• M. Cagnazzo, M. Agostini, M. Antonini, G. Laroche, and J. Jung, “Motion vector quantization for efficient low-bitrate video coding,” in SPIE Visual
Communications and Image Processing Conference, vol. 7257, (San Jose, California), 2009.
• S. Corrado, M. Agostini, M. Cagnazzo, M. Antonini, G. Laroche, and J. Jung, “Improving H.264 performances by quantization of motion vectors,” in
Picture Coding Symposium, (Chicago, IL), 2009.
• M. Agostini, M. Cagnazzo, M. Antonini, G. Laroche, and J. Jung, “A new coding mode for hybrid video coders based on quantized motion vectors,” IEEE
Transactions on Circuits and Systems for Video Technology, vol. 21, pp. 946–956, July 2011.
DecoderEncoder

Quantization step for motion vectors
• Double-pass approach
– Estimation of the best step over a frame
– Actual encoding with the selected step
• Estimation:
– Sum of distortions
– Oracle (used as reference)
• Results: average rate reduction ≈ 4% with respect to H.264
and ≈ 8% with respect to H.264 1/8-pel
NB: All rate reductions for video are measured using the Bjontegaard
metric (approximated average rate reduction for the same PSNR over a
given interval)
14

Differential techniques for ME
BMA
ME
Hybrid
Coder H.264
Stream
Residual
MVs
Differential
MV
refinement
Input
video
Side
Info
Enhancement
Layer
Residual
Hybrid
Coder
MVs
M. Cagnazzo and B. Pesquet-Popescu, “Introducing differential motion estimation into hybrid video coders,” in SPIE Visual Communications and Image
Processing Conference, vol. 1, (Huang Shan, An Hui, China), pp. 1–4, 2010. 15

Differential ME in hybrid video coding
• Layered representation of video
• Base layer compatible with any hybrid technique
• Enhancement layer uses costless refined vectors
𝛿𝐯 𝑛, 𝑚 =
−𝑒 𝑛,𝑚
𝜆 + 𝝓 𝑛,𝑚
2 𝝓 𝑛,𝑚
• The refinement depends on the motion compensated
error image 𝑒 and on the motion compensated
reference image gradient 𝝓
• Proof of principle, small improvements (up to almost
1% rate reduction)
16

Context quantization
• Target: exploit high-order statistical dependencies in
segmented motion fields to reduce the coding rate
(lossless coding)
• Tool: context-based lossless encoder
– Implemented with an arithmetic coder
• Problem: high-order dependencies  large context 
context dilution
– I.e. too many contexts, difficult to estimate conditional
probabilities
• Solution: context quantization
• M. Cagnazzo, M. Antonini, and M. Barlaud, “Mutual information-based context quantization,” Signal Proc.: Image Comm. (Elsevier Science), pp. 64–74,
Jan. 2010. 17

• Contexts (i.e. sequences of already encoded symbols)
are grouped into classes
• Rate increase: the average information loss of including
a context into a class
ℒ 𝑓 = 𝑝 𝑥 𝐷 𝑝 𝑌 𝑥 ∥ 𝑝 𝑌 𝑓 𝑥
𝑥∈𝒳
𝑥: generic context
𝑌: symbol to encode
𝑓: context quantization function, i.e. context label
18

• Problem: finding optimum 𝑓
• Classical approach
– Start with a set of classes
– Move a context from a class 𝑐𝑖 to a class 𝑐𝑗 as far as the
relative entropy 𝐷 𝑝 𝑌 𝑥 ∥ 𝑝(𝑌|𝑐𝑖) is larger than
𝐷 𝑝 𝑌 𝑥 ∥ 𝑝(𝑌|𝑐𝑗)
– Stopping criterion on the relative improvement of the
objective function ℒ(𝑓) or on the number of iterations
19

• Classical approach
– Intuitive, very popular, good results
– Some open questions:
• Does the basic step actually reduce the cost function at each
iteration?
• Is it the largest possible reduction?
• If not, what is the largest possible reduction, and can we achieve it?
• Contribution: answers to these questions
20

• We found the expression of the cost function variation Δℒ associated to
the displacement of a context from a class to another
• We proved that with the classical approach, each iteration actually
reduces the cost function…
• … but not as much as actually possible
• We found the best step
• Rate reductions: up to 3.6% on motion data and to a further 5% on
synthetic data (global minimization based on dynamic programming)
21

ME criterion for WT-based video coding
• WT video coding is based on temporal transform rather than
classical temporal prediction
• Therefore MSE-based ME is not assured to be optimal
• The optimal criterion is the maximization of the coding gain:
CG =
𝑎𝑖 𝑤𝑖 𝜎𝑖
2𝑀
𝑖=1
𝑤𝑖 𝜎𝑖
2 𝑎 𝑖𝑀
𝑖=1
• where 𝑖 is the subband index, 𝜎𝑖
2
the variance, 𝑎𝑖 is the relative
number of coefficients, and 𝑤𝑖 the normalization factor of the 𝑖-th
subband
• M. Cagnazzo, F. Castaldo, T. André, M. Antonini, and M. Barlaud, “Optimal motion estimation for wavelet video coding,” IEEE Trans. Circuits Syst. Video
Technol., vol. 17, pp. 907–911, July 2007. 22

ME criterion for WT-based video coding
• We showed that we only have to minimize 𝜌2 = 𝜎𝑖
2 𝑎 𝑖𝑀
𝑖=1
• In general each MV influences all the subbands, the problem is still complex
• However, the CG can be analytically maximized for a particular class of MC-ed
lifting schemes, the (𝑁, 0) LS
𝒗 𝑩
∗
, 𝒗 𝑭
∗
= argmin
𝑣 𝐵,𝑣 𝐹
ℰ 𝜖 𝐵 + ℰ 𝜖 𝐹 + 2 𝜖 𝐵, 𝜖 𝐹
• Average rate reduction: 8%
23
x7x6x5 x8x3x2x1 x4
h1 h2
l1 l2
h3 h4
l3 l4
x9 … Input frames
High-frequency
subband
Low-frequency subband
(2,0) LS
NB: All rate reductions for
video are measured using
the Bjontegaard metric

3D video coding
• MVD format : multiple views plus depth
• Inter-view and inter-component redundancy
• Three contributions for the upcoming standard 3D-HEVC
24

Modification of the Merge candidate list
for 3D-VC
• In the Merge mode, a block is predicted using a vector from a
short list (Merge list)
• Coding the index list is much less costly than coding the vector
• It can be a motion vector or a disparity vector
• In 3D-HEVC, MVs are much more frequently selected than DVs
• We have proposed to insert a further DV in the Merge list
• Several positions in the primary and secondary list have been
tested
• Best results obtained with the first position of the secondary list
• We obtained both a rate reduction (0.6%) and a complexity
reduction (4%)
• Contribution accepted into the standard
• E. Mora, J. Jung, M. Cagnazzo, B. Pesquet-Popescu. "Modification of the merge candidate list for dependent views in 3D-HEVC". In IEEE International
Conference on Image Processing, September 2013. Melbourne, Australia.
• E. Mora, B. Pesquet, M. Cagnazzo and J. Jung. Modification of the Merge Candidate List for Dependent Views in 3DV-HTM. Document JCT3V-B0069 for
Shanghai meeting (MPEG number m26793). Shanghai (PRC), October 2012. 25

Intra mode inheritance for 3D-HEVC
• Observation: blocks with strong
contours and one dominant
direction tend to be encoded
with the same Intra directional
mode in Texture and Depth
• Idea: when coding Depth, add the
co-located Intra mode to the
Most Probable Mode list when a
dominant direction is detected
• Dominant direction is revealed by
the presence of a single peak in
the histogram of the gradient
angle for the current block
• E. Mora, J. Jung, M. Cagnazzo, and B. Pesquet-Popescu, “Codage de vidéos de profondeur basé sur l’héritage des modes intra de texture,” in
Compression et Représentation des Signaux Audiovisuels, vol. 1, (Lille, France), pp. 1–4, 2012.
• E. Mora, J. Jung, M. Cagnazzo, B. Pesquet-Popescu. “Depth Video Coding Based on Intra Mode Inheritance From Texture”. Submitted to APSIPA
Transactions on Signal and Information Processing (2013) 26
2. Compute
the gradient
statistics
1. Find the
reference
texture block
3. If a dominant
direction is
detected, add to
the MPM list

Intra mode inheritance for 3D-HEVC
• The Dominant angle revealed to
be an effective feature for
detecting blocks were inheritance
is effective
• Inserting the inherited mode in
the MPM list allows an average
coding rate reduction of ≈1%
• Tests performed over MPEG
sequences under Common Test
Conditions
27

Enhanced quad-tree coding for 3D-HEVC
• The 3D-HEVC codec uses quad-trees for encoding
texture and depth
• These trees are quite correlated
• We propose an inter-component coding tool for both
reducing complexity and rate by exploiting the quad-
tree redundancy
• Two variants, according to the component that is
encoded first (texture or depth)
• Contribution to 3D-HEVC working draft and
reference software
• E. Mora, J. Jung, M. Cagnazzo, B. Pesquet-Popescu. “Initialization, limitation and predictive coding of the depth and texture quad-tree in 3D-HEVC
Video Coding”. Accepted into IEEE Transaction on Circuits and Systems for Video Technology 28

Enhanced quad-tree coding for 3D-HEVC
• Observation: texture coding units are very often as much
partitioned as depth
• Therefore we can limit the depth map partitioning level if we know
texture…
• … or we can initialize the texture partitioning if we know depth
• Complexity reduction (less configuration to examine): up to -31%
encoder saving time
• Rate reduction (easier prediction of coding modes): up to -1.8%
29

Don’t Care Regions
A depth pixel only needs to be reconstructed such that the
resulting geometric error leads to an acceptable distortion in the
synthesized view
Disparity value
Error in the
synthesized
pixel value
DCR
G. Valenzise, G. Cheung, R. Galvao, M. Cagnazzo, B. Pesquet-Popescu, and A. Ortega, “Motion prediction of depth video for depth-image-based rendering
using Don’t Care Regions,” in Picture Coding Symposium, vol. 1, (Krakow, Poland), pp. 1–4, 2012. 30

DCR Example (Kendo, frame 10, t = 5)
31

We embedded DCR into a H.264/AVC encoder, changing
three basic aspects:
1. Motion estimation
2. Residual coding
3. Skip mode
32

• We compute and encode prediction residuals wrt the
DCRs
• For SKIP mode, no prediction residuals are coded
– The reconstructed values could be far outside the DCR,
leading to an arbitrarily high distortion in the synthesized
view
– We adopt a conservative policy: prevent SKIP selection when
any reconstructed pixel is outside its DCR
• Results: average rate saving of 7%
• High preprocessing complexity
34

Other work in 3D video coding
• Dense disparity field for MVV and MVD coding
• Depth coding using
elastic curve model
• I. Daribo, M. Kaaniche, W. Miled, M. Cagnazzo, and B. Pesquet-Popescu, “Dense disparity estimation in multiview video coding,” in IEEE Workshop on
Multimedia Signal Processing, (Rio de Janeiro, Brazil), 2009.
• M. Cagnazzo and B. Pesquet-Popescu, “Depth map coding by dense disparity estimation for MVD compression,” in IEEE Digital Signal Processing,
(Corfu, Greece), 2011.
• E. Mora, J. Jung, B. Pesquet-Popescu, M. Cagnazzo. "Modification of the disparity vector derivation process in 3D-HEVC". In IEEE Workshop on
Multimedia Signal Processing, vol. 1, September 2013. Cagliari, Italy. 35

OBJECT-BASED IMAGE CODING
6 Conference papers
2 Journal papers

Region-based hyperspectral image coding
Multispectral /
Hyperspectral Image
Map
Segmentation
(TS-VQ)
Map Coding
Region Coding
• M. Cagnazzo, R. Gaetano, S. Parrilli, and L. Verdoliva, “Region based compression of multispectral images by classified KLT,” in EUSIPCO. 2006.
• M. Cagnazzo, R. Gaetano, S. Parrilli, and L. Verdoliva, “Adaptive region-based compression of multispectral images,” in Proceed. of IEEE Intern. Conf.
Image Proc., (Atlanta, GA), pp. 3249–3252, Oct. 2006
• M. Cagnazzo, S. Parrilli, G. Poggi, and L. Verdoliva, “Costs and advantages of object-based image coding with shape-adaptive wavelet transform,”
EURASIP J. Image Video Proc., 2007 37

• Spectral transform: WT, global KLT, class-based KLT,
region-based KLT
• Spatial transform: WT, SA-WT
• Encoder: SA-SPIHT with optimal rate allocation among
objects
• Results:
– 0.5 dB better than JP2K-Multicomponent
– Better post-processing (i.e. classification) results
• M. Cagnazzo, G. Poggi, and L. Verdoliva, “Region-based transform coding of multispectral images,” IEEE Trans. on Image Processing, vol. 16, pp. 2916–2926, Dec. 2007. 38

AVIRIS
image
32 bands,
0.3 bps
(original
@16bps)
Landsat
TM image
6 bands,
0.6 bps
(original
@8bps)
39

Adaptive wavelet and rate allocation
• Adaptive wavelets (implemented via lifting schemes) allows to
change filters according to the signal characteristics
• Further constraint: reconstruction without sending side
information
x(k)
xd(k)= y01 (k)
U -PDSplit
d(k)
xa(k)=y00 (k)
• S. Parrilli, M. Cagnazzo, and B. Pesquet-Popescu, “Distortion evaluation in transform domain for adaptive lifting schemes,” in IEEE Workshop on
Multimedia Signal Processing, (Cairns, Australia), pp. 200–205, 2008.
• S. Parrilli, M. Cagnazzo, and B. Pesquet-Popescu, “Estimation of quantization noise for adaptive-prediction lifting schemes,” in IEEE Workshop on
Multimedia Signal Processing, (Rio de Janeiro, Brazil), 2009. 40
x(k)
xd(k)= y01 (k)
U -PDSplit
d(k)
xa(k)=y00 (k)

Adaptive wavelet and rate allocation
• The resulting transform is highly non-orthogonal
• Problem: distortion evaluation in the transform domain
in order to perform rate allocation
• Solutions for uncorrelated noise
– Good error energy evaluation
– Performance improvement for ALS up to 3dB
– Improved SSIM (+3%)
41
• M. Cagnazzo and B. Pesquet-Popescu, “Perceptual impact of transform coefficients quantization for adaptive lifting schemes,” in International
Workshop on Video Processing and Quality Metrics for Consumer Electronics, (Scottsdale, AZ), 2010.
• M. Abid, M. Cagnazzo, and B. Pesquet-Popescu, “Image denoising by adaptive lifting schemes,” in European Workshop on Visual Information
Processing, vol. 1, (Paris, France), 2010

DISTRIBUTED VIDEO CODING
17 Conference papers
2 Submitted journal paper
3 Journal papers

Distributed video coding
• Coding of many correlated sources
• Encoders do not communicate one with another
• Same RD performance of centralized coding (in theory only!)
Slepian-Wolf Coder
Quantizer
Turbo
Encoder
Min Distort
Reconstr
Q Q’
Buffer
Turbo
Decoder
WZ WZWZ SI
Image
Interpolation
KF KF
Intra
Coder
Intra
Decoder Decoded
KFs
Decoded
WZFs
Encoder Decoder
43

Image interpolation:
High-order trajectories for ME in DVC
• G. Petrazzuoli, M. Cagnazzo, B. Pesquet-Popescu. "High order motion interpolation for side information improvement in DVC". In International
Conference on Acoustics, Speech and Signal Processing, March 2010. Dallas, TX
• G. Petrazzuoli, M. Cagnazzo, and B. Pesquet-Popescu, “Fast and efficient side information generation in distributed video coding by using dense motion
representation,” in European Signal Processing Conference, (Aalborg, Denmark), 2010.
• G. Petrazzuoli, T. Maugey, M. Cagnazzo, and B. Pesquet-Popescu, “Side information refinement for long duration GOPs in DVC,” in IEEE Workshop on
Multimedia Signal Processing, vol. 1, (Saint-Malo, France), 2010. 44
Rate −3.3%

Image interpolation:
Pel-based motion estimation
• Block-based object trajectory used as initialization
• Within each block, pixel-by-pixel vectors are
obtained by refining the initialization (Cafforio-Rocca
algorithm)
• Refinement equations have been re-written and
solved () since in this case the reference image
does not exist
• Rate reductions: 3.5% to 6%
• M. Cagnazzo, T. Maugey, and B. Pesquet-Popescu, “A differential motion estimation method for image interpolation in distributed video coding,” in
International Conference on Acoustics, Speech and Signal Processing, vol. 1, (Taiwan), pp. 1861–1864, 2009.
• W. Miled, T. Maugey, M. Cagnazzo, and B. Pesquet-Popescu, “Image interpolation with dense disparity estimation in multiview distributed video
coding,” in International Conference on Distributed Smart Cameras, (Como, Italy), 2009.
• T. Maugey, W. Miled, M. Cagnazzo, and B. Pesquet-Popescu, “Méthodes denses d’interpolation de mouvement pour le codage vidéo distribué monovue
et multivue,” in Colloque GRETSI - Traitement du Signal et des Images, (Dijon (France)), 2009.
• M. Cagnazzo, W. Miled, T. Maugey, and B. Pesquet-Popescu, “Image interpolation with edge-preserving differential motion refinement,” in IEEE
International Conference on Image Processing, vol. 1, (Cairo, Egypt), pp. 361–364, 2009. 45

The Cafforio-Rocca algorithm:
Sample results
46

Local and global SI fusion
• Given the WZF, feature points on the reference frames are
extracted by SIFT
• Matching features allow to perform a global motion compensation
(first SI)
• Local motion compensation (traditional method) is also performed
(second SI)
• The two SI are merged using partial channel decoding and re-
estimating motion
• Experiments show average rate reduction of ≈ 25% with respect to
literature references
• A. Abou-El Ailah, F. Dufaux, M. Cagnazzo, B. Pesquet-Popescu, and J. Farah, “Successive refinement of side information using adaptive search area for
long duration GOPs in distributed video coding,” in International Conference on Telecommunications, (Beirut), 2012.
• A. Abou-El Ailah, F. Dufaux, M. Cagnazzo, and J. Farah, “Fusion of global and local side information using support vector machine in transform-domain
DVC,” in EUSIPCO, vol. 1, (Bucharest, Romania), pp. 1–5, Aug. 2012.
• A. Abou-El Ailah, G. Petrazzuoli, J. Farah, M. Cagnazzo, B. Pesquet-Popescu, F. Dufaux. "Side Information Improvement in Transform-Domain Distributed
Video Coding". In SPIE - Applications of Digital Image Processing,. San Diego, CA (USA), Aug. 2012
• A. Abou-El Ailah, F. Dufaux, J. Farah, M. Cagnazzo, and B. Pesquet-Popescu, “Fusion of global and local motion estimation for distributed video coding,”
IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, n. 1, pp. 158-172, Jan. 2013. 47

Multiview DVC
• Motion models for temporal
image interpolations
– High order motion
interpolation
– Pixel-based motion vector
refinement
• Multi-hypothesis SI fusion
based on observed parity bits
and Bayesian classification
48
Views
Time
KFWZWZ KF
KF WZ KF WZ KF
KFWZWZ KF
KF WZ KF WZ KF
WZ
WZ
• G. Petrazzuoli, M. Cagnazzo, B. Pesquet-Popescu. “Novel solutions for side information generation and fusion in multiview distributed video coding”.
Submitted to Eurasip Journal of Advances in Signal Processing

Multiview DVC
• Step 1: produce a temporal estimation with HOMI
• Step 2: produce a inter-view estimation with occlusion
reduction (use disparity to estimate foreground
objects)
• Step 3: produce a fusion of the two estimations using
Left-Right Consistency Check to remove residual
occlusions
• Step 4: Select one out of these three images as side
information
49

Multiview DVC
• For one image out of 𝑁 we ask for parity bits for
temporal and inter-view estimation
• We compare the number of bits needed for correcting
the two estimations:
– If they are close, we
choose the fusion image
– If not, we select the
image with the least
rate
• Equivalent to Bayesian decision
𝐷 = arg max
𝑑
𝑃 𝐷 = 𝑑 𝛿 𝑅
= arg max
𝑑
𝑝 𝛿 𝑅 𝑑 𝑃 𝑑 = arg max
𝑑
𝑓𝑑(𝛿 𝑅)
50

Multiview DVC
• Experiments show that the Bayesian classifier selects
very often the best SI
• It only may be wrong when the decoding rates are very
near each to the other, but thus, selecting a suboptimal
SI does not degrade performance
• Cumulated gain w.r.t to the state of the art: ≈ 9.1%
rate reduction
51

Side information effectiveness
• Side information is corrected with parity bits to produce the
decoded WZ frame
• Intuitively, the most the SI “is similar” to the original image, the less
parity bits are needed
• Traditionally, PSNR between SI and WZF has been used to evaluate
the SI quality
• However it is easy to build some toy example where two iso-PSNR
images requires a very different number of correction bits
SI PSNR: 29.1 dB SI PSNR: 29.1 dB
Parity bits:
137kb
Decoded quality:
39.3 dB
Parity bits:
192kb
Decoded quality:
35.4 dB
• T. Maugey, J. Gauthier, M. Cagnazzo, B. Pesquet. “Evaluation of side information effectiveness in distributed video coding”. IEEE TCSVT, accepted 52

Side information effectiveness
• Questions: why PSNR is not always reliable? Can we find better metrics?
• Applications: Hash-based DVC systems, Witsenhausen-Wyner video coding
systems, …
• New framework for metric comparison based on end-to-end RD
performance
• Proposed metrics:
SIQ 𝑎 𝐼0, 𝐼1 = 10 log10
2552
𝐼0 𝒑 − 𝐼1 𝒑 𝑎
𝒑
HSIQ 𝐼0, 𝐼1 = 10 log10
𝑁bits
𝑑H 𝐼0, 𝐼1
• SIQ1 and HSIQ improves wrt PSNR both theoretical and practical
effectiveness measures (Hash-based system: 20% rate reduction)
• PSNR works well for homogenous errors and start failing for large but
spatially concentrated errors
• T. Maugey, C. Yaacoub, J. Farah, M. Cagnazzo, and B. Pesquet-Popescu, “Side information enhancement using an adaptive hash-based genetic algorithm
in a Wyner-Ziv context,” in IEEE Workshop on Multimedia Signal Processing, vol. 1, (Saint-Malo, France), 2010 53

IMVS using DVC
Views
Time
All frames are Intra
Coded
Each image is coded
and stored only once
Large bandwidth
requested
Relatively low server
space requested
54

IMVS using DVC
Views
Time
P-frames are used: all
possible frame
dependencies are
coded
Each image is coded
many times
Smallest bandwidth
requested
Very large server space
requested
55

IMVS using DVC
Views
Time
WZ-frames are used:
only parity bits are
coded
Each image is coded
and stored only once
Trade-off between
server space and
bandwidth
56

IMVS using DVC
57
Bandwidth
Server space
Only
Intra
Predictive coding:
Each image coded
many times
Ideal Case: Path known at
encoding time
WZ coding
Operation
region

IMVS for MVD using DVC
• We proposed several strategies for view-switching
• The best (adaptive) achieves a rate reduction of more
than 15% wrt to reference methods
G. Petrazzuoli, M. Cagnazzo, F. Dufaux, and B. Pesquet-Popescu, “Using distributed source coding and depth image based rendering to improve interactive
multiview video access,” in IEEE International Conference on Image Processing, vol. 1, (Bruxelles, Belgium), pp. 605–608, 2011.
G. Petrazzuoli, M. Cagnazzo, B. Pesquet-Popescu, F. Dufaux. "Enabling Immersive Visual Communications through Distributed Video Coding". IEEE MMTC
E-Letter (May 2013). 58

Other work on DVC
• Fusion schemes for multiview DVC
• Iterative methods for SI refinement
• DVC for multiple-view-plus-depth video
• DVC and interactive multiview streaming
• Local and global SI fusion
• Nine further conference papers
• A. Abou-El Ailah, F. Dufaux, J. Farah, M. Cagnazzo, and B. Pesquet-Popescu, “Fusion of global and local motion estimation for distributed video coding,”
IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, n. 1, pp. 158-172, Jan. 2013.
• G. Petrazzuoli, M. Cagnazzo, B. Pesquet-Popescu, F. Dufaux. "Enabling Immersive Visual Communications through Distributed Video Coding". IEEE
MMTC E-Letter (May 2013). 59

ROBUST VIDEO DISTRIBUTION
7 Conference papers
1 Submitted journal paper + 2 in preparation
2 Journal papers

ABCD protocol
• Problem: reliable diffusion of video on wireless network
• Construction of overlays to carry MDC video
• Minimization of the number of sent packets (both video
and management packets)
• First contribution: a reliable extension of the IEEE 802.11
broadcast communication, using a control peer
• Once a reliable broadcast channel is provided, the nodes
attach to the stream as soon as they hear about it
• C. Greco, M. Cagnazzo, and B. Pesquet-Popescu, “H.264-based multiple description coding using motion compensated temporal interpolation,” in IEEE
Workshop on Multimedia Signal Processing, vol. 1, (Saint-Malo, France), 2010
• C. Greco, G. Petrazzuoli, M. Cagnazzo, and B. Pesquet-Popescu, “An MDC-based video streaming architecture for mobile networks,” in IEEE Workshop
on Multimedia Signal Processing, vol. 1, (Hangzhou, China), pp. 1–4, 2011.
• C. Greco and M. Cagnazzo, “A cross-layer protocol for cooperative content discovery over mobile ad-hoc networks,” International Journal of
Communication Networks and Distributed Systems, vol. 6, July 2011.
• C. Greco, M. Cagnazzo, and B. Pesquet-Popescu, “ABCD : Un protocole cross-layer pour la diffusion vidéo dans des réseaux sans fil ad-hoc,” in Colloque
GRETSI - Traitement du Signal et des Images, (Bordeaux, France), 2011. 61

ABCD protocol
𝑠 𝑝1
𝑝2
Advertisement Attachment
Attachment
62

Video Data Video Data &
Attachment
Attachment
ABCD protocol
𝑠 𝑝1
𝑝2
𝑝3
𝑝4
63

ABCD protocol: parent switch
𝑝∗ = arg min
𝑝
𝑤ℎℎ 𝑝 + 𝑤 𝑎 𝑎 𝑝 + 𝑤 𝑑 𝑑 𝑝 − 𝑤𝑔 𝑔(𝑝)
64

ABCD: simulation results (ns2)
65

ABCD/CoDiO
• ABCD may suffer from high delay in large, crowded networks
• To reduce the delay, we introduced a Congestion-Distortion
Optimization (CoDiO) in the per-hop wireless broadcast
transmission
• We adjust the RTS/CTS retry limit k of each packet in a Co-Di
optimized fashion
• Small values of k reduce the congestion but the distortion
increases, as the probability of obtaining the channel is lower
• High values of k lower the distortion, but congestion increases
due to the channel occupation
Cost function: 𝐽 𝑘 = 𝐷 𝑘 + 𝜆𝐶(𝑘)
• C. Greco, M. Cagnazzo, and B. Pesquet-Popescu, “Low-latency video streaming with congestion control in mobile ad-hoc networks,” IEEE Transactions
on Multimedia, vol. 14, n. 4, pp. 1337-1350, Aug. 2012. Paper selected as “High quality paper” by the IEEE MMTC-R Letter board 66

ABCD/CoDiO
Challenges:
• Model the effects of a single-node decision on the
entire network
• Even if a node switches off, alternative paths may be
formed
• Information about alternative paths is gathered at
leaves and conveyed upstream
• The information is refined where it actually matters,
i.e. near the root – where a single decision affects a lot
of nodes
67

ABCD/CoDiO: simulation results (ns2)
68

Network coding for video delivery
• Network coding allows incrementing network throughput by letting
intermediate nodes processing packets instead of simply relaying
them
• NC can easily be extended to wireless networks
69

Network coding
• Using ABCD as overlay to implement NC in wireless
network
• Optimized scheduling for MDC in Expanded Window
NC
• Optimized scheduling for multiview video over NC
• Blind source separation for reducing the NC overhead
70

• RDO-scheduling in NC-based delivery
• A generation is composed by the frame of a multi-view GOP
or a MDC GOP
• Each node must decide the schedule of frames
• I. Nemoianu, C. Greco, M. Cagnazzo, and B. Pesquet-Popescu, “A framework for joint multiple description coding and network coding over wireless ad-
hoc networks,” in International Conference on Acoustics, Speech and Signal Processing, (Kyoto, Japan), 2012
• I. Nemoianu, C. Greco, M. Cagnazzo, and B. Pesquet-Popescu, “A network coding scheduling for multiple description video streaming over wireless
networks,” in EUSIPCO, vol. 1, (Bucharest, Romania), pp. 1–5, Aug. 2012.
• I. Nemoianu, C. Greco, M. Cagnazzo, B. Pesquet-Popescu. "Multi-View Video Streaming over Wireless Networks with RD-Optimized Scheduling of
Network Coded Packets". In SPIE Visual Communications and Image Processing Conference, San Diego, CA (USA), Nov. 2012. 71

• RDO calls for a unique scheduling (send first the frame
that maximally reduces the RD cost function)
• NC calls for different scheduling at each node (pseudo-
random selection) in order to maximize the throughput
• Solution: to collect frames into groups with “similar”
RD characteristics, and randomly select within a group
72

BSS for NC
• In NC the intermediate nodes of a network send linear
combinations of the packets they have previously received, with
random coefficients taken from a finite field
• The random coefficients must be added to the packet as
headers, incurring an overhead
• In a blind source separation (BSS) based approach, it could be
possible to relieve the nodes from the need to include the
coefficients in the packets
• BSS consists in recovering a set of source signals 𝑆 from a set of
mixed signals 𝑋 = 𝑓(𝑆), also referred to as observations,
without knowing the sources themselves nor the mixing process
parameters; in NC we have linear mixing, 𝑋 = 𝐴𝑆
• I. Nemoianu, C. Greco, M. Castella, B. Pesquet-Popescu, M. Cagnazzo. "On a practical approach to source separation over finite fields for network coding
applications". In International Conference on Acoustics, Speech and Signal Processing, May 2013. Vancouver, Canada. 73

BSS for NC
• Literature BSS approach in finite fields:
– Iterative scan of packet combinations
– Minimization of a contrast function
• Our idea: add to packets a signature that is degraded by linear
combination
• Then, the contrast function can be
computed only on candidates
having a valid signature
• Problems: how to choose the
signature to reduce the probability
that a linear combination of
packets still carries a valid signature
• Simple solution: odd-parity bit
• Drastic reduction of the search space
74

Perspectives
• “Classical” video coding: advanced models for rate control
• 3D VC:
– combined use of motion and disparity compensation to produce
improved reference frames;
– elastic deformation model for lossless coding of depth contours
• DVC:
– Improved SI generation using an elastic deformation model for
estimating object shapes;
– Geometry-based DVC system for MVD (no backward channel, no
channel coding)
• NC and streaming: use of “social” information to optimize
interactive multiview streaming with a NC approach
76

New themes
• Forensic, forgery detection
• Feature representation and compression
• Video protection
• Immersive communications: holoscopy / holography,
high dynamic range
77

Research and activity report

Recommended

Recommended

More Related Content

Similar to Research and activity report

Similar to Research and activity report (20)

Recently uploaded

Recently uploaded (20)

Research and activity report