SlideShare une entreprise Scribd logo
1  sur  49
組員:
黃弘偉 M9915026 趙修鼎 M9915048
高培元 M9915044 林岱蒼 M9915902
楊逸翔 M9915016 彭宜亭 M9915081
褚慧倫 M9907513
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
Introduction
 ROI determination is required for video data
transmission.
 Moving objects will catch users’ focus points as ROIs in
consecutive frames, but they are computational
intensive.
 Visual rhythm can describe the characteristic of video
content.
 ROI determination based on attention models through
visual rhythm analysis.
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
Visual Rhythm
 Visual rhythm can efficiently
capture the temporal information
of a video.
Visual Rhythm
m
n
diagonal
Anti-diagonal
m : width of a video frame
n : height of a video frame
rd : the ratios of pixel sampling for diagonal
ra : the ratios of pixel sampling for diagonal
• Sampling lines:
Diagonal (D), Anti-diagonal (A),
Vertical (V), Horizontal (H).
Visual Rhythm
Di represents the gray scale value of the diagonal sampling pixels in the ith frame.
Ai represents the gray scale value of the anti-diagonal sampling pixels in the ith frame.
User Attention Models
• Visual rhythm images can be categorized into
six attention model.
User Attention Models
(Horizontal)
User Attention Models
(Vertical)
User Attention Models
(Expanding)
User Attention Models
(Absorbing)
User Attention Models
(Diagonal)
User Attention Models
(Anti-diagonal)
User Attention Models
(POSSIBLE EVENTS)Horizontal
attention
model
Vertical
attention
model
Expanding
attention
model
Absorbing
attention
model
Diagonal
attention
model
Anti-
diagonal
attention
model
Diagonal
sampling
Anti-
diagonal
sampling
Horizontal
sampling
Vertical
sampling
DEMO
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
ROI Determination
 Four sampling lines can obtain the efficient attention
model to characterize the event of a video and avoid
false alarm.
 The center-crossed diagonal and anti-diagonal
sampling lines are first utilized to analyze the attention
model of the current frame, and then the vertical and
horizontal sampling lines are integrated to derive the
final user attention model in order to obtain the ROI.
ROI Determination
1) Visual Rhythm Creation
2)Difference calculation
3)Visual rhythm history
4)Binary thresholding
5)Morphological merging
ROI Determination
Fig. 4. Visual rhythms of diagonal and anti-diagonal sampling lines acquired
from Salesman QCIF sequence with 176 frames. (a) Diagonal and (b)
anti-diagonal.
Fig. 5. Visual rhythm difference images acquired from Fig. 4. (a) Diagonal and (b)
anti-diagonal.
• Obviously, the variation of the visual rhythms embeds significant information
about object movement shown below:
Difference calculation
Fig. 6. Visual rhythm historical images acquired from Fig. 5. (a) Diagonal
and (b) anti-diagonal.
• according to the variation of the visual rhythm:
Visual rhythm history
The threshold is calculated by averaging the historical values, which stand
for the variation of the visual rhythm.
Fig. 7. Binarized images derived from Fig. 6 by the thresholding process of
the historical statistics. (a) Diagonal and (b) anti-diagonal.
Binary thresholding
represents the binary image according to their magnitudes of variations.)(b
i z
Illustrations of the proposed merging steps.
Morphological merging
Images of the scopes of user attention in the diagonal and anti- diagonal
visual rhythms. (a) Diagonal and (b) anti-diagonal.
Morphological merging
Center of Visual Rhythm
Vertical and Horizontal Visual hythms
28
Vertical Horizontal
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
FMO-AWARE ROI DETERMINATION
FOR H.264/AVC VIDEO CODING
 Flexible macroblock ordering (FMO) was introduced in
H.264/AVC through a new error resilience tool and can be
used for ROI video coding as well.
 In H.264/AVC reference software JM 13.2, the FMO
functionality supports eight slice ordering numbers, from
0 to 7, with 0 as its first priority. Thus, the ROI
determination, which is followed by the FMO technique in
H.264/AVC , classifies the MBs into three slices from 0 to 2.
Skin Color Extraction and Visual
Rhythm ROI Determination
 Since human faces are usually the loci of attention in
conversations, human faces should be regarded as the ROI
regions in the implementation.
 Here, both skin color extraction and visual rhythm ROI
determination schemes can detect ROI areas.
 Fig. 16 shows the results of each step in the proposed FMO-
aware ROI determination.
 16(b) and (d), the skin color pixels are
extracted and then categorized into a
macroblockbased image, respectively.
 Then Fig. 16(e) sketches the contour of
the user attention region from the result
of Fig. 16(c).
 Fig. 16(d) and (e) illustrate the
individual ROI results in terms of white
and black macroblocks, where white
macroblocks represent the ROI region.
FMO-AWARE ROI DETERMINATION
Extended ROI Macroblocks
 In implementations, ROI regions do not always stay in the same
position in a consecutive sequence, and a macroblock may change its
ROI status between two consecutive frames.
 Therefore, the variation of generated bits will be raised when a
macroblock changes its situation from a non-ROI region in the
previous frame to an ROI region in the current frame
 Moreover, the visual quality suffers from obvious artifacts in the
boundary between ROI macroblocks and non-ROI ones.However, it is
observed in [24], [25] that an extended region around the ROI regions
is beneficial to reduce the artifact while ensuring regions with targets
are not missed
 Therefore, the extended ROI macroblocks have the ROI regions
obtained above as its center in our implementation. Fig. 16(f) and (g)
illustrates the extended ROI regions marked by gray color.
ROI Scoreboard for FMO
 To create a scoreboard of ROI macroblocks, points are given to classify
the category of each macroblock.
 If a macroblock located in the background gets two points. If a
macroblock belongs to an extended region either in spatial or temporal
domains, it gets one point. Otherwise, a macroblock obtains zero point
when it belongs to the ROI region.
 As illustrated in Fig. 16(h), each macroblock has its score from the
lookup table in Table IV, and then it is arranged into five distinct
ordered slices. Fig. 16(i) shows the original frame with the result of ROI
scoreboard in Fig. 16(h) to demonstrate the location of the
corresponding slices in a frame.
 The higher the score, the less important a
macroblock is in a frame.
Corresponding score lookup table
ROI Scoreboard for FMO
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
Experimental Results
 Salesman introduces a product with his hands
(a) (b) (c) (d)
(e) (f) (g) (h)
 The movement of the hands is the most
important region in this sequence
Experimental Results
(a) (b) (c) (d)
(e) (f) (g) (h)
 ROIs of Foreman sequence.
Experimental Results
(a) (b) (c) (d)
(e) (f) (g) (h)
 Two walking taff in the office room.
Experimental Results
(a) (b) (c) (d)
(e) (f) (g) (h)
Time Consuming Analysis of Visual
Rhythm ROI Determination
Evaluated on 1.5 GHz Pentium-M laptop with 512 MB DDR RAMs
Implementation of H.264/AVC ROI
Video Coding
 Indicate the importance of each slice in FMO

 Ii : the importance factor
 Ni : the number of macroblocks of the slice i
 n stands for the number of slices in a frame
 target bits bppi

 B is the target bits used for the current frame and is estimated
by the JM encoder
 QPi for the FMO

 a and b are recommended as 14 and −0.32
Implementation of H.264/AVC ROI
Video Coding
Implementation of H.264/AVC ROI
Video Coding
Implementation of H.264/AVC ROI
Video Coding
Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion
Conclusion
 This paper has presented a robust ROI determination
method based on user attention models through visual
rhythm analysis.
 It has been the investigation of the visual rhythm
concept for analyzing video content to facilitate the
ROI determination.
 Through visual rhythm, the proposed algorithm can
determine the highest potential ROI area in a fast,
simple, and robust way.
Future Work
 An FMO-aware ROI determination has been proposed
for H.264/AVC video coding to enhance the quality of
ROI regions.
 Based on the concept proposed in this paper, potential
developments of integrated applications are found
when the proposed scheme is combined with
chrominance information analysis.
Thanks for listening.

Contenu connexe

En vedette

How Does Multimedia Enhance The Use Of Information System In Organisations
How Does Multimedia Enhance The Use Of Information System In OrganisationsHow Does Multimedia Enhance The Use Of Information System In Organisations
How Does Multimedia Enhance The Use Of Information System In Organisations
Faisal Haroon
 
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
Iffat Ahmed
 
Workshopvin4 Region Of Interest Advanced Video Coding
Workshopvin4 Region Of Interest Advanced Video CodingWorkshopvin4 Region Of Interest Advanced Video Coding
Workshopvin4 Region Of Interest Advanced Video Coding
imec.archive
 

En vedette (11)

How Does Multimedia Enhance The Use Of Information System In Organisations
How Does Multimedia Enhance The Use Of Information System In OrganisationsHow Does Multimedia Enhance The Use Of Information System In Organisations
How Does Multimedia Enhance The Use Of Information System In Organisations
 
Robot 解析
Robot 解析Robot 解析
Robot 解析
 
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
MetaAnalysis of Multimedia Transmission quality improvements in Wireless Netw...
 
Dmitry Stepanov - Detector of interest point from region of interest on NBI ...
Dmitry Stepanov - Detector of interest point from region of interest on  NBI ...Dmitry Stepanov - Detector of interest point from region of interest on  NBI ...
Dmitry Stepanov - Detector of interest point from region of interest on NBI ...
 
Extraction of region of interest in an image
Extraction of region of interest in an imageExtraction of region of interest in an image
Extraction of region of interest in an image
 
Region Of Interest Extraction
Region Of Interest ExtractionRegion Of Interest Extraction
Region Of Interest Extraction
 
Immersive Telepresence
Immersive TelepresenceImmersive Telepresence
Immersive Telepresence
 
Multimedia networking
Multimedia networkingMultimedia networking
Multimedia networking
 
Workshopvin4 Region Of Interest Advanced Video Coding
Workshopvin4 Region Of Interest Advanced Video CodingWorkshopvin4 Region Of Interest Advanced Video Coding
Workshopvin4 Region Of Interest Advanced Video Coding
 
multimedia technologies Introduction
multimedia technologies Introductionmultimedia technologies Introduction
multimedia technologies Introduction
 
Cgmm presentation on distributed multimedia systems
Cgmm presentation on distributed multimedia systemsCgmm presentation on distributed multimedia systems
Cgmm presentation on distributed multimedia systems
 

Similaire à Robust region of interest determination based on user attention model through visual rhythm analysis

martelli.ppt
martelli.pptmartelli.ppt
martelli.ppt
Videoguy
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
NITC
 
Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...
IJECEIAES
 

Similaire à Robust region of interest determination based on user attention model through visual rhythm analysis (20)

ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS
 
Error resilient for multiview video transmissions with gop analysis
Error resilient for multiview video transmissions with gop analysisError resilient for multiview video transmissions with gop analysis
Error resilient for multiview video transmissions with gop analysis
 
martelli.ppt
martelli.pptmartelli.ppt
martelli.ppt
 
1 state of-the-art and trends in scalable video
1 state of-the-art and trends in scalable video1 state of-the-art and trends in scalable video
1 state of-the-art and trends in scalable video
 
Effects of gop on multiview video
Effects of gop on multiview videoEffects of gop on multiview video
Effects of gop on multiview video
 
A04840107
A04840107A04840107
A04840107
 
Low complexity video coding for sensor network
Low complexity video coding for sensor networkLow complexity video coding for sensor network
Low complexity video coding for sensor network
 
Low complexity video coding for sensor network
Low complexity video coding for sensor networkLow complexity video coding for sensor network
Low complexity video coding for sensor network
 
Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVC
Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVCEfficient Architecture for Variable Block Size Motion Estimation in H.264/AVC
Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVC
 
FPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODERFPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODER
 
Efficient video compression using EZWT
Efficient video compression using EZWTEfficient video compression using EZWT
Efficient video compression using EZWT
 
Video Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience DetectionVideo Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience Detection
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Dynamic Error Concealment Algorithm for Multiview Coding Using Lost MBs Size...
Dynamic Error Concealment Algorithm for Multiview Coding  Using Lost MBs Size...Dynamic Error Concealment Algorithm for Multiview Coding  Using Lost MBs Size...
Dynamic Error Concealment Algorithm for Multiview Coding Using Lost MBs Size...
 
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERHARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
 
E010132529
E010132529E010132529
E010132529
 
Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264
 
Ijetr011814
Ijetr011814Ijetr011814
Ijetr011814
 
Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...Machine learning-based energy consumption modeling and comparing of H.264 and...
Machine learning-based energy consumption modeling and comparing of H.264 and...
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Robust region of interest determination based on user attention model through visual rhythm analysis

  • 1. 組員: 黃弘偉 M9915026 趙修鼎 M9915048 高培元 M9915044 林岱蒼 M9915902 楊逸翔 M9915016 彭宜亭 M9915081 褚慧倫 M9907513
  • 2. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 3. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 4. Introduction  ROI determination is required for video data transmission.  Moving objects will catch users’ focus points as ROIs in consecutive frames, but they are computational intensive.  Visual rhythm can describe the characteristic of video content.  ROI determination based on attention models through visual rhythm analysis.
  • 5. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 6. Visual Rhythm  Visual rhythm can efficiently capture the temporal information of a video.
  • 7. Visual Rhythm m n diagonal Anti-diagonal m : width of a video frame n : height of a video frame rd : the ratios of pixel sampling for diagonal ra : the ratios of pixel sampling for diagonal • Sampling lines: Diagonal (D), Anti-diagonal (A), Vertical (V), Horizontal (H).
  • 8. Visual Rhythm Di represents the gray scale value of the diagonal sampling pixels in the ith frame. Ai represents the gray scale value of the anti-diagonal sampling pixels in the ith frame.
  • 9. User Attention Models • Visual rhythm images can be categorized into six attention model.
  • 16. User Attention Models (POSSIBLE EVENTS)Horizontal attention model Vertical attention model Expanding attention model Absorbing attention model Diagonal attention model Anti- diagonal attention model Diagonal sampling Anti- diagonal sampling Horizontal sampling Vertical sampling
  • 17. DEMO
  • 18. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 19. ROI Determination  Four sampling lines can obtain the efficient attention model to characterize the event of a video and avoid false alarm.  The center-crossed diagonal and anti-diagonal sampling lines are first utilized to analyze the attention model of the current frame, and then the vertical and horizontal sampling lines are integrated to derive the final user attention model in order to obtain the ROI.
  • 20. ROI Determination 1) Visual Rhythm Creation 2)Difference calculation 3)Visual rhythm history 4)Binary thresholding 5)Morphological merging
  • 21. ROI Determination Fig. 4. Visual rhythms of diagonal and anti-diagonal sampling lines acquired from Salesman QCIF sequence with 176 frames. (a) Diagonal and (b) anti-diagonal.
  • 22. Fig. 5. Visual rhythm difference images acquired from Fig. 4. (a) Diagonal and (b) anti-diagonal. • Obviously, the variation of the visual rhythms embeds significant information about object movement shown below: Difference calculation
  • 23. Fig. 6. Visual rhythm historical images acquired from Fig. 5. (a) Diagonal and (b) anti-diagonal. • according to the variation of the visual rhythm: Visual rhythm history
  • 24. The threshold is calculated by averaging the historical values, which stand for the variation of the visual rhythm. Fig. 7. Binarized images derived from Fig. 6 by the thresholding process of the historical statistics. (a) Diagonal and (b) anti-diagonal. Binary thresholding represents the binary image according to their magnitudes of variations.)(b i z
  • 25. Illustrations of the proposed merging steps. Morphological merging
  • 26. Images of the scopes of user attention in the diagonal and anti- diagonal visual rhythms. (a) Diagonal and (b) anti-diagonal. Morphological merging
  • 28. Vertical and Horizontal Visual hythms 28 Vertical Horizontal
  • 29. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 30. FMO-AWARE ROI DETERMINATION FOR H.264/AVC VIDEO CODING  Flexible macroblock ordering (FMO) was introduced in H.264/AVC through a new error resilience tool and can be used for ROI video coding as well.  In H.264/AVC reference software JM 13.2, the FMO functionality supports eight slice ordering numbers, from 0 to 7, with 0 as its first priority. Thus, the ROI determination, which is followed by the FMO technique in H.264/AVC , classifies the MBs into three slices from 0 to 2.
  • 31. Skin Color Extraction and Visual Rhythm ROI Determination  Since human faces are usually the loci of attention in conversations, human faces should be regarded as the ROI regions in the implementation.  Here, both skin color extraction and visual rhythm ROI determination schemes can detect ROI areas.  Fig. 16 shows the results of each step in the proposed FMO- aware ROI determination.
  • 32.  16(b) and (d), the skin color pixels are extracted and then categorized into a macroblockbased image, respectively.  Then Fig. 16(e) sketches the contour of the user attention region from the result of Fig. 16(c).  Fig. 16(d) and (e) illustrate the individual ROI results in terms of white and black macroblocks, where white macroblocks represent the ROI region. FMO-AWARE ROI DETERMINATION
  • 33. Extended ROI Macroblocks  In implementations, ROI regions do not always stay in the same position in a consecutive sequence, and a macroblock may change its ROI status between two consecutive frames.  Therefore, the variation of generated bits will be raised when a macroblock changes its situation from a non-ROI region in the previous frame to an ROI region in the current frame  Moreover, the visual quality suffers from obvious artifacts in the boundary between ROI macroblocks and non-ROI ones.However, it is observed in [24], [25] that an extended region around the ROI regions is beneficial to reduce the artifact while ensuring regions with targets are not missed  Therefore, the extended ROI macroblocks have the ROI regions obtained above as its center in our implementation. Fig. 16(f) and (g) illustrates the extended ROI regions marked by gray color.
  • 34. ROI Scoreboard for FMO  To create a scoreboard of ROI macroblocks, points are given to classify the category of each macroblock.  If a macroblock located in the background gets two points. If a macroblock belongs to an extended region either in spatial or temporal domains, it gets one point. Otherwise, a macroblock obtains zero point when it belongs to the ROI region.  As illustrated in Fig. 16(h), each macroblock has its score from the lookup table in Table IV, and then it is arranged into five distinct ordered slices. Fig. 16(i) shows the original frame with the result of ROI scoreboard in Fig. 16(h) to demonstrate the location of the corresponding slices in a frame.
  • 35.  The higher the score, the less important a macroblock is in a frame. Corresponding score lookup table ROI Scoreboard for FMO
  • 36. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 37. Experimental Results  Salesman introduces a product with his hands (a) (b) (c) (d) (e) (f) (g) (h)
  • 38.  The movement of the hands is the most important region in this sequence Experimental Results (a) (b) (c) (d) (e) (f) (g) (h)
  • 39.  ROIs of Foreman sequence. Experimental Results (a) (b) (c) (d) (e) (f) (g) (h)
  • 40.  Two walking taff in the office room. Experimental Results (a) (b) (c) (d) (e) (f) (g) (h)
  • 41. Time Consuming Analysis of Visual Rhythm ROI Determination Evaluated on 1.5 GHz Pentium-M laptop with 512 MB DDR RAMs
  • 42. Implementation of H.264/AVC ROI Video Coding  Indicate the importance of each slice in FMO   Ii : the importance factor  Ni : the number of macroblocks of the slice i  n stands for the number of slices in a frame  target bits bppi   B is the target bits used for the current frame and is estimated by the JM encoder  QPi for the FMO   a and b are recommended as 14 and −0.32
  • 43. Implementation of H.264/AVC ROI Video Coding
  • 44. Implementation of H.264/AVC ROI Video Coding
  • 45. Implementation of H.264/AVC ROI Video Coding
  • 46. Outline  Introduction  Visual Rhythm And User Attention Model  ROI Determination Through User Attention Model  FMO-aware ROI Determination For H.264/AVC Video coding  Experimental Results  Conclusion
  • 47. Conclusion  This paper has presented a robust ROI determination method based on user attention models through visual rhythm analysis.  It has been the investigation of the visual rhythm concept for analyzing video content to facilitate the ROI determination.  Through visual rhythm, the proposed algorithm can determine the highest potential ROI area in a fast, simple, and robust way.
  • 48. Future Work  An FMO-aware ROI determination has been proposed for H.264/AVC video coding to enhance the quality of ROI regions.  Based on the concept proposed in this paper, potential developments of integrated applications are found when the proposed scheme is combined with chrominance information analysis.

Notes de l'éditeur

  1. NOTE : no one sampling line can represent all events through the six attention