Robust region of interest determination based on user attention model through visual rhythm analysis

組員:
黃弘偉 M9915026 趙修鼎 M9915048
高培元 M9915044 林岱蒼 M9915902
楊逸翔 M9915016 彭宜亭 M9915081
褚慧倫 M9907513

Outline
 Introduction
 Visual Rhythm And User Attention Model
 ROI Determination Through User Attention
Model
 FMO-aware ROI Determination For H.264/AVC
Video coding
 Experimental Results
 Conclusion

Introduction
 ROI determination is required for video data
transmission.
 Moving objects will catch users’ focus points as ROIs in
consecutive frames, but they are computational
intensive.
 Visual rhythm can describe the characteristic of video
content.
 ROI determination based on attention models through
visual rhythm analysis.

Visual Rhythm
 Visual rhythm can efficiently
capture the temporal information
of a video.

Visual Rhythm
m
n
diagonal
Anti-diagonal
m : width of a video frame
n : height of a video frame
rd : the ratios of pixel sampling for diagonal
ra : the ratios of pixel sampling for diagonal
• Sampling lines:
Diagonal (D), Anti-diagonal (A),
Vertical (V), Horizontal (H).

Visual Rhythm
Di represents the gray scale value of the diagonal sampling pixels in the ith frame.
Ai represents the gray scale value of the anti-diagonal sampling pixels in the ith frame.

User Attention Models
• Visual rhythm images can be categorized into
six attention model.

(Horizontal)

(Vertical)

(Expanding)

(Absorbing)

(Diagonal)

(Anti-diagonal)

(POSSIBLE EVENTS)Horizontal
attention
model
Vertical
attention
model
Expanding
attention
model
Absorbing
attention
model
Diagonal
attention
model
Anti-
diagonal
attention
model
Diagonal
sampling
Anti-
diagonal
sampling
Horizontal
sampling
Vertical
sampling

ROI Determination
 Four sampling lines can obtain the efﬁcient attention
model to characterize the event of a video and avoid
false alarm.
 The center-crossed diagonal and anti-diagonal
sampling lines are first utilized to analyze the attention
model of the current frame, and then the vertical and
horizontal sampling lines are integrated to derive the
final user attention model in order to obtain the ROI.

ROI Determination
1) Visual Rhythm Creation
2)Difference calculation
3)Visual rhythm history
4)Binary thresholding
5)Morphological merging

ROI Determination
Fig. 4. Visual rhythms of diagonal and anti-diagonal sampling lines acquired
from Salesman QCIF sequence with 176 frames. (a) Diagonal and (b)
anti-diagonal.

Fig. 5. Visual rhythm difference images acquired from Fig. 4. (a) Diagonal and (b)
anti-diagonal.
• Obviously, the variation of the visual rhythms embeds signiﬁcant information
about object movement shown below:
Difference calculation

Fig. 6. Visual rhythm historical images acquired from Fig. 5. (a) Diagonal
and (b) anti-diagonal.
• according to the variation of the visual rhythm:
Visual rhythm history

The threshold is calculated by averaging the historical values, which stand
for the variation of the visual rhythm.
Fig. 7. Binarized images derived from Fig. 6 by the thresholding process of
the historical statistics. (a) Diagonal and (b) anti-diagonal.
Binary thresholding
represents the binary image according to their magnitudes of variations.)(b
i z

Illustrations of the proposed merging steps.
Morphological merging

Images of the scopes of user attention in the diagonal and anti- diagonal
visual rhythms. (a) Diagonal and (b) anti-diagonal.
Morphological merging

Vertical and Horizontal Visual hythms
28
Vertical Horizontal

FMO-AWARE ROI DETERMINATION
FOR H.264/AVC VIDEO CODING
 Flexible macroblock ordering (FMO) was introduced in
H.264/AVC through a new error resilience tool and can be
used for ROI video coding as well.
 In H.264/AVC reference software JM 13.2, the FMO
functionality supports eight slice ordering numbers, from
0 to 7, with 0 as its first priority. Thus, the ROI
determination, which is followed by the FMO technique in
H.264/AVC , classifies the MBs into three slices from 0 to 2.

Skin Color Extraction and Visual
Rhythm ROI Determination
 Since human faces are usually the loci of attention in
conversations, human faces should be regarded as the ROI
regions in the implementation.
 Here, both skin color extraction and visual rhythm ROI
determination schemes can detect ROI areas.
 Fig. 16 shows the results of each step in the proposed FMO-
aware ROI determination.

 16(b) and (d), the skin color pixels are
extracted and then categorized into a
macroblockbased image, respectively.
 Then Fig. 16(e) sketches the contour of
the user attention region from the result
of Fig. 16(c).
 Fig. 16(d) and (e) illustrate the
individual ROI results in terms of white
and black macroblocks, where white
macroblocks represent the ROI region.
FMO-AWARE ROI DETERMINATION

Extended ROI Macroblocks
 In implementations, ROI regions do not always stay in the same
position in a consecutive sequence, and a macroblock may change its
ROI status between two consecutive frames.
 Therefore, the variation of generated bits will be raised when a
macroblock changes its situation from a non-ROI region in the
previous frame to an ROI region in the current frame
 Moreover, the visual quality suffers from obvious artifacts in the
boundary between ROI macroblocks and non-ROI ones.However, it is
observed in [24], [25] that an extended region around the ROI regions
is beneficial to reduce the artifact while ensuring regions with targets
are not missed
 Therefore, the extended ROI macroblocks have the ROI regions
obtained above as its center in our implementation. Fig. 16(f) and (g)
illustrates the extended ROI regions marked by gray color.

ROI Scoreboard for FMO
 To create a scoreboard of ROI macroblocks, points are given to classify
the category of each macroblock.
 If a macroblock located in the background gets two points. If a
macroblock belongs to an extended region either in spatial or temporal
domains, it gets one point. Otherwise, a macroblock obtains zero point
when it belongs to the ROI region.
 As illustrated in Fig. 16(h), each macroblock has its score from the
lookup table in Table IV, and then it is arranged into five distinct
ordered slices. Fig. 16(i) shows the original frame with the result of ROI
scoreboard in Fig. 16(h) to demonstrate the location of the
corresponding slices in a frame.

 The higher the score, the less important a
macroblock is in a frame.
Corresponding score lookup table
ROI Scoreboard for FMO

Experimental Results
 Salesman introduces a product with his hands
(a) (b) (c) (d)
(e) (f) (g) (h)

 The movement of the hands is the most
important region in this sequence
(a) (b) (c) (d)
(e) (f) (g) (h)

 ROIs of Foreman sequence.
(a) (b) (c) (d)
(e) (f) (g) (h)

 Two walking taff in the office room.
(a) (b) (c) (d)
(e) (f) (g) (h)

Time Consuming Analysis of Visual
Rhythm ROI Determination
Evaluated on 1.5 GHz Pentium-M laptop with 512 MB DDR RAMs

Implementation of H.264/AVC ROI
Video Coding
 Indicate the importance of each slice in FMO

 Ii : the importance factor
 Ni : the number of macroblocks of the slice i
 n stands for the number of slices in a frame
 target bits bppi

 B is the target bits used for the current frame and is estimated
by the JM encoder
 QPi for the FMO

 a and b are recommended as 14 and −0.32

Implementation of H.264/AVC ROI
Video Coding

Conclusion
 This paper has presented a robust ROI determination
method based on user attention models through visual
rhythm analysis.
 It has been the investigation of the visual rhythm
concept for analyzing video content to facilitate the
ROI determination.
 Through visual rhythm, the proposed algorithm can
determine the highest potential ROI area in a fast,
simple, and robust way.

Future Work
 An FMO-aware ROI determination has been proposed
for H.264/AVC video coding to enhance the quality of
ROI regions.
 Based on the concept proposed in this paper, potential
developments of integrated applications are found
when the proposed scheme is combined with
chrominance information analysis.

Robust region of interest determination based on user attention model through visual rhythm analysis

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (11)

Similaire à Robust region of interest determination based on user attention model through visual rhythm analysis

Similaire à Robust region of interest determination based on user attention model through visual rhythm analysis (20)

Dernier

Dernier (20)

Robust region of interest determination based on user attention model through visual rhythm analysis

Notes de l'éditeur