Leveraging an image folksonomy and the signature quadratic form distance for semantic based detection of near-duplicate video clips

Leveraging an Image Folksonomy and the Signature Quadratic Form
Distance for Semantic-Based Detection of Near-Duplicate Video Clips
Hyun-seok Min, Jae Young Choi, Wesley De Neve, and Yong Man Ro
Image and Video Systems Lab
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea
e-mail: hsmin@kaist.ac.kr website: http://ivylab.kaist.ac.kr

I. INTRODUCTION IV. EXPERIMENTS
- Observations 1. Experimental setup
- an increasing number of near-duplicate video clips (NDVCs) can be - Use of TRECVID 2009 for creating NDVCs and reference video clips
found on websites for video sharing - Use of MIRFLICKR-25000 as a source of collective knowledge
- content transformations tend to preserve semantic information - Use of VIREO-374 for model-based semantic concept detection
- Novel idea
- NDVC detection using semantic concept detection 2. Experimental results
- Research challenges 2.1. Influence of semantic concept popularity
- semantic coverage: use of model-free semantic concept detection - The effectiveness of model-based semantic concept detection highly
- semantic similarity: use of adaptive semantic distance measurement depends on the popularity of the semantic concept models used
II. SEMANTIC VIDEO SIGNATURE CREATION USING AN - non-popular semantic concept models hardly contribute to
IMAGE FOLKSONOMY improving the effectiveness of NDVC detection
1.2

1
Input shot Si
0.8

NDCR
Visual Image folksonomy F
0.6
Extraction of low-level visual features Descending order of popularity
features User
0.4 Ascending order of popularity
User
Content-based image retrieval User-contributed images
User-supplied tags 0.2
Images User-contributed images
k nearest visual neighbors of Si & tags User-supplied tags
0

120

310
10
20
30
40
50
60
70
80
90
100
110

130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300

320
330
340
350
360
370
: night, sky, stars, mountains, milkyway, aquila, User-contributed images
sagittarius, scorpius, ... User-contributed images Number of semantic concepts used
User-supplied tags
User-supplied tags
: milkyway, sky, space, astrophotography,
Fig. 2. Influence of semantic concept popularity on NDVC detection.
night, telescope, jupiter, clouds, ...
User
User 2.2. Influence of different types of video content
...

- To facilitate effective NDVC detection, video signatures need to be
: milky way, galaxy, stars, sky robust against the use of different types of video content
- category 1 (documentaries), category 2 (news),
category 3 (drama and movies), category 4 (miscellaneous)
Fig. 1. Retrieval of the k nearest visual neighbor images and their associated tags
from an image folksonomy F for a video shot Si.
- The effectiveness of the proposed NDVC detection technique is
stable and high for all types of video content investigated
- Metric for measuring the relevance of a tag t w.r.t. the shot Si:
c : the frequency of t in the set of k neighbors
c Lt
R (t ) = - , Lt : the number of images labeled with t in F
K F
F : the number of images in F
- Layout of the semantic feature signature Ai of a shot Si:

[ ]
Ai = ti , j , wi , j , j = 1,..., Ai , wi , j : a weight value for tag ti,j

- Computation of the weight value for tag ti,j : R(ti , j )
wi , j = Ai Fig. 3. Effectiveness of NDVC detection for different types of video content.

∑ R(ti, k ) Key frame
Model-based
approach
Model-free
approach
Key frame
Model-based
approach
Model-free
approach
k =1 Cloud Stars
Sky Night
Water Geotagged
N/A N/A
III. SEMANTIC DISTANCE MEASUREMENT USING THE Moonlight
Rainbow
Constellation
Sky
SIGNATURE QUADRATIC FORM DISTANCE (SQFD) … …

- Adaptive semantic distance measurement between shots Sq and Sr: She
Puppy
Dog
r T
Blue
w |- w G w |- w
q r q r q r q
Dshot (S , S ) = SQFD(A , A ) =
Civilian Person Grass
, Group
Clouds
Zoo
N/A
Summer
Safari
…
…
q q q r r r
w w ,...,w 1
Aq
w w ,...,w
1
Fig. 4. Example key frames with detected semantic concepts
Ar (underlined semantic concepts are considered to be correct).
V. CONCLUSIONS
- The elements of the ground similarity matrix G: -This paper discussed a novel technique for NDVC detection
- takes advantage of the collective knowledge in an image folksonomy
It
i tj I ti ∩ t j : the set of images annotated with both tag ti and tj - allows using an unrestricted and dynamic concept vocabulary
gij , - takes advantage of the flexible SQFD metric
It I ti : the set of images annotated with tag ti - allows taking into account that the nature, the relevance, and the
i number of semantic concepts may strongly vary from shot to shot

IEEE International Conference on Multimedia and Expo (ICME), July 2011, Barcelona (Spain)

Leveraging an image folksonomy and the signature quadratic form distance for semantic based detection of near-duplicate video clips

Recommandé

Recommandé

Contenu connexe

Similaire à Leveraging an image folksonomy and the signature quadratic form distance for semantic based detection of near-duplicate video clips

Similaire à Leveraging an image folksonomy and the signature quadratic form distance for semantic based detection of near-duplicate video clips (7)

Plus de Wesley De Neve

Plus de Wesley De Neve (20)

Dernier

Dernier (20)

Leveraging an image folksonomy and the signature quadratic form distance for semantic based detection of near-duplicate video clips