SlideShare une entreprise Scribd logo
1  sur  22
ELIS – Multimedia Lab




MediaEval: Search and Hyperlinking
             4-5 October, Pisa, Italy


                  Tom De Nies
 Pedro Debevere, Davy Van Deursen, Wesley De Neve, Erik
             Mannens and Rik Van de Walle
      Ghent University – IBBT – Multimedia Lab
ELIS – Multimedia Lab

                Our approach in a nutshell




1. Create enriched representation
   of videos and queries
2. Apply multiple similarity metrics
3. Merge results by late fusion




                   MediaEval 2012: Brave New Task: Search and Hyperlinking
                                 Tom De Nies (IBBT-MMLab)                                            2
                                         05/10/2012
ELIS – Multimedia Lab

Enriched Data Representation




    MediaEval 2012: Brave New Task: Search and Hyperlinking
                  Tom De Nies (IBBT-MMLab)                                            3
                          05/10/2012
ELIS – Multimedia Lab

             Enriched Data Representation


Advantages
     Comparable queries and videos
     Extra metadata containing disambiguated concepts
     Easy conversion from video to query object
      → possible to use same approach for Search and Linking!
Disadvantages
    o Enrichment step when ingesting data can take a while
    o Only English NER tools → automatic translation step for
      other languages


                   MediaEval 2012: Brave New Task: Search and Hyperlinking
                                 Tom De Nies (IBBT-MMLab)                                            4
                                         05/10/2012
ELIS – Multimedia Lab




1. Create enriched representation
   of videos and queries
2. Apply multiple similarity metrics
3. Merge results by late fusion




                   MediaEval 2012: Brave New Task: Search and Hyperlinking
                                 Tom De Nies (IBBT-MMLab)                                            5
                                         05/10/2012
ELIS – Multimedia Lab

                     Similarity metrics




1. “Bag of words” similarity
2. Named Entity-based similarity
3. Tag-based similarity




                  MediaEval 2012: Brave New Task: Search and Hyperlinking
                                Tom De Nies (IBBT-MMLab)                                            6
                                        05/10/2012
ELIS – Multimedia Lab

          Bag of Words similarity


                                                                           TEXT
                         STOP WORD
  TEXT                                                                   WITHOUT
                          REMOVAL
                                                                       STOPWORDS




             TF(t,D) = # of
              CALCULATE
            occurrences of
         TERM FREQUENCY (TF)
                  t in D
 FOR
EACH
WORD         CALCULATE
         INVERSE DOCUMENT
           FREQUENCY (IDF)


            MediaEval 2012: Brave New Task: Search and Hyperlinking
                          Tom De Nies (IBBT-MMLab)                                            7
                                  05/10/2012
ELIS – Multimedia Lab

Bag of Words similarity




 MediaEval 2012: Brave New Task: Search and Hyperlinking
               Tom De Nies (IBBT-MMLab)                                            8
                       05/10/2012
ELIS – Multimedia Lab

        Bag of Words similarity


  Both corpus & documents taken into account

Common words get lower weight to exploit unique
                  features

   Expensive training step (IDF initialization)

          No semantics → ambiguity



          MediaEval 2012: Brave New Task: Search and Hyperlinking
                        Tom De Nies (IBBT-MMLab)                                            9
                                05/10/2012
ELIS – Multimedia Lab

   Named Entity-based Similarity
Named Entities are extracted from content

 Similar content will have similar entities!




         MediaEval 2012: Brave New Task: Search and Hyperlinking
                       Tom De Nies (IBBT-MMLab)                                            10
                               05/10/2012
ELIS – Multimedia Lab

Named Entity-based Similarity




    MediaEval 2012: Brave New Task: Search and Hyperlinking
                  Tom De Nies (IBBT-MMLab)                                            11
                          05/10/2012
ELIS – Multimedia Lab

      Named Entity-based Similarity



Less entities than terms → less calculations than BoW

      IDF → IS : no indexing of corpus required

          Named Entities are unambiguous

   Lower precision / coarser granularity than BoW



             MediaEval 2012: Brave New Task: Search and Hyperlinking
                           Tom De Nies (IBBT-MMLab)                                            12
                                   05/10/2012
ELIS – Multimedia Lab

Tag-based similarity




MediaEval 2012: Brave New Task: Search and Hyperlinking
              Tom De Nies (IBBT-MMLab)                                            13
                      05/10/2012
ELIS – Multimedia Lab

     Tag-based similarity




   Uses user-generated metadata

      Synonyms for higher recall

Very coarse granularity / Low precision




     MediaEval 2012: Brave New Task: Search and Hyperlinking
                   Tom De Nies (IBBT-MMLab)                                            14
                           05/10/2012
ELIS – Multimedia Lab




1. Create enriched representation
   of videos and queries
2. Apply multiple similarity metrics
3. Merge results by late fusion




                   MediaEval 2012: Brave New Task: Search and Hyperlinking
                                 Tom De Nies (IBBT-MMLab)                                            15
                                         05/10/2012
ELIS – Multimedia Lab

            Late Fusion




MediaEval 2012: Brave New Task: Search and Hyperlinking
              Tom De Nies (IBBT-MMLab)                                            16
                      05/10/2012
ELIS – Multimedia Lab

                         Evaluation: Search
               MRR                               mGAP                               MASP
Run
               60      30          10            60            30           10      60      30      10

1 (LIMSI:
               0.188   0.15        0.117         0.120         0.089        0.033   0.066   0.066   0.061
BoW+NE)

2 (LIUM:
               0.254   0.187       0.054         0.140         0.069        0.033   0.046   0.046   0.028
BoW+NE)

3 (LIMSI:
               0.165   0.128       0.094         0.099         0.069        0.017   0.061   0.061   0.057
BoW+NE+Tags)

4 (LIUM:
               0.221   0.154       0.038         0.115         0.053        0.017   0.040   0.041   0.023
BoW+NE+Tags)

                       MediaEval 2012: Brave New Task: Search and Hyperlinking
                                     Tom De Nies (IBBT-MMLab)                                            17
                                             05/10/2012
ELIS – Multimedia Lab

                     Evaluation: Search



Unexpected:
• LIUM > LIMSI, even though LIMSI had better language
  detection
  → due to automatic translation?
• NE + BoW > NE + BoW + Tags
  → Tags give false positives higher rank and find more
  results, so MRR decreases



                   MediaEval 2012: Brave New Task: Search and Hyperlinking
                                 Tom De Nies (IBBT-MMLab)                                            18
                                         05/10/2012
ELIS – Multimedia Lab

                           Evaluation: Search

Run                                                    Precision @60                 Recall @60



1 (LIMSI: BoW+NE)                                      0.056                         0.40



2 (LIUM: BoW+NE)                                       0.061                         0.467



3 (LIMSI: BoW+NE+Tags)                                 0.054                         0.433



4 (LIUM: BoW+NE+Tags)                                  0.059                         0.50



                         MediaEval 2012: Brave New Task: Search and Hyperlinking
                                       Tom De Nies (IBBT-MMLab)                                            19
                                               05/10/2012
ELIS – Multimedia Lab

                       Evaluation: Linking
                               MAP (Ground Truth)                              MAP (Search results)

LIMSI (BoW + NE)               0.157                                           0.014
LIUM (BoW + NE)                0.171                                           0.040
LIMSI (BoW + NE + Tags)        0.157                                           0.003
LIUM (BoW + NE + Tags)         0.171                                           0.037



Possible explanations:
• Thresholds optimized for Search task, not for Linking
• User-generated tags vs. extracted tags

… to be investigated!

                     MediaEval 2012: Brave New Task: Search and Hyperlinking
                                   Tom De Nies (IBBT-MMLab)                                             20
                                           05/10/2012
ELIS – Multimedia Lab

               Improvements / Future Work




• Better ranking criteria / late fusion
• Improve tag-similarity
• Optimize parameters for linking




                    MediaEval 2012: Brave New Task: Search and Hyperlinking
                                  Tom De Nies (IBBT-MMLab)                                            21
                                          05/10/2012
ELIS – Multimedia Lab

                                    Discussion




These research activities were funded by Ghent University, IBBT, the IWT
Flanders, the FWO-Flanders, and the European Union, in the context of the
IBBT project Smarter Media in Flanders (SMIF).

                       MediaEval 2012: Brave New Task: Search and Hyperlinking
                                     Tom De Nies (IBBT-MMLab)                                            22
                                             05/10/2012

Contenu connexe

Similaire à Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Similarity using Named Entities

Bringing Newsworthiness into the 21st Century
Bringing Newsworthiness into the 21st CenturyBringing Newsworthiness into the 21st Century
Bringing Newsworthiness into the 21st Century
tdenies
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
MediaEval2012
 
EF-ODL - E-learning & Future Internet
EF-ODL - E-learning & Future InternetEF-ODL - E-learning & Future Internet
EF-ODL - E-learning & Future Internet
Piet Desmet
 
EF-ODL E-learning & Future Internet - Piet Desmet
EF-ODL E-learning & Future Internet - Piet DesmetEF-ODL E-learning & Future Internet - Piet Desmet
EF-ODL E-learning & Future Internet - Piet Desmet
Piet Desmet
 
EF-ODL E-learning & Future Internet
EF-ODL E-learning & Future InternetEF-ODL E-learning & Future Internet
EF-ODL E-learning & Future Internet
Piet Desmet
 
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBsGEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
Gudmundur Thorisson
 

Similaire à Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Similarity using Named Entities (20)

Bringing Newsworthiness into the 21st Century
Bringing Newsworthiness into the 21st CenturyBringing Newsworthiness into the 21st Century
Bringing Newsworthiness into the 21st Century
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Ber...
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
 
Route mate presentation_2_sim2012_
Route mate presentation_2_sim2012_Route mate presentation_2_sim2012_
Route mate presentation_2_sim2012_
 
Presentation mobisticks barcelona
Presentation mobisticks barcelonaPresentation mobisticks barcelona
Presentation mobisticks barcelona
 
2015.05.19 tom de nies - tin can2prov exposing interoperable provenance of ...
2015.05.19   tom de nies - tin can2prov exposing interoperable provenance of ...2015.05.19   tom de nies - tin can2prov exposing interoperable provenance of ...
2015.05.19 tom de nies - tin can2prov exposing interoperable provenance of ...
 
Using DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating EntitiesUsing DBpedia for Spotting and Disambiguating Entities
Using DBpedia for Spotting and Disambiguating Entities
 
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
Marco Roos: Newton's ideas and methods are preserved forever: how about yours?
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
HelenOS: State of the Union 2012
HelenOS: State of the Union 2012HelenOS: State of the Union 2012
HelenOS: State of the Union 2012
 
EF-ODL - E-learning & Future Internet
EF-ODL - E-learning & Future InternetEF-ODL - E-learning & Future Internet
EF-ODL - E-learning & Future Internet
 
EF-ODL E-learning & Future Internet - Piet Desmet
EF-ODL E-learning & Future Internet - Piet DesmetEF-ODL E-learning & Future Internet - Piet Desmet
EF-ODL E-learning & Future Internet - Piet Desmet
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner Infrastructure
 
What is at final 021012 a
What is at   final 021012 aWhat is at   final 021012 a
What is at final 021012 a
 
EF-ODL E-learning & Future Internet
EF-ODL E-learning & Future InternetEF-ODL E-learning & Future Internet
EF-ODL E-learning & Future Internet
 
S.P.A.C.E. Exploration for Software Engineering
 S.P.A.C.E. Exploration for Software Engineering S.P.A.C.E. Exploration for Software Engineering
S.P.A.C.E. Exploration for Software Engineering
 
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBsGEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
 

Plus de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
MediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
MediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
MediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
MediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
MediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
MediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
MediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
MediaEval2012
 

Plus de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 

Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Similarity using Named Entities

  • 1. ELIS – Multimedia Lab MediaEval: Search and Hyperlinking 4-5 October, Pisa, Italy Tom De Nies Pedro Debevere, Davy Van Deursen, Wesley De Neve, Erik Mannens and Rik Van de Walle Ghent University – IBBT – Multimedia Lab
  • 2. ELIS – Multimedia Lab Our approach in a nutshell 1. Create enriched representation of videos and queries 2. Apply multiple similarity metrics 3. Merge results by late fusion MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 2 05/10/2012
  • 3. ELIS – Multimedia Lab Enriched Data Representation MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 3 05/10/2012
  • 4. ELIS – Multimedia Lab Enriched Data Representation Advantages  Comparable queries and videos  Extra metadata containing disambiguated concepts  Easy conversion from video to query object → possible to use same approach for Search and Linking! Disadvantages o Enrichment step when ingesting data can take a while o Only English NER tools → automatic translation step for other languages MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 4 05/10/2012
  • 5. ELIS – Multimedia Lab 1. Create enriched representation of videos and queries 2. Apply multiple similarity metrics 3. Merge results by late fusion MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 5 05/10/2012
  • 6. ELIS – Multimedia Lab Similarity metrics 1. “Bag of words” similarity 2. Named Entity-based similarity 3. Tag-based similarity MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 6 05/10/2012
  • 7. ELIS – Multimedia Lab Bag of Words similarity TEXT STOP WORD TEXT WITHOUT REMOVAL STOPWORDS TF(t,D) = # of CALCULATE occurrences of TERM FREQUENCY (TF) t in D FOR EACH WORD CALCULATE INVERSE DOCUMENT FREQUENCY (IDF) MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 7 05/10/2012
  • 8. ELIS – Multimedia Lab Bag of Words similarity MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 8 05/10/2012
  • 9. ELIS – Multimedia Lab Bag of Words similarity Both corpus & documents taken into account Common words get lower weight to exploit unique features Expensive training step (IDF initialization) No semantics → ambiguity MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 9 05/10/2012
  • 10. ELIS – Multimedia Lab Named Entity-based Similarity Named Entities are extracted from content Similar content will have similar entities! MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 10 05/10/2012
  • 11. ELIS – Multimedia Lab Named Entity-based Similarity MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 11 05/10/2012
  • 12. ELIS – Multimedia Lab Named Entity-based Similarity Less entities than terms → less calculations than BoW IDF → IS : no indexing of corpus required Named Entities are unambiguous Lower precision / coarser granularity than BoW MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 12 05/10/2012
  • 13. ELIS – Multimedia Lab Tag-based similarity MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 13 05/10/2012
  • 14. ELIS – Multimedia Lab Tag-based similarity Uses user-generated metadata Synonyms for higher recall Very coarse granularity / Low precision MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 14 05/10/2012
  • 15. ELIS – Multimedia Lab 1. Create enriched representation of videos and queries 2. Apply multiple similarity metrics 3. Merge results by late fusion MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 15 05/10/2012
  • 16. ELIS – Multimedia Lab Late Fusion MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 16 05/10/2012
  • 17. ELIS – Multimedia Lab Evaluation: Search MRR mGAP MASP Run 60 30 10 60 30 10 60 30 10 1 (LIMSI: 0.188 0.15 0.117 0.120 0.089 0.033 0.066 0.066 0.061 BoW+NE) 2 (LIUM: 0.254 0.187 0.054 0.140 0.069 0.033 0.046 0.046 0.028 BoW+NE) 3 (LIMSI: 0.165 0.128 0.094 0.099 0.069 0.017 0.061 0.061 0.057 BoW+NE+Tags) 4 (LIUM: 0.221 0.154 0.038 0.115 0.053 0.017 0.040 0.041 0.023 BoW+NE+Tags) MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 17 05/10/2012
  • 18. ELIS – Multimedia Lab Evaluation: Search Unexpected: • LIUM > LIMSI, even though LIMSI had better language detection → due to automatic translation? • NE + BoW > NE + BoW + Tags → Tags give false positives higher rank and find more results, so MRR decreases MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 18 05/10/2012
  • 19. ELIS – Multimedia Lab Evaluation: Search Run Precision @60 Recall @60 1 (LIMSI: BoW+NE) 0.056 0.40 2 (LIUM: BoW+NE) 0.061 0.467 3 (LIMSI: BoW+NE+Tags) 0.054 0.433 4 (LIUM: BoW+NE+Tags) 0.059 0.50 MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 19 05/10/2012
  • 20. ELIS – Multimedia Lab Evaluation: Linking MAP (Ground Truth) MAP (Search results) LIMSI (BoW + NE) 0.157 0.014 LIUM (BoW + NE) 0.171 0.040 LIMSI (BoW + NE + Tags) 0.157 0.003 LIUM (BoW + NE + Tags) 0.171 0.037 Possible explanations: • Thresholds optimized for Search task, not for Linking • User-generated tags vs. extracted tags … to be investigated! MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 20 05/10/2012
  • 21. ELIS – Multimedia Lab Improvements / Future Work • Better ranking criteria / late fusion • Improve tag-similarity • Optimize parameters for linking MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 21 05/10/2012
  • 22. ELIS – Multimedia Lab Discussion These research activities were funded by Ghent University, IBBT, the IWT Flanders, the FWO-Flanders, and the European Union, in the context of the IBBT project Smarter Media in Flanders (SMIF). MediaEval 2012: Brave New Task: Search and Hyperlinking Tom De Nies (IBBT-MMLab) 22 05/10/2012