SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
ECDL 2010
6-10 september 2010




          Measuring Effectiveness of
           Geographic IR Systems
              in Digital Libraries:
          Evaluation and Case Study

         Damien Palacio, Guillaume Cabanac,
            Christian Sallaberry, Gilles Hubert

              Damien Palacio - damien.palacio@univ-pau.fr   1
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     2
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     3
1. Motivation – Why Geographic IR?


Geographic Information Retrieval
➔   Query = ''trip around Glasgow in summer 2010''

➔   Search Engines
      ➔   Topical          term          ∈ {trip, Glasgow, summer, 2010}

                           spatial  ∈ {citiesNearGlasgow ...}
      ➔   Geographic       temporal ∈ {21june .. 22sept 2010}
                           term     ∈ {trip, Glasgow, summer, 2010}


➔   ≈ 1/6 Queries = Geographic Queries
      ➔   Excite       (Sanderson et al., 2004)
      ➔   AOL          (Gan et al., 2008)
      ➔   Yahoo!       (Jones et al., 2008)


➔   Current Issue and Realistic
                                                                       4
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   3 Dimensions to Process:
      ➔   Spatial, temporal and topical

➔   1 Index per Dimension
      ➔   Topical      bag of words, vector space model, ...
      ➔   Spatial      named entity recognition, ...
      ➔   Temporal     named entity recognition, ...




                                                               5
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   Spatial Processing




                                      6
1. Motivation – Why Geographic IR?


A Geographic IRS: How Does It Work?
➔   3 Dimensions to Process:
      ➔   Spatial, temporal and topical

➔   1 Index per Dimension
      ➔   Topical      bag of words, vector space model, ...
      ➔   Spatial      named entity recognition, ...
      ➔   Temporal     named entity recognition, ...


➔   Retrieval
      ➔   Usually by filtering (STEWARD, SPIRIT, CITER, …)


➔   Issue: Performance of GIRS vs. topical IRS
➔   Hypothesis: Geographic IRS better than topical IRS
                                                               7
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                     = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     8
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness



                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    9
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness

                   Computation       Storage
                   time              needed
                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    10
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency             +   effectiveness

                   Computation       Storage
                                     needed             Quality
                   time
                       Geo IR litterature             Topical IR
                                                      litterature


➔   Effectiveness Evaluation




                                                                    11
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation


                     Temporal               Topical




                                  Spatial

                                                                       12
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...

                     Temporal               Topical




                                  Spatial

                                                                         13
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical




                                  Spatial

                                                                         14
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical


                                                      Bucher et al. (2005)
                                                      GeoClef
                                  Spatial

                                                                         15
2. Context and Issue: IRS Partial Evaluation


Evaluating an IR System
➔   System =            efficiency               +    effectiveness

                   Computation       Storage
                                     needed                Quality
                   time
                       Geo IR litterature                Topical IR
                                                         litterature


➔   Effectiveness Evaluation
                                                           TREC, CLEF, ...
TempEval
                     Temporal               Topical


                                                      Bucher et al. (2005)
    Evaluation                                        GeoClef
    framework                     Spatial
    proposed
                                                                         16
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     17
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (1/2)
 ➔   Goal: measuring GIRS quality

 ➔   Means: building on TREC framework (1992-)

 ➔   ''Cranfield'' methodology
       ➔   Test collection
              ➔   Corpus
              ➔   ≥ 25 Topics
              ➔   Qrels


       ➔   Measures: P@X, MAP,
             NDCG, ...
                                             [Voorhees, 2007]

                                                                18
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus          Covering the 3
                                  dimensions
              ➔   Gradual qrels
              ➔   + geographic ressources




                                                   19
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus            Covering the 3
                                    dimensions
              ➔   Gradual qrels
                                                  3 dimensions:
              ➔   + geographic ressources          Topic: ''trip around Glasgow''
                                                   Doc: trip + Bob born in Dumbarton
                                   No dimension                   3 dimensions + global
       ➔   About qrels …                                                     =
                                                                     Satisfied topic 
              ➔   Relevance (doc, topic) ∈ {0;1;2;3;4}
              ➔   Principle: ''the more satisfied dimensions there are, the
                    better it is''




                                                                                          20
3. Proposition – GIRS Evaluation Framework


Evaluation Framework for the 3 Dimensions (2/2)
 ➔   TREC Framework Extension
       ➔   Test collection
              ➔   ≥ 25 Topics
              ➔   Corpus             Covering the 3
                                     dimensions
              ➔   Gradual qrels
                                                   3 dimensions:
              ➔   + geographic ressources           Topic: ''trip around Glasgow''
                                                    Doc: trip + Bob born in Dumbarton
                                    No dimension                   3 dimensions + global
       ➔   About qrels …                                                      =
                                                                      Satisfied topic 
              ➔   Relevance (doc, topic) ∈ {0;1;2;3;4}
              ➔   Principle: ''the more satisfied dimensions there are, the
                    better it is''

       ➔   Gradual qrels aware measure:
             Normalized Discounted Cumulative Gain            [Järvelin & Kekäläinen, 2002]

              ➔   By topic: NDCG for each topic
              ➔   Global:   meanNDCG for the system                                        21
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     22
4. Experiments – Case Study with PIV GIRS


Case Study: PIV System
➔   Indexing: 1 index per dimension
      ➔   Topical = Terrier IRS   [Ounis et al, 2005]
      ➔   Spatial = map segmentation into tiles
      ➔   Temporal = timeline segmentation into tiles




                                                            CombMNZ


➔   Retrieval
      ➔   Result document list for each index
      ➔   Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]
                                                                      23
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 24
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 25
4. Experiments – Case Study with PIV GIRS


CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]




                                                 26
4. Experiments – Case Study with PIV GIRS


Case Study: MIDR_2010 collection
➔   Building Qrels: 12 volunteers (thanks!!!)


31 topics                                         Qrels

  5645                                          Relevance
documents                                       judgments

     =                                          {0;1;2;3;4}

paragraphs




   Map for
   tracking
    spatial
 information


                                                              27
4. Experiments – Hypothesis Validated


Analysis of Collected Data
➔   IRS Evaluation
                                   trec_eval
      ➔   ResultsList × Qrels                  NDCG


➔   Results: geographic IRS most effective




Hypothesis 


                                                      28
4. Experiments – Hypothesis Validated


Analysis of Collected Data
➔   Results: geographic IRS most effective




                                             29
Outline

1. Motivation        Topical IR → Geographic IR
                     Hypothesis: GIRS > IRS

2. Context           IRS evaluation
   Issue             Current evaluation frameworks
                      = partial

3. Contribution      GIRS evaluation framework

4. Experiments       Case study with PIV GIRS
                     Hypothesis validated

5. Conclusion and Future Works

                                                     30
Evaluation framework for Geographic IR Systems


Conclusions and Future Works (1/2)
➔   Evaluation Framework for Geographic IR Systems
      ➔   Reusable
      ➔   Generalizable for more dimensions: confidence,
            freshness, ... [Costa Pereira et al., 2009]
      ➔   Not gradual relevance per dimension


➔   Case Study with PIV System
      ➔   Creation of a specific test collection (≥ 25 topics)
      ➔   French test collection
      ➔   Limited collection (number of documents)




                                                                 31
Evaluation Framework for Geographic IR Systems


Conclusions and Future Works (2/2)
➔   Hypothesis Validated
      ➔   The 3 dimensions improve IR (+66.5%)


➔   Future Works
      ➔   More precise analysis: by query
      ➔   Quantify PIV improvements: various indexes combinations
      ➔   Organize a GIRS evaluation campaign: anyone interested?




                                                                    32
ECDL 2010
6-10 september 2010




                           Thank you!




              Damien Palacio - damien.palacio@univ-pau.fr   33
Spatial Interface




                    34
Spatial Interface




                    35
Temporal Interface




                     36
Temporal Interface




                     37
Spatial Tiling




                 38

Contenu connexe

Similaire à ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Team management presentation3
Team management presentation3Team management presentation3
Team management presentation3
John Martin
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with Hadoop
DataWorks Summit
 
Intro to CCSS - East China 2-
Intro to CCSS - East China 2-Intro to CCSS - East China 2-
Intro to CCSS - East China 2-
Laura Chambless
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDB
MongoDB
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Grant Ingersoll
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Grant Ingersoll
 

Similaire à ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study (17)

Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshop
 
Team management presentation3
Team management presentation3Team management presentation3
Team management presentation3
 
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
 
PMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise EnvironmentPMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
PMICOS Webinar: Building a Sound Schedule in an Enterprise Environment
 
Mining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with HadoopMining Large-Scale Temporal Dynamics with Hadoop
Mining Large-Scale Temporal Dynamics with Hadoop
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
Intro to CCSS - East China 2-
Intro to CCSS - East China 2-Intro to CCSS - East China 2-
Intro to CCSS - East China 2-
 
Real World Application Performance with MongoDB
Real World Application Performance with MongoDBReal World Application Performance with MongoDB
Real World Application Performance with MongoDB
 
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin ChengPinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
Pinpoint Ceph Bottleneck Out of Cluster Behavior Mists - Yingxin Cheng
 
Real World Cognition Loop for IoT
Real World Cognition Loop for IoTReal World Cognition Loop for IoT
Real World Cognition Loop for IoT
 
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering[SOCRS2013]Differential Context Modeling in Collaborative Filtering
[SOCRS2013]Differential Context Modeling in Collaborative Filtering
 
Show observe and tell giang nguyen
Show observe and tell   giang nguyenShow observe and tell   giang nguyen
Show observe and tell giang nguyen
 
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by GwizdkaCognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
Cognitive Ability Effects on Effort in Web Search & Navigation by Gwizdka
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
 
Research and collection of data
Research and collection of dataResearch and collection of data
Research and collection of data
 

Plus de Guillaume Cabanac

Adoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousainesAdoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousaines
Guillaume Cabanac
 
Interroger la science
Interroger la scienceInterroger la science
Interroger la science
Guillaume Cabanac
 
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
Guillaume Cabanac
 

Plus de Guillaume Cabanac (20)

Adoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousainesAdoption de l’identifiant ORCID : le cas des universités toulousaines
Adoption de l’identifiant ORCID : le cas des universités toulousaines
 
Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...Dépollution de la littérature scientifique : traque d’expression torturées ...
Dépollution de la littérature scientifique : traque d’expression torturées ...
 
Interroger la science
Interroger la scienceInterroger la science
Interroger la science
 
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
Valoriser le capital documentaire (en sommeil) d’une organisation : exploitat...
 
Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...Comment analyser une mobilisation collective dans les réseaux socionumériques...
Comment analyser une mobilisation collective dans les réseaux socionumériques...
 
Gender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic WritingGender as a Variable to Study Academic Writing
Gender as a Variable to Study Academic Writing
 
Prospection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospectiveProspection de textes scientifiques : vision prospective
Prospection de textes scientifiques : vision prospective
 
Questionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovationQuestionner le texte scientifique pour caractériser la science et l'innovation
Questionner le texte scientifique pour caractériser la science et l'innovation
 
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
Le carnet de l'avent de la sociologie francophone sur Twitter : réseaux et al...
 
Interroger le texte scientifique
Interroger le texte scientifiqueInterroger le texte scientifique
Interroger le texte scientifique
 
The promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artistsThe promises of web scrapping: Mining the web for relational data about artists
The promises of web scrapping: Mining the web for relational data about artists
 
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
Émergence de l’open access « gris » : LibGen et Sci-Hub comme filières clande...
 
Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres Confrontation à la perception humaine de mesures de similarité entre membres
Confrontation à la perception humaine de mesures de similarité entre membres
 
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
« T'as pensé à retweeter mon article ? » Enjeux, limites et critique de la bi...
 
Émergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-HubÉmergence de l’open access « gris » : LibGen et Sci-Hub
Émergence de l’open access « gris » : LibGen et Sci-Hub
 
Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines: Sur les étagères des bibliothèques numériques clandestines:
Sur les étagères des bibliothèques numériques clandestines:
 
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociauxLes altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
Les altmetrics : estimer l'engouement pour la recherche sur les médias sociaux
 
A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...A Journey in Scientometrics: quantitative studies of science at the crossroad...
A Journey in Scientometrics: quantitative studies of science at the crossroad...
 
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifiqueBibliogifts ? Les bibliothèques clandestines de l'édition scientifique
Bibliogifts ? Les bibliothèques clandestines de l'édition scientifique
 
Le renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorshipLe renfort des liens forts - dynamique relationnelle du coauthorship
Le renfort des liens forts - dynamique relationnelle du coauthorship
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

  • 1. ECDL 2010 6-10 september 2010 Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study Damien Palacio, Guillaume Cabanac, Christian Sallaberry, Gilles Hubert Damien Palacio - damien.palacio@univ-pau.fr 1
  • 2. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 2
  • 3. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 3
  • 4. 1. Motivation – Why Geographic IR? Geographic Information Retrieval ➔ Query = ''trip around Glasgow in summer 2010'' ➔ Search Engines ➔ Topical term ∈ {trip, Glasgow, summer, 2010} spatial ∈ {citiesNearGlasgow ...} ➔ Geographic temporal ∈ {21june .. 22sept 2010} term ∈ {trip, Glasgow, summer, 2010} ➔ ≈ 1/6 Queries = Geographic Queries ➔ Excite (Sanderson et al., 2004) ➔ AOL (Gan et al., 2008) ➔ Yahoo! (Jones et al., 2008) ➔ Current Issue and Realistic 4
  • 5. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... 5
  • 6. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ Spatial Processing 6
  • 7. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... ➔ Retrieval ➔ Usually by filtering (STEWARD, SPIRIT, CITER, …) ➔ Issue: Performance of GIRS vs. topical IRS ➔ Hypothesis: Geographic IRS better than topical IRS 7
  • 8. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 8
  • 9. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 9
  • 10. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage time needed Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 10
  • 11. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 11
  • 12. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation Temporal Topical Spatial 12
  • 13. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... Temporal Topical Spatial 13
  • 14. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Spatial 14
  • 15. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) GeoClef Spatial 15
  • 16. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) Evaluation GeoClef framework Spatial proposed 16
  • 17. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 17
  • 18. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (1/2) ➔ Goal: measuring GIRS quality ➔ Means: building on TREC framework (1992-) ➔ ''Cranfield'' methodology ➔ Test collection ➔ Corpus ➔ ≥ 25 Topics ➔ Qrels ➔ Measures: P@X, MAP, NDCG, ... [Voorhees, 2007] 18
  • 19. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels ➔ + geographic ressources 19
  • 20. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' 20
  • 21. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' ➔ Gradual qrels aware measure: Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002] ➔ By topic: NDCG for each topic ➔ Global: meanNDCG for the system 21
  • 22. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 22
  • 23. 4. Experiments – Case Study with PIV GIRS Case Study: PIV System ➔ Indexing: 1 index per dimension ➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles ➔ Temporal = timeline segmentation into tiles CombMNZ ➔ Retrieval ➔ Result document list for each index ➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997] 23
  • 24. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 24
  • 25. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 25
  • 26. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 26
  • 27. 4. Experiments – Case Study with PIV GIRS Case Study: MIDR_2010 collection ➔ Building Qrels: 12 volunteers (thanks!!!) 31 topics Qrels 5645 Relevance documents judgments = {0;1;2;3;4} paragraphs Map for tracking spatial information 27
  • 28. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ IRS Evaluation trec_eval ➔ ResultsList × Qrels NDCG ➔ Results: geographic IRS most effective Hypothesis  28
  • 29. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ Results: geographic IRS most effective 29
  • 30. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 30
  • 31. Evaluation framework for Geographic IR Systems Conclusions and Future Works (1/2) ➔ Evaluation Framework for Geographic IR Systems ➔ Reusable ➔ Generalizable for more dimensions: confidence, freshness, ... [Costa Pereira et al., 2009] ➔ Not gradual relevance per dimension ➔ Case Study with PIV System ➔ Creation of a specific test collection (≥ 25 topics) ➔ French test collection ➔ Limited collection (number of documents) 31
  • 32. Evaluation Framework for Geographic IR Systems Conclusions and Future Works (2/2) ➔ Hypothesis Validated ➔ The 3 dimensions improve IR (+66.5%) ➔ Future Works ➔ More precise analysis: by query ➔ Quantify PIV improvements: various indexes combinations ➔ Organize a GIRS evaluation campaign: anyone interested? 32
  • 33. ECDL 2010 6-10 september 2010 Thank you! Damien Palacio - damien.palacio@univ-pau.fr 33