SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Institute for Web Science and Technologies
                        University of Koblenz ▪ Landau, Germany




SPLENDID: SPARQL Endpoint Federation
     Exploiting VOID Descriptions


          Olaf Görlitz, Steffen Staab
Motivation



    How to access a large number of linked data sources?




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany   Slide 2
Data Integration Approaches

           Data Warehouse                                  Link Traversal




   Efficient query execution                         Live Data Access
   Complete results                                  Flexible / On Demand
   Data copies                                       Incomplete results
   Inflexible                                        Biased by starting point

WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany        Slide 3
Our Approach

                                Data Federation

                                                        Live data access
                                                        Flexible source integration
                                                        Effective query planning
                                                        Complete results


Hypothesis:
Efficient query federation is possible using core Semantic
Web technology (i.e. SPARQL endpoints, VoiD descriptions)


WeST Institute                   Olaf Görlitz
People and Knowledge Networks    COLD 2011, Bonn, Germany        Slide 4
VoiD: „Vocabulary of Interlinked Datasets“




                                              }        General Information




                                              }        Basic statistics
                                                       triples = 732744



                                              }        Type statistics
                                                       chebi:Compound = 50477




                                              }        Predicate statistics
                                                       bio:formula = 39555




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany             Slide 5
Distributed Query Processing




Contribution:
Apply Best Practices of RDBMS for RDF Federation

                                                           http://code.google.com/p/rdffederator/
WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany           Slide 6
Query Example



        Which drugs are categorized as micronutrients?




       SELECT ?drug ?title WHERE {
         ?drug drugbank:drugCategory category:micronutrient .
         ?drug drugbank:casRegistryNumber ?id .
         ?keggDrug rdf:type kegg:Drug .
         ?keggDrug bio2rdf:xRef ?id .
         ?keggDrug purl:title ?title . }
       }




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany   Slide 7
Query Processing


          Source Selection             Join Optimization   Query Execution




       SELECT ?drug ?title WHERE {
         ?drug drugbank:drugCategory category:micronutrient .
         ?drug drugbank:casRegistryNumber ?id .
         ?keggDrug rdf:type kegg:Drug .
         ?keggDrug bio2rdf:xRef ?id .
         ?keggDrug purl:title ?title . }
       }




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany     Slide 8
Query Processing


          Source Selection             Join Optimization       Query Execution



       1. Step: Index-based source mapping

       SELECT ?drug ?title WHERE {
         ?drug drugbank:drugCategory category:micronutrient .              → drugbank
         ?drug drugbank:casRegistryNumber ?id .                            → drugbank
         ?keggDrug rdf:type kegg:Drug .                                    → kegg
         ?keggDrug bio2rdf:xRef ?id .                                      → kegg
         ?keggDrug purl:title ?title . }                                   → kegg, dbpedia, Chebi
       }

         predicate-index                                   type-index
         drugbank:drugCategory → drugbank                  kegg:Drug → kegg




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany         Slide 9
Query Processing


          Source Selection             Join Optimization   Query Execution



       2. Step: Refinement with ASK Queries

       SELECT ?drug ?title WHERE {
         ?drug drugbank:drugCategory category:micronutrient .
         ?drug drugbank:casRegistryNumber ?id .
         ?keggDrug rdf:type kegg:Drug .
         ?keggDrug bio2rdf:xRef ?id .
         ?keggDrug purl:title ?title . }
       }


        No index for subject / object values



WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany    Slide 10
Query Processing


          Source Selection             Join Optimization   Query Execution



       3. Step: Grouping Triple Patterns

       SELECT ?drug ?title WHERE {
         ?drug drugbank:drugCategory category:micronutrient .
         ?drug drugbank:casRegistryNumber ?id .                        } drugbank
         ?keggDrug rdf:type kegg:Drug .
         ?keggDrug bio2rdf:xRef ?id .                                  } kegg
         ?keggDrug purl:title ?title . }                               } kegg, dbpedia, Chebi
       }


        + grouping sameAs patterns



WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany    Slide 11
Join Order Optimization


          Source Selection             Join Optimization   Query Execution



    Dynamic Programming with statistics-based cost estimation

                                     bind join /
                                     hash join




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany    Slide 12
Evaluation


   FedBench Evaluation Suite                                  Measuring
    • Life Science + Cross Domain Data                        • #data sources selected
    • different query characteristics                         • query execution time


Orthogonal State-of-the-Art approaches:
                       DARQ                AliBaba            FedX              SPLENDID
 Statistics            ServiceDesc         –                  –                 VoiD
 Source                Statistics          All sources        ASK queries       Statistics +
 Selection             (predicates)                                             ASK queries
 Query                 DynProg             Heuristics         Heuristics        DynProg
 Optimization
 Query                 Bind join           Bind join          Bound Join +      Bind Join +
 Execution                                                    parallelization   Hash Join


WeST Institute                     Olaf Görlitz
People and Knowledge Networks      COLD 2011, Bonn, Germany          Slide 13
Evaluation: Source Selection


          Source Selection                Join Optimization      Query Execution




                                owl:sameAs                    rdf:type


WeST Institute                     Olaf Görlitz
People and Knowledge Networks      COLD 2011, Bonn, Germany        Slide 14
Evaluation: Query Optimization


          Source Selection             Join Optimization   Query Execution




WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany    Slide 15
Conclusion



                           Publish more VoiD description!



                   VoiD-based query federation is efficient



What next?
 Combination with FedX
 Improving estimation and cost model
 Integrating SPARQL 1.1 features
WeST Institute                  Olaf Görlitz
People and Knowledge Networks   COLD 2011, Bonn, Germany   Slide 16

Contenu connexe

En vedette

Linked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareLinked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareKerstin Forsberg
 
Fce travel and holidays
Fce travel and holidaysFce travel and holidays
Fce travel and holidaysJavier Martos
 
Semantic Blockchains in the Supply Chain
Semantic Blockchains in the Supply ChainSemantic Blockchains in the Supply Chain
Semantic Blockchains in the Supply ChainChristopher Brewster
 

En vedette (6)

Linked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareLinked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcare
 
Homophones homographs & homonyms list with examples PDF
Homophones homographs & homonyms list with examples PDFHomophones homographs & homonyms list with examples PDF
Homophones homographs & homonyms list with examples PDF
 
Prepositions of movement in English pictures and videos
Prepositions of movement in English pictures and videosPrepositions of movement in English pictures and videos
Prepositions of movement in English pictures and videos
 
Fce travel and holidays
Fce travel and holidaysFce travel and holidays
Fce travel and holidays
 
Homographs homophones and homonyms list in PDF.
Homographs homophones and homonyms list in PDF.Homographs homophones and homonyms list in PDF.
Homographs homophones and homonyms list in PDF.
 
Semantic Blockchains in the Supply Chain
Semantic Blockchains in the Supply ChainSemantic Blockchains in the Supply Chain
Semantic Blockchains in the Supply Chain
 

Similaire à Splendid: SPARQL Endpoint Federation Exploiting VOID Descriptions

Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open DataSPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open DataOlafGoerlitz
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
 
Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementOlafGoerlitz
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Databricks
 
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...Basil Ell
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Enayat Rajabi
 
2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptionsAlasdair Gray
 
Linked Data for Federation of OER Data & Repositories
Linked Data for Federation of OER Data & RepositoriesLinked Data for Federation of OER Data & Repositories
Linked Data for Federation of OER Data & RepositoriesStefan Dietze
 
Knowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebKnowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebMichel Dumontier
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
Data101 pmcb retreat_09-20-13_final
Data101 pmcb retreat_09-20-13_finalData101 pmcb retreat_09-20-13_final
Data101 pmcb retreat_09-20-13_finalJackie Wirz, PhD
 
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdfZeyd Boukhers
 
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataOpen City Foundation
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practicesRobert Oostenveld
 
Predictive modeling DBs
Predictive modeling DBsPredictive modeling DBs
Predictive modeling DBsDataVita
 

Similaire à Splendid: SPARQL Endpoint Federation Exploiting VOID Descriptions (20)

Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open DataSPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data Management
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...
Semantically Enhanced Interactions between Heterogeneous Data Life-Cycles - A...
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Linked Data for Federation of OER Data & Repositories
Linked Data for Federation of OER Data & RepositoriesLinked Data for Federation of OER Data & Repositories
Linked Data for Federation of OER Data & Repositories
 
Knowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebKnowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic Web
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
Data101 pmcb retreat_09-20-13_final
Data101 pmcb retreat_09-20-13_finalData101 pmcb retreat_09-20-13_final
Data101 pmcb retreat_09-20-13_final
 
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf
10-EOSC_Symposium_Zeyd_Boukhers.pptx (1).pdf
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
Predictive modeling DBs
Predictive modeling DBsPredictive modeling DBs
Predictive modeling DBs
 

Dernier

ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 

Dernier (20)

ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 

Splendid: SPARQL Endpoint Federation Exploiting VOID Descriptions

  • 1. Institute for Web Science and Technologies University of Koblenz ▪ Landau, Germany SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions Olaf Görlitz, Steffen Staab
  • 2. Motivation How to access a large number of linked data sources? WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 2
  • 3. Data Integration Approaches Data Warehouse Link Traversal  Efficient query execution  Live Data Access  Complete results  Flexible / On Demand  Data copies  Incomplete results  Inflexible  Biased by starting point WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 3
  • 4. Our Approach Data Federation Live data access Flexible source integration Effective query planning Complete results Hypothesis: Efficient query federation is possible using core Semantic Web technology (i.e. SPARQL endpoints, VoiD descriptions) WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 4
  • 5. VoiD: „Vocabulary of Interlinked Datasets“ } General Information } Basic statistics triples = 732744 } Type statistics chebi:Compound = 50477 } Predicate statistics bio:formula = 39555 WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 5
  • 6. Distributed Query Processing Contribution: Apply Best Practices of RDBMS for RDF Federation http://code.google.com/p/rdffederator/ WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 6
  • 7. Query Example Which drugs are categorized as micronutrients? SELECT ?drug ?title WHERE {   ?drug drugbank:drugCategory category:micronutrient .   ?drug drugbank:casRegistryNumber ?id .   ?keggDrug rdf:type kegg:Drug .   ?keggDrug bio2rdf:xRef ?id .   ?keggDrug purl:title ?title . } } WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 7
  • 8. Query Processing Source Selection Join Optimization Query Execution SELECT ?drug ?title WHERE {   ?drug drugbank:drugCategory category:micronutrient .   ?drug drugbank:casRegistryNumber ?id .   ?keggDrug rdf:type kegg:Drug .   ?keggDrug bio2rdf:xRef ?id .   ?keggDrug purl:title ?title . } } WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 8
  • 9. Query Processing Source Selection Join Optimization Query Execution 1. Step: Index-based source mapping SELECT ?drug ?title WHERE {   ?drug drugbank:drugCategory category:micronutrient . → drugbank   ?drug drugbank:casRegistryNumber ?id . → drugbank   ?keggDrug rdf:type kegg:Drug . → kegg   ?keggDrug bio2rdf:xRef ?id . → kegg   ?keggDrug purl:title ?title . } → kegg, dbpedia, Chebi } predicate-index type-index drugbank:drugCategory → drugbank kegg:Drug → kegg WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 9
  • 10. Query Processing Source Selection Join Optimization Query Execution 2. Step: Refinement with ASK Queries SELECT ?drug ?title WHERE {   ?drug drugbank:drugCategory category:micronutrient .   ?drug drugbank:casRegistryNumber ?id .   ?keggDrug rdf:type kegg:Drug .   ?keggDrug bio2rdf:xRef ?id .   ?keggDrug purl:title ?title . } } No index for subject / object values WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 10
  • 11. Query Processing Source Selection Join Optimization Query Execution 3. Step: Grouping Triple Patterns SELECT ?drug ?title WHERE {   ?drug drugbank:drugCategory category:micronutrient .   ?drug drugbank:casRegistryNumber ?id . } drugbank   ?keggDrug rdf:type kegg:Drug .   ?keggDrug bio2rdf:xRef ?id . } kegg   ?keggDrug purl:title ?title . } } kegg, dbpedia, Chebi } + grouping sameAs patterns WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 11
  • 12. Join Order Optimization Source Selection Join Optimization Query Execution Dynamic Programming with statistics-based cost estimation bind join / hash join WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 12
  • 13. Evaluation FedBench Evaluation Suite Measuring • Life Science + Cross Domain Data • #data sources selected • different query characteristics • query execution time Orthogonal State-of-the-Art approaches: DARQ AliBaba FedX SPLENDID Statistics ServiceDesc – – VoiD Source Statistics All sources ASK queries Statistics + Selection (predicates) ASK queries Query DynProg Heuristics Heuristics DynProg Optimization Query Bind join Bind join Bound Join + Bind Join + Execution parallelization Hash Join WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 13
  • 14. Evaluation: Source Selection Source Selection Join Optimization Query Execution owl:sameAs rdf:type WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 14
  • 15. Evaluation: Query Optimization Source Selection Join Optimization Query Execution WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 15
  • 16. Conclusion Publish more VoiD description! VoiD-based query federation is efficient What next?  Combination with FedX  Improving estimation and cost model  Integrating SPARQL 1.1 features WeST Institute Olaf Görlitz People and Knowledge Networks COLD 2011, Bonn, Germany Slide 16

Notes de l'éditeur

  1. Pre-selected linked datasets Transparent query federation