SlideShare une entreprise Scribd logo
1  sur  30
Engineering optimisations in
query rewriting for OBDA
José Mora and Óscar Corcho
{jmora, ocorcho}@fi.upm.es
Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo s/n
28660 Boadilla del Monte, Madrid, Spain
Index
• Background
• Query rewriting (QR) for OBDA
• Approaches
• Clause generation
• Selection function
• Optimisations in the rewriting
• Ontology preprocessing
• Constrain the searches
• Subsumption checks
• Prioritize some inferences
• Questions, comments and other feedback (you do this
part)
2
Q’
Query rewriting for OBDA
3
Query
Q’
Query rewriting for OBDA
3
Q
Rewriting
We use the ontology to
produce a new query (Q’)
We can use this new
query with the DB and
obtain all the answers
with the inference done.
Main approaches in the state of the art
5
Expressiveness Author System Output Date
ELHIO¬
Pérez-Urbina
et al.
REQUIEM
[R] Datalog,
UCQ
2009
Sticky-join [linear]
datalog±
Gottlob et al. Nyaya UCQ 2011
DL-LiteR, DL-LiteF
Calvanese et
al.
QuOnto UCQ 2007
DL-LiteR
Chortaras et
al.
Rapid UCQ 2011
DL-LiteR [+EBox] Rosati et al.
Presto &
Prexto
NR-Datalog &
UCQ
2010 &
2012
Main approaches in the state of the art
5
Expressiveness Author System Output Date
ELHIO¬
Pérez-Urbina
et al.
REQUIEM
[R] Datalog,
UCQ
2009
Sticky-join [linear]
datalog±
Gottlob et al. Nyaya UCQ 2011
DL-LiteR, DL-LiteF
Calvanese et
al.
QuOnto UCQ 2007
DL-LiteR
Chortaras et
al.
Rapid UCQ 2011
DL-LiteR [+EBox] Rosati et al.
Presto &
Prexto
NR-Datalog &
UCQ
2010 &
2012
Clause generation [Pérez-Urbina2010]
6
Selection function [Pérez-Urbina2010]
8José Mora
Optimisations in the rewriting
9José Mora
• The rewriting can be optimised in
several ways
• Ontology preprocessing
• Constrain the searches
• Subsumption checks
• Prioritize inferences
A running example
10José Mora
Consider the following ontology:
With the following query:
A running example
11José Mora
We can convert
the ontology to a
set of clauses
A running example (in Datalog)
12José Mora
We can convert
the ontology to a
set of clauses
Note the auxiliary
predicate
Ontology preprocessing
10José Mora
• Ontology preprocessing
• Some of the resolution
steps don’t need a query
to be done
• Can be done only once
for all queries
• The results can be
stored, increase in size
is reasonable
• Evaluation peformed
with the usual
ontologies, and a few
more →
In our example:
14José Mora
We can do some
inferences
(this is only a
small sample)
to obtain a processed
ontology
Preprocessed ontology
15José Mora
We obtain the
preprocessed
ontology
Preprocessed ontology
16José Mora
We obtain the
preprocessed
ontology
Note the lack of
auxiliary
predicates, and
the presence of
clauses beyond
the mentioned
possibilities for
the initial Datalog
Constraining the searches
13José Mora
• There are several searches for clauses that can be
constrained, improving the efficiency
• We have seen we have main premises and side
premises, and clauses cannot work as both
• By keeping separately clauses that may work as
main premises and as side premises we reduce the
combinatorial search for valid resolutions
• By separating clauses that act as main premises and
side premises we can use selection functions that are
more simple and handle more types of clauses
• By allowing more types of clauses we can remove
some auxiliary predicates
Subsumption checks
18José Mora
Due to preprocessing only a few inferences are needed to obtain the Datalog program:
Note the last inferred clause subsumes the two other inferred clauses
This means a reduction of 90% in this specific example
Consider the following example:
Subsumption checks
11
José Mora
Consider the following example:
Subsumption checks
11
José Mora
the inferred clause subsumes the main premise
If we keep the main premise it will participate in many inferences.
Let’s look at the Datalog program:
Datalog program
21José Mora
There are some clauses that could participate in inferences
Subsumption checks
11
José Mora
the inferred clause subsumes the main premise
This way we can avoid a set of inferences that produce
subsumed clauses that are useless. (73% in this example)
We can delete the first clause because it’s redundant
And we should delete it ASAP
And therefore any clause obtained from the premise
will be subsumed too (by some other clause)
Rewritten query
23José Mora
In the end, the rewritten query is only this:
Prioritizing some inferences
12José Mora
Deleting subsumed clauses reduces
the number of clauses handled
it’s more efficient
but we need a
subsuming clause
Prioritizing some inferences
12José Mora
Deleting subsumed clauses reduces
the number of clauses handled
it’s more efficient
but we need a
subsuming clause
Producing subsuming clauses first
allows to delete subsubmed clauses
some inferences lead
there, some don’t
it’s about choosing
the right ones
Prioritizing some inferences
12José Mora
Deleting subsumed clauses reduces
the number of clauses handled
it’s more efficient
but we need a
subsuming clause
Producing subsuming clauses first
allows to delete subsubmed clauses
some inferences lead
there, some don’t
it’s about choosing
the right onesWe can use heuristics to try to
produce subsuming clauses ASAP
shorter clauses are more likely to
produce subsuming clauses
Recently generated clauses first
(similar to depth first search) helps too
Evaluation (1/2)
27José Mora
Evaluation (2/2)
28José Mora
Full results at: http://www.oeg-upm.net/files/jmora/
Questions, comments, sugg
estions, everything is
welcome
Questions?
14José Mora
Engineering optimisations in
query rewriting for OBDA
José Mora and Óscar Corcho
{jmora, ocorcho}@fi.upm.es
Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo s/n
28660 Boadilla del Monte, Madrid, Spain

Contenu connexe

Similaire à Engineering optimisations in query rewriting for OBDA

Ood and solid principles
Ood and solid principlesOod and solid principles
Ood and solid principlesAvinash Kadam
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...dgarijo
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
OSS Java Analysis - What You Might Be Missing
OSS Java Analysis - What You Might Be MissingOSS Java Analysis - What You Might Be Missing
OSS Java Analysis - What You Might Be MissingCoverity
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Systematic Inventive Thinking and Process improvements
Systematic Inventive Thinking and Process improvementsSystematic Inventive Thinking and Process improvements
Systematic Inventive Thinking and Process improvementsKarthik Srinivasan
 
Top jcl interview questions and answers job interview tips
Top jcl interview questions and answers job interview tipsTop jcl interview questions and answers job interview tips
Top jcl interview questions and answers job interview tipsjcltutorial
 
代码大全(内训)
代码大全(内训)代码大全(内训)
代码大全(内训)Horky Chen
 
Refactoring: Improve the design of existing code
Refactoring: Improve the design of existing codeRefactoring: Improve the design of existing code
Refactoring: Improve the design of existing codeValerio Maggio
 
Getting Unstuck: Working with Legacy Code and Data
Getting Unstuck: Working with Legacy Code and DataGetting Unstuck: Working with Legacy Code and Data
Getting Unstuck: Working with Legacy Code and DataCory Foy
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingLionel Briand
 
Software design principles
Software design principlesSoftware design principles
Software design principlesMd.Mojibul Hoque
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separationNAVER Engineering
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationMongoDB
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tipsBill Buchan
 
MongoDB World 2018: Tutorial - MongoDB Meets Chaos Monkey
MongoDB World 2018: Tutorial - MongoDB Meets Chaos MonkeyMongoDB World 2018: Tutorial - MongoDB Meets Chaos Monkey
MongoDB World 2018: Tutorial - MongoDB Meets Chaos MonkeyMongoDB
 

Similaire à Engineering optimisations in query rewriting for OBDA (20)

Ood and solid principles
Ood and solid principlesOod and solid principles
Ood and solid principles
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
OSS Java Analysis - What You Might Be Missing
OSS Java Analysis - What You Might Be MissingOSS Java Analysis - What You Might Be Missing
OSS Java Analysis - What You Might Be Missing
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Systematic Inventive Thinking and Process improvements
Systematic Inventive Thinking and Process improvementsSystematic Inventive Thinking and Process improvements
Systematic Inventive Thinking and Process improvements
 
Top jcl interview questions and answers job interview tips
Top jcl interview questions and answers job interview tipsTop jcl interview questions and answers job interview tips
Top jcl interview questions and answers job interview tips
 
代码大全(内训)
代码大全(内训)代码大全(内训)
代码大全(内训)
 
Refactoring: Improve the design of existing code
Refactoring: Improve the design of existing codeRefactoring: Improve the design of existing code
Refactoring: Improve the design of existing code
 
Getting Unstuck: Working with Legacy Code and Data
Getting Unstuck: Working with Legacy Code and DataGetting Unstuck: Working with Legacy Code and Data
Getting Unstuck: Working with Legacy Code and Data
 
Clean code
Clean codeClean code
Clean code
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Design p atterns
Design p atternsDesign p atterns
Design p atterns
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Software design principles
Software design principlesSoftware design principles
Software design principles
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separation
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
Dev buchan 30 proven tips
Dev buchan 30 proven tipsDev buchan 30 proven tips
Dev buchan 30 proven tips
 
MongoDB World 2018: Tutorial - MongoDB Meets Chaos Monkey
MongoDB World 2018: Tutorial - MongoDB Meets Chaos MonkeyMongoDB World 2018: Tutorial - MongoDB Meets Chaos Monkey
MongoDB World 2018: Tutorial - MongoDB Meets Chaos Monkey
 

Dernier

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Engineering optimisations in query rewriting for OBDA

  • 1. Engineering optimisations in query rewriting for OBDA José Mora and Óscar Corcho {jmora, ocorcho}@fi.upm.es Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo s/n 28660 Boadilla del Monte, Madrid, Spain
  • 2. Index • Background • Query rewriting (QR) for OBDA • Approaches • Clause generation • Selection function • Optimisations in the rewriting • Ontology preprocessing • Constrain the searches • Subsumption checks • Prioritize some inferences • Questions, comments and other feedback (you do this part) 2
  • 4. Q’ Query rewriting for OBDA 3 Q Rewriting We use the ontology to produce a new query (Q’) We can use this new query with the DB and obtain all the answers with the inference done.
  • 5. Main approaches in the state of the art 5 Expressiveness Author System Output Date ELHIO¬ Pérez-Urbina et al. REQUIEM [R] Datalog, UCQ 2009 Sticky-join [linear] datalog± Gottlob et al. Nyaya UCQ 2011 DL-LiteR, DL-LiteF Calvanese et al. QuOnto UCQ 2007 DL-LiteR Chortaras et al. Rapid UCQ 2011 DL-LiteR [+EBox] Rosati et al. Presto & Prexto NR-Datalog & UCQ 2010 & 2012
  • 6. Main approaches in the state of the art 5 Expressiveness Author System Output Date ELHIO¬ Pérez-Urbina et al. REQUIEM [R] Datalog, UCQ 2009 Sticky-join [linear] datalog± Gottlob et al. Nyaya UCQ 2011 DL-LiteR, DL-LiteF Calvanese et al. QuOnto UCQ 2007 DL-LiteR Chortaras et al. Rapid UCQ 2011 DL-LiteR [+EBox] Rosati et al. Presto & Prexto NR-Datalog & UCQ 2010 & 2012
  • 9. Optimisations in the rewriting 9José Mora • The rewriting can be optimised in several ways • Ontology preprocessing • Constrain the searches • Subsumption checks • Prioritize inferences
  • 10. A running example 10José Mora Consider the following ontology: With the following query:
  • 11. A running example 11José Mora We can convert the ontology to a set of clauses
  • 12. A running example (in Datalog) 12José Mora We can convert the ontology to a set of clauses Note the auxiliary predicate
  • 13. Ontology preprocessing 10José Mora • Ontology preprocessing • Some of the resolution steps don’t need a query to be done • Can be done only once for all queries • The results can be stored, increase in size is reasonable • Evaluation peformed with the usual ontologies, and a few more →
  • 14. In our example: 14José Mora We can do some inferences (this is only a small sample) to obtain a processed ontology
  • 15. Preprocessed ontology 15José Mora We obtain the preprocessed ontology
  • 16. Preprocessed ontology 16José Mora We obtain the preprocessed ontology Note the lack of auxiliary predicates, and the presence of clauses beyond the mentioned possibilities for the initial Datalog
  • 17. Constraining the searches 13José Mora • There are several searches for clauses that can be constrained, improving the efficiency • We have seen we have main premises and side premises, and clauses cannot work as both • By keeping separately clauses that may work as main premises and as side premises we reduce the combinatorial search for valid resolutions • By separating clauses that act as main premises and side premises we can use selection functions that are more simple and handle more types of clauses • By allowing more types of clauses we can remove some auxiliary predicates
  • 18. Subsumption checks 18José Mora Due to preprocessing only a few inferences are needed to obtain the Datalog program: Note the last inferred clause subsumes the two other inferred clauses This means a reduction of 90% in this specific example
  • 19. Consider the following example: Subsumption checks 11 José Mora
  • 20. Consider the following example: Subsumption checks 11 José Mora the inferred clause subsumes the main premise If we keep the main premise it will participate in many inferences. Let’s look at the Datalog program:
  • 21. Datalog program 21José Mora There are some clauses that could participate in inferences
  • 22. Subsumption checks 11 José Mora the inferred clause subsumes the main premise This way we can avoid a set of inferences that produce subsumed clauses that are useless. (73% in this example) We can delete the first clause because it’s redundant And we should delete it ASAP And therefore any clause obtained from the premise will be subsumed too (by some other clause)
  • 23. Rewritten query 23José Mora In the end, the rewritten query is only this:
  • 24. Prioritizing some inferences 12José Mora Deleting subsumed clauses reduces the number of clauses handled it’s more efficient but we need a subsuming clause
  • 25. Prioritizing some inferences 12José Mora Deleting subsumed clauses reduces the number of clauses handled it’s more efficient but we need a subsuming clause Producing subsuming clauses first allows to delete subsubmed clauses some inferences lead there, some don’t it’s about choosing the right ones
  • 26. Prioritizing some inferences 12José Mora Deleting subsumed clauses reduces the number of clauses handled it’s more efficient but we need a subsuming clause Producing subsuming clauses first allows to delete subsubmed clauses some inferences lead there, some don’t it’s about choosing the right onesWe can use heuristics to try to produce subsuming clauses ASAP shorter clauses are more likely to produce subsuming clauses Recently generated clauses first (similar to depth first search) helps too
  • 28. Evaluation (2/2) 28José Mora Full results at: http://www.oeg-upm.net/files/jmora/
  • 29. Questions, comments, sugg estions, everything is welcome Questions? 14José Mora
  • 30. Engineering optimisations in query rewriting for OBDA José Mora and Óscar Corcho {jmora, ocorcho}@fi.upm.es Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo s/n 28660 Boadilla del Monte, Madrid, Spain