SlideShare une entreprise Scribd logo
1  sur  91
Télécharger pour lire hors ligne
Dependencies
Making Ontology Based Data Access Work in Practice
Mariano Rodriguez-Muro and Diego Calvanese
{rodriguez,calvanese}@inf.unibz.it
KRDB Research Centre
Free University of Bozen Bolzano
July, 2011
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 1 / 33
The context
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 2 / 33
DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Ontologies
• Intentional knowledge: TBox T .
• Extensional knowledge: ABox A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
• DL-LiteF TBoxes
B B | B ¬B | (funct R)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF concepts
B := A | ∃R
• DL-LiteF roles
R := P | P−
• DL-LiteF TBoxes
B B | B ¬B | (funct R)
• DL-LiteF ABoxes
A(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
Query Answering
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q, O).
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
Query Answering
TBox:
Man Person, Woman Person, Person ∃hasFather,
∃hasFather−
Person
ABox:
Man(mariano)
Queries:
q(x) ← Person(x), hasFather(x, y), Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q, O).
The promise
We can do this as efficiently as answering DB queries, also in the virtual
setting.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
Query Answering with PerfectRef (2005)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
Query Answering with PerfectRef (2005)
Query:
q(x) ← Person(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
Query Answering with PerfectRef (2005)
Query:
q(x) ← Person(x), hasFather(x, y), Person(y)
Reformulation:
q(x) ← Person(x), hasFather(x, y), Person(y)
q(x) ← Person(x), hasFather(x, y), hasFather(z, y)
q(x) ← Person(x), hasFather(x, y)
q(x) ← Person(x), Person(x)
q(x) ← Person(x)
q(x) ← Person(x), hasFather(x, y), Man(y)
q(x) ← Person(x), hasFather(x, y), Woman(y)
q(x) ← hasFather(x, m), hasFather(x, y), Person(y)
q(x) ← hasFather(x, m), hasFather(x, y), hasFather(z, y)
q(x) ← hasFather(x, m), hasFather(x, y)
q(x) ← hasFather(x, m), Person(x)
q(x) ← hasFather(x, m), hasFather(x, t)
q(x) ← hasFather(x, m)
q(x) ← hasFather(x, m), hasFather(x, y), Man(y)Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
Alternatives
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
What can we do?
?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 8 / 33
Query Answering
It is not only about existential constants
Query:
q(x, y) ← Person(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33
Query Answering
It is not only about existential constants
Query:
q(x, y) ← Person(x), hasFather(x, y), Person(y)
Reformulation:
q(x, y) ← Person(x), hasFather(x, y), Person(y)
q(x, y) ← Person(x), hasFather(x, y), hasFather(z, y)
q(x, y) ← Person(x), hasFather(x, y), Man(y)
q(x, y) ← Person(x), hasFather(x, y), Woman(y)
q(x, y) ← hasFather(x, m), hasFather(x, y), Person(y)
q(x, y) ← hasFather(x, m), hasFather(x, y), hasFather(z, y)
q(x, y) ← hasFather(x, m), hasFather(x, y), Man(y)
q(x, y) ← hasFather(x, m), hasFather(x, y), Woman(y)
q(x, y) ← Man(x), hasFather(x, y), Person(y)
q(x, y) ← Man(x), hasFather(x, y), hasFather(z, y)
q(x, y) ← Man(x), hasFather(x, y), Man(y)
q(x, y) ← Man(x), hasFather(x, y), Woman(y)
q(x, y) ← Woman(x), hasFather(x, y), Person(y)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33
The full picture: Ontology Based Data
Access
SourceUser Source
User
Queries
Ontology
Mappings
Source
To deal with OBDA we need to consider:
• If in the backend we have RDBMSs, we cannot go beyond their
capabilities.
• All systems are composed by T , D = R, I , M.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 10 / 33
First Observation
Is my data complete?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
• This happens a lot!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
First Observation
Is my data complete?
Completeness of A
The TBox sais: Manager Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need to
chase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
Second Observation
There are no ABoxes
THERE ARE NO ABOXES!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
Second Observation
There are no ABoxes
THERE ARE NO ABOXES!
Any Ontology based query answering systems today:
• Uses relational DBs to store the ABox data;
• In such D, both, R and I can be manipulated;
• Implementors may choose any M for their system;
Opportunity
To complete an ABox we can do more than expansion.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
How to approach the problem
Two level approach
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
How to approach the problem
Two level approach
How to approach OBDA in practice?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
How to approach the problem
Two level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
How to approach the problem
Two level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
Contributions
Dealing with redundancy
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 14 / 33
Characterizing completeness
ABox Dependencies
Definition
An assertion B A B that restricts valid ABoxes.
Syntax B2 A B2
Semantics: A |= Manager A Employee if Manager(x)∈ A implies
Employee(x)∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
Characterizing completeness
ABox Dependencies
Definition
An assertion B A B that restricts valid ABoxes.
Syntax B2 A B2
Semantics: A |= Manager A Employee if Manager(x)∈ A implies
Employee(x)∈ A.
ABox dependencies are fundamentally different than TBox assertions.
Think open world
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,
what do we do?
Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
When is an assertion redundant?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the following
hierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
When is an assertion redundant?
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a | hasFather(ramon, a ) ∧ Person(a ) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
When is an assertion redundant?
Indirect Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
When is an assertion redundant?
Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
Formalization: Redundancy
Given a TBox T and a set of dependencies Σ over T , the optimized version
of T w.r.t. Σ, denoted optim(T , Σ), is the set of inclusion assertions
{α ∈ sat(T ) | α is not redundant in sat(T ) w.r.t. sat(Σ)}
We can compute optim(T , Σ) in linear time.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 20 / 33
Contributions
Completing ABoxes
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 21 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with
D = R, I .
If we that V |= A A B, we check make sure that mappings for B include
all the data coming from the mappings of A.
Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B ∃R without the need for new
data.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
Semantic Index for OBDA
General Idea
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the index
and ranges!
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
Semantic Index Example
T = {B A, C A, C D}
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Semantic Index Example
T = {B A, C A, C D}
A
B C
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Semantic Index Example
T = {B A, C A, C D}
1
A
B
2
C
3
4
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Semantic Index Example
T = {B A, C A, C D}
1
A
B
2
C
3
4
D
We create a table TC with constant and idx columns. To insert the data
we use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Semantic Index Example
T = {B A, C A, C D}
1, {(1, 3)}
A
B
2, {(2, 2)}
C
3, {(3, 3)}
4, {(3, 4)}
D
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Semantic Index Example
T = {B A, C A, C D}
1, {(1, 3)}
A
B
2, {(2, 2)}
C
3, {(3, 3)}
4, {(3, 4)}
D
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
Experimentation I
The Resource Index features:
• Search over 22 document collections
• Semantics given by the hierarchies of 200 ontologies (SNOMED, GO)
Implementation in a nutshell:
(i) Understand documents with natural language processing and
annotate
Cervical Cancer( doc224 )
(ii) Expand the ABox
(iii) Pose queries that retrieve documents as
q(x) ← A1(x) ∧ · · · ∧ An(x)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 25 / 33
Experimentation II
The challenge:
• ≈ 3 million concepts and ≈ 2.5 million is-a assertions
• Split second responses
• 150 GB of data
• Expansion data: 1.5 TB
The experimentation data:
• Clinical Trials.gov (CT)
• 181 million assertion (≈ 14 GB of data, ≈ 140 GB when expanded.)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 26 / 33
Results
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
Results
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Results:
• Traditional reformulation: Union of 467874 SQL SPJ queries;
• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Time
to compute semantic index: 1 min; Size of data: +≈ 4 GB.
• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansion
time ≈ 7 days; Size of data +≈ 126 GB.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
The Query
The query:
q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
SELECT DISTINCT r0.element_id as element_id
FROM
RESOURCE_INDEX.CT_ANN r0 JOIN RESOURCE_INDEX.CT_ANN r1
ON r0.element_id = r1.element_id
JOIN RESOURCE_INDEX.CT_ANN r2
ON r1.element_id = r2.element_id
WHERE
((r0.idx >= 1783559 AND r0.idx <= 1783657)) AND
((r1.idx >= 1782996 AND r1.idx <= 1783029)) AND
((r2.idx >= 1783115 AND r2.idx <= 1783253));
Standard SQL query efficient in ANY DBMS.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 28 / 33
Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account more
than only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can complete
ABoxes.
• We presented to efficient ways to complete ABoxes, one for the
general OBDA setting and one for the virtual setting.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account more
than only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can complete
ABoxes.
• We presented to efficient ways to complete ABoxes, one for the
general OBDA setting and one for the virtual setting.
Future work
• Exploring more expressive languages.
• Exploring the RDFS/SPARQL setting.
• Handling updates of T and A.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
Extra examples
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 30 / 33
First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributes
id, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧
WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id, dept) ← Manager(id)∧
MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have that
Employee(John).
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
First Observation (cont.)
Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributes
id, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧
WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id, dept) ← Manager(id)∧
MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have that
Employee(John).
This is an indicator of completeness of all ABoxes A for M and R, e.g., A
is complete w.r.t. Manager A Employee.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
Formalization: Chains
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies over
T . A T -chain from B to C in T (resp., a Σ-chain from B to C in Σ) is a
sequence of concept inclusion assertions (Bi Bi )n
i=0 in T (resp., a
sequence of inclusion dependencies (Bi A Bi )n
i=0 in Σ), for some n ≥ 0,
such that:
1 B0 = B, Bn = C, and
2 for 1 ≤ i ≤ n, we have that Bi−1 and Bi are basic concepts s.t., either
(i) Bi−1 = Bi , or
(ii) Bi−1 = ∃R and Bi = ∃R−
, for some basic role R.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 32 / 33
Formalization: Redundancy
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies. The
concept inclusion assertion B C is directly redundant in T w.r.t. Σ if
(i) Σ |= B A C and
(ii) for every T -chain (Bi Bi )n
i=0 with Bn = B in T , there is a Σ-chain
(Bi A Bi )n
i=0.
Then, B C is redundant in T w.r.t. Σ if
(a) it is directly redundant, or
(b) there exists B = B s.t.
(i) T |= B C,
(ii) B C is not redundant in T w.r.t. Σ, and
(iii) B B is directly redundant in T w.r.t. Σ.
Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 33 / 33

Contenu connexe

En vedette

веселая математика
веселая математикавеселая математика
веселая математикаalexredhill
 
101.10.20 顧客服務及客訴處理 -詹翔霖教授
101.10.20 顧客服務及客訴處理 -詹翔霖教授101.10.20 顧客服務及客訴處理 -詹翔霖教授
101.10.20 顧客服務及客訴處理 -詹翔霖教授文化大學
 
Syarifudin, kumpulan outline semester genap, 2013
Syarifudin, kumpulan outline semester genap, 2013Syarifudin, kumpulan outline semester genap, 2013
Syarifudin, kumpulan outline semester genap, 2013Syarifudin Amq
 
1. data infrastructure keynote october 2010 alain
1. data infrastructure keynote october 2010 alain1. data infrastructure keynote october 2010 alain
1. data infrastructure keynote october 2010 alainDoina Draganescu
 
UrbnApparel presentation
UrbnApparel presentationUrbnApparel presentation
UrbnApparel presentationUrbnDesignz
 
3 อนุกรมเลขคณิต
3 อนุกรมเลขคณิต3 อนุกรมเลขคณิต
3 อนุกรมเลขคณิตToongneung SP
 
Testing can be fun! Intercomputer GS
Testing can be fun! Intercomputer GSTesting can be fun! Intercomputer GS
Testing can be fun! Intercomputer GSNataly Veremeeva
 
Kontit pomppimaan3
Kontit pomppimaan3Kontit pomppimaan3
Kontit pomppimaan3Arto Santala
 
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016A Bill for an Act to enact the Nigerian Budget Reform Bill,2016
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016Dr. TONYE CLINTON JAJA
 
IDS Credential 2016
IDS Credential 2016IDS Credential 2016
IDS Credential 2016Manas Mishra
 
Treatment of familial mediterranean fever: colchicine and beyond
Treatment of familial mediterranean fever: colchicine and beyondTreatment of familial mediterranean fever: colchicine and beyond
Treatment of familial mediterranean fever: colchicine and beyondJosé Luis Moreno Garvayo
 

En vedette (12)

веселая математика
веселая математикавеселая математика
веселая математика
 
101.10.20 顧客服務及客訴處理 -詹翔霖教授
101.10.20 顧客服務及客訴處理 -詹翔霖教授101.10.20 顧客服務及客訴處理 -詹翔霖教授
101.10.20 顧客服務及客訴處理 -詹翔霖教授
 
The game of the inteligents
The game of the inteligentsThe game of the inteligents
The game of the inteligents
 
Syarifudin, kumpulan outline semester genap, 2013
Syarifudin, kumpulan outline semester genap, 2013Syarifudin, kumpulan outline semester genap, 2013
Syarifudin, kumpulan outline semester genap, 2013
 
1. data infrastructure keynote october 2010 alain
1. data infrastructure keynote october 2010 alain1. data infrastructure keynote october 2010 alain
1. data infrastructure keynote october 2010 alain
 
UrbnApparel presentation
UrbnApparel presentationUrbnApparel presentation
UrbnApparel presentation
 
3 อนุกรมเลขคณิต
3 อนุกรมเลขคณิต3 อนุกรมเลขคณิต
3 อนุกรมเลขคณิต
 
Testing can be fun! Intercomputer GS
Testing can be fun! Intercomputer GSTesting can be fun! Intercomputer GS
Testing can be fun! Intercomputer GS
 
Kontit pomppimaan3
Kontit pomppimaan3Kontit pomppimaan3
Kontit pomppimaan3
 
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016A Bill for an Act to enact the Nigerian Budget Reform Bill,2016
A Bill for an Act to enact the Nigerian Budget Reform Bill,2016
 
IDS Credential 2016
IDS Credential 2016IDS Credential 2016
IDS Credential 2016
 
Treatment of familial mediterranean fever: colchicine and beyond
Treatment of familial mediterranean fever: colchicine and beyondTreatment of familial mediterranean fever: colchicine and beyond
Treatment of familial mediterranean fever: colchicine and beyond
 

Similaire à Introduction to query rewriting optimisation with dependencies

A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
Centralities_PaoloBoldi
Centralities_PaoloBoldiCentralities_PaoloBoldi
Centralities_PaoloBoldiYandex
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
 
Towards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic WebTowards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic WebJie Bao
 
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...Walid Saba
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...Seth Grimes
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...RuleML
 
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...AAT Taiwan
 
Adding Semantics to Ontologies
Adding Semantics to OntologiesAdding Semantics to Ontologies
Adding Semantics to OntologiesCristiano Longo
 
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologyFranz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologytaxonbytes
 
english-grammar
english-grammarenglish-grammar
english-grammarcjsmann
 
NLP in Practice - Part II
NLP in Practice - Part IINLP in Practice - Part II
NLP in Practice - Part IIDelip Rao
 
Reuse of Ontology Mappings
Reuse of Ontology MappingsReuse of Ontology Mappings
Reuse of Ontology MappingsAnika Groß
 
10 logic+programming+with+prolog
10 logic+programming+with+prolog10 logic+programming+with+prolog
10 logic+programming+with+prologbaran19901990
 
Citation metrics and the stories they tell
Citation metrics and the stories they tellCitation metrics and the stories they tell
Citation metrics and the stories they tellCarl Bergstrom
 

Similaire à Introduction to query rewriting optimisation with dependencies (20)

A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Centralities_PaoloBoldi
Centralities_PaoloBoldiCentralities_PaoloBoldi
Centralities_PaoloBoldi
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Towards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic WebTowards Linked Ontologies and Data on the Semantic Web
Towards Linked Ontologies and Data on the Semantic Web
 
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
BACK TO THE DRAWING BOARD - The Myth of Data-Driven NLU and How to go Forward...
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...Challenge@RuleML2015  Datalog+, RuleML and OWL 2 - Formats and Translations f...
Challenge@RuleML2015 Datalog+, RuleML and OWL 2 - Formats and Translations f...
 
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...
Making Chinese Art Accessible to Western Users- A Brief Report from AAT Taiwa...
 
Adding Semantics to Ontologies
Adding Semantics to OntologiesAdding Semantics to Ontologies
Adding Semantics to Ontologies
 
Meghyn slides-hse-2014
Meghyn slides-hse-2014Meghyn slides-hse-2014
Meghyn slides-hse-2014
 
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologyFranz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
 
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
2019 Triangle Machine Learning Day - Machine Learning for 3D Imaging - Sayan ...
 
english-grammar
english-grammarenglish-grammar
english-grammar
 
NLP in Practice - Part II
NLP in Practice - Part IINLP in Practice - Part II
NLP in Practice - Part II
 
Reuse of Ontology Mappings
Reuse of Ontology MappingsReuse of Ontology Mappings
Reuse of Ontology Mappings
 
10 logic+programming+with+prolog
10 logic+programming+with+prolog10 logic+programming+with+prolog
10 logic+programming+with+prolog
 
Citation metrics and the stories they tell
Citation metrics and the stories they tellCitation metrics and the stories they tell
Citation metrics and the stories they tell
 

Plus de Mariano Rodriguez-Muro

SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingMariano Rodriguez-Muro
 
SWT Lecture Session 8 - Inference in jena
SWT Lecture Session 8 - Inference in jenaSWT Lecture Session 8 - Inference in jena
SWT Lecture Session 8 - Inference in jenaMariano Rodriguez-Muro
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSMariano Rodriguez-Muro
 
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfs
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfsSWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfs
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfsMariano Rodriguez-Muro
 
SWT Lecture Session 4 - SW architectures and SPARQL
SWT Lecture Session 4 - SW architectures and SPARQLSWT Lecture Session 4 - SW architectures and SPARQL
SWT Lecture Session 4 - SW architectures and SPARQLMariano Rodriguez-Muro
 

Plus de Mariano Rodriguez-Muro (20)

SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
 
SWT Lab 3
SWT Lab 3SWT Lab 3
SWT Lab 3
 
SWT Lab 5
SWT Lab 5SWT Lab 5
SWT Lab 5
 
SWT Lab 2
SWT Lab 2SWT Lab 2
SWT Lab 2
 
SWT Lab 1
SWT Lab 1SWT Lab 1
SWT Lab 1
 
SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2
 
SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1
 
SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mapping
 
SWT Lecture Session 8 - Rules
SWT Lecture Session 8 - RulesSWT Lecture Session 8 - Rules
SWT Lecture Session 8 - Rules
 
SWT Lecture Session 8 - Inference in jena
SWT Lecture Session 8 - Inference in jenaSWT Lecture Session 8 - Inference in jena
SWT Lecture Session 8 - Inference in jena
 
SWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFSSWT Lecture Session 7 - Advanced uses of RDFS
SWT Lecture Session 7 - Advanced uses of RDFS
 
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfs
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfsSWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfs
SWT Lecture Session 6 - RDFS semantics, inference techniques, sesame rdfs
 
SWT Lecture Session 5 - RDFS
SWT Lecture Session 5 - RDFSSWT Lecture Session 5 - RDFS
SWT Lecture Session 5 - RDFS
 
SWT Lecture Session 4 - SW architectures and SPARQL
SWT Lecture Session 4 - SW architectures and SPARQLSWT Lecture Session 4 - SW architectures and SPARQL
SWT Lecture Session 4 - SW architectures and SPARQL
 
SWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - SesameSWT Lecture Session 4 - Sesame
SWT Lecture Session 4 - Sesame
 
SWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQLSWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQL
 
7 advanced uses of rdfs
7 advanced uses of rdfs7 advanced uses of rdfs
7 advanced uses of rdfs
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
4 sw architectures and sparql
4 sw architectures and sparql4 sw architectures and sparql
4 sw architectures and sparql
 
4 sesame
4 sesame4 sesame
4 sesame
 

Dernier

Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 

Dernier (20)

Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 

Introduction to query rewriting optimisation with dependencies

  • 1. Dependencies Making Ontology Based Data Access Work in Practice Mariano Rodriguez-Muro and Diego Calvanese {rodriguez,calvanese}@inf.unibz.it KRDB Research Centre Free University of Bozen Bolzano July, 2011 Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 1 / 33
  • 2. The context Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 2 / 33
  • 3. DL Ontologies Description Logics: • Formalisms for knowledge representation. • Decidable fragments of FOL • Base of OWL • World is described by means of Concepts and Roles Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
  • 4. DL Ontologies Description Logics: • Formalisms for knowledge representation. • Decidable fragments of FOL • Base of OWL • World is described by means of Concepts and Roles Ontologies • Intentional knowledge: TBox T . • Extensional knowledge: ABox A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33
  • 5. OBDA with DL-Lite A family of light-weight ontology languages Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
  • 6. OBDA with DL-Lite A family of light-weight ontology languages • DL-LiteF concepts B := A | ∃R Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
  • 7. OBDA with DL-Lite A family of light-weight ontology languages • DL-LiteF concepts B := A | ∃R • DL-LiteF roles R := P | P− Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
  • 8. OBDA with DL-Lite A family of light-weight ontology languages • DL-LiteF concepts B := A | ∃R • DL-LiteF roles R := P | P− • DL-LiteF TBoxes B B | B ¬B | (funct R) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
  • 9. OBDA with DL-Lite A family of light-weight ontology languages • DL-LiteF concepts B := A | ∃R • DL-LiteF roles R := P | P− • DL-LiteF TBoxes B B | B ¬B | (funct R) • DL-LiteF ABoxes A(a) | R(a, b) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33
  • 10. Query Answering Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
  • 11. Query Answering TBox: Man Person, Woman Person, Person ∃hasFather, ∃hasFather− Person ABox: Man(mariano) Queries: q(x) ← Person(x), hasFather(x, y), Person(y) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
  • 12. Query Answering TBox: Man Person, Woman Person, Person ∃hasFather, ∃hasFather− Person ABox: Man(mariano) Queries: q(x) ← Person(x), hasFather(x, y), Person(y) Problem: Compute the certain answers of Q, denoted cert(Q, O). Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
  • 13. Query Answering TBox: Man Person, Woman Person, Person ∃hasFather, ∃hasFather− Person ABox: Man(mariano) Queries: q(x) ← Person(x), hasFather(x, y), Person(y) Problem: Compute the certain answers of Q, denoted cert(Q, O). The promise We can do this as efficiently as answering DB queries, also in the virtual setting. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33
  • 14. Query Answering with PerfectRef (2005) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
  • 15. Query Answering with PerfectRef (2005) Query: q(x) ← Person(x), hasFather(x, y), Person(y) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
  • 16. Query Answering with PerfectRef (2005) Query: q(x) ← Person(x), hasFather(x, y), Person(y) Reformulation: q(x) ← Person(x), hasFather(x, y), Person(y) q(x) ← Person(x), hasFather(x, y), hasFather(z, y) q(x) ← Person(x), hasFather(x, y) q(x) ← Person(x), Person(x) q(x) ← Person(x) q(x) ← Person(x), hasFather(x, y), Man(y) q(x) ← Person(x), hasFather(x, y), Woman(y) q(x) ← hasFather(x, m), hasFather(x, y), Person(y) q(x) ← hasFather(x, m), hasFather(x, y), hasFather(z, y) q(x) ← hasFather(x, m), hasFather(x, y) q(x) ← hasFather(x, m), Person(x) q(x) ← hasFather(x, m), hasFather(x, t) q(x) ← hasFather(x, m) q(x) ← hasFather(x, m), hasFather(x, y), Man(y)Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33
  • 17. Alternatives Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 18. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 19. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Too many unions, cannot execute!. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 20. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Too many unions, cannot execute!. • PRESTO (Rosati et al., 2010) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 21. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Too many unions, cannot execute!. • PRESTO (Rosati et al., 2010) Better, eventually it breaks. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 22. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Too many unions, cannot execute!. • PRESTO (Rosati et al., 2010) Better, eventually it breaks. • Combined Approach (Kontchakov et. al., 2010) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 23. Alternatives • Improved version of PerfectRef (2007-2011) • RQR (Urbina et, al. 2007) Too many unions, cannot execute!. • PRESTO (Rosati et al., 2010) Better, eventually it breaks. • Combined Approach (Kontchakov et. al., 2010) Fast. But too much data and too much time. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33
  • 24. What can we do? ? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 8 / 33
  • 25. Query Answering It is not only about existential constants Query: q(x, y) ← Person(x), hasFather(x, y), Person(y) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33
  • 26. Query Answering It is not only about existential constants Query: q(x, y) ← Person(x), hasFather(x, y), Person(y) Reformulation: q(x, y) ← Person(x), hasFather(x, y), Person(y) q(x, y) ← Person(x), hasFather(x, y), hasFather(z, y) q(x, y) ← Person(x), hasFather(x, y), Man(y) q(x, y) ← Person(x), hasFather(x, y), Woman(y) q(x, y) ← hasFather(x, m), hasFather(x, y), Person(y) q(x, y) ← hasFather(x, m), hasFather(x, y), hasFather(z, y) q(x, y) ← hasFather(x, m), hasFather(x, y), Man(y) q(x, y) ← hasFather(x, m), hasFather(x, y), Woman(y) q(x, y) ← Man(x), hasFather(x, y), Person(y) q(x, y) ← Man(x), hasFather(x, y), hasFather(z, y) q(x, y) ← Man(x), hasFather(x, y), Man(y) q(x, y) ← Man(x), hasFather(x, y), Woman(y) q(x, y) ← Woman(x), hasFather(x, y), Person(y) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33
  • 27. The full picture: Ontology Based Data Access SourceUser Source User Queries Ontology Mappings Source To deal with OBDA we need to consider: • If in the backend we have RDBMSs, we cannot go beyond their capabilities. • All systems are composed by T , D = R, I , M. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 10 / 33
  • 28. First Observation Is my data complete? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 29. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 30. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 31. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. In any realistic scenario: Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 32. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. In any realistic scenario: • We don’t use arbitrary sources; Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 33. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. In any realistic scenario: • We don’t use arbitrary sources; • Intersection of semantics is reflected in completeness (e.g., no need to chase, expand or rewrite) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 34. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. In any realistic scenario: • We don’t use arbitrary sources; • Intersection of semantics is reflected in completeness (e.g., no need to chase, expand or rewrite) • This happens a lot! Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 35. First Observation Is my data complete? Completeness of A The TBox sais: Manager Employee In the ABox: all Managers are already employees. In any realistic scenario: • We don’t use arbitrary sources; • Intersection of semantics is reflected in completeness (e.g., no need to chase, expand or rewrite) • This happens a lot! Keyword Redundancy Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33
  • 36. Second Observation There are no ABoxes THERE ARE NO ABOXES! Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
  • 37. Second Observation There are no ABoxes THERE ARE NO ABOXES! Any Ontology based query answering systems today: • Uses relational DBs to store the ABox data; • In such D, both, R and I can be manipulated; • Implementors may choose any M for their system; Opportunity To complete an ABox we can do more than expansion. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33
  • 38. How to approach the problem Two level approach Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
  • 39. How to approach the problem Two level approach How to approach OBDA in practice? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
  • 40. How to approach the problem Two level approach How to approach OBDA in practice? • Efficient ways to deal with redundancy due to completeness. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
  • 41. How to approach the problem Two level approach How to approach OBDA in practice? • Efficient ways to deal with redundancy due to completeness. • Efficient ways to complete (virtual) ABoxes. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33
  • 42. Contributions Dealing with redundancy Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 14 / 33
  • 43. Characterizing completeness ABox Dependencies Definition An assertion B A B that restricts valid ABoxes. Syntax B2 A B2 Semantics: A |= Manager A Employee if Manager(x)∈ A implies Employee(x)∈ A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
  • 44. Characterizing completeness ABox Dependencies Definition An assertion B A B that restricts valid ABoxes. Syntax B2 A B2 Semantics: A |= Manager A Employee if Manager(x)∈ A implies Employee(x)∈ A. ABox dependencies are fundamentally different than TBox assertions. Think open world Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33
  • 45. Where to deal with redundancy? Given a TBox T , an ABox A, a set of dependencies Σ and a query Q, what do we do? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
  • 46. Where to deal with redundancy? Given a TBox T , an ABox A, a set of dependencies Σ and a query Q, what do we do? Available Options: Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
  • 47. Where to deal with redundancy? Given a TBox T , an ABox A, a set of dependencies Σ and a query Q, what do we do? Available Options: • Optimize the query reformulation algorithm to deal with Σ. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
  • 48. Where to deal with redundancy? Given a TBox T , an ABox A, a set of dependencies Σ and a query Q, what do we do? Available Options: • Optimize the query reformulation algorithm to deal with Σ. • Optimize the TBox T with respect to Σ. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33
  • 49. When is an assertion redundant? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
  • 50. When is an assertion redundant? Direct Redundancy: Case 1 Let T be implied the following hierarchy: ∃hasFather Person Human Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
  • 51. When is an assertion redundant? Direct Redundancy: Case 1 Let T be implied the following hierarchy: ∃hasFather Person Human Redundant if Σ is: ∃hasFather Person Human Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
  • 52. When is an assertion redundant? Direct Redundancy: Case 1 Let T be implied the following hierarchy: ∃hasFather Person Human Redundant if Σ is: ∃hasFather Person Human Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33
  • 53. When is an assertion redundant? Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
  • 54. When is an assertion redundant? Direct Redundancy: Case 2 Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
  • 55. When is an assertion redundant? Direct Redundancy: Case 2 Let T be the following TBox: Person ∃hasFather− ∃hasFather Man Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
  • 56. When is an assertion redundant? Direct Redundancy: Case 2 Let T be the following TBox: Person ∃hasFather− ∃hasFather Man Redundant if Σ is: Person ∃hasFather− ∃hasFather Man Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
  • 57. When is an assertion redundant? Direct Redundancy: Case 2 Let T be the following TBox: Person ∃hasFather− ∃hasFather Man Redundant if Σ is: Person ∃hasFather− ∃hasFather Man Σ sais Man(ramon) ∈ A → ∃a | hasFather(ramon, a ) ∧ Person(a ) ∈ A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33
  • 58. When is an assertion redundant? Indirect Redundancy Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
  • 59. When is an assertion redundant? Indirect Redundancy Let T be the following TBox: Animal Man Human Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
  • 60. When is an assertion redundant? Indirect Redundancy Let T be the following TBox: Animal Man Human Redundant if Σ is: Animal Man Human Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
  • 61. When is an assertion redundant? Indirect Redundancy Let T be the following TBox: Animal Man Human Redundant if Σ is: Animal Man Human Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33
  • 62. Formalization: Redundancy Given a TBox T and a set of dependencies Σ over T , the optimized version of T w.r.t. Σ, denoted optim(T , Σ), is the set of inclusion assertions {α ∈ sat(T ) | α is not redundant in sat(T ) w.r.t. sat(Σ)} We can compute optim(T , Σ) in linear time. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 20 / 33
  • 63. Contributions Completing ABoxes Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 21 / 33
  • 64. General considerations OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with D = R, I . Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
  • 65. General considerations OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with D = R, I . If we that V |= A A B, we check make sure that mappings for B include all the data coming from the mappings of A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
  • 66. General considerations OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with D = R, I . If we that V |= A A B, we check make sure that mappings for B include all the data coming from the mappings of A. Trade-off: • Degree of completeness (# of dependencies), • Cost of the procedure • Performance of Query answering. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
  • 67. General considerations OBDA systems have no ABoxes, instead virtual ABoxes V = D, M with D = R, I . If we that V |= A A B, we check make sure that mappings for B include all the data coming from the mappings of A. Trade-off: • Degree of completeness (# of dependencies), • Cost of the procedure • Performance of Query answering. We can complete virtual ABoxes up to B ∃R without the need for new data. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33
  • 68. Semantic Index for OBDA General Idea Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
  • 69. Semantic Index for OBDA General Idea • To encode the semantics of T in numeric indexes and ranges for concept names and roles. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
  • 70. Semantic Index for OBDA General Idea • To encode the semantics of T in numeric indexes and ranges for concept names and roles. • Store the ABox in the database using those indexes and ranges. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
  • 71. Semantic Index for OBDA General Idea • To encode the semantics of T in numeric indexes and ranges for concept names and roles. • Store the ABox in the database using those indexes and ranges. • Make mappings for the system that take the ranges into account. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
  • 72. Semantic Index for OBDA General Idea • To encode the semantics of T in numeric indexes and ranges for concept names and roles. • Store the ABox in the database using those indexes and ranges. • Make mappings for the system that take the ranges into account. We can do this by using the implied hierarchy of T to generate the index and ranges! Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33
  • 73. Semantic Index Example T = {B A, C A, C D} Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 74. Semantic Index Example T = {B A, C A, C D} A B C D Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 75. Semantic Index Example T = {B A, C A, C D} 1 A B 2 C 3 4 D Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 76. Semantic Index Example T = {B A, C A, C D} 1 A B 2 C 3 4 D We create a table TC with constant and idx columns. To insert the data we use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 77. Semantic Index Example T = {B A, C A, C D} 1, {(1, 3)} A B 2, {(2, 2)} C 3, {(3, 3)} 4, {(3, 4)} D Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 78. Semantic Index Example T = {B A, C A, C D} 1, {(1, 3)} A B 2, {(2, 2)} C 3, {(3, 3)} 4, {(3, 4)} D We create the mappings using the ranges, e.g., SELECT constant FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33
  • 79. Experimentation I The Resource Index features: • Search over 22 document collections • Semantics given by the hierarchies of 200 ontologies (SNOMED, GO) Implementation in a nutshell: (i) Understand documents with natural language processing and annotate Cervical Cancer( doc224 ) (ii) Expand the ABox (iii) Pose queries that retrieve documents as q(x) ← A1(x) ∧ · · · ∧ An(x) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 25 / 33
  • 80. Experimentation II The challenge: • ≈ 3 million concepts and ≈ 2.5 million is-a assertions • Split second responses • 150 GB of data • Expansion data: 1.5 TB The experimentation data: • Clinical Trials.gov (CT) • 181 million assertion (≈ 14 GB of data, ≈ 140 GB when expanded.) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 26 / 33
  • 81. Results The query: q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x) Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
  • 82. Results The query: q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x) Results: • Traditional reformulation: Union of 467874 SQL SPJ queries; • Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Time to compute semantic index: 1 min; Size of data: +≈ 4 GB. • ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansion time ≈ 7 days; Size of data +≈ 126 GB. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33
  • 83. The Query The query: q(x) ← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x) SELECT DISTINCT r0.element_id as element_id FROM RESOURCE_INDEX.CT_ANN r0 JOIN RESOURCE_INDEX.CT_ANN r1 ON r0.element_id = r1.element_id JOIN RESOURCE_INDEX.CT_ANN r2 ON r1.element_id = r2.element_id WHERE ((r0.idx >= 1783559 AND r0.idx <= 1783657)) AND ((r1.idx >= 1782996 AND r1.idx <= 1783029)) AND ((r2.idx >= 1783115 AND r2.idx <= 1783253)); Standard SQL query efficient in ANY DBMS. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 28 / 33
  • 84. Conclusions Contributions • We indicated that efficient OBDA requires to take into account more than only T , A and Q. • Provided means to deal with redundancy at the level of the TBox. • We showed that expansion is not necessary that we can complete ABoxes. • We presented to efficient ways to complete ABoxes, one for the general OBDA setting and one for the virtual setting. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
  • 85. Conclusions Contributions • We indicated that efficient OBDA requires to take into account more than only T , A and Q. • Provided means to deal with redundancy at the level of the TBox. • We showed that expansion is not necessary that we can complete ABoxes. • We presented to efficient ways to complete ABoxes, one for the general OBDA setting and one for the virtual setting. Future work • Exploring more expressive languages. • Exploring the RDFS/SPARQL setting. • Handling updates of T and A. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33
  • 86. Extra examples Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 30 / 33
  • 87. First Observation (cont.) Mappings will introduce dependencies over ABoxes Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
  • 88. First Observation (cont.) Mappings will introduce dependencies over ABoxes Let R be a DB schema with the relation schema employee with attributes id, dept, and salary. Let M be the following mappings: SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧ WORKS-FOR(id, dept) SELECT id,dept FROM employee WHERE salary > 1000 ;q(id, dept) ← Manager(id)∧ MANAGES(id, dept) Then for any instance I, if Manager(John) ∈ A we have that Employee(John). Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
  • 89. First Observation (cont.) Mappings will introduce dependencies over ABoxes Let R be a DB schema with the relation schema employee with attributes id, dept, and salary. Let M be the following mappings: SELECT id,dept FROM employee ;q(id, dept) ← Employee(id) ∧ WORKS-FOR(id, dept) SELECT id,dept FROM employee WHERE salary > 1000 ;q(id, dept) ← Manager(id)∧ MANAGES(id, dept) Then for any instance I, if Manager(John) ∈ A we have that Employee(John). This is an indicator of completeness of all ABoxes A for M and R, e.g., A is complete w.r.t. Manager A Employee. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33
  • 90. Formalization: Chains Let T be a TBox, B, C basic concepts, and Σ a set of dependencies over T . A T -chain from B to C in T (resp., a Σ-chain from B to C in Σ) is a sequence of concept inclusion assertions (Bi Bi )n i=0 in T (resp., a sequence of inclusion dependencies (Bi A Bi )n i=0 in Σ), for some n ≥ 0, such that: 1 B0 = B, Bn = C, and 2 for 1 ≤ i ≤ n, we have that Bi−1 and Bi are basic concepts s.t., either (i) Bi−1 = Bi , or (ii) Bi−1 = ∃R and Bi = ∃R− , for some basic role R. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 32 / 33
  • 91. Formalization: Redundancy Let T be a TBox, B, C basic concepts, and Σ a set of dependencies. The concept inclusion assertion B C is directly redundant in T w.r.t. Σ if (i) Σ |= B A C and (ii) for every T -chain (Bi Bi )n i=0 with Bn = B in T , there is a Σ-chain (Bi A Bi )n i=0. Then, B C is redundant in T w.r.t. Σ if (a) it is directly redundant, or (b) there exists B = B s.t. (i) T |= B C, (ii) B C is not redundant in T w.r.t. Σ, and (iii) B B is directly redundant in T w.r.t. Σ. Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 33 / 33