SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
Ontology-Based Data Access Mapping Generation
via Data, Schema, Query, and Mapping Knowledge
Pieter Heyvaert
pheyvaer.heyvaert@ugent.be
Semantic Web technologies rely on Linked Data
querying
visualizations
publishing
But not all data is accessible as Linked Data
databases
XML files
JSON files
Solutions to provide access exist
manual: completely done by the user
semi-automatic: users provide feedback
automatic: no user interaction required
But they have limitations
limited to specific use cases
limited support for complex use cases
PhD’s goal: improve access to Linked Data
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
How do we provide access?
non-Linked
Data
Linked
Data
?
How do we provide access?
non-Linked
Data
Linked
Data
?
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Apply mappings on non-Linked Data
non-Linked
Data
Linked
Data
mapping
mapping: rules to generate RDF terms and triples using data and ontologies
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
rule: create url from id
rule: name is value for ex:fullname
rule: if genre is ‘fiction’
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
Apply mappings on non-Linked Data
non-Linked Data Linked Datamapping
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ex:0 a ex:FictionAuthor .
ex:0 ex:fullname ‘J.K. Rowling’ .
ex:1 a ex:NonFictionAuthor .
ex:1 ex:fullname ‘George Orwell’ .
Mappings need to be created
from scratch (single-scenario use case)
mapping A
by reusing previous mappings (multi-scenario use case)
mapping B mapping C
mapping
(Semi-)automatic methods are preferred
mapping
manual
(semi-)automatic
Still a number of challenges left
dealing complex data (schemas)
not all techniques work on single-scenario use cases
Dealing with complex data (schemas)
e.g., when the class of an entity does not depend on the table, but on a value
rule: if genre is ‘fiction’,
class is ex:FictionAuthor
else
class is ex:NonFictionAuthor
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
Not all techniques work on single-scenario use cases
scenario A scenario Bmulti
single
because they rely on readily-available previous mappings
mapping
results in reuse
? scenario B?
results in reuse
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Current solutions
What knowledge is used?
How is this knowledge used?
What knowledge is not used?
What do current solutions use?
knowledge from the mapping process
existing knowledge outside the mapping process
Knowledge from mapping process is used
data
data schema
ontologies
not all elements are required
Existing knowledge is used
data
data schemas
mappings
ontologies
Linked Data
not all elements are required
How is all this knowledge used?
data schema + existing ontology
data + existing mapping
Data schema + existing ontology
data schema
new ontology
1
Data schema + existing ontology
data schema
existing ontologynew ontology match
1
2 2
Data schema + existing ontology
data schema
existing ontologynew ontology match
mapping
1
2 2
3
Data + existing mapping
data
classesproperties
1
Data + existing mapping
data existing mapping
classesproperties classespropertiesmodel
1
2 2
2
Data + existing mapping
data existing mapping
classes
mapping
properties classespropertiesmodel
1
2 2
2
3
3 3
These methods are not combined
only a single method is used
combining multiple methods has not been explored
What knowledge do current solutions not use?
not all knowledge from previous mappings
neglect query workload
Not all knowledge from previous mappings is used
data transformations
to lowercase
substring
conditions: if-else rules
Query workload is neglected
queries to be executed on the non-existing Linked Dataset
queries contains knowledge
model
used ontologies
annotations
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
ontology to use: http://example.com
model + annotations: ex:FictionAuthor
ex:fullname
How can we use queries?
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Research questions
discover existing knowledge
use discovered knowledge
Question 1: how can we discover
existing knowledge that is relevant?
?mappings
ontologies
(Linked) Data
query workload
data schema
existing
mapping
Question 2: how can we use the discovered knowledge
to generate a new mapping?
mapping
mappings
ontologies
(Linked) Data
query workload
data
data schema
ontologies
query workload
data schema
existing mapping process
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Hypotheses
improve quality
decrease task complexity
Hypothesis 1: using existing knowledge improves
the quality of a new single-scenario mapping.
quality → fitness for use
Hypothesis 2: using existing knowledge
decreases the task complexity of the mapping process.
Lui and Li developed model to measure task complexity.
5 characteristics that influence the task’s performance
Task complexity has 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Two aspects need to be tackled
discover existing knowledge
use knowledge
both can be tackled separately
Discover existing knowledge
infer knowledge from mapping process where possible
find relevant other existing knowledge via similarity metrics
Infer knowledge from mapping process
e.g., infer data schema from data
e.g., infer ontology from queries
Infer data schema from data
id name genre
0 J.K. Rowling fiction
1 George Orwell non-fiction
table: authors
table: authors
columns: id, name, genre
id: index, integer
name: string
genre: string (‘fiction’ or ‘non-fiction’)
Infer ontology from queries
select * where {
?s a ex:FictionAuthor .
?s ex:fullname ?n .
}
http://example.com
Find relevant existing knowledge via similarity metrics
mapping process
mapping
1. determine similarity
2. consider in mapping process
existing
table: authors
columns: id, name, genre
id: index, integer, unique
name: string
genre: string (‘fiction’ or
‘non-fiction’)
table: author
columns: id, fullname,
genres
id: index, integer
fullname: string
genres: string
Similarity metrics on different/combination of elements
metrics on data schema, ontologies, data, and query workload
PhD:
Which metrics do we use?
How do we combine the different metrics?
Two aspects need to be tackled
discover existing knowledge
use knowledge
Use knowledge
work with existing methods, e.g.:
data schema + existing ontology
data + existing mappings
PhD:
how do we include new knowledge?
how do we combine these methods?
Overview
problem statement
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Preliminary Results
RMLEditor
RMLWorkbench
mapping generation approaches
hierarchical data analysis
RMLEditor eases the creation of mappings
GUI so domain experts can create mappings
users can view the data, mappings, and RDF triples
usable by both non-SW and SW experts
PhD: present mappings to get feedback during mapping process
RMLWorkbench eases generation and publication
graphical user interface so domain experts can administer
Linked Data generation
publication workflow
PhD: manage elements of the mapping generation process
Identified mapping generation approaches
data-driven
schema-driven
model-driven
result-driven
PhD:
provides insights on how users work
this can be applied when developing an (semi-)automatic approach
Developed tool for data analysis on hierarchical data
efficient discovery of unique identifiers in hierarchical data
PhD: to infer knowledge within the mapping process
Overview
problem
current solutions
research questions
hypotheses
research methodology & approach
preliminary results
evaluation plan
Evaluation Plan
mapping quality
task complexity
Evaluate mapping quality
existing benchmark RODI
great for tabular data
no support for other formats, such as hierarchical data formats
Evaluate task complexity via 5 characteristics
input: e.g., data, ontologies, user feedback
output: Linked Data, mapping
process: steps, user actions
duration: time to complete task
presentation: user interface
Limited in current evaluations to single aspect
only duration
only number of user actions
only precision and recall
Roundup
improve single-scenario mappings by discovering and using existing knowledge
What similarity metrics we use for discovery?
How do we use and combine
the different methods and knowledge?

Contenu connexe

Tendances

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Dirk Lewandowski
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsMatthias Braunhofer
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsUniversity of Bergen
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluationijnlc
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentationnirvdrum
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsMohammad Aliannejadi
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generationijtsrd
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USAIadh Ounis
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsMatthias Braunhofer
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingJoe Raad
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-biasSyracuse University
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsMarco Rossetti
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queriesiosrjce
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 

Tendances (20)

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
 
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question GenerationDynamic Question Answer Generator An Enhanced Approach to Question Generation
Dynamic Question Answer Generator An Enhanced Approach to Question Generation
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender Systems
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
ACM ICTIR 2019 Slides - Santa Clara, USA
ACM ICTIR 2019 Slides -  Santa Clara, USAACM ICTIR 2019 Slides -  Santa Clara, USA
ACM ICTIR 2019 Slides - Santa Clara, USA
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
 
Carma internet research module n-bias
Carma internet research module   n-biasCarma internet research module   n-bias
Carma internet research module n-bias
 
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queries
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 

Similaire à Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptxsameernsn1
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationRich Heimann
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature reviewKashif Hussain
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabusanoop bk
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfRAKESHG79
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsGregor Polančič
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networkingStenio Fernandes
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustMenchita Falcutila Dumlao
 

Similaire à Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge (20)

Phd thesis final presentation
Phd thesis   final presentationPhd thesis   final presentation
Phd thesis final presentation
 
Topic modeling
Topic modelingTopic modeling
Topic modeling
 
OpenSciMatch
OpenSciMatchOpenSciMatch
OpenSciMatch
 
Topic model
Topic modelTopic model
Topic model
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics Corporation
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature review
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystemDigital repertoires of poetry metrics: towards a Linked Open Data ecosystem
Digital repertoires of poetry metrics: towards a Linked Open Data ecosystem
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabus
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
 
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape DesignsAn Empirical Investigation of the Intuitiveness of Process Landscape Designs
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networking
 
Lec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrustLec 1 integrating data science and data analytics in various research thrust
Lec 1 integrating data science and data analytics in various research thrust
 
My experiment
My experimentMy experiment
My experiment
 

Plus de Pieter Heyvaert

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping CreationPieter Heyvaert
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsPieter Heyvaert
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Pieter Heyvaert
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsPieter Heyvaert
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Pieter Heyvaert
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)Pieter Heyvaert
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Pieter Heyvaert
 

Plus de Pieter Heyvaert (7)

Semi-Automatic Example-Driven Linked Data Mapping Creation
Semi-Automatic  Example-Driven Linked Data Mapping CreationSemi-Automatic  Example-Driven Linked Data Mapping Creation
Semi-Automatic Example-Driven Linked Data Mapping Creation
 
Towards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping DefinitionsTowards a Uniform User Interface for Editing Mapping Definitions
Towards a Uniform User Interface for Editing Mapping Definitions
 
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
Using EPUB 3 and the Open Web Platform for Enhanced Presentation and Machine-...
 
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data MappingsRMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
RMLEditor: A Graph-based Mapping Editor for Linked Data Mappings
 
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
Graph-Based Editing of Linked Data Mappings using the RMLEditor | ESWC2016 De...
 
FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)FREME (EU Project Networking Session ESWC 2015)
FREME (EU Project Networking Session ESWC 2015)
 
Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)Buliding a DCAT Merger (SemDev 2015)
Buliding a DCAT Merger (SemDev 2015)
 

Dernier

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 

Dernier (20)

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 

Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and Mapping Knowledge

  • 1. Ontology-Based Data Access Mapping Generation via Data, Schema, Query, and Mapping Knowledge Pieter Heyvaert pheyvaer.heyvaert@ugent.be
  • 2. Semantic Web technologies rely on Linked Data querying visualizations publishing
  • 3. But not all data is accessible as Linked Data databases XML files JSON files
  • 4. Solutions to provide access exist manual: completely done by the user semi-automatic: users provide feedback automatic: no user interaction required
  • 5. But they have limitations limited to specific use cases limited support for complex use cases
  • 6. PhD’s goal: improve access to Linked Data
  • 7. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 8. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 9. How do we provide access? non-Linked Data Linked Data ?
  • 10. How do we provide access? non-Linked Data Linked Data ? id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 11. Apply mappings on non-Linked Data non-Linked Data Linked Data mapping mapping: rules to generate RDF terms and triples using data and ontologies
  • 12. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors rule: create url from id rule: name is value for ex:fullname rule: if genre is ‘fiction’ class is ex:FictionAuthor else class is ex:NonFictionAuthor
  • 13. Apply mappings on non-Linked Data non-Linked Data Linked Datamapping id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ex:0 a ex:FictionAuthor . ex:0 ex:fullname ‘J.K. Rowling’ . ex:1 a ex:NonFictionAuthor . ex:1 ex:fullname ‘George Orwell’ .
  • 14. Mappings need to be created from scratch (single-scenario use case) mapping A by reusing previous mappings (multi-scenario use case) mapping B mapping C mapping
  • 15. (Semi-)automatic methods are preferred mapping manual (semi-)automatic
  • 16. Still a number of challenges left dealing complex data (schemas) not all techniques work on single-scenario use cases
  • 17. Dealing with complex data (schemas) e.g., when the class of an entity does not depend on the table, but on a value rule: if genre is ‘fiction’, class is ex:FictionAuthor else class is ex:NonFictionAuthor id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors
  • 18. Not all techniques work on single-scenario use cases scenario A scenario Bmulti single because they rely on readily-available previous mappings mapping results in reuse ? scenario B? results in reuse
  • 19. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 20. Current solutions What knowledge is used? How is this knowledge used? What knowledge is not used?
  • 21. What do current solutions use? knowledge from the mapping process existing knowledge outside the mapping process
  • 22. Knowledge from mapping process is used data data schema ontologies not all elements are required
  • 23. Existing knowledge is used data data schemas mappings ontologies Linked Data not all elements are required
  • 24. How is all this knowledge used? data schema + existing ontology data + existing mapping
  • 25. Data schema + existing ontology data schema new ontology 1
  • 26. Data schema + existing ontology data schema existing ontologynew ontology match 1 2 2
  • 27. Data schema + existing ontology data schema existing ontologynew ontology match mapping 1 2 2 3
  • 28. Data + existing mapping data classesproperties 1
  • 29. Data + existing mapping data existing mapping classesproperties classespropertiesmodel 1 2 2 2
  • 30. Data + existing mapping data existing mapping classes mapping properties classespropertiesmodel 1 2 2 2 3 3 3
  • 31. These methods are not combined only a single method is used combining multiple methods has not been explored
  • 32. What knowledge do current solutions not use? not all knowledge from previous mappings neglect query workload
  • 33. Not all knowledge from previous mappings is used data transformations to lowercase substring conditions: if-else rules
  • 34. Query workload is neglected queries to be executed on the non-existing Linked Dataset queries contains knowledge model used ontologies annotations
  • 35. select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors ontology to use: http://example.com model + annotations: ex:FictionAuthor ex:fullname How can we use queries?
  • 36. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 37. Research questions discover existing knowledge use discovered knowledge
  • 38. Question 1: how can we discover existing knowledge that is relevant? ?mappings ontologies (Linked) Data query workload data schema existing mapping
  • 39. Question 2: how can we use the discovered knowledge to generate a new mapping? mapping mappings ontologies (Linked) Data query workload data data schema ontologies query workload data schema existing mapping process
  • 40. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 42. Hypothesis 1: using existing knowledge improves the quality of a new single-scenario mapping. quality → fitness for use
  • 43. Hypothesis 2: using existing knowledge decreases the task complexity of the mapping process. Lui and Li developed model to measure task complexity. 5 characteristics that influence the task’s performance
  • 44. Task complexity has 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 45. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 46. Two aspects need to be tackled discover existing knowledge use knowledge both can be tackled separately
  • 47. Discover existing knowledge infer knowledge from mapping process where possible find relevant other existing knowledge via similarity metrics
  • 48. Infer knowledge from mapping process e.g., infer data schema from data e.g., infer ontology from queries
  • 49. Infer data schema from data id name genre 0 J.K. Rowling fiction 1 George Orwell non-fiction table: authors table: authors columns: id, name, genre id: index, integer name: string genre: string (‘fiction’ or ‘non-fiction’)
  • 50. Infer ontology from queries select * where { ?s a ex:FictionAuthor . ?s ex:fullname ?n . } http://example.com
  • 51. Find relevant existing knowledge via similarity metrics mapping process mapping 1. determine similarity 2. consider in mapping process existing table: authors columns: id, name, genre id: index, integer, unique name: string genre: string (‘fiction’ or ‘non-fiction’) table: author columns: id, fullname, genres id: index, integer fullname: string genres: string
  • 52. Similarity metrics on different/combination of elements metrics on data schema, ontologies, data, and query workload PhD: Which metrics do we use? How do we combine the different metrics?
  • 53. Two aspects need to be tackled discover existing knowledge use knowledge
  • 54. Use knowledge work with existing methods, e.g.: data schema + existing ontology data + existing mappings PhD: how do we include new knowledge? how do we combine these methods?
  • 55. Overview problem statement research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 56. Preliminary Results RMLEditor RMLWorkbench mapping generation approaches hierarchical data analysis
  • 57. RMLEditor eases the creation of mappings GUI so domain experts can create mappings users can view the data, mappings, and RDF triples usable by both non-SW and SW experts PhD: present mappings to get feedback during mapping process
  • 58. RMLWorkbench eases generation and publication graphical user interface so domain experts can administer Linked Data generation publication workflow PhD: manage elements of the mapping generation process
  • 59. Identified mapping generation approaches data-driven schema-driven model-driven result-driven PhD: provides insights on how users work this can be applied when developing an (semi-)automatic approach
  • 60. Developed tool for data analysis on hierarchical data efficient discovery of unique identifiers in hierarchical data PhD: to infer knowledge within the mapping process
  • 61. Overview problem current solutions research questions hypotheses research methodology & approach preliminary results evaluation plan
  • 63. Evaluate mapping quality existing benchmark RODI great for tabular data no support for other formats, such as hierarchical data formats
  • 64. Evaluate task complexity via 5 characteristics input: e.g., data, ontologies, user feedback output: Linked Data, mapping process: steps, user actions duration: time to complete task presentation: user interface
  • 65. Limited in current evaluations to single aspect only duration only number of user actions only precision and recall
  • 66. Roundup improve single-scenario mappings by discovering and using existing knowledge What similarity metrics we use for discovery? How do we use and combine the different methods and knowledge?