SlideShare une entreprise Scribd logo
1  sur  14
A Role for Provenance in Quality
Assessment


Chris Baillie, Pete Edwards, and Edoardo Pignotti
c.baillie@abdn.ac.uk
Overview

 Motivation

 Evaluating Data Quality

 A Role for Provenance

 Future work




                    c.baillie@abdn.ac.uk
Motivation

 “we don’t know whether the information we find [on the Web]
  is accurate or not. So we have to teach people how to assess
  what they’ve found’’
       Vint Cerf, 2010


 Web of Documents has become the Web of documents,
  services, data, and people.

 Anyone can publish anything so we need a way to evaluate
  quality.

 We are investigating these issues within the Internet of Things
    Sensors now at the centre of many applications


                         c.baillie@abdn.ac.uk
Example Scenario




             c.baillie@abdn.ac.uk
Evaluating Data Quality
                                                           Quality Scores
                                                           -Quality is a multi-
Entity (and context)                                       dimensional construct
To evaluate quality, we                                        - Accuracy
must examine the                                               - Timeliness
context around data                                            - Relevance
                                  F(E, R) = Q

WIQA Framework
examines data content,                                 Data Requirements
context, and external                                  -Furber and Hepp (2011)
ratings                                                use rules to identify
          (Bizer et al. 2009)                          quality problems




                                c.baillie@abdn.ac.uk
Representing Sensor Observations


 Linked Data: “recommended best practice for exposing,
  sharing, and connecting pieces of data using URIs and RDF”




                      c.baillie@abdn.ac.uk
Performing Quality Assessment




                                         CONSTRUCT {
                                           _:b0 a QualityScore .
                                           _:b0 score ?qs .
                             ( E distanceFromRoute X )
                                           _:b0 dqm:ruleViolation _:b1 .
         Rrelevance =     1-
                                      100 _:b1 a DataRequirementViolation .
                                           _:b1 dqm:affectedInstance ?instance .
                                         } WHERE {
                                           ?instance a Observation .
                                           ?instance distanceFromRoute ?distance .
                                           LET (?qs := (1 - (?distance / 100))) .
                                         }



                        c.baillie@abdn.ac.uk
Quality Assessment Results




              c.baillie@abdn.ac.uk
Observation Provenance
 Provenance is a critical part of observation context

 Describes the entities, agents, and activities involved in
  data creation:
    How was the observation value measured?
    Who controlled the sensing process?
    How has the observation been transformed since it was
     created?


 W3C Prov-O model provides linked data representation
  of provenance
Observation Provenance
                     Entity
                 "Observation 2"


                 wasGeneratedBy
                                      Activity
                                   "Map matching"

                                        used
                                                           Agent
                                                           "Chris"
                                       Entity
                                   "Observation 1"
                                                     wasAssociatedWith

                                   wasGeneratedBy
                                                          Activity
                                                      "Sensing Process"

                                                            used


                                                            Entity
                                                       "iPhoneSensor"
Quality Score Provenance
Work To Date
 Developed Quality Assessment Framework that enables:
    Linked data representation of sensor observations
    Definition of quality requirements using SPARQL rules
    Generation of quality scores via reasoning



Future Work
 Implementation of quality rules that examine provenance
 Investigate quality score re-use
Any questions?




Come and see the IRP demo (D9) to see quality
           assessment in action.
Implementation
                                       Quality Rules
           Observation      Reasoner   Relevance
             Triple          (SPIN)      Rule
             Store
                                       Timeliness
                                          Rule
                  Apache Tomcat         Accuracy
                                          Rule
          Observation        Quality
           Service           Service   Availability
                                         Rule

Contenu connexe

En vedette

Unforgetable trip sp2 h
Unforgetable trip sp2 hUnforgetable trip sp2 h
Unforgetable trip sp2 hslidesharer09
 
Evaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and ProvenanceEvaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and ProvenanceChris Baillie
 
Connect and combine
Connect and combineConnect and combine
Connect and combinedoeniadee
 
11.mon div
11.mon div11.mon div
11.mon divwdwasile
 
Quality Reasoning in the Semantic Web
Quality Reasoning in the Semantic WebQuality Reasoning in the Semantic Web
Quality Reasoning in the Semantic WebChris Baillie
 
Circuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con resetCircuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con resetFernando Marcos Marcos
 

En vedette (9)

Unforgetable trip sp2 h
Unforgetable trip sp2 hUnforgetable trip sp2 h
Unforgetable trip sp2 h
 
Evaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and ProvenanceEvaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and Provenance
 
Grammar book
Grammar bookGrammar book
Grammar book
 
Connect and combine
Connect and combineConnect and combine
Connect and combine
 
10.mon pr
10.mon pr10.mon pr
10.mon pr
 
11.mon div
11.mon div11.mon div
11.mon div
 
Quality Reasoning in the Semantic Web
Quality Reasoning in the Semantic WebQuality Reasoning in the Semantic Web
Quality Reasoning in the Semantic Web
 
Filtros y oscilador de wien
Filtros y oscilador de wienFiltros y oscilador de wien
Filtros y oscilador de wien
 
Circuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con resetCircuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con reset
 

Similaire à A Role for Provenance in Quality Assessment

COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB Project
 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Stuart Wrigley
 
IoT 2010 Talk on System Infrastructure for the Internet of Things.
IoT 2010 Talk on System Infrastructure for the  Internet of Things.IoT 2010 Talk on System Infrastructure for the  Internet of Things.
IoT 2010 Talk on System Infrastructure for the Internet of Things.Fahim Kawsar
 
Kliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_finalKliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_finalIGN Vorstand
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 
Testing systemqualities agile2012
Testing systemqualities   agile2012Testing systemqualities   agile2012
Testing systemqualities agile2012drewz lin
 
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph YoderTesting System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph YoderJoseph Yoder
 
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryFebruary 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryJohn Wang
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshopFrank Gielen
 
Top100summit christina
Top100summit christinaTop100summit christina
Top100summit christinaChristina Geng
 
Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2CA Technologies Italia
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012Amazon Web Services
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialStuart Charlton
 
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionAlbert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionInstitute for Knowledge Mobilization
 
Semantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation AlgorithmsSemantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation AlgorithmsLuigi Ceccaroni
 
Industrialized Linked Data
Industrialized Linked DataIndustrialized Linked Data
Industrialized Linked DataDave Reynolds
 
service quality & usability
service quality & usabilityservice quality & usability
service quality & usabilityYves Pigneur
 

Similaire à A Role for Provenance in Quality Assessment (20)

COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
 
IoT 2010 Talk on System Infrastructure for the Internet of Things.
IoT 2010 Talk on System Infrastructure for the  Internet of Things.IoT 2010 Talk on System Infrastructure for the  Internet of Things.
IoT 2010 Talk on System Infrastructure for the Internet of Things.
 
Kliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_finalKliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_final
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
Testing systemqualities agile2012
Testing systemqualities   agile2012Testing systemqualities   agile2012
Testing systemqualities agile2012
 
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph YoderTesting System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
 
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryFebruary 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshop
 
Top100summit christina
Top100summit christinaTop100summit christina
Top100summit christina
 
Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
 
Knowledge mobilization
Knowledge mobilization Knowledge mobilization
Knowledge mobilization
 
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionAlbert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
 
Semantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation AlgorithmsSemantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation Algorithms
 
Hypothesis Based Testing: Power + Speed.
Hypothesis Based Testing: Power + Speed.Hypothesis Based Testing: Power + Speed.
Hypothesis Based Testing: Power + Speed.
 
Industrialized Linked Data
Industrialized Linked DataIndustrialized Linked Data
Industrialized Linked Data
 
service quality & usability
service quality & usabilityservice quality & usability
service quality & usability
 

Dernier

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Dernier (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 

A Role for Provenance in Quality Assessment

  • 1. A Role for Provenance in Quality Assessment Chris Baillie, Pete Edwards, and Edoardo Pignotti c.baillie@abdn.ac.uk
  • 2. Overview  Motivation  Evaluating Data Quality  A Role for Provenance  Future work c.baillie@abdn.ac.uk
  • 3. Motivation  “we don’t know whether the information we find [on the Web] is accurate or not. So we have to teach people how to assess what they’ve found’’ Vint Cerf, 2010  Web of Documents has become the Web of documents, services, data, and people.  Anyone can publish anything so we need a way to evaluate quality.  We are investigating these issues within the Internet of Things  Sensors now at the centre of many applications c.baillie@abdn.ac.uk
  • 4. Example Scenario c.baillie@abdn.ac.uk
  • 5. Evaluating Data Quality Quality Scores -Quality is a multi- Entity (and context) dimensional construct To evaluate quality, we - Accuracy must examine the - Timeliness context around data - Relevance F(E, R) = Q WIQA Framework examines data content, Data Requirements context, and external -Furber and Hepp (2011) ratings use rules to identify (Bizer et al. 2009) quality problems c.baillie@abdn.ac.uk
  • 6. Representing Sensor Observations  Linked Data: “recommended best practice for exposing, sharing, and connecting pieces of data using URIs and RDF” c.baillie@abdn.ac.uk
  • 7. Performing Quality Assessment CONSTRUCT { _:b0 a QualityScore . _:b0 score ?qs . ( E distanceFromRoute X ) _:b0 dqm:ruleViolation _:b1 . Rrelevance = 1- 100 _:b1 a DataRequirementViolation . _:b1 dqm:affectedInstance ?instance . } WHERE { ?instance a Observation . ?instance distanceFromRoute ?distance . LET (?qs := (1 - (?distance / 100))) . } c.baillie@abdn.ac.uk
  • 8. Quality Assessment Results c.baillie@abdn.ac.uk
  • 9. Observation Provenance  Provenance is a critical part of observation context  Describes the entities, agents, and activities involved in data creation:  How was the observation value measured?  Who controlled the sensing process?  How has the observation been transformed since it was created?  W3C Prov-O model provides linked data representation of provenance
  • 10. Observation Provenance Entity "Observation 2" wasGeneratedBy Activity "Map matching" used Agent "Chris" Entity "Observation 1" wasAssociatedWith wasGeneratedBy Activity "Sensing Process" used Entity "iPhoneSensor"
  • 12. Work To Date  Developed Quality Assessment Framework that enables:  Linked data representation of sensor observations  Definition of quality requirements using SPARQL rules  Generation of quality scores via reasoning Future Work  Implementation of quality rules that examine provenance  Investigate quality score re-use
  • 13. Any questions? Come and see the IRP demo (D9) to see quality assessment in action.
  • 14. Implementation Quality Rules Observation Reasoner Relevance Triple (SPIN) Rule Store Timeliness Rule Apache Tomcat Accuracy Rule Observation Quality Service Service Availability Rule

Notes de l'éditeur

  1. In this talk I will outline: why the need for quality assessment exists describe how quality is perceived outline our approach to quality assessment provide an example scenario and outline our future work.
  2. Don’t know whether information is accuracte: need to assess! Web has evolved. Web = open platform. Web is big, need smaller platform for eval.
  3. Consider mobile phones providing passenger information regarding the location of buses. Sometimes we get lucky and observations land right on the bus route. However, there are many different sources of low quality data. Inaccurate GPS readings… Malicious users… someone playing with the app while at home People that make mistakes… someone perhaps on the wrong bus…
  4. Animate this ObservationValue ->[Motivate SSN here] Observation + foi -> disruption report
  5. DataRequirement1 -> wasAttributedTo -> Agent