SlideShare une entreprise Scribd logo
1  sur  30
Evolution of Workflow Provenance
    Information in the Presence of
        Custom Inference Rules
     C. Strubulis, Y. Tzitzikas, M. Doerr and G. Flouris
                {strubul, tzitzik, martin, fgeo}@ics.forth.gr




Institute of Computer Science
Foundation for Research                            SWPM Workshop
and Technology – Hellas                              28/05/2012
ICS-FORTH                                          Crete, Herakleion

Computer Science Department
University of Crete
SWPM 2012                                          2



Outline
 • Provenance-based inference rules
 • Knowledge evolution of provenance information
SWPM 2012                                                  3



What is Provenance?
 • Etymology
   ▫ French verb “provenir”
   ▫ Meaning: to come forth, originate

 • The Merriam-Webster Online Dictionary
   ▫ the origin, source
   ▫ the history of ownership of a valued object or work
     of art or literature.
SWPM 2012                                                                 4



Why is Provenance Important?
  anon4877_base_20060331.jpg               anon4877_lesion_20060401.jpg

                         Reproducibility

                           Data quality

                           Attribution

                          Informational


               How were these images created?
        Was any pre-processing applied more!)raw data?
                 Provenance is as (or to the
                important as the experimental
      Who created them?       What’s the difference?
                           results
              Are they really from the same patient?
SWPM 2012                                                 5



Motivation
 • Motivation
   ▫ Reduce the amount of provenance information
     that has to be stored (produced by scientific
     workflow systems)
   ▫ Reduce the time and human effort in the case of
     manual ingestion of provenance metadata in
     repositories
   ▫ Elimination of errors starting from the input
     data
       Reduction of error search space -> Easier error
        correction
SWPM 2012                                     6



  Storage Space Challenge

          Extra data for
                            Adoption of Inference
          provenance
Storage                         mechanisms
Space


              Data
              Production




                                          Time
SWPM 2012                                                    7



User Input Challenge

                       Record
                       Provenance




            Digital
            Metadata                Ingestion

                                                 RDF/S
                                                Repository
SWPM 2012                                   8



User Input Challenge
   Day 1              Day 11111




            …………….                Adoption of
                                   Inference
                                  mechanisms




               Time
SWPM 2012                                                  9



   Error Correction Challenge
Scientific workflow   Provenance   Data          Metadata
                      record       ingestion




                      Which
                      one to
                      correct?
     From
     where to                        Adoption of Inference
     start                               mechanisms
     searching?
SWPM 2012                                                                10



   Approach-Results
• Our Approach
  ▫ Dynamic completion of the stored knowledge by
    logical assumptions – inferences

  ▫ Identify and specify some basic provenance-
                                                           Query results
    based inference rules                                       +
                                                            Inferences
  ▫ In addition, we tackle the knowledge evolution
    requirements
     The question is how we can satisfy update requests
      while still supporting the aforementioned
      provenance-based inference rules.
     Operations: Disassociation, Contraction
SWPM 2012                                                                             11



 The Assumed Provenance Model
   Most provenance
   models have                                     P12 was             E73 Information
                                      E5 Event
   similar concepts!!!                             present at               Object
                                IsA
                P9 forms                                                P128
                part of                          P12 was                carries
                           E7 Activity           present at

                    P14                                                      E24
Small part of                                P16 was used
                    carried                                                Physical
CIDOC CRM                                    for
                    out by                                                 Man-Made
                                                                            Thing
                                         E22 Man-made            IsA
                    E39 Actor            Object (Device)
                                                              P46 forms part of
SWPM 2012                                               12



    The three inference rules
R1:
If an actor has carried out one activity,
then (s)he has carried out all of its subactivities.


R2:
If an object (device) was used for an event,
then all parts of that object were also used for
that event.

R3:
If a physical object that carries an information object
was present at an event, then that information object
was also present at the event.
SWPM 2012                   13



3d Reconstruction process
SWPM 2012                                                                    14



Actors - Activities
• If an actor has carried out one activity, then (s)he has carried out all of
  its subactivities.

  Starc    P14 carried out Laser scanning         P14 carried out
 Institute by                acquisition                             John
                                                  by
                             P9 forms
                             part of
                             Detailed sequence
                                  of shots
   P14 carried                                                 P14 carried
   out by                                                      out by
                                  P9 forms
                                  part of
                 Capture 1      ………………..       Capture 10

                                         P14 carried out by
SWPM 2012                                                                15



Devices - Activities
 If an object was used for an event, then all parts of the object were
 used for that event too.

                                                     P16 was used for
     Detailed
                       P16 was
   sequence of                       Multiviewdome device
                       used for
      shots




                                      P46 forms    P46 forms
                                      part of      part of


                        Nikon D90        AF-5_NIKKOR
                                           18-105mm  ........ Nikon D300
   P16 was used for
SWPM 2012                                                               16



Information Objects - Events
 If a physical object that carries an information object was present
 at an event, then that information object was present at the event
 too.

                         P12 was present at
  3D reconstruction                           Part of column of Ramesses II

  Detailed sequence of
         shots
                                                  P128
         ………                                      carries

     Capture 1_10

                    P12 was present
                                              Information in hieroglyphics
                    at
SWPM 2012                                          17



Outline
 • Provenance-based inference rules
 • Knowledge evolution of provenance information
SWPM 2012                                               18



Knowledge Evolution
                                           Inference
                                           rules
                     Evolution



 • Updating our knowledge is essential!!

 • Requests for adding/removing information

 • The use of inference rules introduces difficulties
   with respect to the evolution of knowledge
SWPM 2012                                                                            19



    Example
• Consider a KB containing the         Starc      P14 carried    Laser scanning
  activities of Laser Scan            Institute   out by           acquisition
  Acquisition
                                                             P9 forms
                                                             part of
• Starc is propagated to all the
  subactivities of Laser Scan                                   Detailed sequence
  Acquisition by rule R1                                             of shots

• Update request:                             P14
   Starc was not responsible for              carried                P9
   writing the Capture 1_1                    out by                 forms
                                                                     part of
• There are two cases to handle the                     Capture 1              Capture 10
  request
SWPM 2012                                                20



Foundational vs Coherence
 • Foundational Viewpoint
   ▫ Each piece of our knowledge serves as a
     justification for other beliefs
   ▫ Implicit facts are supported by the explicit ones
   ▫ Explicit knowledge is more important than
     implicit one

 • Coherence Viewpoint
   ▫ Every piece of knowledge is self-justified
   ▫ Implicit needs no support from explicit
   ▫ Explicit and implicit have the same value
SWPM 2012                                                        21



Deletion of a fact
 • Foundational:
   ▫ All implicit data that is no longer supported must also
     be deleted




  • Coherence:
     ▫ Delete implicit data only if it is necessary due to the
       deletion request
SWPM 2012                                                                          22



    Example
• Consider a KB containing the       Starc      P14 carried    Laser scanning
  activities of Laser Scan          Institute   out by           acquisition
  Acquisition
                                                           P9 forms
                                                           part of
• Starc is propagated to all the
  subactivities of Laser Scan                                 Detailed sequence
  Acquisition by rule R1                                           of shots

• Update request:                           P14
   Starc was not responsible for            carried                P9
   Capture 1_1                              out by                 forms
                                                                   part of
• Two cases:                                          Capture 1              Capture 10
   •Actor disassociation
   (foundational)
   •Actor contraction (coherence)
23

• Update request:
   Starc was not responsible for Capture 1


                 Foundational                                            Coherence


                P14 carried      Laser                                  P14 carried      Laser
 Starc                                                    Starc
                               scanning                                                scanning
Institute       out by                                   Institute      out by
                              acquisition                                             acquisition

                     P9 forms                                                P9 forms
                     part of                                                 part of


                          Detailed sequence                                       Detailed sequence
                               of shots                                                of shots

      P14                                                     P14
      carried                                                 carried
                              P9 forms                                                P9 forms
      out by                                                  out by
                              part of                                                 part of

                 Capture 1                  Capture 10                   Capture 1                  Capture 10


                        P14 carried out by                                      P14 carried out by
SWPM 2012                                         24



Complexity analysis
   • Similar operations can be also defined for
     rules R2 and R3
SWPM 2012                                                                         25



Conclusion
 • Provenance-based Inference Rules
   ▫   We motivated the need for provenance-based inference rules to
       reduce the storage space requirements
       ease the ingestion of metadata and the error correction
   ▫   We identified three basic rules accompanied by real world examples.

 • Provenance-based Inference Rules and Knowledge
   Evolution
   ▫   The use of inference rules introduces difficulties with respect to the
       evolution of knowledge
   ▫   We identified two ways to deal with deletions in this context
   ▫   Even though we confined ourselves to CIDOC, and to three specific
       inference rules, the general ideas behind our work (including the
       discrimination between foundational and coherence semantics of
       deletion) can be applied to other models and/or sets of inference rules.
26




Thank you for your attention.
27
SWPM 2012                                                           28



On repository policies
                        Needs to be computed
                        after every change
                    7
                                              1   R: rdfs + rules
            5                  6              2   R: rdfs
                    4
                                              3   R: rules
                                              4   K
                                              5   C: rdfs
                2          3
                                              6   C: rules
                                              7   C: rdfs+rules
                    1



                                   Increases space
SWPM 2012            29



General statistics
SWPM 2012          30



Space evaluation

Contenu connexe

Tendances

Cufp 2012 talk
Cufp 2012 talkCufp 2012 talk
Cufp 2012 talktoolslive
 
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...Sociotechnical Roundtable
 
Yew manson
Yew mansonYew manson
Yew mansonNASAPMC
 

Tendances (6)

Eye-based head gestures
Eye-based head gesturesEye-based head gestures
Eye-based head gestures
 
Cufp 2012 talk
Cufp 2012 talkCufp 2012 talk
Cufp 2012 talk
 
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
 
Yew manson
Yew mansonYew manson
Yew manson
 
Bellec cornell 2021
Bellec cornell 2021Bellec cornell 2021
Bellec cornell 2021
 
Art c
Art cArt c
Art c
 

Similaire à Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules

SDPM - Lecture 5 - Software effort estimation
SDPM - Lecture 5 - Software effort estimationSDPM - Lecture 5 - Software effort estimation
SDPM - Lecture 5 - Software effort estimationOpenLearningLab
 
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryFebruary 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryJohn Wang
 
SDPM - Lecture 4 - Activity planning and resource allocation
SDPM - Lecture 4 - Activity planning and resource allocationSDPM - Lecture 4 - Activity planning and resource allocation
SDPM - Lecture 4 - Activity planning and resource allocationOpenLearningLab
 
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingDigital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingChris Godin✪
 
Sigir12 tutorial: Query Perfromance Prediction for IR
Sigir12 tutorial: Query Perfromance Prediction for IRSigir12 tutorial: Query Perfromance Prediction for IR
Sigir12 tutorial: Query Perfromance Prediction for IRDavid Carmel
 
A middleware platform_to_federate_complex_event_processing
A middleware platform_to_federate_complex_event_processingA middleware platform_to_federate_complex_event_processing
A middleware platform_to_federate_complex_event_processingFawaz Fernand PARAISO
 
Dev ops self service approach-1.3
Dev ops  self service approach-1.3Dev ops  self service approach-1.3
Dev ops self service approach-1.3Alex Tregubov
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer PresentationSplunk
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer PresentationSplunk
 
Paper talk: Idcc 11
Paper talk: Idcc 11  Paper talk: Idcc 11
Paper talk: Idcc 11 Paolo Missier
 
Process Project Mgt Seminar 8 Apr 2009(2)
Process Project Mgt Seminar 8 Apr 2009(2)Process Project Mgt Seminar 8 Apr 2009(2)
Process Project Mgt Seminar 8 Apr 2009(2)avitale1998
 
Electronic Data Discovery
Electronic Data DiscoveryElectronic Data Discovery
Electronic Data DiscoveryCarahsoft
 
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!Zend by Rogue Wave Software
 
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value Splunk
 
SDPM - Lecture 7 - Project monitoring and control
SDPM - Lecture 7 - Project monitoring and controlSDPM - Lecture 7 - Project monitoring and control
SDPM - Lecture 7 - Project monitoring and controlOpenLearningLab
 
Splunk September 2023 User Group PDX.pdf
Splunk September 2023 User Group PDX.pdfSplunk September 2023 User Group PDX.pdf
Splunk September 2023 User Group PDX.pdfAmanda Richardson
 

Similaire à Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules (20)

Qual-IT-yes2012
Qual-IT-yes2012Qual-IT-yes2012
Qual-IT-yes2012
 
SDPM - Lecture 5 - Software effort estimation
SDPM - Lecture 5 - Software effort estimationSDPM - Lecture 5 - Software effort estimation
SDPM - Lecture 5 - Software effort estimation
 
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryFebruary 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
 
SDPM - Lecture 4 - Activity planning and resource allocation
SDPM - Lecture 4 - Activity planning and resource allocationSDPM - Lecture 4 - Activity planning and resource allocation
SDPM - Lecture 4 - Activity planning and resource allocation
 
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingDigital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
 
Sigir12 tutorial: Query Perfromance Prediction for IR
Sigir12 tutorial: Query Perfromance Prediction for IRSigir12 tutorial: Query Perfromance Prediction for IR
Sigir12 tutorial: Query Perfromance Prediction for IR
 
A middleware platform_to_federate_complex_event_processing
A middleware platform_to_federate_complex_event_processingA middleware platform_to_federate_complex_event_processing
A middleware platform_to_federate_complex_event_processing
 
Dev ops self service approach-1.3
Dev ops  self service approach-1.3Dev ops  self service approach-1.3
Dev ops self service approach-1.3
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer Presentation
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer Presentation
 
Paper talk: Idcc 11
Paper talk: Idcc 11  Paper talk: Idcc 11
Paper talk: Idcc 11
 
Process Project Mgt Seminar 8 Apr 2009(2)
Process Project Mgt Seminar 8 Apr 2009(2)Process Project Mgt Seminar 8 Apr 2009(2)
Process Project Mgt Seminar 8 Apr 2009(2)
 
8D Problem Solving Report Template with Guidance
8D Problem Solving Report Template with Guidance8D Problem Solving Report Template with Guidance
8D Problem Solving Report Template with Guidance
 
Electronic Data Discovery
Electronic Data DiscoveryElectronic Data Discovery
Electronic Data Discovery
 
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!
Code Tracing with Zend Server 5: A Flight Recorder for your PHP Applications!
 
Inte Great Detailed Presentation Full V35 2
Inte Great Detailed Presentation Full V35 2Inte Great Detailed Presentation Full V35 2
Inte Great Detailed Presentation Full V35 2
 
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
 
SDPM - Lecture 7 - Project monitoring and control
SDPM - Lecture 7 - Project monitoring and controlSDPM - Lecture 7 - Project monitoring and control
SDPM - Lecture 7 - Project monitoring and control
 
Splunk September 2023 User Group PDX.pdf
Splunk September 2023 User Group PDX.pdfSplunk September 2023 User Group PDX.pdf
Splunk September 2023 User Group PDX.pdf
 
COBWEB Authentication Workshop
COBWEB Authentication WorkshopCOBWEB Authentication Workshop
COBWEB Authentication Workshop
 

Plus de PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 

Plus de PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Dernier (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules

  • 1. Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules C. Strubulis, Y. Tzitzikas, M. Doerr and G. Flouris {strubul, tzitzik, martin, fgeo}@ics.forth.gr Institute of Computer Science Foundation for Research SWPM Workshop and Technology – Hellas 28/05/2012 ICS-FORTH Crete, Herakleion Computer Science Department University of Crete
  • 2. SWPM 2012 2 Outline • Provenance-based inference rules • Knowledge evolution of provenance information
  • 3. SWPM 2012 3 What is Provenance? • Etymology ▫ French verb “provenir” ▫ Meaning: to come forth, originate • The Merriam-Webster Online Dictionary ▫ the origin, source ▫ the history of ownership of a valued object or work of art or literature.
  • 4. SWPM 2012 4 Why is Provenance Important? anon4877_base_20060331.jpg anon4877_lesion_20060401.jpg Reproducibility Data quality Attribution Informational How were these images created? Was any pre-processing applied more!)raw data? Provenance is as (or to the important as the experimental Who created them? What’s the difference? results Are they really from the same patient?
  • 5. SWPM 2012 5 Motivation • Motivation ▫ Reduce the amount of provenance information that has to be stored (produced by scientific workflow systems) ▫ Reduce the time and human effort in the case of manual ingestion of provenance metadata in repositories ▫ Elimination of errors starting from the input data  Reduction of error search space -> Easier error correction
  • 6. SWPM 2012 6 Storage Space Challenge Extra data for Adoption of Inference provenance Storage mechanisms Space Data Production Time
  • 7. SWPM 2012 7 User Input Challenge Record Provenance Digital Metadata Ingestion RDF/S Repository
  • 8. SWPM 2012 8 User Input Challenge Day 1 Day 11111 ……………. Adoption of Inference mechanisms Time
  • 9. SWPM 2012 9 Error Correction Challenge Scientific workflow Provenance Data Metadata record ingestion Which one to correct? From where to Adoption of Inference start mechanisms searching?
  • 10. SWPM 2012 10 Approach-Results • Our Approach ▫ Dynamic completion of the stored knowledge by logical assumptions – inferences ▫ Identify and specify some basic provenance- Query results based inference rules + Inferences ▫ In addition, we tackle the knowledge evolution requirements  The question is how we can satisfy update requests while still supporting the aforementioned provenance-based inference rules.  Operations: Disassociation, Contraction
  • 11. SWPM 2012 11 The Assumed Provenance Model Most provenance models have P12 was E73 Information E5 Event similar concepts!!! present at Object IsA P9 forms P128 part of P12 was carries E7 Activity present at P14 E24 Small part of P16 was used carried Physical CIDOC CRM for out by Man-Made Thing E22 Man-made IsA E39 Actor Object (Device) P46 forms part of
  • 12. SWPM 2012 12 The three inference rules R1: If an actor has carried out one activity, then (s)he has carried out all of its subactivities. R2: If an object (device) was used for an event, then all parts of that object were also used for that event. R3: If a physical object that carries an information object was present at an event, then that information object was also present at the event.
  • 13. SWPM 2012 13 3d Reconstruction process
  • 14. SWPM 2012 14 Actors - Activities • If an actor has carried out one activity, then (s)he has carried out all of its subactivities. Starc P14 carried out Laser scanning P14 carried out Institute by acquisition John by P9 forms part of Detailed sequence of shots P14 carried P14 carried out by out by P9 forms part of Capture 1 ……………….. Capture 10 P14 carried out by
  • 15. SWPM 2012 15 Devices - Activities If an object was used for an event, then all parts of the object were used for that event too. P16 was used for Detailed P16 was sequence of Multiviewdome device used for shots P46 forms P46 forms part of part of Nikon D90 AF-5_NIKKOR 18-105mm ........ Nikon D300 P16 was used for
  • 16. SWPM 2012 16 Information Objects - Events If a physical object that carries an information object was present at an event, then that information object was present at the event too. P12 was present at 3D reconstruction Part of column of Ramesses II Detailed sequence of shots P128 ……… carries Capture 1_10 P12 was present Information in hieroglyphics at
  • 17. SWPM 2012 17 Outline • Provenance-based inference rules • Knowledge evolution of provenance information
  • 18. SWPM 2012 18 Knowledge Evolution Inference rules Evolution • Updating our knowledge is essential!! • Requests for adding/removing information • The use of inference rules introduces difficulties with respect to the evolution of knowledge
  • 19. SWPM 2012 19 Example • Consider a KB containing the Starc P14 carried Laser scanning activities of Laser Scan Institute out by acquisition Acquisition P9 forms part of • Starc is propagated to all the subactivities of Laser Scan Detailed sequence Acquisition by rule R1 of shots • Update request: P14 Starc was not responsible for carried P9 writing the Capture 1_1 out by forms part of • There are two cases to handle the Capture 1 Capture 10 request
  • 20. SWPM 2012 20 Foundational vs Coherence • Foundational Viewpoint ▫ Each piece of our knowledge serves as a justification for other beliefs ▫ Implicit facts are supported by the explicit ones ▫ Explicit knowledge is more important than implicit one • Coherence Viewpoint ▫ Every piece of knowledge is self-justified ▫ Implicit needs no support from explicit ▫ Explicit and implicit have the same value
  • 21. SWPM 2012 21 Deletion of a fact • Foundational: ▫ All implicit data that is no longer supported must also be deleted • Coherence: ▫ Delete implicit data only if it is necessary due to the deletion request
  • 22. SWPM 2012 22 Example • Consider a KB containing the Starc P14 carried Laser scanning activities of Laser Scan Institute out by acquisition Acquisition P9 forms part of • Starc is propagated to all the subactivities of Laser Scan Detailed sequence Acquisition by rule R1 of shots • Update request: P14 Starc was not responsible for carried P9 Capture 1_1 out by forms part of • Two cases: Capture 1 Capture 10 •Actor disassociation (foundational) •Actor contraction (coherence)
  • 23. 23 • Update request: Starc was not responsible for Capture 1 Foundational Coherence P14 carried Laser P14 carried Laser Starc Starc scanning scanning Institute out by Institute out by acquisition acquisition P9 forms P9 forms part of part of Detailed sequence Detailed sequence of shots of shots P14 P14 carried carried P9 forms P9 forms out by out by part of part of Capture 1 Capture 10 Capture 1 Capture 10 P14 carried out by P14 carried out by
  • 24. SWPM 2012 24 Complexity analysis • Similar operations can be also defined for rules R2 and R3
  • 25. SWPM 2012 25 Conclusion • Provenance-based Inference Rules ▫ We motivated the need for provenance-based inference rules to  reduce the storage space requirements  ease the ingestion of metadata and the error correction ▫ We identified three basic rules accompanied by real world examples. • Provenance-based Inference Rules and Knowledge Evolution ▫ The use of inference rules introduces difficulties with respect to the evolution of knowledge ▫ We identified two ways to deal with deletions in this context ▫ Even though we confined ourselves to CIDOC, and to three specific inference rules, the general ideas behind our work (including the discrimination between foundational and coherence semantics of deletion) can be applied to other models and/or sets of inference rules.
  • 26. 26 Thank you for your attention.
  • 27. 27
  • 28. SWPM 2012 28 On repository policies Needs to be computed after every change 7 1 R: rdfs + rules 5 6 2 R: rdfs 4 3 R: rules 4 K 5 C: rdfs 2 3 6 C: rules 7 C: rdfs+rules 1 Increases space
  • 29. SWPM 2012 29 General statistics
  • 30. SWPM 2012 30 Space evaluation