Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 83 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à PhD Defense (20)

Publicité

PhD Defense

  1. 1. Institut Mines-Télécom Generation of Linked Data Platforms in Highly Decentralized Information Ecosystem Mohammad Noorani BAKERALLY Institut Henri Fayol, EMSE, Connected intelligence, Laboratoire Hubert Curien, UMR CNRS 5516 1 December 20, 2018 PhD Thesis Defense
  2. 2. Institut Mines-Télécom2 Developers Web Services Data Providers Data Sources Data Consumers Highly Decentralized Information Ecosystem
  3. 3. Institut Mines-Télécom3 Developers Web Services Data Sources Data Consumers Data Publisher <<owns>> Data Publisher <<owns>> Highly Decentralized Information Ecosystem Data Providers Data Portals Data Portals
  4. 4. Institut Mines-Télécom4 Highly Decentralized Information Ecosystem Developers Web Services Data Sources Data Consumers Data Publishers <<owns>> Data Publishers <<owns>> Data Providers Data Portals Data Portals is an information ecosystem consisting of information systems managed by actors that are self-governed with little to no coordination between them, e.g. Open data context, the Web, Organizational information ecosystem
  5. 5. Institut Mines-Télécom ■ Data Heterogeneity levels: • Syntax • Semantics • Access ■ Hosting Constraints preventing hosting of data in third party software environments. • Examples: ─ Data sources bounded by license restrictions ─ Real-time data sources Problems 5 Highly Decentralized Information Ecosystem Developers Web Services Data Sources Data Consumers Data Publishers <<owns>> Data Publishers <<owns>> Data Providers Data Portals Data Portals
  6. 6. Institut Mines-Télécom ■ Facilitate data exploitation for data consumers in highly decentralized information ecosystem Aim 6 Highly Decentralized Information Ecosystem Developers Web Services Data Sources Data Consumers Data Publishers <<owns>> Data Publishers <<owns>> Data Providers Data Portals Data Portals
  7. 7. Institut Mines-Télécom ■ Facilitate data exploitation for data consumers in highly decentralized information ecosystem Aim 7 Highly Decentralized Information Ecosystem Developers Web Services Data Sources Data Consumers Data Publishers <<owns>> Data Publishers <<owns>> Data Providers Data Portals Data Portals Publication of interoperable data and semantics by data publishers
  8. 8. Institut Mines-Télécom ■ Syntax • Uniform identification mechanism to refer to resources • Flexibility wrt description of resources having varying structures ■ Semantics • Ontology languages to make semantics explicit • Semantics in syntax to make data self-described and portable ■ Access • High-level protocols to hide heterogeneity of platforms • Uniform data access to facilitate data exploitation Requirements for data interoperability 8 Highly Decentralized Information Ecosystem Open standards
  9. 9. Institut Mines-Télécom ■ Semantic Web ■ Linked Data Platform Generation Model ■ Linked Data Platform Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives Outline 9
  10. 10. Institut Mines-Télécom ■ Semantic Web ■ Linked Data Platform Generation Model ■ LDP Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives Outline 10
  11. 11. Institut Mines-Télécom ■ Data Syntax: RDF [CWL14] • 😃 Uniform identification mechanism ─ Uniform Resource Identifier (URI) • 😃 Flexibility ─ Schema-less ■ Data Semantics: RDFS [BG14] and OWL [W3C12] • 😃 Ontology languages ─ RDFS and OWL are ontology languages • 😃 Semantics in syntax ─ RDFS and OWL can be serialized in RDF Semantic Web wrt to Data Syntax & Semantics 11
  12. 12. Institut Mines-Télécom ■ SPARQL [Gro13]: Standard query language for RDF • 😃 High-level protocol ─ SPARQL 1.1 Protocol • 😃 Uniform data access ─ Formal syntax and semantics ■ SPARQL is only for querying (data consumers ) rather than publishing data (data publishers ) Semantic Web for Data Access 12 Model View Controller XQUERY, SQL, SPARQL
  13. 13. Institut Mines-Télécom Semantic Web for Data Access 13 ■ Linked Data principles [BL06]: provide RESTful access to data in RDF • High-level protocol ─ operates on HTTP • Uniform data access ─ Provides description using set of standards (RDF, Turtle etc) ─ Leaves open choices (e.g. Default RDF serialization) ■ Linked Data Platform 1.0 [SAM15c]: standardizes RESTful access to data in RDF • 😃 High-level protocol ─ Standardizes interaction on top of HTTP • 😃 Uniform data access ─ Provides domain and interaction model
  14. 14. Institut Mines-Télécom Linked Data Platform 1.0 ■ Domain Model • Defines different types of LDP resources • Used to describe resources on LDPs ■ Interaction Model • Well-defined HTTP methods for CRUD operations on LDP Resources 14 LDP Resource LDP RDF Source LDP Non-RDF Source LDP Basic Container LDP Container LDP Indirect Container LDP Direct Container Semantic Web LDP Standard: Linked Data Platform 1.0 LDPs: data platforms implementing LDP Standard
  15. 15. Institut Mines-Télécom ■ RDF for Data Syntax • Uniform identification mechanism • Flexibility ■ RDFS/OWL for Data Semantics • Ontology languages • Semantics in syntax ■ LDP Standard for Data Access • High-level protocols • Uniform data access Satisfaction of Requirements for data interoperability 15 Semantic Web Open standards
  16. 16. Institut Mines-Télécom LDP Related Work 16 ■ Usage of LDP • Linked Data Platform as a novel approach for Enterprise Application Integration [MGG13] • Music SOFA: An architecture for semantically informed recomposition of Digital Music Objects [DDR18] • ECA2LD: Generating Linked Data from Entity-Component-Attribute runtimes [TRM18] • Linking the Web of Things: LDP-CoAP Mapping [LIG+16] ■ Custom Generation of LDP • Morph-LDP: An R2RML-based Linked Data Platform implementation [MPC+14] • A Linked Data Platform adapter for the Bugzilla issue tracker [MGG14] ■ LDP Implementations: • LDP Resource Management Systems: Generic LDP servers • LDP Frameworks: Tools for developing LDP servers Semantic Web
  17. 17. Institut Mines-Télécom LDP Implementations ■ LDP Resource Management Systems: • Generic LDP servers for storing, retrieving and manipulating LDP resources through HTTP methods • e.g. OpenLink Virtuoso Server, Apache Marmotta, Fedora Commons ■ LDP Frameworks: • API for facilitating the manual development of LDPs • e.g. LDP4j [EGMGC14], Eclipse Lyo 17 RDF Data Sources LDP Resource Generator LDP Resources
  18. 18. Institut Mines-Télécom Generation of LDPs 18 Design Implementation Deployment ● Define data design: how data is organized according the domain model ● Encode data design in LDP Resource Generator ● Deploy LDP server and data ● Problems: ○ Heterogeneity: No support for non-RDF data sources ○ Hosting constraints ● Problems: ○ Tight coupling between design and implementation hindering: ■ Maintainability of design ■ Reusability of design ● Problems: ○ Definition is manual Semantic Web
  19. 19. Institut Mines-Télécom State of the art: Synthesis 19 ■ Problems wrt to data exploitation in highly decentralized information ecosystems are data heterogeneity and hosting constraints ■ Semantic Web standards (RDF, RDFS/OWL, LDP) satisfy requirements for data interoperability ■ But generating LDPs from existing RDF data sources is a complex task: • No support for non-RDF data sources • No support for hosting constraints • Manual development producing tight coupling between data design and implementation ─ Reusability and maintainability of LDP designs are strongly limited
  20. 20. Institut Mines-Télécom Objective ■ Automatize the generation of LDPs in highly decentralized information ecosystem by using Semantic Web technologies and considering the following constraints: • Data Heterogeneity • Hosting Constraints • LDP Design Reusability 20
  21. 21. Institut Mines-Télécom Outline ■ Semantic Web ■ LDP Generation Model • LDP Generation Workflow • LDP Design Language (LDP-DL) ■ LDP Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives 21
  22. 22. Institut Mines-Télécom22 ■ Models as first-class entities to generate [FR07]: • Models • Platforms ■ Higher reusability of systems’ models [SVB+06] Model Driven Engineering <<defined using>> <<defined using>> <<uses>> <<uses>> <<uses>> <<uses>> LDP Generation Workflow
  23. 23. Institut Mines-Télécom LDP Generation Workflow 23 LDP Server Data sources LDP Resource Generation
  24. 24. Institut Mines-Télécom LDP Generation Workflow 24 LDP Server Data sources LDP Dataset LDP Resource Generation
  25. 25. Institut Mines-Télécom LDP Generation Workflow 25 LDP Server LDP design document LDP Dataset Data sources LDP Resource Generation
  26. 26. Institut Mines-Télécom LDP Generation Workflow 26 LDP Server LDP design document LDP Dataset Model-to-Model Transformation Model-to-Platform Transformation Data sources LDP Resource Generation
  27. 27. Institut Mines-Télécom LDP Generation Workflow 27 LDP Server LDP design document LDP Dataset LDPizer LDP Dataset Deployer Deployment Parameters Data sources LDP Resource Generation
  28. 28. Institut Mines-Télécom LDP Dataset ■ LDP Dataset consists of: • Set of container structures (n,g,M): ─ n is the IRI of the container ─ g its RDF graph ─ M is a set of IRIs representing the members of container n • Set of named graphs (n,g): ─ n is the IRI of the non-container ─ g its RDF graph 28 LDP Generation Workflow
  29. 29. Institut Mines-Télécom LDP Design Language (LDP-DL) 29 LDP Generation Workflow ■ Overview ■ Syntax ■ Semantics
  30. 30. Institut Mines-Télécom LDP-DL: Overview 30 Data Source LDP Generation Workflow
  31. 31. Institut Mines-Télécom LDP-DL: Overview 31 Data Source LDP Generation Workflow Data design questions: ■ What are the LDP resources wrt to resources from the data source ? ■ What is the structure of containers/non-containers ? ■ What are the content of containers/non-containers ?
  32. 32. Institut Mines-Télécom LDP-DL: Overview 32 LDP Dataset Data Source LDP Generation Workflow Data design questions: ■ What are the LDP resources wrt to resources from the data source ? ■ What is the structure of containers/non-containers ? ■ What are the content of containers/non-containers ?
  33. 33. Institut Mines-Télécom LDP-DL: Overview 33 LDP Dataset Data Source LDP Generation Workflow Data design questions: ■ What are the LDP resources wrt to resources from the data source ? ■ What is the structure of containers/non-containers ? ■ What are the content of containers/non-containers ? dex:paris-catalog a ldp:BasicContainer; foaf:primaryTopic ex:paris-catalog; ex:paris-catalog a dcat:catalog; dcat:keyword "paris","dataset"; ……. ldp:contains dex:parking, dex:busStation;
  34. 34. Institut Mines-Télécom LDP-DL: Overview 34 LDP Dataset Data Source Data design questions: ■ What are the LDP resources wrt to resources from the data source ? ■ What is the structure of containers/non-containers ? ■ What are the content of containers/non-containers ? LDP design language describes LDP resources: ■ IRIs ■ organization in containers ■ Content (graph) ■ Members of containers LDP Generation Workflow
  35. 35. Institut Mines-Télécom LDP-DL: Overview 35 Related resource LDP Generation Workflow Related resource
  36. 36. Institut Mines-Télécom LDP-DL: Overview 36 Related resource dex:paris-catalog a ldp:BasicContainer; foaf:primaryTopic ex:paris-catalog; ex:paris-catalog a dcat:catalog; dcat:keyword "paris","dataset"; ……. ldp:contains dex:parking, dex:busStation; LDP Generation Workflow
  37. 37. Institut Mines-Télécom dex:paris-catalog a ldp:BasicContainer; foaf:primaryTopic ex:paris-catalog; ex:paris-catalog a dcat:catalog; dcat:keyword "paris","dataset"; ……. ldp:contains dex:parking, dex:busStation; LDP-DL: Overview 37 Related resource LDP Generation Workflow RDF Graph of the LDP Resource
  38. 38. Institut Mines-Télécom LDP-DL: Syntax ■ ResourceMap: • Related resources identified by Query Pattern • RDF graph of LDP resources described by Construct Query 38
  39. 39. Institut Mines-Télécom LDP-DL: Syntax ■ ResourceMap: • Related resources identified by Query Pattern • RDF graph of LDP resources described by Construct Query ■ NonContainerMap: describes non-containers 39
  40. 40. Institut Mines-Télécom LDP-DL: Syntax ■ ResourceMap: • Related resources identified by Query Pattern • RDF graph of LDP resources described by Construct Query ■ NonContainerMap: describes non-containers ■ ContainerMap: describes containers and their members (containers or non-containers) 40
  41. 41. Institut Mines-Télécom LDP-DL: Syntax ■ ResourceMap: • Related resources identified by Query Pattern • RDF graph of LDP resources described by Construct Query ■ NonContainerMap: describes non-containers ■ ContainerMap: describes containers and their members (containers or non-containers) ■ DataSource describes: • RDF Sources using their IRIs • Non-RDF Sources using: ─ IRIs of data sources ─ IRIs of lifting rules 41
  42. 42. Institut Mines-Télécom LDP-DL: Formal Semantics 42 eltdd Interpretation of LDP-DL syntactic constructs notion of satisfaction <<instanceOf>>
  43. 43. Institut Mines-Télécom LDP-DL: Formal Semantics 43 dd Interpretation of LDP-DL syntactic constructs notion of satisfaction <<instanceO f>>
  44. 44. Institut Mines-Télécom ■ Given an interpretation and a design document , we define the LDP dataset that we call the evaluation of wrt LDP-DL Formal Semantics 44 ■ A LDP dataset D is valid wrt to iff there exists such that: ⊧ and D is the evaluation of wrt ■ We provide an algorithm for that generates LDP datasets that are provably valid wrt input design documents
  45. 45. Institut Mines-Télécom Handling Hosting Constraints ■ Dynamic LDP dataset store instructions to generate graph of LDP resources ■ Using dynamic LDP dataset: • Generate LDP dataset at deployment • Generate graph of LDP resources at query time ■ Deal with dynamicity of data sources and hosting constraints 45 LDP Generation Workflow
  46. 46. Institut Mines-Télécom ■ Semantic Web ■ LDP Generation Model • LDP Generation Workflow • LDP Design Language (LDP-DL) ■ LDP Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives Outline 46
  47. 47. Institut Mines-Télécom LDP Generation Toolkit 47
  48. 48. Institut Mines-Télécom LDP Generation Toolkit 48 *Lefrançois, Maxime, Antoine Zimmermann, and Noorani Bakerally. "A SPARQL extension for generating RDF from heterogeneous formats." European Semantic Web Conference. Springer, Cham, 2017.
  49. 49. Institut Mines-Télécom LDP Generation Toolkit 49
  50. 50. Institut Mines-Télécom LDP Generation Toolkit 50
  51. 51. Institut Mines-Télécom LDP Generation Toolkit 51
  52. 52. Institut Mines-Télécom ■ Semantic Web ■ LDP Generation Model ■ LDP Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives Outline 52
  53. 53. Institut Mines-Télécom Evaluation ■ Objective: Automatize the generation of LDPs in highly decentralized information ecosystem by using Semantic Web technologies and considering the following constraints: • Data Heterogeneity • Hosting Constraints • LDP Design Reusability ■ Evaluation criteria are derived from objective 53
  54. 54. Institut Mines-Télécom Evaluation: Experiment Settings ■ 8 design documents ■ 28 data sources • RDF data sources: ─ Open data catalogs from 21 data portals ─ BBC wildlife dataset ─ LodPaddle • Heterogeneous data sources (JSON, CSV) • Real-time data sources (JSON, CSV) ■ Github: https://github.com/noorbakerally/LDPDatasetExamples ■ Performance test done using a simple design document and different data sources having a maximum of 1 million triples • Performance is approximately linear 54
  55. 55. Institut Mines-Télécom55 Evaluation ■ Homogeneous LDP Access Experiment: LDP Generation from heterogeneous data sources
  56. 56. Institut Mines-Télécom56 Evaluation ■ Dynamic LDP Experiment: LDP Generation from real-time data source
  57. 57. Institut Mines-Télécom Evaluation: LDP Design Reusability ■ Domain Design Reusability Experiment: Same design document and varying data sources structured with same ontology 57
  58. 58. Institut Mines-Télécom ■ Generic Design Reusability Experiment: Same design document and varying data sources structured with different ontology 58 Evaluation: LDP Design Reusability
  59. 59. Institut Mines-Télécom ■ Modular Design Reusability Experiment: Modular design documents 59 Evaluation: LDP Design Reusability
  60. 60. Institut Mines-Télécom Summary of evaluation 60 Evaluation Criteria Experiments Data Heterogeneity Hosting Constraints LDP Design Reusability Automatization Homogeneous LDP Access ✔ ✔ Dynamic LDP ✔ ✔ Domain Design Reusability ✔ ✔ Generic Design Reusability ✔ ✔ Modular Design Reusability ✔ ✔
  61. 61. Institut Mines-Télécom ■ Semantic Web ■ LDP Generation Model • LDP Generation Workflow • LDP Design Language ■ LDP Generation Toolkit ■ Evaluation ■ Conclusion & Perspectives Outline 61
  62. 62. Institut Mines-Télécom ■ Definition of Highly decentralized information ecosystem • Identification of problems w.r.t data exploitation • Identification of requirements for data interoperability ■ Semantic Web standards as foundations to facilitate data publications ■ Data exploitation may be facilitated by providing tools to data publishers rather than only data consumers Conclusion: Context 62
  63. 63. Institut Mines-Télécom ■ LDP Generation Workflow • LDP Design Language with: ─ Formal syntax to write LDP design documents ─ Formal semantics to properly interpret LDP design documents • LDP Dataset ■ LDP Generation Toolkit: Implementation of the LDP Generation Workflow ■ Evaluation of LDP Generation Toolkit wrt data heterogeneity, hosting constraints, LDP design reusability Conclusion: Summary of Contributions 63
  64. 64. Institut Mines-Télécom ■ Partial coverage of the LDP standard (e.g. Direct, Indirect Containers are not considered) ■ Limited handling of hosting constraints ■ Manual generation of LDP design documents ■ Manual generation of lifting rules Conclusion: Limitations 64
  65. 65. Institut Mines-Télécom Perspectives ■ Enrich design aspects in LDP-DL Model • Consider Direct & Indirect containers • Provide deployment constructs to describe aspects such as: ─ Access rights ─ Paging ■ Generate Linked Data based on best practices from Data on the Web Best Practices [LBC17] ■ Provide LDP Generation methodology ■ Evaluate with real users of LDP 65
  66. 66. Institut Mines-Télécom References [BG14] Dan Brickley and Ramanathan V. Guha. RDF Schema 1.1. W3C Recommendation, World Wide Web Consortium (W3C), February 25 2014. [BL06] Tim Berners-Lee. Linked Data-Design Issues, 2006. [CWL14] R. Cyganiak, D. Wood, and M. Lanthaler. RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation 25 February 2014. Technical report, W3C, 2014 [DDR18] De Roure, David, et al. "Music sofa: An architecture for semantically informed recomposition of digital music objects." Proceedings of the 1st International Workshop on Semantic Applications for Audio and Music. ACM, 2018. [FR07] R. B. France and B. Rumpe. Model-driven development of complex software: A research roadmap. In FOSE, 2007. [Gro13] W3C SPARQL Working Group. SPARQL 1.1 Overview. W3C Recommendation, World Wide Web Consortium (W3C), March 21 2013. 66
  67. 67. Institut Mines-Télécom References [LIG+16] Loseto, Giuseppe, et al. "Linking the web of things: LDP-CoAP mapping." Procedia Computer Science 83 (2016): 1182-1187. [MGG13] Mihindukulasooriya, Nandana, Raúl García-Castro, and Miguel Esteban Gutiérrez. "Linked Data Platform as a novel approach for Enterprise Application Integration." COLD. 2013. [MGG14] Mihindukulasooriya, Nandana Sampath, Miguel Esteban Gutiérrez, and Raul García Castro. "A Linked Data Platform adapter for the Bugzilla issue tracker." (2014): 89-92. [MPC+14] Mihindukulasooriya, Nandana, et al. "morph-LDP: an R2RML-based linked data platform implementation." European Semantic Web Conference. Springer, Cham, 2014. [SAM15c] Steve Speicher, John Arwe, and Ashok Malhotra. Linked Data Platform 1.0. Technical report, World Wide Web Consortium (W3C), February 26 2015. 67
  68. 68. Institut Mines-Télécom References [SVB+06] T. Stahl, M. Volter, J. Bettin, A. Haase, and S. Helsen. Model-driven software development: technology, engineering, management. Pitman, 2006. [TRM18] Spieldenner, T., Schubotz, R., & Guldner, M. (2018, June). ECA2LD: Generating Linked Data from Entity-Component-Attribute runtimes. In 2018 Global Internet of Things Summit (GIoTS) (pp. 1-4). IEEE. [W3C12] W3C OWL Working Group. OWL 2 Web Ontology Language Docu-ment Overview (Second Edition), W3C Recommendation 11 December2012. W3C Recommendation, World Wide Web Consortium (W3C),December 11 2012 68
  69. 69. Institut Mines-Télécom Annexes 69
  70. 70. Institut Mines-Télécom70 Model Theoretic Semantics: LDP-DL Interpretation
  71. 71. Institut Mines-Télécom Model Theoretic Semantics: DataSource Satisfaction 71
  72. 72. Institut Mines-Télécom Model Theoretic Semantics: Ancestor List and Mapping 72
  73. 73. Institut Mines-Télécom Model Theoretic Semantics: ResourceMap Satisfaction 73
  74. 74. Institut Mines-Télécom Model Theoretic Semantics: NonContainerMap Satisfaction 74
  75. 75. Institut Mines-Télécom Model Theoretic Semantics: ContainerMap Satisfaction 75
  76. 76. Institut Mines-Télécom Map Evaluation 76
  77. 77. Institut Mines-Télécom Design Document Evaluation 77
  78. 78. Institut Mines-Télécom Design Document Evaluation 78
  79. 79. Institut Mines-Télécom Flexible LDP Design 79
  80. 80. Institut Mines-Télécom LDP-DL Semantics 80
  81. 81. Institut Mines-Télécom LDP-DL Semantics 81 1. Eval of qp returns { 𝞀←ex:paris-catalog} and {𝞀←ex:toulouse-catalog} 2. for each of them, a new resource is created 3. consider {𝞀 ←ex:paris-catalog} 4. the new resource (𝜈) is dex:paris-catalog 5. To generate graph of dex:paris-catalog, cq is evaluated on the source with the bindings {𝞀←ex:paris-catalog}, {𝜈←dex:paris-catalog} 𝞀: related resource, 𝜈: new LDP resource
  82. 82. Institut Mines-Télécom LDP-DL Semantics 82 :dataset ContainerMap members of dex:paris-catalog and dex:toulouse-catalogs
  83. 83. Institut Mines-Télécom LDP-DL Semantics 83 -Consider eval of :dataset to generate members of dex:paris-catalog -members of dex:paris-catalog describes dcat:datasets of ex:paris-catalog (related resource) - eval of qp is done with bindings {π1 ← ex:paris-catalog}

×