Contenu connexe
Similaire à Pal gov.tutorial2.session14.lab rdf-dataintegration (16)
Plus de Mustafa Jarrar (20)
Pal gov.tutorial2.session14.lab rdf-dataintegration
- 1. أكاديمية الحكومة اإللكترونية الفلسطينية
The Palestinian eGovernment Academy
www.egovacademy.ps
Tutorial II: Data Integration and Open Information Systems
Session 14 (Practical):
Data Integration and Fusion using RDF
Dr. Mustafa Jarrar
University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
PalGov © 2011 1
- 2. About
This tutorial is part of the PalGov project, funded by the TEMPUS IV program of the
Commission of the European Communities, grant agreement 511159-TEMPUS-1-
2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.ps
Project Consortium:
Birzeit University, Palestine
University of Trento, Italy
(Coordinator )
Palestine Polytechnic University, Palestine Vrije Universiteit Brussel, Belgium
Palestine Technical University, Palestine
Université de Savoie, France
Ministry of Telecom and IT, Palestine
University of Namur, Belgium
Ministry of Interior, Palestine
TrueTrust, UK
Ministry of Local Government, Palestine
Coordinator:
Dr. Mustafa Jarrar
Birzeit University, P.O.Box 14- Birzeit, Palestine
Telfax:+972 2 2982935 mjarrar@birzeit.eduPalGov © 2011
2
- 3. © Copyright Notes
Everyone is encouraged to use this material, or part of it, but should
properly cite the project (logo and website), and the author of that part.
No part of this tutorial may be reproduced or modified in any form or by
any means, without prior written permission from the project, who have
the full copyrights on the material.
Attribution-NonCommercial-ShareAlike
CC-BY-NC-SA
This license lets others remix, tweak, and build upon your work non-
commercially, as long as they credit you and license their new creations
under the identical terms.
PalGov © 2011 3
- 4. Tutorial Map
Topic h
Intended Learning Objectives
Session 1: XML Basics and Namespaces 3
A: Knowledge and Understanding
Session 2: XML DTD’s 3
2a1: Describe tree and graph data models.
Session 3: XML Schemas 3
2a2: Understand the notation of XML, RDF, RDFS, and OWL.
2a3: Demonstrate knowledge about querying techniques for data Session 4: Lab-XML Schemas 3
models as SPARQL and XPath. Session 5: RDF and RDFs 3
2a4: Explain the concepts of identity management and Linked data. Session 6: Lab-RDF and RDFs 3
2a5: Demonstrate knowledge about Integration &fusion of Session 7: OWL (Ontology Web Language) 3
heterogeneous data. Session 8: Lab-OWL 3
B: Intellectual Skills Session 9: Lab-RDF Stores -Challenges and Solutions 3
2b1: Represent data using tree and graph data models (XML & Session 10: Lab-SPARQL 3
RDF). Session 11: Lab-Oracle Semantic Technology 3
2b2: Describe data semantics using RDFS and OWL. Session 12_1: The problem of Data Integration 1.5
2b3: Manage and query data represented in RDF, XML, OWL. Session 12_2: Architectural Solutions for the Integration Issues 1.5
2b4: Integrate and fuse heterogeneous data. Session 13_1: Data Schema Integration 1
C: Professional and Practical Skills Session 13_2: GAV and LAV Integration 1
2c1: Using Oracle Semantic Technology and/or Virtuoso to store Session 13_3: Data Integration and Fusion using RDF 1
and query RDF stores. Session 14: Lab-Data Integration and Fusion using RDF 3
D: General and Transferable Skills
2d1: Working with team. Session 15_1: Data Web and Linked Data 1.5
2d2: Presenting and defending ideas. Session 15_2: RDFa 1.5
2d3: Use of creativity and innovation in problem solving.
2d4: Develop communication skills and logical reasoning abilities. Session 16: Lab-RDFa 3
PalGov © 2011 4
- 5. Module ILOs
After completing this module students will be able to:
- Explain the concepts of identity management and linked data.
- Integrate and fuse heterogeneous data.
- Represent data using the graph data model (RDF).
- Manage and query data represented in RDF.
PalGov © 2011 5
- 6. Practical Session
Description:
From previous practical sessions: “The central management of students’ profiles by
the ministry of education is becoming an urgent need in the last years. Many students in
Palestine move from one university to another, and they need to transfer their academic
records. Also, the ministry of higher education needs to certify the diplomas and mark
sheets of students. Moreover, there is a need to centrally manage/monitor students financial
aids. Therefore, the ministry of higher education has decided to build a national student
registry, such that, each semester every university has to send the academic record of every
student to the ministry of education. The ministry will then update and integrate the
academic records according to the data combined from all universities into the national
student registry.”
The ministry wants to use RDF to integrate this data. Thus, each
university must map its relational data (or data in any other model)
into RDF, and at the ministry this data is integrated and fused. Map
the universities’ relational data into RDF and integrate and fuse it.
PalGov © 2011 6
- 7. Practical Session
• Each two students form a group. Each group must be composed of students from
different universities (in their first level degrees).
• Students are expected to use three different mark sheets from different universities to
construct 3 different hypothetical relational data schemes of students records.
• Students must populate the three databases (pertaining to the 3 different data
schemes) with sample data.
• Students must integrate and fuse all data using RDF.
• Students are highly recommended to use the ontologies developed in previous practical
sessions when mapping and integrating RDF data.
• Students must write at least three SPARQL queries on the integrated RDF data that
involves data from all 3 sources
• Students must work this practical session using Oracle Semantic Technologies.
• After finalizing their work, each group will be asked to present their work to all students,
so to collect comments and feedback.
• The final delivery include: (i) Snapshots of the three hypothetical databases and
schemes taken from Oracle DB. (ii) The RDF mapping of each database (SPO tables).
(iii) The integrated final RDF showing how entities were disambiguated. (iv) The
executed SPARQL queries and their results. Note that this final delivery should have the
form of a report where discussion of the various steps are expected to be clear.
PalGov © 2011 7