SlideShare une entreprise Scribd logo
1  sur  49
Towards Linked Open
Government Data
Vlad Posea
vlad.posea@cs.pub.ro
about me
• bachelor and PhD from Politehnica University of Bucharest
• Master in Data Mining from University Lumiere of Lyon
• research on competence management, semantic web, e-learning
• business on career management and recruiting
• now fellow of the Romanian American Foundation at the University of
Rochester (Fulbright scholarship starting with 2017) for developing
entrepreneurship in Romania
Linked Data and Open Data
• linked data = a way to connect data on the web using URIs and RDF,
the most successful result of the Semantic Web initiative
• open data = Open data is data that can be freely used, re-used and
redistributed by anyone - subject only, at most, to the requirement to
attribute and sharealike.
• open government data = data regarding public institutions, published
on governmental sites
Very Short Intro on RDF
• data represented as
statements
• statements contain
• subject
• predicate
• object
• subject, predicate and
sometimes objects are URIs
• URIs are used to uniquely
identify entities or properties
Linked Data
http://lod-cloud.net/
Why Do We Need Open Data?
• transparency
• how does the government spend money
• fuel innovation and entrepreneurship
• https://www.youtube.com/watch?v=sUqY5ySylXg (Todd Park discussing
benefits of Open Government Data)
• opening weather data and GPS data allowed people to build businesses
• “last year alone civilian and commercial access to GPS created 90 billion $
worth of value” (2013)
• participatory governance
• citizens enabled in decision making
• “making a full read/write society” (http://opengovernmentdata.org/why/)
Open Data Quality
• five stars of open data proposed by Tim Berners Lee
• (1) be available on the Web under an open licence,
• (2) be in the form of structured data,
• (3) be in a non-proprietary file format,
• (4) use URIs as itsidentifiers (see also RDF),
• (5) include links to other data sources (see linked data).
http://opendatahandbook.org/glossary/en/terms/five-stars-of-open-data/
Open Data in the World
• Global Open Data Quality measures how governments implement Open
Data
• evaluates if a country posted data on
• national statistics
• government budget
• government spending
• legislation
• election results
• national map
• pollution
• also evaluates the quality of the posted data
• companies
• location datasets
• government procurement
• water quality
• weather forecast
• land ownership
• transport timetables
• health performance
Global Open Data Quality
relevant progress has been made in terms of
opening data
scores would be much lower if 5 star data
would have a bigger weight
http://index.okfn.org/place/
http://index.okfn.org/methodology/
Open Data in the US
• data.gov – 190k datasets
• mostly html (70k)
• RDF below 5% of the total number of datasets
• more than a quarter are either pdf, jpg, tiff
• relevant steps
• data.gov launched in 2009
• Open Government Partnership 2011 (http://www.opengovpartnership.org/)
• Digital Accountability and Transparency Act (2014)
• creating publishing standards for public spending data
https://max.gov/maxportal/assets/public/offm/DataStandardsFinal.htm
Open Data in Saint Louis
• https://www.stlouis-mo.gov/data/ - list of data sets
• most of them html or pdf
• some confuse open data with reports
Open Data in Romania
• Data.gov.ro
• National portal where public institutions put all the data
• Types of resources published: CSV (***), PDF(*), XLS (**)
• There is no connection between files (zero files with 4 or 5 *)
• September 2016:
• 72 public institutions
• 8185 files
• Each file can have its own structure
• uses CKAN (http://ckan.org/)
Why do we need Linked Open Data
• classic workflow when working with open data:
• analyze CSV files
• define own data model
• import data from CSV files into data model
• solve import problems (naming differences, character encoding issues)
• identify entities and link them to other entities existing in the model
• link data from different CSV files in a common model
• extract relevant information
• write the program logic to exploit the data
Why Do We Need Linked Open Data
• classic workflow when working with linked open data:
• analyze models
• write query to extract relevant information
• write the program logic to exploit the data
• can use directly more than one dataset by performing “joins” in the
queries
• much faster to develop an application
• much easier to reuse data
Linked Open Data in Romania
• Our goal is to transform open data from Romania into Linked Open
Data.
• Transform data into RDF triples (Subject, Predicate, Object)
• Link entities with existing online resources, especially from dbpedia.org and
Geonames
• Create a platform where each published file is transformed into RDF
• Create rich applications using SPARQL queries
Vision
• create tools and workflows to allow non-technical users to add Linked
Data to the government website
• offer an API for developers who want to create apps based on open
government data
• integrate the software into CKAN (the open data portal used by most
governments) to allow every government to create linked data
Stages
1. modeling data
2. massively transforming data
3. linking data to external data sets
4. embed into CKAN
First Stage – Modeling Data
• Identify the most common ontologies used
• Create naming rules for creating the same URIs that identify the same
resources
• Identify the most common properties of the open data and the
ontological properties associated to them
• Identify the most common naming problems
• different encodings
• different spelling
• different lexicalization of the same concepts
and write hacks to solve them
Open Data Types
• Numerical data:
• Different budgets or revenues
• Different statistical data: number of cars/type/year, number of bed/hospital
• Etc.
• “Plain” data:
• Information about entities:
• Lawyers, Schools, Pharmacies, Museums, Archeological sites
• Etc.
• Found in tabular files, such as CSV or XLS
Most Common Vocabularies for Open Data
https://lov.okfn.org/dataset/lov/
Most Common Vocabularies for Open Data
• Dublin Core (DCTerms, DCE) – describes metadata terms
(http://dublincore.org/schemas/)
• SKOS – Simple Knowledge Organisation Systems – representing
taxonomies
• FOAF – Friend of a Friend – representing people and the relations
between them
• CC – Creative Commons – copyright information
• GEO – Geonames – data about locations
• VANN – data about vocabularies
• DBPedia –
Ontologies used
• We used especially OWL classes defined by dbpedia such as:
• http://dbpedia.org/ontology/Location
• http://dbpedia.org/ontology/Place
• http://dbpedia.org/ontology/PopulatedPlace
• http://dbpedia.org/ontology/Museum
• http://dbpedia.org/ontology/Hospital
• http://dbpedia.org/ontology/EducationalInstitution
• http://dbpedia.org/class/yago/CitiesInRomania
• Other used OWL classes:
• http://umbel.org/umbel/rc/Village
• https://schema.org/PostalAddress
Naming rules for creating URIs
• Each URI has as prefix: http://opendata.cs.pub.ro/resource
• Our goal is to make URIs for each resource as easy to understand as
possible for humans
• Our statement is: Once you read the URI, you know what it is about
• For example:
• Locality: http://opendata.cs.pub.ro/resource/<localityName>_judet_<localityCounty>
• Hospital: http://opendata.cs.pub.ro/resource/<hospitalName>_hospital_<localityCounty>
Most common properties
• Most used properties were taken from well-known vocabularies such
as:
• VCARD: vcard:region, vcard:locality
• FOAF: foaf:mbox, foaf:fax
• GEO: geo:lat, geo:long
• Other properties were taken from those defined by dbpedia.org:
• http://dbpedia.org/property/postcode
• http://dbpedia.org/property/phonenumber
• We also defined properties defined in our own namespace:
• http://opendata.cs.pub.ro/property
Most common naming problems and how we
solve them
• Resource’s name was written with diacritics
• Replace diacritics with normal letter
• Resource’s name was written with non-alphanumeric characters, such as:
space, hyphen, comma
• Replace them with underscore
• After initial choosing the naming convention, we saw that there can be
some conflicts
• For example: we chose initial the URI for museums only the name of the museum,
but there can be a museum with same name in multiple towns, so we added for the
URI also the museum’s town
Stage 2: massively transforming data
• experiment with 20 students from a master’s class
• groups of 2 asked to choose 2 datasets and transform them
• + greatly increased the number of data transformed
• - big amount of work to correct the errors introduced by students
• have to involve a larger number of volunteers
• students will be asked to offer expert support
• 2016 result: 10 new datasets, more than 500000 triples added
Biggest problem PDF Files
• Unfortunately, there are tabular
data hidden in scanned PDF files
• We created an algorithm to extract
only the tables from these scanned
files
• This way, we transform the
unmanageable scanned files into
tabular ones
• We want to improve existing tools
using contextual information
regarding the type of document
Stage 3 Linking to external data sets
• most important datasets:
• dbpedia
• people
• events
• places
• geonames
• all the places
How to link?
• query and disambiguate
• sometimes really difficult
• disambiguation
• by type
• by context
• not always possible automatically
Relevant tool: SILK
• http://silkframework.org/
• Generating links between related data items within different Linked Data
sources.
• Linked Data publishers can use Silk to set RDF links from their data sources to
other data sources on the Web.
• Applying data transformations to structured data sources.
or write some code
geoloc=loc.decode("utf8")[:-1]
query=strip_accents("http://api.geonames.org/search?q=%s&maxRows=1&type=rdf&username=vladposea"%geoloc)
g=rdflib.Graph()
g.parse(query.encode("ascii","ignore"),format="xml", encoding="utf-8")
for s,p,o in g.triples( (None, rdflib.RDF.type, gn.Feature) ):
fullGraph.add((locURI,OWL.sameAs, s))
<rdf:RDF >
<gn:Feature rdf:about="http://sws.geonames.org/686254/">
...
Stage 4 – Embed into CKAN
• CKAN - http://ckan.org/
• tool for publishing data
• aimed at governments and other public organizations
• specially designed for open data
• used internationally
• not built for linked data
• we envision developing a plugin to semi-automatically construct
linked data from the open data published
What do we have so far?
• Our focus was on “plain” data:
• Cities dataset published in RDF and linked with geonames.org
• Each created resource has a <owl:sameAs> property that links to geonames.org
• Schools dataset published in RDF
• Pharmacies dataset published in RDF
• Museums dataset published in RDF and linked with dbpedia.org
• Churches dataset publishe in RDF and linked with dbpedia.org
• 207382 URIs with overall 2683968 RDF triples
How we transformed the data?
• For each dataset:
• We identified what vocabulary should be used for each property
• We identified what additional properties should be created for each resource
• Each physical entity has an address and using Google Geocode service we obtained the
geographical coordinates for that address
• We created one unique URI for each resource
• We generated the URIs by putting a lot of information inside them for example the URI
for one school is :http://opendata.cs.pub.ro/resource/<school_name>_<city>
• We opted for this encoding schema to create more verbose URIs, not just hashes
• We linked each possible resource using online semantic repositories, such as
dbpedia.org and geonames.org
• The linking is done by searching entities with the same type and name
How can someone access the resources?
• We have published all RDF triples in a semantic repository:
• http://opendata.cs.pub.ro/repo
• It supports SPARQL queries
• http://opendata.cs.pub.ro/repo/sparql
• We document all published datasets in :
• Blog: http://opendata.cs.pub.ro/blog
• Wiki: http://opendata.cs.pub.ro/wiki
SPARQL queries
Towns where there are no schools
SELECT ?loc
WHERE { ?loc rdf:type
<http://dbpedia.org/ontology/Settlement> .
FILTER NOT EXISTS { ?x
<http://opendata.cs.pub.ro/property/institutie_in_lo
calitate> ?loc . } }
Find the museums linked with dbpedia.org
SELECT ?MusRO ?MusDB
WHERE {
?MusRO rdf:type
<http://dbpedia.org/ontology/Museum>.
?MusRO owl:sameAs ?MusDB. }
ORDER BY ?MusRO
Example application
• All physical entities have an address and we obtained the
geographical coordinates of this address.
• We put on a map all these entities and someone can see the nearest
museums, hospitals or pharmacies from its location
• The app is online:
• http://opendata.cs.pub.ro:3000
Technologies used in this project
• Storage layer:
• Apache Marmotta HEAD version
• Processing layer:
• JAVA using Apache POI for reading tabular data and Apache Jena for
converting data to RDF
• C with OpenCV and Tesseract for extracting tabular data from scanned PDF
files
• Visualization layer:
• Backend: node.js using sparql-client module for SPARQL queries
• Fronted: angular.js
Alternative Technologies
• Open Refine
• http://openrefine.org/
• formerly Google Refine
• allows to
• explore data in various formats
• clean and transform data (clustering, easy or scripted transformations)
• reconcile and match data
• supports external web services
Karma
• semantic mapping tool http://labs.europeana.eu/apps/karma
• imports data in various formats
• transforms it to semantic data
• links it to DBPedia or GeoNames
• no features for statistical data integration
• no features for parsing pdf files
Named Entity Recognition
• Named Entity Recognition – identify entities in texts, apply tags, link
to permanent entities
• Open Calais – up to 5k free requests/day
• http://www.opencalais.com/
• Alchemy – made by IBM
• http://www.alchemyapi.com/
• 1k/day free
Apache Marmotta
• http://marmotta.apache.org/
• read – write linked data server
• open implementation of W3C’s
Linked Data Platform
Recommendation
https://www.w3.org/TR/ldp/
• repository
• SPARQL 1.1 engine
RDF Data Cube Vocabulary
• statistical data can’t be expressed using just subject predicate and
attribute
• RDF – graph
• statistical data – hypergraph
• RDF Data Cube https://www.w3.org/TR/vocab-data-cube -
recommendation for a vocabulary to describe multi-dimensional data
• compatible with Statistical Data and Metadata eXchange - SDMX
Plan for the future
• Develop an automated way to choose the vocabulary for one dataset
• Focus on statistical data and publish them using RDF Data Cube
vocabulary
• Develop a more accurate method of linking resources
• Create more applications that use the published data
Papers
• LODRo: Using cultural Romanian open data to build new learning
applications
Octavian Rinciog, Vlad Posea, The International Scientific Conference eLearning and
Software for Education, Bucharest, 2016
• Publishing Romanian public health data as Linked Open Data
Octavian Rinciog, Vlad Posea, E-Health and Bioengineering Conference (EHB), Iasi,
2015
• The Semantic Representation of Open Data Regarding the Romanian
Companies
Marian Spoiala, Octavian Rinciog, Vlad Posea, RoEDU Conference, Bucharest, 2016
• GovLOD: Towards a Linked Open Data Portal
Octavian Rinciog, Vlad Posea, Poster in ISWC Conference, Tokyo, 2016
instead of references
http://www.ted.com/talks/tim_berners_lee_on_the_next_web

Contenu connexe

Tendances

Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
Tomek Pluskiewicz
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
eswcsummerschool
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
R A Akerkar
 

Tendances (20)

From the Semantic Web to the Web of Data: ten years of linking up
From the Semantic Web to the Web of Data: ten years of linking upFrom the Semantic Web to the Web of Data: ten years of linking up
From the Semantic Web to the Web of Data: ten years of linking up
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
An Introduction to Semantic Web Technology
An Introduction to Semantic Web TechnologyAn Introduction to Semantic Web Technology
An Introduction to Semantic Web Technology
 
semantic web-unique presentation
semantic web-unique presentationsemantic web-unique presentation
semantic web-unique presentation
 
Semantic web technology
Semantic web technologySemantic web technology
Semantic web technology
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
Familiarization with Web Tools
Familiarization with Web ToolsFamiliarization with Web Tools
Familiarization with Web Tools
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Webinar: Semantic web for developers
Webinar: Semantic web for developersWebinar: Semantic web for developers
Webinar: Semantic web for developers
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
 
Hacking with Semantic Web
Hacking with Semantic WebHacking with Semantic Web
Hacking with Semantic Web
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data EnvironmentMetadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
 
The Semantic Web: What IAs Need to Know About Web 3.0
The Semantic Web: What IAs Need to Know About Web 3.0The Semantic Web: What IAs Need to Know About Web 3.0
The Semantic Web: What IAs Need to Know About Web 3.0
 
A review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic WebA review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic Web
 
Jarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic WebJarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic Web
 

En vedette

IPW HTML course
IPW HTML courseIPW HTML course
IPW HTML course
Vlad Posea
 
IPW 2eme course - HTML
IPW 2eme course - HTMLIPW 2eme course - HTML
IPW 2eme course - HTML
Vlad Posea
 
Usability and accessibility on the web
Usability and accessibility on the webUsability and accessibility on the web
Usability and accessibility on the web
Vlad Posea
 
C5 Javascript French
C5 Javascript FrenchC5 Javascript French
C5 Javascript French
Vlad Posea
 
IPW 3rd Course - CSS
IPW 3rd Course - CSSIPW 3rd Course - CSS
IPW 3rd Course - CSS
Vlad Posea
 
Introduction dans la Programmation Web Course 1
Introduction dans la Programmation Web Course 1Introduction dans la Programmation Web Course 1
Introduction dans la Programmation Web Course 1
Vlad Posea
 
utilisabilite et accessibilite au web
utilisabilite et accessibilite au webutilisabilite et accessibilite au web
utilisabilite et accessibilite au web
Vlad Posea
 
HTML 5 - intro - en francais
HTML 5 - intro - en francaisHTML 5 - intro - en francais
HTML 5 - intro - en francais
Vlad Posea
 
IPW Course 3 CSS
IPW Course 3 CSSIPW Course 3 CSS
IPW Course 3 CSS
Vlad Posea
 
Intro to HTML5
Intro to HTML5Intro to HTML5
Intro to HTML5
Vlad Posea
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first course
Vlad Posea
 

En vedette (20)

Ce mă fac când o să fiu mare - optiuni pentru o cariera in IT
Ce mă fac când o să fiu mare - optiuni pentru o cariera in ITCe mă fac când o să fiu mare - optiuni pentru o cariera in IT
Ce mă fac când o să fiu mare - optiuni pentru o cariera in IT
 
Programarea calculatoarelor c2
Programarea calculatoarelor c2Programarea calculatoarelor c2
Programarea calculatoarelor c2
 
IPW HTML course
IPW HTML courseIPW HTML course
IPW HTML course
 
Jena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registryJena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registry
 
IPW 2eme course - HTML
IPW 2eme course - HTMLIPW 2eme course - HTML
IPW 2eme course - HTML
 
Usability and accessibility on the web
Usability and accessibility on the webUsability and accessibility on the web
Usability and accessibility on the web
 
C5 Javascript French
C5 Javascript FrenchC5 Javascript French
C5 Javascript French
 
C5 Javascript
C5 JavascriptC5 Javascript
C5 Javascript
 
IPW 3rd Course - CSS
IPW 3rd Course - CSSIPW 3rd Course - CSS
IPW 3rd Course - CSS
 
Introduction dans la Programmation Web Course 1
Introduction dans la Programmation Web Course 1Introduction dans la Programmation Web Course 1
Introduction dans la Programmation Web Course 1
 
C5 Javascript
C5 JavascriptC5 Javascript
C5 Javascript
 
utilisabilite et accessibilite au web
utilisabilite et accessibilite au webutilisabilite et accessibilite au web
utilisabilite et accessibilite au web
 
HTML 5 - intro - en francais
HTML 5 - intro - en francaisHTML 5 - intro - en francais
HTML 5 - intro - en francais
 
IPW Course 3 CSS
IPW Course 3 CSSIPW Course 3 CSS
IPW Course 3 CSS
 
Intro to HTML5
Intro to HTML5Intro to HTML5
Intro to HTML5
 
Introduction to Web Programming - first course
Introduction to Web Programming - first courseIntroduction to Web Programming - first course
Introduction to Web Programming - first course
 
Css+html
Css+htmlCss+html
Css+html
 
Présentation html5
Présentation html5Présentation html5
Présentation html5
 
Cours HTML/CSS
Cours HTML/CSSCours HTML/CSS
Cours HTML/CSS
 
Beautiful CSS : Structurer, documenter, maintenir
Beautiful CSS : Structurer, documenter, maintenirBeautiful CSS : Structurer, documenter, maintenir
Beautiful CSS : Structurer, documenter, maintenir
 

Similaire à Linked Open Data in Romania

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
Linked open data project
Linked open data projectLinked open data project
Linked open data project
Faathima Fayaza
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
Peter Haase
 
Is Linked Open Data the way forward?
Is Linked Open Data the way forward?Is Linked Open Data the way forward?
Is Linked Open Data the way forward?
American Art Collaborative
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
Regan Harper
 

Similaire à Linked Open Data in Romania (20)

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Linked Data
Linked DataLinked Data
Linked Data
 
Linked open data project
Linked open data projectLinked open data project
Linked open data project
 
Exploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesExploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sources
 
OpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenDataOpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenData
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Linked Open Government Data: What’s Next?
Linked Open Government Data:  What’s Next?Linked Open Government Data:  What’s Next?
Linked Open Government Data: What’s Next?
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Is Linked Open Data the way forward?
Is Linked Open Data the way forward?Is Linked Open Data the way forward?
Is Linked Open Data the way forward?
 
Finding Data Sets
Finding Data SetsFinding Data Sets
Finding Data Sets
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Establishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBEstablishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNB
 
Here Comes Everything
Here Comes EverythingHere Comes Everything
Here Comes Everything
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 

Plus de Vlad Posea

Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
Vlad Posea
 
Javascript ajax tutorial
Javascript ajax tutorialJavascript ajax tutorial
Javascript ajax tutorial
Vlad Posea
 
Studiu Referitor La Insertia Pe Piata Muncii (1)
Studiu Referitor La Insertia Pe Piata Muncii (1)Studiu Referitor La Insertia Pe Piata Muncii (1)
Studiu Referitor La Insertia Pe Piata Muncii (1)
Vlad Posea
 
Aplicații Web Semantice - Descriere Proiect
Aplicații Web Semantice - Descriere ProiectAplicații Web Semantice - Descriere Proiect
Aplicații Web Semantice - Descriere Proiect
Vlad Posea
 
Ghidul bobocului de la Facultatea de Automatica si Calculatoare
Ghidul bobocului de la Facultatea de Automatica si CalculatoareGhidul bobocului de la Facultatea de Automatica si Calculatoare
Ghidul bobocului de la Facultatea de Automatica si Calculatoare
Vlad Posea
 
Tips & Tricks Proiect
Tips & Tricks   ProiectTips & Tricks   Proiect
Tips & Tricks Proiect
Vlad Posea
 

Plus de Vlad Posea (13)

Design thinking
Design thinkingDesign thinking
Design thinking
 
Talentul meu – mersul pe bicicletă
Talentul meu – mersul pe bicicletăTalentul meu – mersul pe bicicletă
Talentul meu – mersul pe bicicletă
 
Programarea calculatoarelor - Limbajul C
Programarea calculatoarelor   - Limbajul CProgramarea calculatoarelor   - Limbajul C
Programarea calculatoarelor - Limbajul C
 
Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
Ghidul Bobocului de la Facultatea de Automatica si Calculatoare vers 2011-2012
 
Json tutorial
Json tutorialJson tutorial
Json tutorial
 
Javascript ajax tutorial
Javascript ajax tutorialJavascript ajax tutorial
Javascript ajax tutorial
 
Studiu Referitor La Insertia Pe Piata Muncii (1)
Studiu Referitor La Insertia Pe Piata Muncii (1)Studiu Referitor La Insertia Pe Piata Muncii (1)
Studiu Referitor La Insertia Pe Piata Muncii (1)
 
Aplicații Web Semantice - Descriere Proiect
Aplicații Web Semantice - Descriere ProiectAplicații Web Semantice - Descriere Proiect
Aplicații Web Semantice - Descriere Proiect
 
Stagii In Strainatate
Stagii In StrainatateStagii In Strainatate
Stagii In Strainatate
 
Student si/sau Angajat
Student si/sau AngajatStudent si/sau Angajat
Student si/sau Angajat
 
Ghidul bobocului de la Facultatea de Automatica si Calculatoare
Ghidul bobocului de la Facultatea de Automatica si CalculatoareGhidul bobocului de la Facultatea de Automatica si Calculatoare
Ghidul bobocului de la Facultatea de Automatica si Calculatoare
 
Tips & Tricks Proiect
Tips & Tricks   ProiectTips & Tricks   Proiect
Tips & Tricks Proiect
 
Boboc Advisory Board Intalnire 1
Boboc Advisory Board Intalnire 1Boboc Advisory Board Intalnire 1
Boboc Advisory Board Intalnire 1
 

Dernier

Dernier (20)

VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
 
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
 
An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCC
 
Top Rated Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
 
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
 
The NAP process & South-South peer learning
The NAP process & South-South peer learningThe NAP process & South-South peer learning
The NAP process & South-South peer learning
 
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
 
2024: The FAR, Federal Acquisition Regulations, Part 30
2024: The FAR, Federal Acquisition Regulations, Part 302024: The FAR, Federal Acquisition Regulations, Part 30
2024: The FAR, Federal Acquisition Regulations, Part 30
 
Top Rated Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated  Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...Top Rated  Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
 
PPT Item # 4 - 231 Encino Ave (Significance Only)
PPT Item # 4 - 231 Encino Ave (Significance Only)PPT Item # 4 - 231 Encino Ave (Significance Only)
PPT Item # 4 - 231 Encino Ave (Significance Only)
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Human-AI Collaboration for Virtual Capacity in Emergency Operation Centers (E...
Human-AI Collaborationfor Virtual Capacity in Emergency Operation Centers (E...Human-AI Collaborationfor Virtual Capacity in Emergency Operation Centers (E...
Human-AI Collaboration for Virtual Capacity in Emergency Operation Centers (E...
 
CBO’s Recent Appeals for New Research on Health-Related Topics
CBO’s Recent Appeals for New Research on Health-Related TopicsCBO’s Recent Appeals for New Research on Health-Related Topics
CBO’s Recent Appeals for New Research on Health-Related Topics
 
The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)
 
The Most Attractive Pune Call Girls Handewadi Road 8250192130 Will You Miss T...
The Most Attractive Pune Call Girls Handewadi Road 8250192130 Will You Miss T...The Most Attractive Pune Call Girls Handewadi Road 8250192130 Will You Miss T...
The Most Attractive Pune Call Girls Handewadi Road 8250192130 Will You Miss T...
 
Government e Marketplace GeM Presentation
Government e Marketplace GeM PresentationGovernment e Marketplace GeM Presentation
Government e Marketplace GeM Presentation
 
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
 
Incident Command System xxxxxxxxxxxxxxxxxxxxxxxxx
Incident Command System xxxxxxxxxxxxxxxxxxxxxxxxxIncident Command System xxxxxxxxxxxxxxxxxxxxxxxxx
Incident Command System xxxxxxxxxxxxxxxxxxxxxxxxx
 

Linked Open Data in Romania

  • 1. Towards Linked Open Government Data Vlad Posea vlad.posea@cs.pub.ro
  • 2. about me • bachelor and PhD from Politehnica University of Bucharest • Master in Data Mining from University Lumiere of Lyon • research on competence management, semantic web, e-learning • business on career management and recruiting • now fellow of the Romanian American Foundation at the University of Rochester (Fulbright scholarship starting with 2017) for developing entrepreneurship in Romania
  • 3. Linked Data and Open Data • linked data = a way to connect data on the web using URIs and RDF, the most successful result of the Semantic Web initiative • open data = Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. • open government data = data regarding public institutions, published on governmental sites
  • 4. Very Short Intro on RDF • data represented as statements • statements contain • subject • predicate • object • subject, predicate and sometimes objects are URIs • URIs are used to uniquely identify entities or properties
  • 6. Why Do We Need Open Data? • transparency • how does the government spend money • fuel innovation and entrepreneurship • https://www.youtube.com/watch?v=sUqY5ySylXg (Todd Park discussing benefits of Open Government Data) • opening weather data and GPS data allowed people to build businesses • “last year alone civilian and commercial access to GPS created 90 billion $ worth of value” (2013) • participatory governance • citizens enabled in decision making • “making a full read/write society” (http://opengovernmentdata.org/why/)
  • 7. Open Data Quality • five stars of open data proposed by Tim Berners Lee • (1) be available on the Web under an open licence, • (2) be in the form of structured data, • (3) be in a non-proprietary file format, • (4) use URIs as itsidentifiers (see also RDF), • (5) include links to other data sources (see linked data). http://opendatahandbook.org/glossary/en/terms/five-stars-of-open-data/
  • 8. Open Data in the World • Global Open Data Quality measures how governments implement Open Data • evaluates if a country posted data on • national statistics • government budget • government spending • legislation • election results • national map • pollution • also evaluates the quality of the posted data • companies • location datasets • government procurement • water quality • weather forecast • land ownership • transport timetables • health performance
  • 9. Global Open Data Quality relevant progress has been made in terms of opening data scores would be much lower if 5 star data would have a bigger weight http://index.okfn.org/place/ http://index.okfn.org/methodology/
  • 10. Open Data in the US • data.gov – 190k datasets • mostly html (70k) • RDF below 5% of the total number of datasets • more than a quarter are either pdf, jpg, tiff • relevant steps • data.gov launched in 2009 • Open Government Partnership 2011 (http://www.opengovpartnership.org/) • Digital Accountability and Transparency Act (2014) • creating publishing standards for public spending data https://max.gov/maxportal/assets/public/offm/DataStandardsFinal.htm
  • 11. Open Data in Saint Louis • https://www.stlouis-mo.gov/data/ - list of data sets • most of them html or pdf • some confuse open data with reports
  • 12. Open Data in Romania • Data.gov.ro • National portal where public institutions put all the data • Types of resources published: CSV (***), PDF(*), XLS (**) • There is no connection between files (zero files with 4 or 5 *) • September 2016: • 72 public institutions • 8185 files • Each file can have its own structure • uses CKAN (http://ckan.org/)
  • 13. Why do we need Linked Open Data • classic workflow when working with open data: • analyze CSV files • define own data model • import data from CSV files into data model • solve import problems (naming differences, character encoding issues) • identify entities and link them to other entities existing in the model • link data from different CSV files in a common model • extract relevant information • write the program logic to exploit the data
  • 14. Why Do We Need Linked Open Data • classic workflow when working with linked open data: • analyze models • write query to extract relevant information • write the program logic to exploit the data • can use directly more than one dataset by performing “joins” in the queries • much faster to develop an application • much easier to reuse data
  • 15. Linked Open Data in Romania • Our goal is to transform open data from Romania into Linked Open Data. • Transform data into RDF triples (Subject, Predicate, Object) • Link entities with existing online resources, especially from dbpedia.org and Geonames • Create a platform where each published file is transformed into RDF • Create rich applications using SPARQL queries
  • 16. Vision • create tools and workflows to allow non-technical users to add Linked Data to the government website • offer an API for developers who want to create apps based on open government data • integrate the software into CKAN (the open data portal used by most governments) to allow every government to create linked data
  • 17. Stages 1. modeling data 2. massively transforming data 3. linking data to external data sets 4. embed into CKAN
  • 18. First Stage – Modeling Data • Identify the most common ontologies used • Create naming rules for creating the same URIs that identify the same resources • Identify the most common properties of the open data and the ontological properties associated to them • Identify the most common naming problems • different encodings • different spelling • different lexicalization of the same concepts and write hacks to solve them
  • 19. Open Data Types • Numerical data: • Different budgets or revenues • Different statistical data: number of cars/type/year, number of bed/hospital • Etc. • “Plain” data: • Information about entities: • Lawyers, Schools, Pharmacies, Museums, Archeological sites • Etc. • Found in tabular files, such as CSV or XLS
  • 20. Most Common Vocabularies for Open Data https://lov.okfn.org/dataset/lov/
  • 21. Most Common Vocabularies for Open Data • Dublin Core (DCTerms, DCE) – describes metadata terms (http://dublincore.org/schemas/) • SKOS – Simple Knowledge Organisation Systems – representing taxonomies • FOAF – Friend of a Friend – representing people and the relations between them • CC – Creative Commons – copyright information • GEO – Geonames – data about locations • VANN – data about vocabularies • DBPedia –
  • 22. Ontologies used • We used especially OWL classes defined by dbpedia such as: • http://dbpedia.org/ontology/Location • http://dbpedia.org/ontology/Place • http://dbpedia.org/ontology/PopulatedPlace • http://dbpedia.org/ontology/Museum • http://dbpedia.org/ontology/Hospital • http://dbpedia.org/ontology/EducationalInstitution • http://dbpedia.org/class/yago/CitiesInRomania • Other used OWL classes: • http://umbel.org/umbel/rc/Village • https://schema.org/PostalAddress
  • 23. Naming rules for creating URIs • Each URI has as prefix: http://opendata.cs.pub.ro/resource • Our goal is to make URIs for each resource as easy to understand as possible for humans • Our statement is: Once you read the URI, you know what it is about • For example: • Locality: http://opendata.cs.pub.ro/resource/<localityName>_judet_<localityCounty> • Hospital: http://opendata.cs.pub.ro/resource/<hospitalName>_hospital_<localityCounty>
  • 24. Most common properties • Most used properties were taken from well-known vocabularies such as: • VCARD: vcard:region, vcard:locality • FOAF: foaf:mbox, foaf:fax • GEO: geo:lat, geo:long • Other properties were taken from those defined by dbpedia.org: • http://dbpedia.org/property/postcode • http://dbpedia.org/property/phonenumber • We also defined properties defined in our own namespace: • http://opendata.cs.pub.ro/property
  • 25. Most common naming problems and how we solve them • Resource’s name was written with diacritics • Replace diacritics with normal letter • Resource’s name was written with non-alphanumeric characters, such as: space, hyphen, comma • Replace them with underscore • After initial choosing the naming convention, we saw that there can be some conflicts • For example: we chose initial the URI for museums only the name of the museum, but there can be a museum with same name in multiple towns, so we added for the URI also the museum’s town
  • 26.
  • 27.
  • 28. Stage 2: massively transforming data • experiment with 20 students from a master’s class • groups of 2 asked to choose 2 datasets and transform them • + greatly increased the number of data transformed • - big amount of work to correct the errors introduced by students • have to involve a larger number of volunteers • students will be asked to offer expert support • 2016 result: 10 new datasets, more than 500000 triples added
  • 29. Biggest problem PDF Files • Unfortunately, there are tabular data hidden in scanned PDF files • We created an algorithm to extract only the tables from these scanned files • This way, we transform the unmanageable scanned files into tabular ones • We want to improve existing tools using contextual information regarding the type of document
  • 30. Stage 3 Linking to external data sets • most important datasets: • dbpedia • people • events • places • geonames • all the places
  • 31. How to link? • query and disambiguate • sometimes really difficult • disambiguation • by type • by context • not always possible automatically
  • 32. Relevant tool: SILK • http://silkframework.org/ • Generating links between related data items within different Linked Data sources. • Linked Data publishers can use Silk to set RDF links from their data sources to other data sources on the Web. • Applying data transformations to structured data sources.
  • 33. or write some code geoloc=loc.decode("utf8")[:-1] query=strip_accents("http://api.geonames.org/search?q=%s&maxRows=1&type=rdf&username=vladposea"%geoloc) g=rdflib.Graph() g.parse(query.encode("ascii","ignore"),format="xml", encoding="utf-8") for s,p,o in g.triples( (None, rdflib.RDF.type, gn.Feature) ): fullGraph.add((locURI,OWL.sameAs, s)) <rdf:RDF > <gn:Feature rdf:about="http://sws.geonames.org/686254/"> ...
  • 34. Stage 4 – Embed into CKAN • CKAN - http://ckan.org/ • tool for publishing data • aimed at governments and other public organizations • specially designed for open data • used internationally • not built for linked data • we envision developing a plugin to semi-automatically construct linked data from the open data published
  • 35. What do we have so far? • Our focus was on “plain” data: • Cities dataset published in RDF and linked with geonames.org • Each created resource has a <owl:sameAs> property that links to geonames.org • Schools dataset published in RDF • Pharmacies dataset published in RDF • Museums dataset published in RDF and linked with dbpedia.org • Churches dataset publishe in RDF and linked with dbpedia.org • 207382 URIs with overall 2683968 RDF triples
  • 36. How we transformed the data? • For each dataset: • We identified what vocabulary should be used for each property • We identified what additional properties should be created for each resource • Each physical entity has an address and using Google Geocode service we obtained the geographical coordinates for that address • We created one unique URI for each resource • We generated the URIs by putting a lot of information inside them for example the URI for one school is :http://opendata.cs.pub.ro/resource/<school_name>_<city> • We opted for this encoding schema to create more verbose URIs, not just hashes • We linked each possible resource using online semantic repositories, such as dbpedia.org and geonames.org • The linking is done by searching entities with the same type and name
  • 37. How can someone access the resources? • We have published all RDF triples in a semantic repository: • http://opendata.cs.pub.ro/repo • It supports SPARQL queries • http://opendata.cs.pub.ro/repo/sparql • We document all published datasets in : • Blog: http://opendata.cs.pub.ro/blog • Wiki: http://opendata.cs.pub.ro/wiki
  • 38. SPARQL queries Towns where there are no schools SELECT ?loc WHERE { ?loc rdf:type <http://dbpedia.org/ontology/Settlement> . FILTER NOT EXISTS { ?x <http://opendata.cs.pub.ro/property/institutie_in_lo calitate> ?loc . } } Find the museums linked with dbpedia.org SELECT ?MusRO ?MusDB WHERE { ?MusRO rdf:type <http://dbpedia.org/ontology/Museum>. ?MusRO owl:sameAs ?MusDB. } ORDER BY ?MusRO
  • 39. Example application • All physical entities have an address and we obtained the geographical coordinates of this address. • We put on a map all these entities and someone can see the nearest museums, hospitals or pharmacies from its location • The app is online: • http://opendata.cs.pub.ro:3000
  • 40.
  • 41. Technologies used in this project • Storage layer: • Apache Marmotta HEAD version • Processing layer: • JAVA using Apache POI for reading tabular data and Apache Jena for converting data to RDF • C with OpenCV and Tesseract for extracting tabular data from scanned PDF files • Visualization layer: • Backend: node.js using sparql-client module for SPARQL queries • Fronted: angular.js
  • 42. Alternative Technologies • Open Refine • http://openrefine.org/ • formerly Google Refine • allows to • explore data in various formats • clean and transform data (clustering, easy or scripted transformations) • reconcile and match data • supports external web services
  • 43. Karma • semantic mapping tool http://labs.europeana.eu/apps/karma • imports data in various formats • transforms it to semantic data • links it to DBPedia or GeoNames • no features for statistical data integration • no features for parsing pdf files
  • 44. Named Entity Recognition • Named Entity Recognition – identify entities in texts, apply tags, link to permanent entities • Open Calais – up to 5k free requests/day • http://www.opencalais.com/ • Alchemy – made by IBM • http://www.alchemyapi.com/ • 1k/day free
  • 45. Apache Marmotta • http://marmotta.apache.org/ • read – write linked data server • open implementation of W3C’s Linked Data Platform Recommendation https://www.w3.org/TR/ldp/ • repository • SPARQL 1.1 engine
  • 46. RDF Data Cube Vocabulary • statistical data can’t be expressed using just subject predicate and attribute • RDF – graph • statistical data – hypergraph • RDF Data Cube https://www.w3.org/TR/vocab-data-cube - recommendation for a vocabulary to describe multi-dimensional data • compatible with Statistical Data and Metadata eXchange - SDMX
  • 47. Plan for the future • Develop an automated way to choose the vocabulary for one dataset • Focus on statistical data and publish them using RDF Data Cube vocabulary • Develop a more accurate method of linking resources • Create more applications that use the published data
  • 48. Papers • LODRo: Using cultural Romanian open data to build new learning applications Octavian Rinciog, Vlad Posea, The International Scientific Conference eLearning and Software for Education, Bucharest, 2016 • Publishing Romanian public health data as Linked Open Data Octavian Rinciog, Vlad Posea, E-Health and Bioengineering Conference (EHB), Iasi, 2015 • The Semantic Representation of Open Data Regarding the Romanian Companies Marian Spoiala, Octavian Rinciog, Vlad Posea, RoEDU Conference, Bucharest, 2016 • GovLOD: Towards a Linked Open Data Portal Octavian Rinciog, Vlad Posea, Poster in ISWC Conference, Tokyo, 2016