Van de droom van het Semantic Web naar de realiteit van Linked Open
1. Creative Commons CC BY 3.0:
allowed to share & remix
(also commercial)
but must attribute
Frank van Harmelen
Vrije Universiteit
Van de droom van
het Semantic Web
naar de realiteit van
Linked Open Data
2. “There is lots of data we all use every day,
and it’s not connected.
I can see my bank statements on the web, and my photos,
and I can see my appointments in a calendar.
But can I see my photos in a calendar
to see what I was doing when I took them?
Can I see bank statement lines in a calendar?
No... Why not?
Because we don’t have a web of data.
Because data is controlled by applications
and each application keeps it to itself.”
The Semantic Web Dream
3. Twee problemen om op te lossen
Gedistribueerde informatie
Semantic
Web
Heterogene informatie
5. SW/Linked Data in 4 principles
1. Give all things a name
2. Make a graph of relations between the things
at this point we have (only) a Giant Graph
3. Make sure all names are URIs
at this point we have (only) a Giant Global Graph
4. Add semantics (= predictable inference)
Now we have a Giant Global Knowledge Graph
8. P3. The names are addresses on the Web
Allows integration of data from different owners
at different locations
geonames:..
T
[<x> rdf:type <T>]
different
owners & locations Dbpedia/Village
11. P4. explicit & formal semantics
• assign types to things
• assign types to relations
• organise types in a hierarchy
• empose constraints on
possible interpretations
O W L
12. Why is semantics hard for computers?
Or: What’s it like to be a computer ?
13. P4. explicit & formal semantics
Frank Bussum
birth-place
• has-birth-place
relates person to
location
Frank is person
• Has-birth-place relates
1 person to
1 location
Bussum = Meren
lowerbound upperbound
Meren
Has-birth-place
14. SW/Linked Data in 4 principles
Give all things a name
Make a graph of relations between the things
at this point we have (only) a Giant Graph
Make sure all names are URIs
at this point we have (only) a Giant Global Graph
Add semantics (= predictable inference)
Now we have a Giant Global Knowledge Graph
18. Is anybody using this for real?
Schema.org:
Vocabulary to describe “things on the web”
Agreed upon by all major search engines
600+ types, 1000 properties
used by 10M+ sites
shows up in 36% of all Google results
22. data.gov.uk
started in early 2010 with 3000 datasets
include Ordnance Survey data
• map safety of bicycle routes
• inform home buyers about their new neighborhood.
• school finder
• nursery finder
• pollution alert
• fix my neighbourhood
• regional expenditure map
• www.wheredidmytaxgo.co.uk/
24. Data.gov
6 Other nations establishing open data
Canada, Ireland, Norway, Australia, New Zealand
8 States now offering data sites
California, Utah, Michigan, Massachusetts, Washington, ...
8 Cities in America with open data
San Francisco, New York City, Austin, ...
Fact or Fiction?
Followers from data.gov
Linked Open Data for Open Government
World Wide
28. • TOOI: Overheid.nl | KOOP Waardelijsten
• Nieuwe metagegevensstandaard voor de
gehele overheid | Archive-IT
• Registratie onderwijs instellingen RIO - DUO
(onderwijsregistratie.nl)
34. • NXP is a semiconductor (microchip) manufacturer
• Established: 2006 (formerly a division of Philips) with 50+
years of experience in semiconductors
• Headquarters: Eindhoven, The Netherlands
• Customers include Apple, Bosch, Continental, Delphi,
Gemalto, Giesecke/Devrient, Huawei, NSN, Panasonic and
Samsung
• Portfolio of 26,000+ products
34
37. Uber
“When an eater enters a query, we try to understand their
intent based on our knowledge of food organized as a graph”
Uber Engineering blog, June 6 2018
39. PILOD platform
Platform Linked Data Nederland (pldn.nl):
Great source of practical info on Linked Open Data:
• learning resources,
• good use-cases,
LinkedDataParels2019.pdf
• steps to take
Notes de l'éditeur
The good news: a distributed knowledge-base that describes hundreds of millions of items through tens of billions of relations between them, classifying them into hundreds of thousands of different classes, hosted on a web of thousands of different servers across the world, with fully distributed access and open to contributions from anybody. A knowledge-base on this scale, of this size and of such broad coverage would have been unthinkable 15 years ago, but it has now become reality under a variety of names such as the Semantic Web, the Linked Open Data cloud, or the Web of Data.The bad news: despite this success, we actually understand very little of the structure of the Web of Data. Its formal meaning is specified in logic, but with its scale, context dependency and dynamics, the Web of Data has outgrown its traditional model-theoretic semantics. Is the meaning of a logical statement (an edge in the graph) dependent on the cluster ("context") in which it appears? Does a more densely connected concept (node) contain more information? Is the path length between two nodes related to their semantic distance? Properties such as clustering, connectivity and path length are not described, much less explained by model-theoretic semantics. Do such properties contribute to the meaning of a knowledge graph?To properly understand the structure and meaning of knowledge graphs, we should no longer treat knowledge graphs as (only) a set of logical statements, but treat them properly as a graph. But how to do this is far from clear. In this talk, we'll report on some of our early results on some of these questions, but we'll ask many more questions for which we don't have answers yet.
We’re going to explain all of these.
Give all things a name (including non-physical things like a date, a year, a location, a movie, the color red, a disease, etc). That’s lots of names.
Make a graph of those names: nodes are the (names of) things, edges are the relations between them. Notice names for non-physical things like “1999”.
This creates a giant graph.
This slide is sloppy, all of these names should be URL, next principle
So now, anybody can assign any property to any object published by anybody else. Together this creates a giant GLOBAL graph
To make that giant global graph a knowledge graph, we need to assign formal meaning. That meaning will have a very simple structure. More or less the modelling primitives you find in any widely accepted modelling language
First, let’s remind ourselves how hard it is for computers to find “meaning” in anything. T
Mind-reading game to explain semantics.
If I show the audience the top triple, and we share a little bit of background knowledge in the square box (“ontology”), I can predict what the audience will infer from the top-triple. The shared background knowledge forces us to believe certain things (such that the right blobs must be locations) , and forbids us to believe certain things (such as that the two right blobs are different). By increasing the background knowledge the enforced conclusions (lowerbound on agreement) and the forbidden conlusions (upperbound on agreement) get closer and closer, and the remaining space for ambiguity and misunderstanding reduces. Not only misunderstanding between people, but also between machines.
Slogan: semantics is when I can predict what you will infer when I send you something.
We’re going to explain all of these.
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
We’ve seen this example
That works because all three major search engines are sharing a single very lightweight ontology.
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
The US government is publishing many many datasets in semantic web format. So that citizens and companies can re-use these data for their own purposes. (commercial, lobbying, education, science, etc)
Lots of Governments around the world do this.
In Europe too
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
Journalists re-use bits of information, text and images from other journalists all the time. Semweb technology made that process more efficient. The BBC website, powered by SemWeb technology was the busiest website in the world during the London Olympic Games.
And yes, just as XMP made an ontology about their electronic products, the BBC made an ontology about Olympic sports.
This company had so many variations on their products that their own engineers couldn’t find the specs of each others designs any more.
After it was such a success for their own engineers, they also made portions of it open to their customers.