SlideShare a Scribd company logo
1 of 22
Download to read offline
Selecting with SPARQL
Searching Linked Data with Sparql using the British
National Bibliography data

Owen Stephens, November 2013
Monday, 25 November 13

These slides introduce SPARQL, the ‘SELECT’ query in SPARQL, and show how you can use relatively
straightforward SELECT queries on the British Library’s BNB SPARQL endpoint
SPARQL
• Pronounced “Sparkle”
• Recursive acronym for “SPARQL Protocol
and RDF Query Language”

• A W3C Standard
• A way of querying RDF in a triple store
Monday, 25 November 13

One of those recursive acronyms beloved of some of the IT community (e.g. see also GNU = Gnu’s Not Linux)
Basically a standard way of querying RDF stored in a triple store. A triple store is a database designed specifically
to store RDF triples - the equivalent of a relational database management system (RDBMS) that are widely used
for library systems.
Just as RDBMS where there are different products (proprietary and open source) such as Oracle, MySQL, SQL
Server etc. there are different triple stores. Popular ones include:
OWLIM
Jena
4Store and 5Store (currently - November 2013 - the British Library BNB runs on 5Store hosted by TSO http://
openup.tso.co.uk/openup-platform#RDF%20Store)
Virtuoso
Just as with RDBMS systems, each product has it’s own strengths/weaknesses and may implement standards in
different ways etc. Most triple stores will support SPARQL in some way, but different implementations may not
support the full range of the latest SPARQL spec (1.1)
Subject

Predicate

Object

Monday, 25 November 13

A reminder - Triples are made of Subject, Predicate, Object
Pride and Prejudice

has author

Jane Austen

Monday, 25 November 13

A reminder - an English form of a triple (sort of)
Basic form of a
SPARQL Query
SELECT *
WHERE {
?subject ?predicate ?object
}
LIMIT 10
https://gist.github.com/ostephens/7633222
Monday, 25 November 13

The simplest SPARQL query you can do:
Essentially means:
“Give me the first 10 triples (in the queried store)” - since no order is specified it is up to the store how it does this
The terms ‘?subject’, ‘?predicate’, and ‘?object’ are just placeholders (known as variables) - you can use
whatever names you like for these as long as they start with a ‘?’. So this general query will more often be written:
SELECT *
WHERE {
!
?s ?p ?o
}
LIMIT 10
Basic form of a
SPARQL Query II
SELECT *
WHERE {
?s ?p ?o
}
LIMIT 10
https://gist.github.com/ostephens/7633423
Monday, 25 November 13

As previous slide, but with single letter variables to show how this works
Pride and Prejudice

has author

Jane Austen

Monday, 25 November 13

A reminder - an English form of a triple
Basic SPARQL Query
using a literal I
SELECT *
WHERE {
?subject ?predicate "Austen, Jane"

}
LIMIT 10

https://gist.github.com/ostephens/7639241
Monday, 25 November 13

SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match
this pattern.
So this query will find all triples that have the literal “Austen, Jane” as an object.
However, returns no results against the BNB due to the literal not being a precise match
Basic SPARQL Query
using a literal II
SELECT *
WHERE {
?subject ?predicate "Austen, Jane, 1775-1817"

}
LIMIT 10
https://gist.github.com/ostephens/7639249
Monday, 25 November 13

SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match
this pattern.
So this query will find all triples that have the literal “Austen, Jane, 1775-1817” as an object.
This finds results, but note this does not find books authored by Jane Austen - due to the data structure used by
BNB - which assigns a URI, and then uses the authorised form of the name as an RDFS label on a URI
representing
Subject

Predicate

http://bnb.data.bl.uk/id/person/
AustenJane1775-1817

Austen, Jane,
1775-1817

Monday, 25 November 13

Generally statements in the BNB about Jane Austen will use the URI for Jane Austen, rather than a string. The
BNB for Jane Austen the person who wrote Pride and Prejudice is <http://bnb.data.bl.uk/id/person/
AustenJane1775-1817> so we would need to use this in our query
Basic SPARQL Query
using a URI I
SELECT *
WHERE {
?subject ?predicate <http://bnb.data.bl.uk/id/
person/AustenJane1775-1817>

}
LIMIT 10
https://gist.github.com/ostephens/7633236
Monday, 25 November 13

SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match
this pattern.
So this query will find all triples that have the URI “http://bnb.data.bl.uk/id/person/AustenJane1775-1817” as an
object. This URI represents Jane Austen, so this query will find any statements where Jane Austen is the object.
A typical statement of this type might be (in sort-of English rather than RDF):
Pride and Prejudice - has the author - Jane Austen
So - this would find all the books authored by Jane Austen. However, since we haven’t specified the type of
relationship to Jane Austen it would also find any other triples where the URI for Jane Austen was the object
Book

dc:creator

http://bnb.data.bl.uk/id/person/
AustenJane1775-1817

Austen, Jane,
1775-1817

Monday, 25 November 13

If we want to be more specific and retrieve books (or strictly, URIs which identify books) where Jane Austen is
the dc:creator we need to specify dc:creator as the predicate in the query pattern
Basic SPARQL Query
using a URI II
SELECT *
WHERE {
?book <http://purl.org/dc/terms/creator> <http://
bnb.data.bl.uk/id/person/AustenJane1775-1817>

}
LIMIT 10
https://gist.github.com/ostephens/7633258
Monday, 25 November 13

This is a stricter pattern - insists not only that <http://bnb.data.bl.uk/id/person/AustenJane1775-1817> is the
object, but that the predicate is http://purl.org/dc/terms/creator
This finds all books where Jane Austen is the ‘dc:creator’ which I think in terms of the BNB data essentially
means Austen, Jane was the entry in the MARC 100 field
Does not include books where Jane Austen is ‘dc:contributor’ (in MARC 700 field)
Perhaps more accurately: This finds the URIs for books where Jane Austen is the dc:creator - don’t confuse the
URIs for the books themselves - what you get in response to this queries is a list of URIs, not a list of titles.
Also note uses the slightly more readable ‘?book’ as variable instead of the generic ‘?subject’
PREFIX
PREFIX dct:

<http://purl.org/dc/terms/>

SELECT *
WHERE {
?subject dct:creator <http://bnb.data.bl.uk/id/
person/AustenJane1775-1817>

}
LIMIT 10
https://gist.github.com/ostephens/7633289
Monday, 25 November 13

The PREFIX command is just a timesaver - essentially allows you to substitute part of a URI with an abbreviation
The PREFIX is defined only in the query, although there are some common prefixes that are used - as long as
you are consistent in your use you can use what you want
PREFIX banana: <http://purl.org/dc/terms/>
works just as effectively as
PREFIX dct: <http://purl.org/dc/terms/>
but the latter is much more common! (see https://gist.github.com/ostephens/7633297 for use of ‘banana’ as
PREFIX)
Common PREFIX statements are given at https://gist.github.com/ostephens/7633203
Basic SPARQL Query
III
PREFIX dct:

<http://purl.org/dc/terms/>

SELECT *
WHERE {
<http://bnb.data.bl.uk/id/resource/009724890> ?
predicate ?object

}
LIMIT 10
https://gist.github.com/ostephens/7633258
Monday, 25 November 13

Pattern for finding statements about a particular book (in this case an edition of Pride and Prejudice)
Query with 2 patterns
PREFIX dct:

<http://purl.org/dc/terms/>

SELECT ?title
WHERE {
?book dct:creator <http://bnb.data.bl.uk/id/
person/AustenJane1775-1817> .
?book dct:title ?title

}
LIMIT 10
https://gist.github.com/ostephens/7633608
Monday, 25 November 13

This query joins together the two patterns we’ve see in the previous slides.
First finds URIs where Jane Austen is dct:creator
Then for each of the URIs finds what the dct:title is
That is - the second pattern applies to the URIs found in the first pattern.
This will get us a title list of books authored by Jane Austen
Note that the two patterns are joined with a ‘period’ - ‘.’
Note also that here the ‘Select’ statement only asks for ?title to ‘*’ - what do you think you’d get if you used ‘*’ here
Try amending this to find titles of books where Jane Austen is a ‘contributor’ - a much more interesting title list!
PREFIX dct:
PREFIX owl:
owl#>

<http://purl.org/dc/terms/>
<http://www.w3.org/2002/07/

SELECT DISTINCT ?title
WHERE {
?person owl:sameAs <http://viaf.org/viaf/
102333412> .
?book dct:creator ?person .
?book dct:title ?title

}
LIMIT 10
https://gist.github.com/ostephens/7633907
Monday, 25 November 13

A 3 pattern query
Note use of ‘Distinct’ to avoid duplicates in results
This finds titles of books authored by the person identified by the VIAF URI http://viaf.org/viaf/102333412 - which
happens to be ... Jane Austen. Starts to show how sharing identifiers can be very powerful
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntaxns#>
PREFIX dct:
<http://purl.org/dc/terms/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX bio: <http://purl.org/vocab/bio/0.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name
WHERE {
	 ?person owl:sameAs <http://viaf.org/viaf/102333412> .
	 ?person bio:event ?event .
	 ?event rdf:type bio:Birth .
	 ?event bio:date ?dob .
	 ?e2 bio:date ?dob .
	 ?e2 rdf:type bio:Birth .
	 ?p2 bio:event ?e2 .
	 ?p2 foaf:name ?name
}
LIMIT 1000

https://gist.github.com/ostephens/7633750
Monday, 25 November 13

Complex select - finds the names of all the people in BNB with the same birth year as person identified by http://
viaf.org/viaf/102333412 (Jane Austen)
Example application
using SPARQL

Monday, 25 November 13

Built using SPARQL statement of previous page
Possible because Wikipedia uses VIAF URIs to identify people...
See http://www.meanboyfriend.com/overdue_ideas/2013/04/contemporaneous-part-one/
Write your own
SPARQL
• BNB has a SPARQL endpoint and online
SPARQL editor + sample queries

• http://bnb.data.bl.uk/flint

Monday, 25 November 13

Many SPARQL endpoints provide a web based SPARQL editor - here are a few:
BNB = http://bnb.data.bl.uk/flint
British Museum = http://collection.britishmuseum.org/sparql
Cambridge University Library (beta) = http://data.lib.cam.ac.uk/endpoint.php
Europeana = http://europeana.ontotext.com/sparql
...
Important things I
haven’t mentioned
• Using ‘filters’ in queries
• Data types
• Language tags
• much more...
Monday, 25 November 13

See next slide for some further reading!
Further reading
•

Learning SPARQL by Bob DuCharme (http://
www.learningsparql.com/bio.html)

•

http://www.meanboyfriend.com/overdue_ideas/
2011/12/experimenting-with-british-museum-data/

•

http://www.meanboyfriend.com/overdue_ideas/
2013/04/contemporaneous-part-one/

•

Tutorial from Cambridge University Library
Linked Data project: http://data.lib.cam.ac.uk/
sparql.php

Monday, 25 November 13

More Related Content

What's hot

الوصول الحر
الوصول الحرالوصول الحر
الوصول الحر
Maha Ramadan
 
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
Simplilearn
 

What's hot (20)

Cloud computing الحوسبة السحابية
Cloud computing الحوسبة السحابية Cloud computing الحوسبة السحابية
Cloud computing الحوسبة السحابية
 
Recap: Windows Server 2019 Failover Clustering
Recap: Windows Server 2019 Failover ClusteringRecap: Windows Server 2019 Failover Clustering
Recap: Windows Server 2019 Failover Clustering
 
الميتاداتا و المصادر الرقمية
الميتاداتا و المصادر الرقميةالميتاداتا و المصادر الرقمية
الميتاداتا و المصادر الرقمية
 
الحوسبة في المكتبات
الحوسبة في المكتباتالحوسبة في المكتبات
الحوسبة في المكتبات
 
الوصول الحر الى المعلومات او النفاذ الحر من خلال التعرف على المراجع والمصادر ...
الوصول الحر الى المعلومات او النفاذ الحر من خلال التعرف على المراجع والمصادر ...الوصول الحر الى المعلومات او النفاذ الحر من خلال التعرف على المراجع والمصادر ...
الوصول الحر الى المعلومات او النفاذ الحر من خلال التعرف على المراجع والمصادر ...
 
微服務對IT人員的衝擊
微服務對IT人員的衝擊微服務對IT人員的衝擊
微服務對IT人員的衝擊
 
Azure Infrastructure as Code 体験入隊
Azure Infrastructure as Code 体験入隊Azure Infrastructure as Code 体験入隊
Azure Infrastructure as Code 体験入隊
 
الفهرسة المقروءة آلياً Marc
الفهرسة المقروءة آلياً Marcالفهرسة المقروءة آلياً Marc
الفهرسة المقروءة آلياً Marc
 
خطوات بناء المكانز
خطوات بناء المكانزخطوات بناء المكانز
خطوات بناء المكانز
 
マイクロサービスのセキュリティ概説
マイクロサービスのセキュリティ概説マイクロサービスのセキュリティ概説
マイクロサービスのセキュリティ概説
 
الوصول الحر
الوصول الحرالوصول الحر
الوصول الحر
 
Azure Key Vault
Azure Key VaultAzure Key Vault
Azure Key Vault
 
آليات التكشيف على الويب وأدواته
آليات التكشيف على الويب وأدواتهآليات التكشيف على الويب وأدواته
آليات التكشيف على الويب وأدواته
 
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
How To Become A Cloud Engineer | Cloud Engineer Salary | Cloud Computing Engi...
 
IT エンジニアのための 流し読み Windows 10 - Windows 10 サブスクリプションのライセンス認証
IT エンジニアのための 流し読み Windows 10 - Windows 10 サブスクリプションのライセンス認証IT エンジニアのための 流し読み Windows 10 - Windows 10 サブスクリプションのライセンス認証
IT エンジニアのための 流し読み Windows 10 - Windows 10 サブスクリプションのライセンス認証
 
Microsoft 365 で両立するセキュリティと働き方改革
Microsoft 365 で両立するセキュリティと働き方改革Microsoft 365 で両立するセキュリティと働き方改革
Microsoft 365 で両立するセキュリティと働き方改革
 
OWASP WordPressセキュリティ実装ガイドライン (セキュアなWordPressの構築)
OWASP WordPressセキュリティ実装ガイドライン (セキュアなWordPressの構築)OWASP WordPressセキュリティ実装ガイドライン (セキュアなWordPressの構築)
OWASP WordPressセキュリティ実装ガイドライン (セキュアなWordPressの構築)
 
Microsoft Azure Storage Overview | Microsoft Azure Training | Microsoft Azure...
Microsoft Azure Storage Overview | Microsoft Azure Training | Microsoft Azure...Microsoft Azure Storage Overview | Microsoft Azure Training | Microsoft Azure...
Microsoft Azure Storage Overview | Microsoft Azure Training | Microsoft Azure...
 
Semantic web
Semantic web Semantic web
Semantic web
 
مشروعات رقمنة مصادر المعلومات دراسة لتجارب المكتبات الوطنية الفرانكوفونية
مشروعات رقمنة مصادر المعلومات دراسة لتجارب المكتبات الوطنية الفرانكوفونيةمشروعات رقمنة مصادر المعلومات دراسة لتجارب المكتبات الوطنية الفرانكوفونية
مشروعات رقمنة مصادر المعلومات دراسة لتجارب المكتبات الوطنية الفرانكوفونية
 

Viewers also liked

Rdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology SpecificationRdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology Specification
chenjennan
 
Understanding Mobile Phone Requirements
Understanding Mobile Phone RequirementsUnderstanding Mobile Phone Requirements
Understanding Mobile Phone Requirements
chenjennan
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for Java
Aleksander Pohl
 

Viewers also liked (9)

SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Rdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology SpecificationRdf And Rdf Schema For Ontology Specification
Rdf And Rdf Schema For Ontology Specification
 
Understanding Mobile Phone Requirements
Understanding Mobile Phone RequirementsUnderstanding Mobile Phone Requirements
Understanding Mobile Phone Requirements
 
Java and SPARQL
Java and SPARQLJava and SPARQL
Java and SPARQL
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for Java
 
Java and OWL
Java and OWLJava and OWL
Java and OWL
 
Jena Programming
Jena ProgrammingJena Programming
Jena Programming
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Linked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and ExamplesLinked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and Examples
 

Similar to Selecting with SPARQL

It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011
Ross Singer
 
Karen Coyle: New Models of Matadata
Karen Coyle: New Models of MatadataKaren Coyle: New Models of Matadata
Karen Coyle: New Models of Matadata
ALATechSource
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
eswcsummerschool
 
¿ARCHIVO?
¿ARCHIVO?¿ARCHIVO?
¿ARCHIVO?
ESPOL
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
Fabien Gandon
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 

Similar to Selecting with SPARQL (20)

It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011It's not rocket surgery - Linked In: ALA 2011
It's not rocket surgery - Linked In: ALA 2011
 
NoSQL and Triple Stores
NoSQL and Triple StoresNoSQL and Triple Stores
NoSQL and Triple Stores
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
Publishing and Using Linked Data
Publishing and Using Linked DataPublishing and Using Linked Data
Publishing and Using Linked Data
 
Karen Coyle: New Models of Matadata
Karen Coyle: New Models of MatadataKaren Coyle: New Models of Matadata
Karen Coyle: New Models of Matadata
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
 
TMS for Researchers
TMS for ResearchersTMS for Researchers
TMS for Researchers
 
Notes from the Library Juice Academy courses on “SPARQL Fundamentals”: Univer...
Notes from the Library Juice Academy courses on “SPARQL Fundamentals”: Univer...Notes from the Library Juice Academy courses on “SPARQL Fundamentals”: Univer...
Notes from the Library Juice Academy courses on “SPARQL Fundamentals”: Univer...
 
Uplift – Generating RDF datasets from non-RDF data with R2RML
Uplift – Generating RDF datasets from non-RDF data with R2RMLUplift – Generating RDF datasets from non-RDF data with R2RML
Uplift – Generating RDF datasets from non-RDF data with R2RML
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
Transformational Tricks for RDF.pptx
Transformational Tricks for RDF.pptxTransformational Tricks for RDF.pptx
Transformational Tricks for RDF.pptx
 
SPARQL
SPARQLSPARQL
SPARQL
 
Archives & the Semantic Web
Archives & the Semantic WebArchives & the Semantic Web
Archives & the Semantic Web
 
¿ARCHIVO?
¿ARCHIVO?¿ARCHIVO?
¿ARCHIVO?
 
que hisciste el verano pasado
que hisciste el verano pasadoque hisciste el verano pasado
que hisciste el verano pasado
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
 

More from ostephens

References on the web
References on the webReferences on the web
References on the web
ostephens
 
Project Management Tools
Project Management ToolsProject Management Tools
Project Management Tools
ostephens
 

More from ostephens (15)

A Chrismash Carol
A Chrismash CarolA Chrismash Carol
A Chrismash Carol
 
Open, Linked, Hacked
Open, Linked, HackedOpen, Linked, Hacked
Open, Linked, Hacked
 
Read 'em and weep
Read 'em and weepRead 'em and weep
Read 'em and weep
 
Open for Reuse: Library data and mashups
Open for Reuse: Library data and mashupsOpen for Reuse: Library data and mashups
Open for Reuse: Library data and mashups
 
Where are you from? and other stupid questions
Where are you from? and other stupid questionsWhere are you from? and other stupid questions
Where are you from? and other stupid questions
 
Linking lcsh and other stuff
Linking lcsh and other stuffLinking lcsh and other stuff
Linking lcsh and other stuff
 
Mashing libraries to build communities - CILIPS 2011
Mashing libraries to build communities - CILIPS 2011Mashing libraries to build communities - CILIPS 2011
Mashing libraries to build communities - CILIPS 2011
 
Lucero Library Update 03/11/10
Lucero Library Update 03/11/10Lucero Library Update 03/11/10
Lucero Library Update 03/11/10
 
Mashing libraries to build communities
Mashing libraries to build communitiesMashing libraries to build communities
Mashing libraries to build communities
 
References on the web
References on the webReferences on the web
References on the web
 
Project Management Tools
Project Management ToolsProject Management Tools
Project Management Tools
 
TELSTAR
TELSTARTELSTAR
TELSTAR
 
The Semantic Web
The Semantic WebThe Semantic Web
The Semantic Web
 
Resource Discovery Infrastructure - what if we were starting from scratch?
Resource Discovery Infrastructure - what if we were starting from scratch?Resource Discovery Infrastructure - what if we were starting from scratch?
Resource Discovery Infrastructure - what if we were starting from scratch?
 
Digital Future
Digital FutureDigital Future
Digital Future
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Selecting with SPARQL

  • 1. Selecting with SPARQL Searching Linked Data with Sparql using the British National Bibliography data Owen Stephens, November 2013 Monday, 25 November 13 These slides introduce SPARQL, the ‘SELECT’ query in SPARQL, and show how you can use relatively straightforward SELECT queries on the British Library’s BNB SPARQL endpoint
  • 2. SPARQL • Pronounced “Sparkle” • Recursive acronym for “SPARQL Protocol and RDF Query Language” • A W3C Standard • A way of querying RDF in a triple store Monday, 25 November 13 One of those recursive acronyms beloved of some of the IT community (e.g. see also GNU = Gnu’s Not Linux) Basically a standard way of querying RDF stored in a triple store. A triple store is a database designed specifically to store RDF triples - the equivalent of a relational database management system (RDBMS) that are widely used for library systems. Just as RDBMS where there are different products (proprietary and open source) such as Oracle, MySQL, SQL Server etc. there are different triple stores. Popular ones include: OWLIM Jena 4Store and 5Store (currently - November 2013 - the British Library BNB runs on 5Store hosted by TSO http:// openup.tso.co.uk/openup-platform#RDF%20Store) Virtuoso Just as with RDBMS systems, each product has it’s own strengths/weaknesses and may implement standards in different ways etc. Most triple stores will support SPARQL in some way, but different implementations may not support the full range of the latest SPARQL spec (1.1)
  • 3. Subject Predicate Object Monday, 25 November 13 A reminder - Triples are made of Subject, Predicate, Object
  • 4. Pride and Prejudice has author Jane Austen Monday, 25 November 13 A reminder - an English form of a triple (sort of)
  • 5. Basic form of a SPARQL Query SELECT * WHERE { ?subject ?predicate ?object } LIMIT 10 https://gist.github.com/ostephens/7633222 Monday, 25 November 13 The simplest SPARQL query you can do: Essentially means: “Give me the first 10 triples (in the queried store)” - since no order is specified it is up to the store how it does this The terms ‘?subject’, ‘?predicate’, and ‘?object’ are just placeholders (known as variables) - you can use whatever names you like for these as long as they start with a ‘?’. So this general query will more often be written: SELECT * WHERE { ! ?s ?p ?o } LIMIT 10
  • 6. Basic form of a SPARQL Query II SELECT * WHERE { ?s ?p ?o } LIMIT 10 https://gist.github.com/ostephens/7633423 Monday, 25 November 13 As previous slide, but with single letter variables to show how this works
  • 7. Pride and Prejudice has author Jane Austen Monday, 25 November 13 A reminder - an English form of a triple
  • 8. Basic SPARQL Query using a literal I SELECT * WHERE { ?subject ?predicate "Austen, Jane" } LIMIT 10 https://gist.github.com/ostephens/7639241 Monday, 25 November 13 SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match this pattern. So this query will find all triples that have the literal “Austen, Jane” as an object. However, returns no results against the BNB due to the literal not being a precise match
  • 9. Basic SPARQL Query using a literal II SELECT * WHERE { ?subject ?predicate "Austen, Jane, 1775-1817" } LIMIT 10 https://gist.github.com/ostephens/7639249 Monday, 25 November 13 SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match this pattern. So this query will find all triples that have the literal “Austen, Jane, 1775-1817” as an object. This finds results, but note this does not find books authored by Jane Austen - due to the data structure used by BNB - which assigns a URI, and then uses the authorised form of the name as an RDFS label on a URI representing
  • 10. Subject Predicate http://bnb.data.bl.uk/id/person/ AustenJane1775-1817 Austen, Jane, 1775-1817 Monday, 25 November 13 Generally statements in the BNB about Jane Austen will use the URI for Jane Austen, rather than a string. The BNB for Jane Austen the person who wrote Pride and Prejudice is <http://bnb.data.bl.uk/id/person/ AustenJane1775-1817> so we would need to use this in our query
  • 11. Basic SPARQL Query using a URI I SELECT * WHERE { ?subject ?predicate <http://bnb.data.bl.uk/id/ person/AustenJane1775-1817> } LIMIT 10 https://gist.github.com/ostephens/7633236 Monday, 25 November 13 SPARQL select queries are essentially about giving patterns to the triple store and getting back triples that match this pattern. So this query will find all triples that have the URI “http://bnb.data.bl.uk/id/person/AustenJane1775-1817” as an object. This URI represents Jane Austen, so this query will find any statements where Jane Austen is the object. A typical statement of this type might be (in sort-of English rather than RDF): Pride and Prejudice - has the author - Jane Austen So - this would find all the books authored by Jane Austen. However, since we haven’t specified the type of relationship to Jane Austen it would also find any other triples where the URI for Jane Austen was the object
  • 12. Book dc:creator http://bnb.data.bl.uk/id/person/ AustenJane1775-1817 Austen, Jane, 1775-1817 Monday, 25 November 13 If we want to be more specific and retrieve books (or strictly, URIs which identify books) where Jane Austen is the dc:creator we need to specify dc:creator as the predicate in the query pattern
  • 13. Basic SPARQL Query using a URI II SELECT * WHERE { ?book <http://purl.org/dc/terms/creator> <http:// bnb.data.bl.uk/id/person/AustenJane1775-1817> } LIMIT 10 https://gist.github.com/ostephens/7633258 Monday, 25 November 13 This is a stricter pattern - insists not only that <http://bnb.data.bl.uk/id/person/AustenJane1775-1817> is the object, but that the predicate is http://purl.org/dc/terms/creator This finds all books where Jane Austen is the ‘dc:creator’ which I think in terms of the BNB data essentially means Austen, Jane was the entry in the MARC 100 field Does not include books where Jane Austen is ‘dc:contributor’ (in MARC 700 field) Perhaps more accurately: This finds the URIs for books where Jane Austen is the dc:creator - don’t confuse the URIs for the books themselves - what you get in response to this queries is a list of URIs, not a list of titles. Also note uses the slightly more readable ‘?book’ as variable instead of the generic ‘?subject’
  • 14. PREFIX PREFIX dct: <http://purl.org/dc/terms/> SELECT * WHERE { ?subject dct:creator <http://bnb.data.bl.uk/id/ person/AustenJane1775-1817> } LIMIT 10 https://gist.github.com/ostephens/7633289 Monday, 25 November 13 The PREFIX command is just a timesaver - essentially allows you to substitute part of a URI with an abbreviation The PREFIX is defined only in the query, although there are some common prefixes that are used - as long as you are consistent in your use you can use what you want PREFIX banana: <http://purl.org/dc/terms/> works just as effectively as PREFIX dct: <http://purl.org/dc/terms/> but the latter is much more common! (see https://gist.github.com/ostephens/7633297 for use of ‘banana’ as PREFIX) Common PREFIX statements are given at https://gist.github.com/ostephens/7633203
  • 15. Basic SPARQL Query III PREFIX dct: <http://purl.org/dc/terms/> SELECT * WHERE { <http://bnb.data.bl.uk/id/resource/009724890> ? predicate ?object } LIMIT 10 https://gist.github.com/ostephens/7633258 Monday, 25 November 13 Pattern for finding statements about a particular book (in this case an edition of Pride and Prejudice)
  • 16. Query with 2 patterns PREFIX dct: <http://purl.org/dc/terms/> SELECT ?title WHERE { ?book dct:creator <http://bnb.data.bl.uk/id/ person/AustenJane1775-1817> . ?book dct:title ?title } LIMIT 10 https://gist.github.com/ostephens/7633608 Monday, 25 November 13 This query joins together the two patterns we’ve see in the previous slides. First finds URIs where Jane Austen is dct:creator Then for each of the URIs finds what the dct:title is That is - the second pattern applies to the URIs found in the first pattern. This will get us a title list of books authored by Jane Austen Note that the two patterns are joined with a ‘period’ - ‘.’ Note also that here the ‘Select’ statement only asks for ?title to ‘*’ - what do you think you’d get if you used ‘*’ here Try amending this to find titles of books where Jane Austen is a ‘contributor’ - a much more interesting title list!
  • 17. PREFIX dct: PREFIX owl: owl#> <http://purl.org/dc/terms/> <http://www.w3.org/2002/07/ SELECT DISTINCT ?title WHERE { ?person owl:sameAs <http://viaf.org/viaf/ 102333412> . ?book dct:creator ?person . ?book dct:title ?title } LIMIT 10 https://gist.github.com/ostephens/7633907 Monday, 25 November 13 A 3 pattern query Note use of ‘Distinct’ to avoid duplicates in results This finds titles of books authored by the person identified by the VIAF URI http://viaf.org/viaf/102333412 - which happens to be ... Jane Austen. Starts to show how sharing identifiers can be very powerful
  • 18. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntaxns#> PREFIX dct: <http://purl.org/dc/terms/> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX bio: <http://purl.org/vocab/bio/0.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT DISTINCT ?name WHERE { ?person owl:sameAs <http://viaf.org/viaf/102333412> . ?person bio:event ?event . ?event rdf:type bio:Birth . ?event bio:date ?dob . ?e2 bio:date ?dob . ?e2 rdf:type bio:Birth . ?p2 bio:event ?e2 . ?p2 foaf:name ?name } LIMIT 1000 https://gist.github.com/ostephens/7633750 Monday, 25 November 13 Complex select - finds the names of all the people in BNB with the same birth year as person identified by http:// viaf.org/viaf/102333412 (Jane Austen)
  • 19. Example application using SPARQL Monday, 25 November 13 Built using SPARQL statement of previous page Possible because Wikipedia uses VIAF URIs to identify people... See http://www.meanboyfriend.com/overdue_ideas/2013/04/contemporaneous-part-one/
  • 20. Write your own SPARQL • BNB has a SPARQL endpoint and online SPARQL editor + sample queries • http://bnb.data.bl.uk/flint Monday, 25 November 13 Many SPARQL endpoints provide a web based SPARQL editor - here are a few: BNB = http://bnb.data.bl.uk/flint British Museum = http://collection.britishmuseum.org/sparql Cambridge University Library (beta) = http://data.lib.cam.ac.uk/endpoint.php Europeana = http://europeana.ontotext.com/sparql ...
  • 21. Important things I haven’t mentioned • Using ‘filters’ in queries • Data types • Language tags • much more... Monday, 25 November 13 See next slide for some further reading!
  • 22. Further reading • Learning SPARQL by Bob DuCharme (http:// www.learningsparql.com/bio.html) • http://www.meanboyfriend.com/overdue_ideas/ 2011/12/experimenting-with-british-museum-data/ • http://www.meanboyfriend.com/overdue_ideas/ 2013/04/contemporaneous-part-one/ • Tutorial from Cambridge University Library Linked Data project: http://data.lib.cam.ac.uk/ sparql.php Monday, 25 November 13