SlideShare une entreprise Scribd logo
1  sur  23
Department of Parliamentary Services
Parliamentary Library and Information Service

Linked Data:
thinking big, starting small

VALA
6 February 2014
Peter Neish
@peterneish
Department of Parliamentary Services
Parliamentary Library and Information Service

What will be covered
• Background
– What is Linked Data?

– Linked Data in Libraries and
Government

• What we did
– Linked Data Workflow

• What did we get out of it?
Department of Parliamentary Services
Parliamentary Library and Information Service

What is Linked Data?
1 October 1988

United
Australia
Party

http://www.w3.org/ns/org#memberOf

Predicate

elected

successorOf

party

Subject

Object

Denis Napthine

Liberal Party

http://parliament.vic.gov.au/members/id/135

hasRole
premier

successorOf

http://dbpedia.org/resource/Liberal_Party_of_Australia

party

formationDate

Ted Baillieu

the triple statement
slightly simplified example

31 August 1945
Department of Parliamentary Services
Parliamentary Library and Information Service
Department of Parliamentary Services
Parliamentary Library and Information Service
Department of Parliamentary Services
Parliamentary Library and Information Service
Department of Parliamentary Services
Parliamentary Library and Information Service

Linked Data in Libraries
• OCLC – 1.2 million resources – 80 million triples
• LOC – Subject headings, authority files

• British Library – 2.8 million records, 93 billion triples
• BIBFRAME

• Schema Bib Extend Community Group
• LODLAM
Department of Parliamentary Services
Parliamentary Library and Information Service

Linked Data in Parliament and Government
– 6.4 billion triples of open government data
Department of Parliamentary Services
Parliamentary Library and Information Service

Open Government
Department of Parliamentary Services
Parliamentary Library and Information Service

Project aims
• Is Linked Data useful in a
local context
• Explore the process of using
Linked Data – where do you
start?
• Being able to interrogate our
data in new ways
• Use visualisation to gain new
insights into data
Department of Parliamentary Services
Parliamentary Library and Information Service

Databases at Parliament
People and
Organisations

government
agencies

Documents

media releases

parliamentary
debates (Hansard)

newspaper
clippings

Members of
Parliament

Media

parliamentary
papers

video and audio
clips

party policies
Department of Parliamentary Services
Parliamentary Library and Information Service

Linked Data Workflow
Preparation

•choose ontology
•investigate similar projects

Clean and
reconcile data

•clean data (cluster, facet)
•named entity extraction
•reconcile with other data

Publish

•output RDF
•store data (files, triple store etc)
Department of Parliamentary Services
Parliamentary Library and Information Service

Preparation
• Investigate similar projects
– Don’t reinvent the wheel
– Collaborate

• Choose an ontology (or build your own)
– Linked Data Open Vocabularies (lov.okfn.org)
Department of Parliamentary Services
Parliamentary Library and Information Service

Popolo Ontology
popoloproject.com

• developing open government
specifications relating to the legislature
• prioritizes reuse over novelty
• attempts to make it easy to represent
real-world data
• consensus model – open to
contributions (W3C community
group, github)
Department of Parliamentary Services
Parliamentary Library and Information Service

Clean and
reconcile data
Department of Parliamentary Services
Parliamentary Library and Information Service

Clean and
reconcile data
Department of Parliamentary Services
Parliamentary Library and Information Service

Publish
• create RDF (Open Refine can do this
too)
• store data
– separate files
– embedded in html
– Database mapping using D2RQ
– triple store
Department of Parliamentary Services
Parliamentary Library and Information Service

What do we get out of it?
• Combined approach
– embedded data in catalogue

– Fuseki Triple Store

• Complex queries using SPARQL:
– what have previous speakers being saying
about the current issues in parliament?
– find all articles about transport that mention
members of the Road Safety Committee
Department of Parliamentary Services
Parliamentary Library and Information Service
Department of Parliamentary Services
Parliamentary Library and Information Service

Links to related articles
Department of Parliamentary Services
Parliamentary Library and Information Service

Federal Preferences 2013 Election
Department of Parliamentary Services
Parliamentary Library and Information Service

Conclusion
• The process itself is valuable
• Aligning data with standards
(Popolo Ontology)
• Cleaning and reconciling
adds value to data

• Databases linked internally
• Can now provide Linked
Data externally
Department of Parliamentary Services
Parliamentary Library and Information Service

Further Information
Linked Data best practise and recipes
• freeyourmetadata.org

• linkeddatabook.com
• euclid-project.eu
@peterneish
github.com/peterneish

Contenu connexe

Tendances

Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)OpenAIRE
 
New ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity researchNew ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity researchVince Smith
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Projectariadnenetwork
 
Transition to Open Science in Europe
Transition to Open Science in EuropeTransition to Open Science in Europe
Transition to Open Science in EuropeLIBER Europe
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)OpenAIRE
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefitsariadnenetwork
 
Connecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open scienceConnecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open scienceOpenAIRE
 
Lodlam.slideshare
Lodlam.slideshareLodlam.slideshare
Lodlam.slideshareHafabe
 
OpenAIRE – The path from OpenAIRE to EOSC in Belgium
OpenAIRE – The path from OpenAIRE to EOSC in BelgiumOpenAIRE – The path from OpenAIRE to EOSC in Belgium
OpenAIRE – The path from OpenAIRE to EOSC in BelgiumOpenAccessBelgium
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsPeter Haase
 
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...OpenAIRE
 
VALA 2016 L-Plate session on Linked Open Data
VALA 2016 L-Plate session on Linked Open DataVALA 2016 L-Plate session on Linked Open Data
VALA 2016 L-Plate session on Linked Open DataPeter Neish
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataAndreas Thalhammer
 
SWSIG intro WLIC2013
SWSIG intro WLIC2013SWSIG intro WLIC2013
SWSIG intro WLIC2013Figoblog
 
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...LIBER Europe
 
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...LIBER Europe
 
Beyond OpenAIRE2020
Beyond OpenAIRE2020Beyond OpenAIRE2020
Beyond OpenAIRE2020OpenAIRE
 
SWSIG wlic2016
SWSIG wlic2016SWSIG wlic2016
SWSIG wlic2016Figoblog
 

Tendances (20)

Wikidata
WikidataWikidata
Wikidata
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
New ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity researchNew ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity research
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Project
 
Transition to Open Science in Europe
Transition to Open Science in EuropeTransition to Open Science in Europe
Transition to Open Science in Europe
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefits
 
Connecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open scienceConnecting the dots - e-Infra services for open science
Connecting the dots - e-Infra services for open science
 
Lodlam.slideshare
Lodlam.slideshareLodlam.slideshare
Lodlam.slideshare
 
OpenAIRE – The path from OpenAIRE to EOSC in Belgium
OpenAIRE – The path from OpenAIRE to EOSC in BelgiumOpenAIRE – The path from OpenAIRE to EOSC in Belgium
OpenAIRE – The path from OpenAIRE to EOSC in Belgium
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data Portals
 
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...
The OpenAIRE Catalogue of Services: Towards Open Science - Workshop: Design y...
 
VALA 2016 L-Plate session on Linked Open Data
VALA 2016 L-Plate session on Linked Open DataVALA 2016 L-Plate session on Linked Open Data
VALA 2016 L-Plate session on Linked Open Data
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and Wikidata
 
SWSIG intro WLIC2013
SWSIG intro WLIC2013SWSIG intro WLIC2013
SWSIG intro WLIC2013
 
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
 
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
The GND initiative 2017-2021: Developing a Backbone for the Web of Cultural a...
 
Beyond OpenAIRE2020
Beyond OpenAIRE2020Beyond OpenAIRE2020
Beyond OpenAIRE2020
 
SWSIG wlic2016
SWSIG wlic2016SWSIG wlic2016
SWSIG wlic2016
 

Similaire à Linked Data: thinking big, starting small

Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data ManagementMarin Dimitrov
 
Talk of Europe: Linked data of the European Parliament
Talk of Europe:  Linked data of the European ParliamentTalk of Europe:  Linked data of the European Parliament
Talk of Europe: Linked data of the European ParliamentLaura Hollink
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
Trove: A Government 2.0 Showcase August 2010, Australian Parliament
Trove: A Government 2.0 Showcase August 2010, Australian ParliamentTrove: A Government 2.0 Showcase August 2010, Australian Parliament
Trove: A Government 2.0 Showcase August 2010, Australian ParliamentRose Holley
 
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Harvesting Repositories:  DPLA, Europeana, & Other Case StudiesHarvesting Repositories:  DPLA, Europeana, & Other Case Studies
Harvesting Repositories: DPLA, Europeana, & Other Case Studieseohallor
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data SetsGlen Robson
 
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Marcus Smith
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Cultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data CollectionsCultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data Collectionslljohnston
 
Open Access and Search4Dev - Harry Heemskerk - KIT
Open Access and Search4Dev - Harry Heemskerk - KITOpen Access and Search4Dev - Harry Heemskerk - KIT
Open Access and Search4Dev - Harry Heemskerk - KITopenforchange
 
Dane Wright, London Borough of Brent - open and linked data
Dane Wright, London Borough of Brent - open and linked dataDane Wright, London Borough of Brent - open and linked data
Dane Wright, London Borough of Brent - open and linked dataSocitm
 
Putting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television CatalogPutting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television CatalogWGBH Media Library and Archives
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?Brian Vetruba
 
Chasing the Fifth Star - Open Data at the National Library of NZ
Chasing the Fifth Star - Open Data at the National Library of NZChasing the Fifth Star - Open Data at the National Library of NZ
Chasing the Fifth Star - Open Data at the National Library of NZmlascarides
 
Digital Education Resource Archive (DERA)
Digital Education Resource Archive (DERA)  Digital Education Resource Archive (DERA)
Digital Education Resource Archive (DERA) ALISS
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 

Similaire à Linked Data: thinking big, starting small (20)

Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 
Talk of Europe: Linked data of the European Parliament
Talk of Europe:  Linked data of the European ParliamentTalk of Europe:  Linked data of the European Parliament
Talk of Europe: Linked data of the European Parliament
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
Trove: A Government 2.0 Showcase August 2010, Australian Parliament
Trove: A Government 2.0 Showcase August 2010, Australian ParliamentTrove: A Government 2.0 Showcase August 2010, Australian Parliament
Trove: A Government 2.0 Showcase August 2010, Australian Parliament
 
Open statistics Belgium
Open statistics BelgiumOpen statistics Belgium
Open statistics Belgium
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Harvesting Repositories:  DPLA, Europeana, & Other Case StudiesHarvesting Repositories:  DPLA, Europeana, & Other Case Studies
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
 
NLW Linked Open Data Sets
NLW Linked Open Data SetsNLW Linked Open Data Sets
NLW Linked Open Data Sets
 
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Cultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data CollectionsCultural Heritage Insitutions and Big Data Collections
Cultural Heritage Insitutions and Big Data Collections
 
Open Access and Search4Dev - Harry Heemskerk - KIT
Open Access and Search4Dev - Harry Heemskerk - KITOpen Access and Search4Dev - Harry Heemskerk - KIT
Open Access and Search4Dev - Harry Heemskerk - KIT
 
Here Comes Everything
Here Comes EverythingHere Comes Everything
Here Comes Everything
 
Dane Wright, London Borough of Brent - open and linked data
Dane Wright, London Borough of Brent - open and linked dataDane Wright, London Borough of Brent - open and linked data
Dane Wright, London Borough of Brent - open and linked data
 
Putting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television CatalogPutting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television Catalog
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?
 
Chasing the Fifth Star - Open Data at the National Library of NZ
Chasing the Fifth Star - Open Data at the National Library of NZChasing the Fifth Star - Open Data at the National Library of NZ
Chasing the Fifth Star - Open Data at the National Library of NZ
 
Digital Education Resource Archive (DERA)
Digital Education Resource Archive (DERA)  Digital Education Resource Archive (DERA)
Digital Education Resource Archive (DERA)
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Linked Data: thinking big, starting small

  • 1. Department of Parliamentary Services Parliamentary Library and Information Service Linked Data: thinking big, starting small VALA 6 February 2014 Peter Neish @peterneish
  • 2. Department of Parliamentary Services Parliamentary Library and Information Service What will be covered • Background – What is Linked Data? – Linked Data in Libraries and Government • What we did – Linked Data Workflow • What did we get out of it?
  • 3. Department of Parliamentary Services Parliamentary Library and Information Service What is Linked Data? 1 October 1988 United Australia Party http://www.w3.org/ns/org#memberOf Predicate elected successorOf party Subject Object Denis Napthine Liberal Party http://parliament.vic.gov.au/members/id/135 hasRole premier successorOf http://dbpedia.org/resource/Liberal_Party_of_Australia party formationDate Ted Baillieu the triple statement slightly simplified example 31 August 1945
  • 4. Department of Parliamentary Services Parliamentary Library and Information Service
  • 5. Department of Parliamentary Services Parliamentary Library and Information Service
  • 6. Department of Parliamentary Services Parliamentary Library and Information Service
  • 7. Department of Parliamentary Services Parliamentary Library and Information Service Linked Data in Libraries • OCLC – 1.2 million resources – 80 million triples • LOC – Subject headings, authority files • British Library – 2.8 million records, 93 billion triples • BIBFRAME • Schema Bib Extend Community Group • LODLAM
  • 8. Department of Parliamentary Services Parliamentary Library and Information Service Linked Data in Parliament and Government – 6.4 billion triples of open government data
  • 9. Department of Parliamentary Services Parliamentary Library and Information Service Open Government
  • 10. Department of Parliamentary Services Parliamentary Library and Information Service Project aims • Is Linked Data useful in a local context • Explore the process of using Linked Data – where do you start? • Being able to interrogate our data in new ways • Use visualisation to gain new insights into data
  • 11. Department of Parliamentary Services Parliamentary Library and Information Service Databases at Parliament People and Organisations government agencies Documents media releases parliamentary debates (Hansard) newspaper clippings Members of Parliament Media parliamentary papers video and audio clips party policies
  • 12. Department of Parliamentary Services Parliamentary Library and Information Service Linked Data Workflow Preparation •choose ontology •investigate similar projects Clean and reconcile data •clean data (cluster, facet) •named entity extraction •reconcile with other data Publish •output RDF •store data (files, triple store etc)
  • 13. Department of Parliamentary Services Parliamentary Library and Information Service Preparation • Investigate similar projects – Don’t reinvent the wheel – Collaborate • Choose an ontology (or build your own) – Linked Data Open Vocabularies (lov.okfn.org)
  • 14. Department of Parliamentary Services Parliamentary Library and Information Service Popolo Ontology popoloproject.com • developing open government specifications relating to the legislature • prioritizes reuse over novelty • attempts to make it easy to represent real-world data • consensus model – open to contributions (W3C community group, github)
  • 15. Department of Parliamentary Services Parliamentary Library and Information Service Clean and reconcile data
  • 16. Department of Parliamentary Services Parliamentary Library and Information Service Clean and reconcile data
  • 17. Department of Parliamentary Services Parliamentary Library and Information Service Publish • create RDF (Open Refine can do this too) • store data – separate files – embedded in html – Database mapping using D2RQ – triple store
  • 18. Department of Parliamentary Services Parliamentary Library and Information Service What do we get out of it? • Combined approach – embedded data in catalogue – Fuseki Triple Store • Complex queries using SPARQL: – what have previous speakers being saying about the current issues in parliament? – find all articles about transport that mention members of the Road Safety Committee
  • 19. Department of Parliamentary Services Parliamentary Library and Information Service
  • 20. Department of Parliamentary Services Parliamentary Library and Information Service Links to related articles
  • 21. Department of Parliamentary Services Parliamentary Library and Information Service Federal Preferences 2013 Election
  • 22. Department of Parliamentary Services Parliamentary Library and Information Service Conclusion • The process itself is valuable • Aligning data with standards (Popolo Ontology) • Cleaning and reconciling adds value to data • Databases linked internally • Can now provide Linked Data externally
  • 23. Department of Parliamentary Services Parliamentary Library and Information Service Further Information Linked Data best practise and recipes • freeyourmetadata.org • linkeddatabook.com • euclid-project.eu @peterneish github.com/peterneish

Notes de l'éditeur

  1. Modelling data in such a way that computers can understandAs humans we have some understanding of what is meant by party and membership, but computers don’t, they are stupid – need formal definitions of these thingsNeed identifiers – again these are for computers, not for peopleOne database record could have hundreds of triple statementsOnce it is linked in a graph and put on the web lots of interesting things become possible
  2. Finding cheesecake recipesPossible because data is marked up semantically behind the scenesTrivial example, but search is one of the primary drivers of Linked Data technologiesMicrosoft, Google and Yahoo have agreed on a schema – schema.org – cannot be ignored
  3. I used to work at the Royal Botanic Gardens here in Melbourne where we worked really hard with other botanic gardens to link up data across statesProblem was that names would vary across state boundaries – Linked Data was the answer and this underpins the Atlas of Australia which links up data on all living this held in Australian botanic gardens and museums.
  4. Biomedical area has been an early adopter of Linked DataLinking gene sequences, proteins, drugs and clinical trial new discoveries can be made
  5. Libraries have been very active too e.g. OCLC, Library of Congress, British LibraryAlso groups working out the best ways to work with Linked Data and bibliographic recordsBIBFRAME concerned with using Linked Data to describe collections and the entire cataloguing processSchema Bib Extend – concerned with making sure bibliographic information is discoverable by working to get bibliographic information encoded in schema.orgLODLAM – Linked Open Data in Libraries Archives and Museums
  6. Move to Open Government where governments release data they have collected (which has been paid for by the tax payer)Makes governments more accountableOthers can use the data to build applications
  7. Non government organisations have appeared that strive to make governments more accountableOpenAustralia – republishes Hansard from the federal parliamentSunlight Foundation in US – many applications that track activities in CongressOpenNorth in CanadamySociety - UK
  8. How useful is Linked Data in a local context – is that enough to justify investing in the technology?
  9. Databases are grouped into three related groups “People and organisations” and “Digital Resources”Currently on a variety of platforms, mainly DB/Textworks, but also MySQL and KE Texpress
  10. Workflow to implement Linked Data – there are links to best practises at end of the presentation
  11. Natural tendency to think that our institution is unique and has unique requirements – probably not the caseInvestigate, find others doing the same thing and possibly collaborateChoosing an ontology requires decisions to be made about how to describe the things in our database using a standard ontologyOne tool that is useful is a site from the open knowledge foundation that lets you search about 400 well known ontologies
  12. Popolo was aligned with our dataGovernments are complex – there are members, houses of parliament, legislation, acts, bills, speeches – all this needs to be modelled in a standard wayIf we can agree on a standard then we can collaborate on tools as well as the standardsWe have been able to test our data against the ontology and provide feedback on where it falls short.
  13. Open Refine (formerly Google Refine) is an amazing tool for cleaning and reconciling data.Excellent introductory and tutorial videos available on the open refine site http://openrefine.org/In brief, can cluster and facet to find duplicate values or slight misspellings or syntax errors
  14. Reconciling is about taking values stored as strings in your database and linking them to an authoritative source – this could be dbpedia or freebase or the library of congress subject headings.Your term can be matched against the authoritative source and the best match chosen. You now have the identifier that creates a link between your data and the authoritative source
  15. Google refine can do this too.See links at end on best practises and guides for publishing Linked Data.
  16. Fuseki (JENA project) using a TDB data store.How would these queries have been done before? First look up all the previous speakers, then query our database for each of them and combine the resultsNow our system knows which members have had the role of Speaker and we can use this in our query – SPARQL is a query language for querying triple store databases.
  17. Shows the topics of media releases from the last year – colour is for party and the size is proportional to the number of media releases.Can quickly see what the important topics are for the week / month etc
  18. We are using some semantic tools to identify entities in our content and can easily link these to other similar items
  19. With an election coming up in November this year we are keen to explore how we can link our information with that from the electoral commissionWe should be able to link up electorates, demographics, news items, candidates, policies etc.This visualisation was possible because the Australian Electoral commission produces data on elections in a standard format. In this case I was able to use the data to get a better insight into how preferences were being swapped between parties.