SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Library linked data
LIS 551
Dorothea Salo
DPLA• What is it?
• Where are its materials coming from?
• Where is its metadata coming from?
• What does that tell us about the metadata?
• How do you think they’ll collect the metadata?
• What will they need to do with the metadata, once
collected?
• What problems will they run into, do you think?
Some eternal verities
• What’s in our catalogs isn’t all the
metadata (broad sense) we have.
• BLASPHEMY: a lot of that catalog metadata probably
isn’t even the most important metadata academic
libraries have! Why might that be?
• Possibly not the most prolific source of metadata
either. This will be truer as time passes. Why?
• What about public libraries? Archives?
• The rest of our metadata exists in many
forms and formats.
• The major, often only, form of interaction
with our metadata is computer-mediated.
• Other people have metadata too!
Practical implications
• We need to design standards and practices around
what computers do well, and what they need in
order to do what they do.
• We need to design for being PART of the data
universe, not all of it.
• “open world assumption:” no one body has all the data! or all the
answers!
• And nobody can impose their view of the world on everybody
else. (Fortunately, nobody necessarily has to.)
• Designing for consistency, flexibility and
extensibility without sacrificing comprehensibility
• (this is a tall order; we’re not there yet. is anyone?)
Things computers like
• Unique identifiers
• for anything you plan to discuss or refer to
• that NEVER CHANGE OR DISAPPEAR. (Sorry, name-authority strings.)
• How do we do this given the open-world assumption?
• Consistent, predictable, human-language-independent
data
• Free text (including punctuation) makes computers sad. They aren’t
human. They don’t understand it. They can be cued to PRODUCE it, but
only based on rules they’re given about the underlying data.
• Computers produce typography and layout, but don’t understand
those, either.
• Controlled vocabularies
• (If they’re well-provisioned with identifiers; see above.)
We have
and we both love and hate them.
Photo: Doc Searls, “silos,” http://www.flickr.com/photos/docsearls/5500714140/ CC-BY
So how can we
de-silo-ize library data?
Possibility 1:
One standard to rule them all
• Issues with this?
• Technical issues
• Quality issues
• Language issues
• Sociological issues
• Who’s trying this? On what level?
Possibility 2:
Metasearch
• Issues with this?
• Technical issues
• Quality issues
• Sociological issues
• Who’s trying this? On what level?
Diagram: Angela Pratesi and Kalsang (by permission)
Possibility 2:
Metasearch
• Issues with this?
• Technical issues
• Quality issues
• Language issues
• Sociological issues
• Who’s trying this? On what level?
Possibility 3:
Big metadata bucket
• Issues with this?
• Technical issues
• Quality issues
• Sociological issues
• Who’s trying this? On what level?
Diagram: Angela Pratesi and Kalsang (by permission)
Possibility 3:
Big metadata bucket
• Issues with this?
• Technical issues
• Quality issues
• Language issues
• Sociological issues
• Who’s trying this? On what level?
How do you make a big
metadata bucket?
• Given...
• Different file formats (XML, relational-database,
Excel, plain-text, etc)
• Different structures with different granularity
• Different standards... or no standard at all
• Different controlled vocabularies... or none
• One option: the Google route
• But what do we lose there?
Crosswalking: the nxn problem
• As you build your bucket, you find that
people are using n metadata standards.
• You decide you want to be able to translate
any of them into any of the others.
• Guess what? You need to write nxn-n
(nearly n2) crosswalks.
• This gets impossibly unwieldy very quickly. How many
metadata standards do you know about, just from
this class?
• And how compatible will the standards be, anyway?
Okay, okay, master
standard, then!
• Crosswalk everything you take in to one
standard. Then you only need to write n
crosswalks!
• Issues with this?
• Technical issues
• Quality issues
• Language issues
• Sociological issues
Is there a better way?
... Maybe?
Five stars of linked data
(the first three, at least)
Sir Tim Berners-Lee:
Review: URLs as identifiers
• Where have we seen this already?
• Why URLs?
• What library-type stuff has already been
identified with URLs?
• What would need to be, do you think?
So, seriously...
• Every term in every controlled vocabulary, every
element in every metadata standard, every
“document” we might ever talk about (in all its
FRBRish permutations) needs its own URL?
• SERIOUSLY?
• ... basically, yep.
• Not every time. (Dates are dates. Human names are strings.)
• It gets worse, though: XML-based languages use element
nesting to carry meaning, and relational databases use table
membership and data typing. How do you translate THOSE
to URLs?
Example 1: Authority URIs
Example 2: Dublin Core
concepts
Use URIs in MODS!
The fundamental strategy
• Break down everything we can say about
the world into the smallest units of
meaning we can manage.
• That’s smaller than you’d think, as we’ll see!
• Build up search indexes, user displays,
and machine interactions from there.
• I’m being vague about “machine interactions.” Don’t
take that to mean they aren’t important! They’re
just a bit more than I can explain here and now.
• Try not to reinvent wheels.
• But if you must, make sure to link new and old.
Smallest units of meaning:
are these them?
Okay, so we have a
bunch of URIs.
What do we actually DO with them?
We plug them into RDF.
... vocabulary note
• “Semantic Web:” Tim Berners-Lee disappearing
into his own navel.
• Term is a bit out-of-favor these days.
• “Linked data:” a real-world effort to make large
datastores more interoperable
• RDF: invented by the SemWebbers, now a
cornerstone for linked data
• Does this mean that all data will be stored as RDF? NO, IT
DOES NOT (and you have my permission to slap anybody
who says it will).
• Totally possible to provide an RDF view onto non-RDF data,
IF AND ONLY IF the data structures and meanings are
thought through in an RDFfy way.
What to do with URIs
• RDF’s answer: “We say things about stuff.”
• At base, RDF really is that simple!
• Base unit of RDF: “triple”
• Subject, property, value/object. Much like subject-verb-
object in English sentence.
• Example: “Dorothea Salo is the author of ‘Innkeeper at the
Roach Motel.’”
Dorothea Salo
“Innkeeper at the
Roach Motel”
isAuthorOf
... wait. Where’d all the URLs go?
http://
digital.library.wisc.edu/
1793/22088
http://viaf.org/viaf/
21599115/
URL-izing a triple
“Innkeeper at the
Roach Motel”
isAuthorOf
Dorothea Salo
vocabularies! with URIs!
dcterms:creator http://
digital.library.wisc.edu/
1793/22088
http://viaf.org/viaf/
21599115/
URL-izing a triple
isAuthorOf
Building up from triples
Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
... which can get tangled
Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
But... but...
• What if the same thing has two URIs?
• Foreseen problem! There are ways for linked data to express
URI equivalences... though there are huge arguments about
when two URIs are really-truly equivalent.
• My sense is that this decision is contextual. (AKA: “will
Amazon.com use FRBR?”) What’s equivalent for your
purposes may not be for mine. And that’s okay!
• Where do we get URIs from?
• This will be part of the new cataloging infrastructure a-
borning, but the answer works out to “a lot of the same
places we already get authority information and catalog
records from,” e.g. VIAF.
• But we’re no longer LIMITED to just those! Key point. Think
about ORCID!
But... but...
• Where’s the record? And standards for
the record?
• The record is what we make it! What’s useful to us,
we use. What isn’t, we ignore. That’s how the open
world assumption works.
• If we need to impose rules on the data we’ll be
putting out there (and we probably do!), there are
ways to do that.
• We just can’t expect to impose those ways on
anybody else. (Though we can put our rules out
there for others to follow, and we probably should!)
Trust: an unsolved problem
• Review: what happened with <meta> tags
on the web?
• Right. What’s to stop the same thing
happening in a linked-data environment?
• What’s to stop me from writing a triple that says
I’m Tchaikovsky?
• For our purposes? We’ll pick and choose the
vocabularies and domains we trust, I expect, just
as we already do.
Fine. Whatever.
So is anybody actually
DOING
this linked-data stuff?
Yes.And we’ll talk about that next week!
Thanks!
• Copyright 2013 by Dorothea Salo.
• This lecture and slide deck are licensed
under a Creative Commons Attribution
3.0 United States License.
• Please respect ownership and licensing
of included materials. Thanks!

Contenu connexe

Tendances

Jim Hendler's Presentation at SSSW 2011
Jim Hendler's Presentation at SSSW 2011Jim Hendler's Presentation at SSSW 2011
Jim Hendler's Presentation at SSSW 2011sssw2011
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Jon Voss
 
Web 3.0 explained with a stamp (pt I: the basics)
Web 3.0 explained with a stamp (pt I: the basics)Web 3.0 explained with a stamp (pt I: the basics)
Web 3.0 explained with a stamp (pt I: the basics)Freek Bijl
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic WebJuan Sequeda
 
Linked Data an Introduction
Linked Data an IntroductionLinked Data an Introduction
Linked Data an IntroductionTalis Consulting
 
Semantic Web: an Introduction
Semantic Web: an IntroductionSemantic Web: an Introduction
Semantic Web: an IntroductionLuigi De Russis
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011sssw2011
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic WebJohn Breslin
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)James Hendler
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the CloudGeorgina Goodlander
 
Linked Data Integration and semantic web
Linked Data Integration and semantic webLinked Data Integration and semantic web
Linked Data Integration and semantic webDiego Pessoa
 
IOT, Real World Things, & Linked data
IOT, Real World Things, & Linked dataIOT, Real World Things, & Linked data
IOT, Real World Things, & Linked datarobin fay
 
Linked Open Govt Data - Sem Tech East
Linked Open Govt Data - Sem Tech EastLinked Open Govt Data - Sem Tech East
Linked Open Govt Data - Sem Tech EastJames Hendler
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebGIS Colorado
 
Joy Nelson - BIBFRAME: MARC Replacement and Much More
Joy Nelson - BIBFRAME: MARC Replacement and Much MoreJoy Nelson - BIBFRAME: MARC Replacement and Much More
Joy Nelson - BIBFRAME: MARC Replacement and Much MoreKohaGruppoItaliano
 
They have left the building: The Web Route to Library Users
They have left the building: The Web Route to Library UsersThey have left the building: The Web Route to Library Users
They have left the building: The Web Route to Library UsersRichard Wallis
 
IIIF, Linked Data and the Getty Vocabularies
IIIF, Linked Data and the Getty VocabulariesIIIF, Linked Data and the Getty Vocabularies
IIIF, Linked Data and the Getty VocabulariesRobert Sanderson
 
Social Machines Oxford Hendler
Social Machines Oxford HendlerSocial Machines Oxford Hendler
Social Machines Oxford HendlerJames Hendler
 

Tendances (20)

Jim Hendler's Presentation at SSSW 2011
Jim Hendler's Presentation at SSSW 2011Jim Hendler's Presentation at SSSW 2011
Jim Hendler's Presentation at SSSW 2011
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
 
Web 3.0 explained with a stamp (pt I: the basics)
Web 3.0 explained with a stamp (pt I: the basics)Web 3.0 explained with a stamp (pt I: the basics)
Web 3.0 explained with a stamp (pt I: the basics)
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic Web
 
Linked Data an Introduction
Linked Data an IntroductionLinked Data an Introduction
Linked Data an Introduction
 
Semantic Web: an Introduction
Semantic Web: an IntroductionSemantic Web: an Introduction
Semantic Web: an Introduction
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic Web
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the Cloud
 
Linked Data Integration and semantic web
Linked Data Integration and semantic webLinked Data Integration and semantic web
Linked Data Integration and semantic web
 
IOT, Real World Things, & Linked data
IOT, Real World Things, & Linked dataIOT, Real World Things, & Linked data
IOT, Real World Things, & Linked data
 
Linked Open Govt Data - Sem Tech East
Linked Open Govt Data - Sem Tech EastLinked Open Govt Data - Sem Tech East
Linked Open Govt Data - Sem Tech East
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Joy Nelson - BIBFRAME: MARC Replacement and Much More
Joy Nelson - BIBFRAME: MARC Replacement and Much MoreJoy Nelson - BIBFRAME: MARC Replacement and Much More
Joy Nelson - BIBFRAME: MARC Replacement and Much More
 
Semantic web Santhosh N Basavarajappa
Semantic web   Santhosh N BasavarajappaSemantic web   Santhosh N Basavarajappa
Semantic web Santhosh N Basavarajappa
 
MyLifeBits van Microsoft
MyLifeBits van MicrosoftMyLifeBits van Microsoft
MyLifeBits van Microsoft
 
They have left the building: The Web Route to Library Users
They have left the building: The Web Route to Library UsersThey have left the building: The Web Route to Library Users
They have left the building: The Web Route to Library Users
 
IIIF, Linked Data and the Getty Vocabularies
IIIF, Linked Data and the Getty VocabulariesIIIF, Linked Data and the Getty Vocabularies
IIIF, Linked Data and the Getty Vocabularies
 
Social Machines Oxford Hendler
Social Machines Oxford HendlerSocial Machines Oxford Hendler
Social Machines Oxford Hendler
 

En vedette

DataUp Presentation at Cal Poly
DataUp Presentation at Cal PolyDataUp Presentation at Cal Poly
DataUp Presentation at Cal PolyCarly Strasser
 
Embrace The Chaos
Embrace The ChaosEmbrace The Chaos
Embrace The Chaosjonphipps
 
Aligning library services with emerging research data needs
Aligning library services with emerging research data needsAligning library services with emerging research data needs
Aligning library services with emerging research data needsAndrew Sallans
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management EcosystemJohn Kunze
 
MozCon 2013 Recap - Day One
MozCon 2013 Recap - Day OneMozCon 2013 Recap - Day One
MozCon 2013 Recap - Day OneKane Jamison
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
DCMI/RDA Task Group Report, DC-2010 Pittsburgh
DCMI/RDA Task Group Report, DC-2010 PittsburghDCMI/RDA Task Group Report, DC-2010 Pittsburgh
DCMI/RDA Task Group Report, DC-2010 PittsburghDiane Hillmann
 
Unicorns and Other Wild Things
Unicorns and Other Wild ThingsUnicorns and Other Wild Things
Unicorns and Other Wild ThingsAlberta Soranzo
 

En vedette (20)

DataUp Presentation at Cal Poly
DataUp Presentation at Cal PolyDataUp Presentation at Cal Poly
DataUp Presentation at Cal Poly
 
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
 
Embrace The Chaos
Embrace The ChaosEmbrace The Chaos
Embrace The Chaos
 
NISO/DCMI Webinar: Metadata Harmonization: Making Standards Work Together
NISO/DCMI Webinar: Metadata Harmonization: Making Standards Work TogetherNISO/DCMI Webinar: Metadata Harmonization: Making Standards Work Together
NISO/DCMI Webinar: Metadata Harmonization: Making Standards Work Together
 
Aligning library services with emerging research data needs
Aligning library services with emerging research data needsAligning library services with emerging research data needs
Aligning library services with emerging research data needs
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
MozCon 2013 Recap - Day One
MozCon 2013 Recap - Day OneMozCon 2013 Recap - Day One
MozCon 2013 Recap - Day One
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
DCMI/RDA Task Group Report, DC-2010 Pittsburgh
DCMI/RDA Task Group Report, DC-2010 PittsburghDCMI/RDA Task Group Report, DC-2010 Pittsburgh
DCMI/RDA Task Group Report, DC-2010 Pittsburgh
 
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
 NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti... NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
Unicorns and Other Wild Things
Unicorns and Other Wild ThingsUnicorns and Other Wild Things
Unicorns and Other Wild Things
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
NISO DCMI Webinar bibframe-20130123
NISO DCMI Webinar bibframe-20130123NISO DCMI Webinar bibframe-20130123
NISO DCMI Webinar bibframe-20130123
 
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
 
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
 
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
 
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
 
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
 

Similaire à Library Linked Data

MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesDorothea Salo
 
Write a better FM
Write a better FMWrite a better FM
Write a better FMRich Bowen
 
Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Uche Ogbuji
 
Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Rich Bowen
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesDorothea Salo
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsDorothea Salo
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBlehresman
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of librariesRegan Harper
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsLeighton Pritchard
 
Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Miningebelani
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics Hackathon
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics HackathonxAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics Hackathon
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics HackathonRussell Duhon
 
Online Citation Tools
Online Citation ToolsOnline Citation Tools
Online Citation Toolswill wade
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of librariesRegan Harper
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAsDorothea Salo
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 

Similaire à Library Linked Data (20)

Metadata
MetadataMetadata
Metadata
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
 
Write a better FM
Write a better FMWrite a better FM
Write a better FM
 
Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)
 
Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011Write A Better FM - Ohio Linux 2011
Write A Better FM - Ohio Linux 2011
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanities
 
Wither OWL
Wither OWLWither OWL
Wither OWL
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
 
FRBR and RDA
FRBR and RDAFRBR and RDA
FRBR and RDA
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDB
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Mining
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics Hackathon
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics HackathonxAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics Hackathon
xAPI Vocabulary Stone Soup: LAK 2016 JISC Learning Analytics Hackathon
 
Online Citation Tools
Online Citation ToolsOnline Citation Tools
Online Citation Tools
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of libraries
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAs
 
Code4Lib Keynote 2011
Code4Lib Keynote 2011Code4Lib Keynote 2011
Code4Lib Keynote 2011
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 

Plus de Dorothea Salo

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Dorothea Salo
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Dorothea Salo
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and librariesDorothea Salo
 
Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditingDorothea Salo
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)Dorothea Salo
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Dorothea Salo
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly CommunicationDorothea Salo
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Dorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!Dorothea Salo
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!Dorothea Salo
 
Taming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsTaming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsDorothea Salo
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's WayDorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
Databases, Markup, and Regular Expressions
Databases, Markup, and Regular ExpressionsDatabases, Markup, and Regular Expressions
Databases, Markup, and Regular ExpressionsDorothea Salo
 

Plus de Dorothea Salo (20)

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
 
Encryption
EncryptionEncryption
Encryption
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and libraries
 
Paying for it
Paying for itPaying for it
Paying for it
 
Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditing
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
What We Organize
What We OrganizeWhat We Organize
What We Organize
 
Occupy Copyright!
Occupy Copyright!Occupy Copyright!
Occupy Copyright!
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!
 
Taming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsTaming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation Tools
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
Open Content
Open ContentOpen Content
Open Content
 
Databases, Markup, and Regular Expressions
Databases, Markup, and Regular ExpressionsDatabases, Markup, and Regular Expressions
Databases, Markup, and Regular Expressions
 

Library Linked Data

  • 1. Library linked data LIS 551 Dorothea Salo
  • 2. DPLA• What is it? • Where are its materials coming from? • Where is its metadata coming from? • What does that tell us about the metadata? • How do you think they’ll collect the metadata? • What will they need to do with the metadata, once collected? • What problems will they run into, do you think?
  • 3. Some eternal verities • What’s in our catalogs isn’t all the metadata (broad sense) we have. • BLASPHEMY: a lot of that catalog metadata probably isn’t even the most important metadata academic libraries have! Why might that be? • Possibly not the most prolific source of metadata either. This will be truer as time passes. Why? • What about public libraries? Archives? • The rest of our metadata exists in many forms and formats. • The major, often only, form of interaction with our metadata is computer-mediated. • Other people have metadata too!
  • 4. Practical implications • We need to design standards and practices around what computers do well, and what they need in order to do what they do. • We need to design for being PART of the data universe, not all of it. • “open world assumption:” no one body has all the data! or all the answers! • And nobody can impose their view of the world on everybody else. (Fortunately, nobody necessarily has to.) • Designing for consistency, flexibility and extensibility without sacrificing comprehensibility • (this is a tall order; we’re not there yet. is anyone?)
  • 5. Things computers like • Unique identifiers • for anything you plan to discuss or refer to • that NEVER CHANGE OR DISAPPEAR. (Sorry, name-authority strings.) • How do we do this given the open-world assumption? • Consistent, predictable, human-language-independent data • Free text (including punctuation) makes computers sad. They aren’t human. They don’t understand it. They can be cued to PRODUCE it, but only based on rules they’re given about the underlying data. • Computers produce typography and layout, but don’t understand those, either. • Controlled vocabularies • (If they’re well-provisioned with identifiers; see above.)
  • 6. We have and we both love and hate them. Photo: Doc Searls, “silos,” http://www.flickr.com/photos/docsearls/5500714140/ CC-BY
  • 7. So how can we de-silo-ize library data?
  • 8. Possibility 1: One standard to rule them all • Issues with this? • Technical issues • Quality issues • Language issues • Sociological issues • Who’s trying this? On what level?
  • 9. Possibility 2: Metasearch • Issues with this? • Technical issues • Quality issues • Sociological issues • Who’s trying this? On what level? Diagram: Angela Pratesi and Kalsang (by permission)
  • 10. Possibility 2: Metasearch • Issues with this? • Technical issues • Quality issues • Language issues • Sociological issues • Who’s trying this? On what level?
  • 11. Possibility 3: Big metadata bucket • Issues with this? • Technical issues • Quality issues • Sociological issues • Who’s trying this? On what level? Diagram: Angela Pratesi and Kalsang (by permission)
  • 12. Possibility 3: Big metadata bucket • Issues with this? • Technical issues • Quality issues • Language issues • Sociological issues • Who’s trying this? On what level?
  • 13. How do you make a big metadata bucket? • Given... • Different file formats (XML, relational-database, Excel, plain-text, etc) • Different structures with different granularity • Different standards... or no standard at all • Different controlled vocabularies... or none • One option: the Google route • But what do we lose there?
  • 14. Crosswalking: the nxn problem • As you build your bucket, you find that people are using n metadata standards. • You decide you want to be able to translate any of them into any of the others. • Guess what? You need to write nxn-n (nearly n2) crosswalks. • This gets impossibly unwieldy very quickly. How many metadata standards do you know about, just from this class? • And how compatible will the standards be, anyway?
  • 15. Okay, okay, master standard, then! • Crosswalk everything you take in to one standard. Then you only need to write n crosswalks! • Issues with this? • Technical issues • Quality issues • Language issues • Sociological issues
  • 16. Is there a better way? ... Maybe?
  • 17. Five stars of linked data (the first three, at least) Sir Tim Berners-Lee:
  • 18. Review: URLs as identifiers • Where have we seen this already? • Why URLs? • What library-type stuff has already been identified with URLs? • What would need to be, do you think?
  • 19. So, seriously... • Every term in every controlled vocabulary, every element in every metadata standard, every “document” we might ever talk about (in all its FRBRish permutations) needs its own URL? • SERIOUSLY? • ... basically, yep. • Not every time. (Dates are dates. Human names are strings.) • It gets worse, though: XML-based languages use element nesting to carry meaning, and relational databases use table membership and data typing. How do you translate THOSE to URLs?
  • 21. Example 2: Dublin Core concepts
  • 22. Use URIs in MODS!
  • 23. The fundamental strategy • Break down everything we can say about the world into the smallest units of meaning we can manage. • That’s smaller than you’d think, as we’ll see! • Build up search indexes, user displays, and machine interactions from there. • I’m being vague about “machine interactions.” Don’t take that to mean they aren’t important! They’re just a bit more than I can explain here and now. • Try not to reinvent wheels. • But if you must, make sure to link new and old.
  • 24. Smallest units of meaning: are these them?
  • 25. Okay, so we have a bunch of URIs. What do we actually DO with them? We plug them into RDF.
  • 26. ... vocabulary note • “Semantic Web:” Tim Berners-Lee disappearing into his own navel. • Term is a bit out-of-favor these days. • “Linked data:” a real-world effort to make large datastores more interoperable • RDF: invented by the SemWebbers, now a cornerstone for linked data • Does this mean that all data will be stored as RDF? NO, IT DOES NOT (and you have my permission to slap anybody who says it will). • Totally possible to provide an RDF view onto non-RDF data, IF AND ONLY IF the data structures and meanings are thought through in an RDFfy way.
  • 27. What to do with URIs • RDF’s answer: “We say things about stuff.” • At base, RDF really is that simple! • Base unit of RDF: “triple” • Subject, property, value/object. Much like subject-verb- object in English sentence. • Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’” Dorothea Salo “Innkeeper at the Roach Motel” isAuthorOf ... wait. Where’d all the URLs go?
  • 28. http:// digital.library.wisc.edu/ 1793/22088 http://viaf.org/viaf/ 21599115/ URL-izing a triple “Innkeeper at the Roach Motel” isAuthorOf Dorothea Salo vocabularies! with URIs!
  • 30. Building up from triples Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
  • 31. ... which can get tangled Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
  • 32. But... but... • What if the same thing has two URIs? • Foreseen problem! There are ways for linked data to express URI equivalences... though there are huge arguments about when two URIs are really-truly equivalent. • My sense is that this decision is contextual. (AKA: “will Amazon.com use FRBR?”) What’s equivalent for your purposes may not be for mine. And that’s okay! • Where do we get URIs from? • This will be part of the new cataloging infrastructure a- borning, but the answer works out to “a lot of the same places we already get authority information and catalog records from,” e.g. VIAF. • But we’re no longer LIMITED to just those! Key point. Think about ORCID!
  • 33.
  • 34. But... but... • Where’s the record? And standards for the record? • The record is what we make it! What’s useful to us, we use. What isn’t, we ignore. That’s how the open world assumption works. • If we need to impose rules on the data we’ll be putting out there (and we probably do!), there are ways to do that. • We just can’t expect to impose those ways on anybody else. (Though we can put our rules out there for others to follow, and we probably should!)
  • 35. Trust: an unsolved problem • Review: what happened with <meta> tags on the web? • Right. What’s to stop the same thing happening in a linked-data environment? • What’s to stop me from writing a triple that says I’m Tchaikovsky? • For our purposes? We’ll pick and choose the vocabularies and domains we trust, I expect, just as we already do.
  • 36. Fine. Whatever. So is anybody actually DOING this linked-data stuff?
  • 37. Yes.And we’ll talk about that next week!
  • 38. Thanks! • Copyright 2013 by Dorothea Salo. • This lecture and slide deck are licensed under a Creative Commons Attribution 3.0 United States License. • Please respect ownership and licensing of included materials. Thanks!