3. Welcome
• A chance to share knowledge, expertise, perspective; explore ideas
• Goal: a solid, basic, conceptual understanding of Linked Open Data
• Breaking it down into 3, and maybe 4 parts
• #lodlam teaser: http://youtu.be/YdrVI7emnt4
4. Linked Open Data in Cultural Context
• It’s not just Libraries, Archives &
Museums
• Linked Open Data has evolved in
the cultural context of shared
information, music, movies
• From rock to rap to hip-hop to
mashups
• Changing expectations from
audiences, curators, technologists
• http://mashupbreakdown.com/
5. LODLAM is a Growing Movement
• in its infancy, but picking up steam
• it requires experimentation
• small, niche, domain-specific implementations
• use cases, reasons for content providers to get excited about contributing
6. LODLAM is a product of our increasingly
connected culture.
• it’s an unfolding story, but it’s awn...
• first funded projects in the US exploring Linked Open Data in the
humanities now underway: http://lod-lam.net
• 100 people gathering from around the world to forward LODLAM in the
next year
7. LODLAM is a product of our increasingly
connected culture.
• and that’s just the beginning...
Linked Open Data
9. Going from Tables to Graphs
• Our data and databases have been organized in tables
• which works, but only to a point
http://www.flickr.com/photos/thomasjwoods-com/2264301251
10. Going from Tables to Graphs
• The World Wide Web is much more like a graph, and the ability to link to
disparate datasets relies on our ability to understand data as nodes and
links in a graph
11. Going from Tables to Graphs
• As computing power increases, the ability to build more and more complex
graphs becomes a reality.
• Human vs. Machine readable
msulibraries lookbackmaps
msulibraries internetarchive
msulibraries librarycongress
lookbackmaps internetarchive
internetarchive librarycongress
12. Introducing Triples
• Quite simply: Subject, Predicate, Object
• gives us the ability to describe entities in a way that is machine readable
knows
jonvoss copystar
Nodes & Links
13. What do we know about the person: Ed Summers
(aside from the fact that he rocks)?
Bio: Hacker for libraries, digital archaeologist, pragmatist.
bio
knows
depiction of
knows
http://inkdroid.org/ehs.rdf
14. A quick word about vocabularies
• Caution: what libraries call vocabularies is not necessarily what we mean...
• This is how we organize information and triangulate the data we’re looking
for
• How we agree on predicates
• Ontologies like FOAF, OWL, http://id.loc.gov/, VIAF, etc.
15. Triples for machines
• now we’re ready to talk to machines
• triples can be serialized in many different ways, including Resource
Description Framework, RDF/XML, RDFa, N3, Turtle, etc, but they all
describe things in the <subject><predicate><object> format.
• of course, we need to be consistent and predictable for machines to
understand us.
• More info from old Semantic Focus article
17. Tim Berners-Lee’s 4 rules of Linked Data
• http://www.w3.org/DesignIssues/LinkedData.html
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards
(RDF*, SPARQL)
4. Include links to other URIs. so that they can discover more things.
18. Now that we can see the code...
• RDF at Open Library (search for
Civil War regiments: http://
openlibrary.org/search?
q=regiment&has_fulltext=true&tim
e_facet=Civil+War%2C
+1861-1865)
• @musebrarian’s Of Ships and Men
project. http://bit.ly/h8W2yl
(vocabulary: minting uri’s)
• Advanced: Ed Summer’s SNAC
hacks post: http://inkdroid.org/
journal/2011/03/31/snac-hacks/
19. Tim Berners-Lee 2010 Ted Talk
• what people are doing with Linked Data
• http://www.ted.com/talks/
tim_berners_lee_the_year_open_data_went_worldwide.html
20. Civil War Data 150
• consider graph demo: http://civilwardata150.net
• Civil War vocabulary, or a way to link and traverse across datasets
• Regiments, battles, Freebase military schema
• Building apps
• How tools like Simile/Exhibit can use Linked Data in coordination with
Freebase (Conflict History: http://conflicthistory.com/#/period/
1861-1865/conflict/+en+american_civil_war)
21. In summary
Linked
• Graphs
• Human AND Machine readable
• Vocabulary, agreed terms for organizing info
• Triples, RDF
23. The “Open” part of Linked Open Data
Open
• 5 Stars
• Considerations and ramifications
• Difference between shared, published, open
• Legal tools
• Precedents/Examples
24. Tim Berners-Lee: 5 Stars of Linked Data
• More thanks to Ed Summers: http://inkdroid.org/journal/2010/06/04/the-5-
stars-of-open-linked-data/
• This is NOT all or nothing
25. Expose yourself, be vulnerable
• This is the major cultural shift, the tide rising amongst institutions, that data
wants to be free in a culture economy.
• There is value in sharing
• It does require a leap of faith, but risks and rewards should be carefully
considered and calculated
• Excellent resource: JISC Open Bibliographic Data Guide http://
obd.jisc.ac.uk/
26. What will happen to your data?
• If you want people to do something with your data/metadata, you have to
put it out there
• But once you do, it’s [mostly] out of your control. Yet it can be a part of
something much greater than any of the component parts
• Roots and Wings
27. What will happen to your data?
• working with Open Data from
NOAA at wherecamp 2011. http://
www.nauticalcharts.noaa.gov/
history/CivilWar/
28. Metadata vs. data, assets, digital surrogates
• A key conceptual shift with Open Data is looking at metadata and data as
two separate things, that can have different licensing and permissions
30. Creative Commons
• In the last several years, Creative Commons have provided standardized,
portable legal tools that make it easier for individuals and institutions to
use.
• http://creativecommons.org/licenses/
Open Data Published Data
CC-BY
CC-BY-NC-ND
CC0
CC-BY-NC
Public Domain Mark
CC-BY-ND
CC-BY-SA
CC-BY-NC-SA
31. Open Data Commons
• ODC Public Domain Dedication and License
• http://www.opendatacommons.org/licenses/
• Building tools with a focus on databases
• May need a graphic artist?
32. Concerns and Limitations
• There is some argument about whether or not metadata can be protected
under copyright at all. Copyright protects a creative work, and some argue
that metadata is scientific fact, rather than creative work.
• Databases are protected differently in the EU and US, for example.
• Public Domain and No Known Copyright...
• Issuing blanket copyright over all works on a website, even though some
may be in the public domain
• Institutions that will not issue any kind of copyright due to concerns or
questions about ownership and copyright
33. Examples and precedents
• Bibliographic data:
• British Library (CC0), University of Michigan (CC0), Stanford (CC-BY)
have published large, raw datasets of bibliographic data they have
created (being careful not to publish OCLC or other vendor controlled or
licensed metadata)
34. Examples and precedents
• Civil War Data 150
• Metadata from contributing federal institutions are largely considered
to be Public Domain.
• State, local, university & individual researchers are considering
policies for metadata publishing on a case by case basis.
35. Sciences leading the way vs. Humanities
• In the sciences, there have been a lot of advances in the realm of Open
Data, which will provide models for humanities research as well
• Nano Publishing: the idea of publishing datasets separately from
research findings, so that it can more easily be built upon and integrated
into other datasets. Several scientific journals have already started this.
• Federally funded medical research must have a data management plan
and some funders are requiring that data be published separately from
analysis and findings as Open Data
36. In summary
Open
• put it out there... 5 stars
• published, shared, and/or open
• tools
• metadata vs. assets
38. Raw Data Now...
Open
• Looking at Civil War Data 150 workflow and strategy
• http://www.civilwardata150.net/join
• How we plan to take various datasets and:
• Clean
• Reconcile/Vocabulary Alignment
• Publish triples
39. Raw Data Now...
• One of our inspirations for this sort of workflow:
• Data.gov Wiki from RPI
• http://data-gov.tw.rpi.edu/wiki
40. Google Refine
• A tool for large datasets, cleaning and reconciling
• http://code.google.com/p/google-refine/
• Extremely powerful, though scripting language has not yet been very well
documented.
• Enables you to reconcile data against the 20 million + known entities in
Freebase
41. Sandbox
• Depending on time and interest,
some possibilities
• Demo Refine, or break into small
groups to work with datasets
• Look at MQL/SPARQL queries as
the next step of interacting with
the Global Graph
42. What Would You Do?
• Conceptualizing domains, Linked Open Data projects, collaborations, etc
43. Join the LODLAM movement
• http://groups.google.com/group/lod-lam
• #lodlam hashtag on Twitter
• http://lod-lam.net proceedings online and on the road for the next year at
various annual meetings and conferences
• Contribute!
It started for me with the book Linked, which was first published in 2002. I don&#x2019;t think I read it until 2003 or so, but it changed my life. The explanations of mathematical graph and network theory in lay terms helped me to see how an understanding of interconnectedness would allow us to do amazing things with the disparate datasets around us. \n
\n
\n
\n
\n
\n
\n
\n
Where did we get all that info about Ed? He published it here.\n