Developments in catalogues and data sharing

•Download as PPT, PDF•

3 likes•1,409 views

1. The document discusses the evolving nature of library catalogues as data is increasingly shared and consumed outside of traditional library systems. It explores how catalog data is being transformed, merged with other data sources, and used in new ways. 2. Key points addressed include the release of library catalogue data using open standards like RDF and Linked Data, as well as initiatives to make metadata more accessible to developers and the public. Challenges around aging data formats and the need for more community involvement in metadata standards are also covered. 3. The future may include greater programmatic access to catalog data through APIs, as well as new lightweight metadata schemas that better support open data practices and the needs of non-library users.

Education Technology

[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Our local catalogues National / international aggregations Joe Public Teenage software developer / hacker Booksellers Web start-ups Search engines Wikipedia Other libraries Research group website

[object Object],[object Object],[object Object],Why not share?

[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],http://discovery.ac.uk

<h1 itemprop="name”>The Cambridge companion to Spenser edited by Andrew Hadfield. [electronic resource] /</h1> <span style="display: none;" itemprop="publisher">Cambridge University Press,</span> <span style="display: none;" itemprop="datePublished">2001.</span>

http://www.lib.cam.ac.uk/api/voyager/newtonSearch.cgi?searchArg=darwin&databases=depfacaedb

Marc21 … 001 1000346 245$aEarly medieval history of Kashmir : $b[with special reference to the Loharas] A.D. 1003-1171 / DC XML … <dc:identifer>1000346</dc:identifer> <dc:title>Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171</dc:title> RDF triples … <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171"

1. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171" . 2. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/1cb251ec0d568de6a929b520c4aed8d1> . 3. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/46657eb180382684090fda2b5670335d> . 4. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/identifier> "UkCU1000346" . 5. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/issued> "1981" . 6. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/creator> <http://data.lib.cam.ac.uk/id/entity/cambrdgedb_a5a6f7a184ff02e08b1befedc1b3a4d0> . 7. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/language> <http://id.loc.gov/vocabulary/iso639-2/eng> . 8. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://RDVocab.info/ElementsplaceOfPublication> <http://id.loc.gov/vocabulary/countries/ii>

The Linking Open Data cloud diagram - http://richard.cyganiak.de/2007/10/lod

[object Object],[object Object],[object Object],[object Object],[object Object],Open Data Commons Public Domain Dedication and License (PDDL)

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],df100$aBradford, Gamaliel$d1863 - 1932. <authorParsed> <surname>Bradford</surname> <restOfName> Gamaliel</restOfName> <birthDate>1863</birthDate> <birthDateNormalised>18630101</birthDateNormalised> <deathDate>1932</deathDate> <deathDateNormalised>19320101</deathDateNormalised> </authorParsed>

[object Object],[object Object],Karen Coyle criticises the Marc21 Bibliographic Framework Transition Initiative for not including museums, publishing, and IT professionals … She argues that our data is not just for us to consume alone … “ The next data carrier for libraries needs to be developed as a truly open effort. It should be led by a neutral organization (possibly ad hoc) that can bring together the wide range of interested parties and make sure that all voices are heard. Technical development should be done by computer professionals with expertise in metadata design. The resulting system should be rigorous yet flexible enough to allow growth and specialization.” http://kcoyle.blogspot.com/2011/08/bibliographic-framework-transition.html

What's hot

Inferring Web Citations using Social Data and SPARQL Rules

Matthew Rowe

Libraries and Linked Data: Looking to the Future (2)

ALATechSource

Libraries and Linked Data: Looking to the Future (1)

ALATechSource

Libraries and Linked Data: Looking to the Future (3)

ALATechSource

Presentation federated search

Dr. Shakuntala Nighot

Discovering Scholarly Orphans Using ORCID

Martin Klein

Rahn Huber - Market Research On A Shoestring, 4/2/08

NAMA

Linked Data and Tools

Pedro Szekely

Linked Data - Exposing what we have

Richard Wallis

Natalya Csatari_Library Database and Website Search

Natalya Csatari, RN, BSN,CM

Open Source: Liberating your systems

Richard Wallis

Linked Data - Radical Change?

Richard Wallis

EBSCO tutorial 2011

larathinks

Jim Hendler's Presentation at SSSW 2011

sssw2011

This slide deck accompanies the manuscript "Interoperability and FAIRness through a novel combination of Web technologies", submitted to PeerJ Computer Science: https://doi.org/10.7287/peerj.preprints.2522v1 It describes the output of the "Skunkworks" FAIR implementation group, who were tasked with building a prototype infrastructure that would fulfill the FAIR Principles for scholarly data publishing. We show how a novel combination of the Linked Data Platform, RDF Mapping Language (RML) and Triple Pattern Fragments (TPF) can be combined to create a scholarly publishing infrastructure that is markedly interoperable, at both the metadata and the data level. This slide deck (or something close) will be presented at the Dutch Techcenter for Life Sciences Partners Workshop, November 4, 2016. Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...

Mark Wilkinson

After centuries with little change, scientific libraries have recently experienced massive upheaval. From being almost entirely paper-based, most libraries are now almost completely digital. This information revolution has all happened in less than 20 years and has created many novel opportunities and threats for scientists, publishers and libraries. Today, we are struggling with an embarassing wealth of digital knowledge on the Web. Most scientists access this knowledge through some kind of digital library, however these places can be cold, impersonal, isolated, and inaccessible places. Many libraries are still clinging to obsolete models of identity, attribution, contribution, citation and publication. Based on a review published in PLoS Computational Biology, http://pubmed.gov/18974831 this talk will discuss the current chilly state of digital libraries for biologists, chemists and informaticians, including PubMed and Google Scholar. We highlight problems and solutions to the coupling and decoupling of publication data and metadata, with a tool called http://www.citeulike.org. This software tool exploits the Web to make digital libraries “warmer”: more personal, sociable, integrated, and accessible places. Finally issues that will help or hinder the continued warming of libraries in the future, particularly the accurate identity of authors and their publications, are briefly introduced. These are discussed in the context of the BBSRC funded REFINE project, at the National Centre for Text Mining (NaCTeM.ac.uk), which is linking biochemical pathway data with evidence for pathways from the PubMed database.

Defrosting the Digital Library: A survey of bibliographic tools for the next ...

Duncan Hull

Publishing data on the Semantic Web

Peter Mika

The primary slide deck for the SADI tutorial. We explain the motivation, simple SADI services, more complex SADI services, and then do a detailed walk-through of building a service, including the Perl service code and examples of service invocation at the command line, and using the SHARE client. You will want to look at the sample data/queries in this slide deck: http://www.slideshare.net/markmoby/sample-data-and-other-ur-ls-55737183 and the example service code in this slide deck: http://www.slideshare.net/markmoby/example-code-for-the-sadi-bmi-calculator-web-service?related=1

Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015

Mark Wilkinson

Semantic Search Summer School2009

Peter Mika

Year of the Monkey: Lessons from the first year of SearchMonkey

Peter Mika

What's hot (20)

Inferring Web Citations using Social Data and SPARQL Rules

Libraries and Linked Data: Looking to the Future (2)

Libraries and Linked Data: Looking to the Future (1)

Libraries and Linked Data: Looking to the Future (3)

Presentation federated search

Discovering Scholarly Orphans Using ORCID

Rahn Huber - Market Research On A Shoestring, 4/2/08

Linked Data and Tools

Linked Data - Exposing what we have

Natalya Csatari_Library Database and Website Search

Open Source: Liberating your systems

Linked Data - Radical Change?

EBSCO tutorial 2011

Jim Hendler's Presentation at SSSW 2011

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...

Defrosting the Digital Library: A survey of bibliographic tools for the next ...

Publishing data on the Semantic Web

Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015

Semantic Search Summer School2009

Year of the Monkey: Lessons from the first year of SearchMonkey

Similar to Developments in catalogues and data sharing

Open (linked) bibliographic data edmund chamberlain (university of cambridge)

RDTF-Discovery

Open (linked) bibliographic data

Edmund Chamberlain

HKU Data Curation MLIM7350 Class 10

Scott Edmunds

Scholarly Communication May 12-13, 2009

Joseph Kraus

The Impact of Bibframe

Thomas Meehan

Over the past 5 years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs and so forth. Don’t stop reading. Data management isn’t likely to win anyone a Nobel prize. But publications should be supported and accompanied by data, methods, procedures, etc. to assure reproducibility of results. Funding agencies expect data (and increasingly software) management retention and access plans as part of the proposal process for projects to be funded. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation. The FAIR Guiding Principles for scientific data management and stewardship (http://www.nature.com/articles/sdata201618) has been an effective rallying-cry for EU and USA Research Infrastructures. FAIRDOM (Findable, Accessible, Interoperable, Reusable Data, Operations and Models) Initiative has 8 years of experience of asset sharing and data infrastructure ranging across European programmes (SysMO and EraSysAPP ERANets), national initiatives (de.NBI, German Virtual Liver Network, UK SynBio centres) and PI's labs. It aims to support Systems and Synthetic Biology researchers with data and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. This talk will use the FAIRDOM Initiative to discuss the FAIR management of data, SOPs, and models for Sys Bio, highlighting the challenges of and approaches to sharing, credit, citation and asset infrastructures in practice. I'll also highlight recent experiments in affecting sharing using behavioural interventions. http://www.fair-dom.org http://www.fairdomhub.org http://www.seek4science.org Presented at COMBINE 2016, Newcastle, 19 September. http://co.mbine.org/events/COMBINE_2016

FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...

Carole Goble

KBART Update ALA June 2011

Jason Price, PhD

2 Literature Searching

Ruby Rennie

Working with data.open.ac.uk, the Linked Data Platform of the Open University

Mathieu d'Aquin

LOCAH Project and Considerations of Linked Data Approaches

Adrian Stevenson

Dark Data In the Long Tail of Science: Examples in Biology

Bryan Heidorn

Representing the world: How web users become web thinkers and web makers

judell

One of Many Small Collections: The Vassar College Costume Collection Online

Arden Kirkland

Berlin 6 Open Access Conference: Tony Hey

Cornelius Puschmann

2011 JISC RDTF - The Women's Library (Teresa Doherty)

RDTF-Discovery

2011 jisc rdtf teresa the womens library

Teresa Doherty

Preservation Metadata, CARLI Metadata Matters series, December 2010

Claire Stewart

Preserving Streams of Issued Content

EDINA, University of Edinburgh

The Semantic Web

ostephens

Knowledge Graph Engineering

Armin Haller

Similar to Developments in catalogues and data sharing (20)

Open (linked) bibliographic data edmund chamberlain (university of cambridge)

Open (linked) bibliographic data

HKU Data Curation MLIM7350 Class 10

Scholarly Communication May 12-13, 2009

The Impact of Bibframe

FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...

KBART Update ALA June 2011

2 Literature Searching

Working with data.open.ac.uk, the Linked Data Platform of the Open University

LOCAH Project and Considerations of Linked Data Approaches

Dark Data In the Long Tail of Science: Examples in Biology

Representing the world: How web users become web thinkers and web makers

One of Many Small Collections: The Vassar College Costume Collection Online

Berlin 6 Open Access Conference: Tony Hey

2011 JISC RDTF - The Women's Library (Teresa Doherty)

2011 jisc rdtf teresa the womens library

Preservation Metadata, CARLI Metadata Matters series, December 2010

Preserving Streams of Issued Content

The Semantic Web

Knowledge Graph Engineering

Recently uploaded

Accessible design: Minimum effort, maximum impact

dawncurless

Web & Social Media Analytics Previous Year Question Paper.pdf

Jayanti Pande

Software Engineering Methodologies (overview)

eniolaolutunde

Introduction to Nonprofit Accounting: The Basics

TechSoup

Sanyam Choudhary Chemistry practical.pdf

sanyamsingh5019

APM Welcome Tuesday 30 April 2024 APM North West Network Conference, Synergies Across Sectors Presented by: Professor Adam Boddison OBE, Chief Executive Officer, APM Conference overview: https://www.apm.org.uk/community/apm-north-west-branch-conference/ Content description: APM welcome from CEO The main conference objective was to promote the Project Management profession with interaction between project practitioners, APM Corporate members, current project management students, academia and all who have an interest in projects.

APM Welcome, APM North West Network Conference, Synergies Across Sectors

Association for Project Management

Advanced Views - Calendar View in Odoo 17

Celine George

Measures of Dispersion and Variability: Range, QD, AD and SD

Thiyagu K

1029 - Danh muc Sach Giao Khoa 10 . pdf

QucHHunhnh

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf

SoniaTolstoy

Código Creativo y Arte de Software | Unidad 1

Maestría en Comunicación Digital Interactiva - UNR

Paris 2024 Olympic Geographies - an activity

GeoBlogs

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"

National Information Standards Organization (NISO)

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

Sapna Thakur

microwave assisted reaction. General introduction

Maksud Ahmed

Nutritional Needs Presentation - HLTH 104

misteraugie

Advance Mobile Application Development class 07

Dr. Mazin Mohamed alkathiri

Disha NEET Physics Guide for classes 11 and 12.pdf

chloefrazer622

Z Score,T Score, Percential Rank and Box Plot Graph

Thiyagu K

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

EduSkills OECD

Recently uploaded (20)

Accessible design: Minimum effort, maximum impact

Web & Social Media Analytics Previous Year Question Paper.pdf

Software Engineering Methodologies (overview)

Introduction to Nonprofit Accounting: The Basics

Sanyam Choudhary Chemistry practical.pdf

APM Welcome, APM North West Network Conference, Synergies Across Sectors

Advanced Views - Calendar View in Odoo 17

Measures of Dispersion and Variability: Range, QD, AD and SD

1029 - Danh muc Sach Giao Khoa 10 . pdf

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf

Código Creativo y Arte de Software | Unidad 1

Paris 2024 Olympic Geographies - an activity

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

microwave assisted reaction. General introduction

Nutritional Needs Presentation - HLTH 104

Advance Mobile Application Development class 07

Disha NEET Physics Guide for classes 11 and 12.pdf

Z Score,T Score, Percential Rank and Box Plot Graph

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Developments in catalogues and data sharing

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21. Our local catalogues National / international aggregations Joe Public Teenage software developer / hacker Booksellers Web start-ups Search engines Wikipedia Other libraries Research group website

22.

23.

24.

25.

26.

27.

28.

29. <h1 itemprop="name”>The Cambridge companion to Spenser edited by Andrew Hadfield. [electronic resource] /</h1> <span style="display: none;" itemprop="publisher">Cambridge University Press,</span> <span style="display: none;" itemprop="datePublished">2001.</span>

30.

31. http://www.lib.cam.ac.uk/api/voyager/newtonSearch.cgi?searchArg=darwin&databases=depfacaedb

32.

33.

34. Marc21 … 001 1000346 245$aEarly medieval history of Kashmir : $b[with special reference to the Loharas] A.D. 1003-1171 / DC XML … <dc:identifer>1000346</dc:identifer> <dc:title>Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171</dc:title> RDF triples … <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171"

35. 1. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/title> "Early medieval history of Kashmir : [with special reference to the Loharas] A.D. 1003-1171" . 2. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/1cb251ec0d568de6a929b520c4aed8d1> . 3. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/type> <http://data.lib.cam.ac.uk/id/type/46657eb180382684090fda2b5670335d> . 4. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/identifier> "UkCU1000346" . 5. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/issued> "1981" . 6. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/creator> <http://data.lib.cam.ac.uk/id/entity/cambrdgedb_a5a6f7a184ff02e08b1befedc1b3a4d0> . 7. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://purl.org/dc/terms/language> <http://id.loc.gov/vocabulary/iso639-2/eng> . 8. <http://data.lib.cam.ac.uk/id/entry/cambrdgedb_1000346> <http://RDVocab.info/ElementsplaceOfPublication> <http://id.loc.gov/vocabulary/countries/ii>

36.

37. The Linking Open Data cloud diagram - http://richard.cyganiak.de/2007/10/lod

38.

39.

40.

41.

42.

43.

44.

45.

46.

47.

48.

49.

50.

51.

52.

Editor's Notes

I’ll look at how our catalogues, and thus the data within them has changed to meet the changing expectation of the user. Cataloguing to provide data that better serve the needs of machines (and the people that program them) Also something of a reflection on the changes that have taken place in the last 10 years …
I’m trying to frame the next 40 minutes or so as a narrative
When attempting to guess where we are going, it helps if we take a step back 1) To simplify things (a little) Librarians and cataloguers used to have full control of their data and the way it was used (consumed) - We created it (or paid others to do so for us) - Our readers consumed it, in our libraries, served via ledgers, card indexes and OPACs - We had / have policies + standards (AACR2, Marc21) procedures (LOC Authority control, organisation (RLUK, OCLC), technology (Z39.50, OPACS)
- We created and largely owned a closed ecosystem for our readers, our data and ourselves and it worked Through this control of production, control of consumption and control of material, total ownership we were successful Closed eco systems can be successful today just look at Apple not a dead concept itself, but I believe that it could not last for libraries …
And along with estate agents, travel agents, government, landlords and bookshops we needed to think again … We already had our own networks, but now there was a global one with a rapid pace of chance Whilst Apple could grow a new ecosystem, ours was under threat
2) - We slowly lost our place as a single prime authority - for data - Commercial, social and academic discovery mechanisms Other sources of information for our users to turn to and eventually for content Also had to cope with a growth in digital content - Publishing shift to digital (took as while as journals came first, they were only a small part of our business - analytical cataloguing not standard practice) – this is resulting in massive changes in metadata and discovery usage …
Library still in its bubble Alternative discovery mechanisms and academic data & content sources suddenly existed alongside our sealed environment – all very heavily branded, very slick, constantly evolving Some we pay for, some we contribute to, some we view as inferior competition – but they exist – all legitimate means to discover bibliographic material of interest to the researcher or the scholar and they act as a direct alternative to our traditional model All with their own data environments, standards, procedures, protocols – not necessarily ours In light of this I argue that we could not longer maintain the closed ecosystem – to argue as such has become a fallacy, even in the mighty libraries of Oxford and Cambridge with world class special collections
In the new environment, come new users termed Generation Y. Generation Y, it is argued have grown up and worked outside of our bubble all along - used to a very different mode of consumption for data and resources They are born between 1984 and 1990. but I would argue the concept can be stretched further, way back, probably anyone who has studied science since the mid to late 1990s … Cambridge Arcadia report 2009 Preference for search engine over catalogue Online over in-building Trust peers of librarian Still respect the library ‘brand’ All of this has lead to a direct and open questioning of the purpose of the academic library – never mind the public one
I’ll now very quickly go over how our services and interfaces have responded to this need for change, with points one and two, And what else is to come in points three and four …
Lightweight simple things Started small … Libraries gateway Search boxes in website – make it a focus Catalogue pages used to be a single link in a wall of text New approach to online services - Don’t hide things, don’t post rules, ‘ 15 tips from those in the know – it used to be that we guarded our knowledge’. When your library looks like a prison, this is pretty vital
Keyword based discovery services Rich faceting Greater linking New ways to OPAC is dead? -it is in your case, and I’m quite jealous… All possible due to richness of data – our authority controlled catalogue records generally work quite well in faceted environments – we gain a competitive edge over folk whose data is not in such good shape Catalogues are easier to pick up, easier to teach and provide a more cohesive experience, even if they don’t always work in the way we as Librarians would always like. Our data is still in use, it is valuable and relevant, partly as a result of these changes in interface And I know this, because when you launched Solo a couple of years ago, some of your undergrads became our post grads and told us what they thought of our interfaces
But our data has evolved along with the catalogue Enrichment – made possible by use of identifiers – something we do very well External data can be indexed alongside yours – redefines what a record means – breaks down the concept of it as composite entity Catalogue records co-exist alongside data from other sources –part of a framework of data
And its not just supplier data, nowadays, we let readers into our catalogues: Tags Public lists Reader reviews Dramatic growth in access points Input from true subject specialists (i.e. at least those who have read the book) Lack of structure (well, our structure is still there) No quality control Compromise of sanctity? I would argue that in the academic sphere, this is fairly self weeding, few people are likely to ‘deface’ a record On a specific niche subject area. They are also popular, Its expected by users, its useful for them and and we are doing it
Web scale -> Resource discovery concept taken further to large centralise indexes In products such as Summon, library catalogue records are taken, merged, modified and stored as part of a central index of over 800 million items. WorldCat local works on these lines, as does Ebsco Discovery service. You use Primo central, its doing the same thing but not holding onto local content … yet These are the catalogue interfaces students will be using to search your records A recent development here lies in full text enrichment from Hathi Trust – records for print material are being boosted with full text where they can. And I’m sure that now they have the technology to do this, Serials Solutions are looking elsewhere for other sources of full text Early days, but a real paradigm shifter. Full text not necessarily a substitute for metadata, but if handled correctly in the keyword based world, it could blow things wide open … Or it could fail dreadfully – but, this development is being explored by major vendors True evolution of the catalogue to the ‘net’
Catalogue data now goes through several processes The record you create is not always the record readers will see The way it is searched and accessed Yet we still build it with the same rules and container formats as we did 20 years ago
This may not always be to our tastes and practices as librarians, but in terms of reader experience this has provided a dramatic improvement in service quality, as they have a fighting chance of understanding our interfaces if all this works – we are perhaps in better shape – perhaps on the same page as some of the competition, at least for the generation Y, and for a fair few others I believe we are now doing a better job …
What comes next is rather unknown … with no map to guide us In management speak, we may be able to meet generation Y’s use cases, but we also need to be ready for the use cases we’ve not yet thought of …
(Talk over black screen …) We have to stop and think about what has changed. We’ve lost the bubble around the library. Our old, successful closed data ecosystem has been rendered obsolete or at least severely degraded by competition on the web. As data and service providers, we’ve still had to change and innovate to get back to somewhere on the same page. Much of this change has come from our technology partners, our suppliers and our vendors as well as a growing community of open source software developers in and around libraries. Its been a pretty painful few years and we’ve all had to play a lot of catch-up. To a certain extent, we are still on the back-foot. To cope better in the future, we need to get better at handling change, we need to be faster, quicker, more ready to evolve. Stepping totally away from the closed model, we cannot exist in isolation. As well as vendors and open source providers, we do need external help from outside the library community to prepare and innovate for the future One way to get encourage this help is by finding ways to making our catalogue data easier to share and for others to reuse Even in its new form of a discovery service, The library catalogue is still a silo and still exists as something of a barrier to sharing. And despite the changes in interfaces we’ve gone through, the way we create, own and share data has until now largely carried on as normal … The practices here are as much of a barrier to sharing as the technology around the data Lets look at how most research libraries currently share data …
One way to prepare is to open up. We need to share and open up our raw data and to make it easier for others to re-use. I would argue each of these groups has an equal right to our raw data as much as we do, each would have different use cases for it And by and large, in the field of online services, I’m talking about software developers but in many areas Allow others to innovate on our data on our behalf, think of those use cases and explore them.
And there is demand. This slide is based on the ideas of a certain Cambridge academic. Bibliographic data linked to many aspects of teaching and research Citation lists – measure output Shared bibliography – core of research group work Reading lists – backbone of undergraduate teaching Quality of data – in terms of consistency and accuracy and form we are much easier to handle than museums and archives All exists already, but not in an open, linked capacity that can be tied quickly and easily into other institutional and external services
This is my colleague Katies’ write up of a talk lead by Owen Stephens it really sums it all up …
Success of distributed access outside of cultural heritage – Amazon can put a lot of their success down to distributed marketing. When you discover an amazon product, its not necessarily on the Amazon site, but they’ve shared their data in such a way that you can get to their catalogue And single means / point of discovery a myth - our one stop shops are actually our first stop shops, or second stop shops. Unrealistic to expect everyone who may want to access your collections to come to you, to your interfaces and domain And in case we are worried about selling out our ‘IP’ – most of this data was funded by the taxpayer, that includes business and web startups.
We are not alone in thinking like this. There is a national and international trend for the public release of data Get ourselves into this domain
This is recognised nationally by the JISC, who earlier this year launched the discovery initiative Oxford text archive contributed a project, we did with catalogue data and they are funding some very exciting work …
Search engines – we’ve open our Aquabrowser catalogue up to web crawlers Argument in the past has always been why should we bother, I would argue why should we withhold? We've actually been here for a while (Worldcat) - no-one is using it (or are they), Used semantic tags in HTML to indicate some structure, author, title, format, availability Still a commercial application designed from advertising - Not great fit Schema data - getting better What is the use case? – Google is designed to sell … For what its worth, Google have only taken 10% So perhaps the primary reason for doing this was to shut people up! I think there are better ways to share data than simply letting the spiders in …
All semantic web talks include this – it grows every year, will be interesting to see if growth is sustained.
More access points to records Better mechanisms for record enrichment Revised cataloguing workflows – imagine LOC subject and name authority entries that simply update themselves Access to developers
Hard to understand and decode Supporting ‘stack’ not up to scratch No seriously compelling use case (yet) Other ways to provide linked data
Whatever the format, there are common challenges in getting our data to them … Now some of the challenges involved in getting there
For this to work, we need to lift the legal barriers to sharing data, make it public, make it open, and open as defined by the wider Internet community Tended to have a lot of restrictions on record re-sharing in the past, there has been a lot of movement on this area in the past two years Most Cambridge data could be released under a permissive license Europeana Digital Library approve CC0 licensing of data OCLC looking at attribution only licensing British Library BNB – Creative Commons ‘Zero’ Move away from ‘non-commercial’ wording
No one involved in the library open data movement wants OCLC to go under Record vendors are valued partners when you have truck loads of legal deposit turning up every week Focus on sharing ‘non-marc21’ formats Of greater use to the non-Librarian I think we are seeing a shift in the way we operate
Based on a 40 year old format Based on a need to print a human readable card Syntax, vocabulary, fields and content all intertwined According to OCLC Research : Only 10% of all Marc tags in Worldcat appear in 100% of all Worldcat records 65% of tags appear in less that 1% of records.
Marc – data rich, structurally poor. If you as cataloguers agonise about where to put punctuation, we developers and hackers agonise about taking it out. Its waste. We are not printing out card indexes anymore, so why use formats and standards designed for that purpose? Provide true granular data and let the interface render it depending on the rules .. Very difficult to map to RDF and emerging standards – that $d in particular, especially with you use bs, ds, and cs … Mixed fields (text and numbers) (020$a) Duplication author name 100 and 245$c ? format 100 record fields?
Marc21 is binary encoded – hard to crack, needs specialised code libraries that few software developers are willing to learn or support Most developers know how to deal with XML / JSON and are happy with it Also, all that bs, ds, and cs? Its really hard to understand and remember. It’s a dark art. To understand easily data needs to be human readable - Numbers for field names, a whole website dedicated to explaining them To learn to code with library data, you technically need to learn to catalogue – an artificial barrier Bad encoding allowed by Marc - can crash whole systems when imported – XML would stop this
LOC Bibliographic Framework Transition declares a shift away from Marc. Delay in introduction of RDA until we get a ‘better container’ No system vendor is going forward with Marc21 as the internal storage mechanism in their next generation of systems. They may allow you to write records in it. Will take 10+ years What is to come next?
So despite the change its my worry that those in charge of Marc21 and RDA developments are not thinking widely enough about the new open ecosystem in which our data must inhabit
If we don’t try and shift … It becomes easier to go to Amazon – who have awesome API’s Or even Google books (theirs are rubbish) Our status as an authority of data providers will be further eroded No-one will want to play with us if we do not share

Developments in catalogues and data sharing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Developments in catalogues and data sharing

Similar to Developments in catalogues and data sharing (20)

Recently uploaded

Recently uploaded (20)

Developments in catalogues and data sharing

Editor's Notes