SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Datasets Programme
June 2010



John Kaye – Lead Content Specialist Datasets
The British Library



                      Exists for everyone who wants to do
                      research – for academic, personal, and
                      commercial purposes.

                      Covers all subject areas – sciences,
                      technology, medicine, arts, humanities,
                      social sciences…

                      Receives a copy of every item
                      published in the UK.

                      Holds over 150 million items, with 3
                      million items added each year.

                      Used by over 16,000 people each day
                      (on site and online).


                                                                2
Data and the Digital Landscape



 Seismic measurements taken by a
 geologist.



 Genetic data collected by a medical
 researcher.



 A survey of public opinions collected
 by a sociologist.


                                         3
The Foundation for Research




Data is a crucial component of the scholarly record.

Re-acquisition may be impossible

Datasets are essential to the British Library’s mission
to advance the World’s knowledge.




                                                          4
Currently…


There is:
  No effective way to link between datasets and article;
  No widely used method to identify datasets;
  No widely used method to cite datasets.

As a result, datasets are:
  Difficult to discover;
  Difficult to access;
  In danger of being lost.




                                                           5
Datasets Strategy - Vision



Researchers can discover, access, adapt, reuse and
reference datasets in the course of their research

Researchers will be able to track the impact that their datasets
have and receive appropriate credit

The British Library will be an essential component of an
interconnected network of service providers

Datasets from all disciplines remain intact, discoverable,
useable and vital for future generations




                                                                   6
The Datasets Programme



We envision a future where researchers can:

  Discover, access, reuse, and reference
  datasets.
  Track the impact of the data that they
  generate and receive appropriate credit.

Our approach is to:

  Provide a focus for the community to
  establish needs, requirements and
  agreement.
  Explore novel technology and creative
  solutions.

                                              7
Projects – DataCite




              DataCite is an international consortium which
                aims to:

                Establish easier access to scientific research
                data on the Internet

                Increase acceptance of research data as
                legitimate, citable contributions to the scientific
                record

                Support data archiving that will permit results
                to be verified and re-purposed for future study


                                                                      8
Projects – DataCite



German National Library of Science and
Technology (TIB)
British Library (BL), UK
ETH Zurich Library, Switzerland
Institute for Scientific and Technical
Information (INIST-CNRS), France          Founded on 1 Dec 2009
National Technical Information Center
(DTIC), Denmark
TU Delft Library, Netherlands
Canada Institute for Scientific and
Technical Information (CISTI)             Now 12 members from 9
Australian National Data Service (ANDS)   different countries
California Digital Library (CDL), USA
Purdue University Libraries (PUL), USA
German National Library of Medicine (ZB
MED)
GESIS - Leibniz Institute of Social
Sciences, Germany

                                                                  9
Projects – DataCite




DataCite:
  Supports researchers by enabling them to locate,
  identify, and cite research datasets with
  confidence
  Supports data centres by providing workflows and
  standards for data publication
  Supports publishers by enabling research articles
  to be linked to the underlying data



                                                      10
A Key Component for Many Goals




                    Cite
           Make
                            Reuse
          Visible
                  Persistent
                      ?
                Identification
         Find                    Verify

                     Track
             Access
                    Impact


                                          11
Connecting an Article with the Underlying Data



URLs are not persistent

 (e.g. Wren JD: URL decay in
 MEDLINE- a 4-year follow-up
 study. Bioinformatics. 2008, Jun
 1;24(11):1381-5).




Digital Object Identifiers (DOIs)
 offer a solution

 Mostly widely used identifier for
 scientific articles                 Dataset
 Researchers, authors, publishers    Yancheva et al (2007). Analyses
 know how to use them                on sediment of Lake Maar.
 Put datasets on the same playing    PANGAEA.
 field as articles                   doi:10.1594/PANGAEA.587840

                                                                  12
The Cost of Visibility




        DOI-registration and      €0.01 – €1
             search results


             Storage,
                                  €50 – €500
        quality assurance,
                               (approx 1% of data creation cost)
          and metadata



           Harvesting             €5,000 – €5,000,000
          and production



                                                                   13
Projects – Search Our Catalogue




                                  14
Social Science Collections and Research Datasets Strategy



 Content
   Continue to build existing content (print and electronic): OECD, World
   Bank, UN etc
   Enhance links to Economic and Social Data Service: International
   Government, Longitudinal, Qualidata
 Partnerships
   Key partners: UK Data Archive, ONS, The National Archives
   Involved in UK Data Forum; signatory to National Data Strategy for
   Economic and Social Data
 Resource discovery
   Resource/ user guides – add value to SSCR projects
   Dataset Cataloguing
   Census 2011 exhibition
 Capacity building
   Datasets Content Lead Recruited
   Training for Reference Team (Social Science, Science)


                                                                            15
Challenges to Explore




  Long-term preservation of data

  Standards for data citation and metadata

  Methods for assuring quality and integrity of data

  Attribution and credit for data producers

  Effective discovery and accessibility




                                                       16
John Kaye
Lead Content Specialist – Datasets
Social Science Collections and Research
The British Library
96 Euston Road London
NW1 2DB

Telephone: 020 7412 7450
Email: john.kaye@bl.uk

datasets@bl.uk



                                          17

Contenu connexe

Tendances

Tendances (20)

Cologne open access slides dec 2010
Cologne open access slides dec 2010Cologne open access slides dec 2010
Cologne open access slides dec 2010
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)
 
Opendata repository-v2
Opendata repository-v2Opendata repository-v2
Opendata repository-v2
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
PEER End of Project Report
PEER End of Project ReportPEER End of Project Report
PEER End of Project Report
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 
Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...
 
EPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasetsEPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasets
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
 
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
 
EPSRC research data expectations and research software management
EPSRC research data expectations and research software managementEPSRC research data expectations and research software management
EPSRC research data expectations and research software management
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
DataCite at APE 2011
DataCite at APE 2011DataCite at APE 2011
DataCite at APE 2011
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of Edinburgh
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
 
Tales from the Keepers Registry
Tales from the Keepers RegistryTales from the Keepers Registry
Tales from the Keepers Registry
 

Similaire à British Library Datasets Programme 2010

Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
Incisive_Events
 

Similaire à British Library Datasets Programme 2010 (20)

Malcolm Read: Drivers for Open Access and Data - a funder's perspective
Malcolm Read: Drivers for Open Access and Data - a funder's perspectiveMalcolm Read: Drivers for Open Access and Data - a funder's perspective
Malcolm Read: Drivers for Open Access and Data - a funder's perspective
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms:
 
Research Data Management: A Tale of Two Paradigms
Research Data Management: A Tale of Two ParadigmsResearch Data Management: A Tale of Two Paradigms
Research Data Management: A Tale of Two Paradigms
 
ANDS and Data Management
ANDS and Data ManagementANDS and Data Management
ANDS and Data Management
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructure
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Birgit Schmidt: RDA for Libraries from an International Perspective
Birgit Schmidt: RDA for Libraries from an International PerspectiveBirgit Schmidt: RDA for Libraries from an International Perspective
Birgit Schmidt: RDA for Libraries from an International Perspective
 
Open science curriculum for students, June 2019
Open science curriculum for students, June 2019Open science curriculum for students, June 2019
Open science curriculum for students, June 2019
 
dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET Poster ENDO 2016
dkNET Poster ENDO 2016
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
Simon hodson
Simon hodsonSimon hodson
Simon hodson
 
User engagement in research data curation
User engagement in research data curationUser engagement in research data curation
User engagement in research data curation
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Sla2009 D Curation Heidorn
Sla2009 D Curation HeidornSla2009 D Curation Heidorn
Sla2009 D Curation Heidorn
 
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy DevelopmentsRoss Wilkinson - Data Publication: Australian and Global Policy Developments
Ross Wilkinson - Data Publication: Australian and Global Policy Developments
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Imac 090924
Imac 090924Imac 090924
Imac 090924
 
Research Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture ChangeResearch Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture Change
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
 
Dataset citation and identification
Dataset citation and identificationDataset citation and identification
Dataset citation and identification
 

Plus de ALISS

July2015cooke.
July2015cooke.July2015cooke.
July2015cooke.
ALISS
 
ALISS AGM Minutes 2015
ALISS AGM Minutes 2015ALISS AGM Minutes 2015
ALISS AGM Minutes 2015
ALISS
 
Developing digital literacies in undergraduate students: SADL project -
Developing digital literacies in undergraduate students: SADL project - Developing digital literacies in undergraduate students: SADL project -
Developing digital literacies in undergraduate students: SADL project -
ALISS
 
Useful resources for student training and orientation
Useful resources for student training and orientationUseful resources for student training and orientation
Useful resources for student training and orientation
ALISS
 

Plus de ALISS (20)

Library champions for disability Meeting Notes January 22nd 2021
Library champions for disability Meeting Notes January 22nd 2021Library champions for disability Meeting Notes January 22nd 2021
Library champions for disability Meeting Notes January 22nd 2021
 
Disability- higher education, libraries, teaching and learning bibliography m...
Disability- higher education, libraries, teaching and learning bibliography m...Disability- higher education, libraries, teaching and learning bibliography m...
Disability- higher education, libraries, teaching and learning bibliography m...
 
What is crowdsourcing?
What is crowdsourcing?What is crowdsourcing?
What is crowdsourcing?
 
Creating Digital Collections Through Crowdsourcing
Creating Digital Collections Through CrowdsourcingCreating Digital Collections Through Crowdsourcing
Creating Digital Collections Through Crowdsourcing
 
The sound of the Crowd: David Tomkins, Bodleian Digital Library
The sound of the Crowd: David Tomkins, Bodleian Digital Library The sound of the Crowd: David Tomkins, Bodleian Digital Library
The sound of the Crowd: David Tomkins, Bodleian Digital Library
 
Incorporating student content at city- Diane Bell, City University
Incorporating student content at city- Diane Bell, City UniversityIncorporating student content at city- Diane Bell, City University
Incorporating student content at city- Diane Bell, City University
 
July2015cooke.
July2015cooke.July2015cooke.
July2015cooke.
 
ALISS AGM Minutes 2015
ALISS AGM Minutes 2015ALISS AGM Minutes 2015
ALISS AGM Minutes 2015
 
Developing digital literacies in undergraduate students: SADL project -
Developing digital literacies in undergraduate students: SADL project - Developing digital literacies in undergraduate students: SADL project -
Developing digital literacies in undergraduate students: SADL project -
 
News media at the British Library
News media at the British LibraryNews media at the British Library
News media at the British Library
 
How SCIE supports the information needs of health and social care professionals
How SCIE supports the information needs of health and social care professionalsHow SCIE supports the information needs of health and social care professionals
How SCIE supports the information needs of health and social care professionals
 
Searching systematically: supporting authors of Cochrane reviews.
Searching systematically: supporting authors of Cochrane reviews.  Searching systematically: supporting authors of Cochrane reviews.
Searching systematically: supporting authors of Cochrane reviews.
 
Jo Wood, Cafcass –Build it and they will come: developing an in-house service...
Jo Wood, Cafcass –Build it and they will come: developing an in-house service...Jo Wood, Cafcass –Build it and they will come: developing an in-house service...
Jo Wood, Cafcass –Build it and they will come: developing an in-house service...
 
Speedy professional conversations around learning and teaching in higher educ...
Speedy professional conversations around learning and teaching in higher educ...Speedy professional conversations around learning and teaching in higher educ...
Speedy professional conversations around learning and teaching in higher educ...
 
The Digital Documents Harvesting and Processing Tool (Document Harvester)
The Digital Documents Harvesting and Processing Tool (Document Harvester)The Digital Documents Harvesting and Processing Tool (Document Harvester)
The Digital Documents Harvesting and Processing Tool (Document Harvester)
 
Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly use
 
Legal Deposit in a Digital Age: an overview
Legal Deposit in a Digital Age: an overviewLegal Deposit in a Digital Age: an overview
Legal Deposit in a Digital Age: an overview
 
Useful resources for student training and orientation
Useful resources for student training and orientationUseful resources for student training and orientation
Useful resources for student training and orientation
 
Doing something different staff development and workplace learning at Cardiff...
Doing something different staff development and workplace learning at Cardiff...Doing something different staff development and workplace learning at Cardiff...
Doing something different staff development and workplace learning at Cardiff...
 
Knowledge, skills and reskilling – where does the MSc fit in?
Knowledge, skills and reskilling – where does the MSc fit in?Knowledge, skills and reskilling – where does the MSc fit in?
Knowledge, skills and reskilling – where does the MSc fit in?
 

British Library Datasets Programme 2010

  • 1. Datasets Programme June 2010 John Kaye – Lead Content Specialist Datasets
  • 2. The British Library Exists for everyone who wants to do research – for academic, personal, and commercial purposes. Covers all subject areas – sciences, technology, medicine, arts, humanities, social sciences… Receives a copy of every item published in the UK. Holds over 150 million items, with 3 million items added each year. Used by over 16,000 people each day (on site and online). 2
  • 3. Data and the Digital Landscape Seismic measurements taken by a geologist. Genetic data collected by a medical researcher. A survey of public opinions collected by a sociologist. 3
  • 4. The Foundation for Research Data is a crucial component of the scholarly record. Re-acquisition may be impossible Datasets are essential to the British Library’s mission to advance the World’s knowledge. 4
  • 5. Currently… There is: No effective way to link between datasets and article; No widely used method to identify datasets; No widely used method to cite datasets. As a result, datasets are: Difficult to discover; Difficult to access; In danger of being lost. 5
  • 6. Datasets Strategy - Vision Researchers can discover, access, adapt, reuse and reference datasets in the course of their research Researchers will be able to track the impact that their datasets have and receive appropriate credit The British Library will be an essential component of an interconnected network of service providers Datasets from all disciplines remain intact, discoverable, useable and vital for future generations 6
  • 7. The Datasets Programme We envision a future where researchers can: Discover, access, reuse, and reference datasets. Track the impact of the data that they generate and receive appropriate credit. Our approach is to: Provide a focus for the community to establish needs, requirements and agreement. Explore novel technology and creative solutions. 7
  • 8. Projects – DataCite DataCite is an international consortium which aims to: Establish easier access to scientific research data on the Internet Increase acceptance of research data as legitimate, citable contributions to the scientific record Support data archiving that will permit results to be verified and re-purposed for future study 8
  • 9. Projects – DataCite German National Library of Science and Technology (TIB) British Library (BL), UK ETH Zurich Library, Switzerland Institute for Scientific and Technical Information (INIST-CNRS), France Founded on 1 Dec 2009 National Technical Information Center (DTIC), Denmark TU Delft Library, Netherlands Canada Institute for Scientific and Technical Information (CISTI) Now 12 members from 9 Australian National Data Service (ANDS) different countries California Digital Library (CDL), USA Purdue University Libraries (PUL), USA German National Library of Medicine (ZB MED) GESIS - Leibniz Institute of Social Sciences, Germany 9
  • 10. Projects – DataCite DataCite: Supports researchers by enabling them to locate, identify, and cite research datasets with confidence Supports data centres by providing workflows and standards for data publication Supports publishers by enabling research articles to be linked to the underlying data 10
  • 11. A Key Component for Many Goals Cite Make Reuse Visible Persistent ? Identification Find Verify Track Access Impact 11
  • 12. Connecting an Article with the Underlying Data URLs are not persistent (e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5). Digital Object Identifiers (DOIs) offer a solution Mostly widely used identifier for scientific articles Dataset Researchers, authors, publishers Yancheva et al (2007). Analyses know how to use them on sediment of Lake Maar. Put datasets on the same playing PANGAEA. field as articles doi:10.1594/PANGAEA.587840 12
  • 13. The Cost of Visibility DOI-registration and €0.01 – €1 search results Storage, €50 – €500 quality assurance, (approx 1% of data creation cost) and metadata Harvesting €5,000 – €5,000,000 and production 13
  • 14. Projects – Search Our Catalogue 14
  • 15. Social Science Collections and Research Datasets Strategy Content Continue to build existing content (print and electronic): OECD, World Bank, UN etc Enhance links to Economic and Social Data Service: International Government, Longitudinal, Qualidata Partnerships Key partners: UK Data Archive, ONS, The National Archives Involved in UK Data Forum; signatory to National Data Strategy for Economic and Social Data Resource discovery Resource/ user guides – add value to SSCR projects Dataset Cataloguing Census 2011 exhibition Capacity building Datasets Content Lead Recruited Training for Reference Team (Social Science, Science) 15
  • 16. Challenges to Explore Long-term preservation of data Standards for data citation and metadata Methods for assuring quality and integrity of data Attribution and credit for data producers Effective discovery and accessibility 16
  • 17. John Kaye Lead Content Specialist – Datasets Social Science Collections and Research The British Library 96 Euston Road London NW1 2DB Telephone: 020 7412 7450 Email: john.kaye@bl.uk datasets@bl.uk 17