SlideShare a Scribd company logo
1 of 30
HATHITRUST
A Shared Digital Repository
HathiTrust:
An Above Campus Solution
Sarah Michalak
RLUK Birmingham
November 14, 2014
12/18/2014
Today’s Discussion - HathiTrust
• Mission and partnership
• Collections
• Services
• HathiTrust Research Center
• Benefits for Libraries
12/18/2014
The Name
• The meaning behind the name
• Hathi (hah-tee)--Hindi for elephant
• Never forgets
• Full of wisdom
• Secure
• Trustworthy
• Big, strong
12/18/2014
The Mission and Partnership
12/18/2014
Mission
To contribute to the common good by collecting, organizing, preserving,
communicating, and sharing the record of human knowledge.
Efforts include, but are not limited to
…building comprehensive collections co-owned and managed by
partners.
…enabling access by users with print disabilities.
…supporting computational research with the collections.
…stimulating shared collection storage strategies among libraries.
12/18/2014
HathiTrust Members
Allegheny College
Arizona State University
Baylor University
Boston College
Boston University
Brandeis University
Brown University
California Digital Library
Carnegie Mellon University
Colby College
Columbia University
Cornell University
Dartmouth College
Duke University
Emory University
Florida State University**
Getty Research Institute
Harvard University Library
Indiana University
Iowa State University
Johns Hopkins University
Kansas State University
Lafayette College
Library of Congress
Massachusetts Institute of
Technology
McGill University`
Michigan State University
Montana State University
Mount Holyoke College
New York Public Library
New York University
North Carolina Central
University
North Carolina State
University
Northwestern University
The Ohio State University
The Pennsylvania State
University
Princeton University
Purdue University
Rutgers University
Stanford University
Syracuse University
Temple University
Texas A&M University
Texas Tech
Tufts University
Universidad Complutense
de Madrid
University of Alabama
University of Alberta
University of Arizona
University of British Columbia
University of Calgary
University of California
Berkeley
Davis
Irvine
Los Angeles
Merced
Riverside
San Diego
San Francisco
Santa Barbara
Santa Cruz
The University of Chicago
University of Connecticut
University of Delaware
University of Florida
University of Houston
University of Illinois
University of Illinois at
Chicago
The University of Iowa**
University of Kansas**
University of Maine
University of Maryland**
University of Massachusetts,
Amherst
University of Miami
University of Michigan
University of Minnesota**
University of Missouri**
University of Nebraska-
Lincoln**
University of New Mexico
The University of North
Carolina at Chapel Hill**
University of Notre Dame
University of Oklahoma
University of Pennsylvania
University of Pittsburgh
University of Queensland
University of Tennessee,
Knoxville**
University of Texas
University of Utah
University of Vermont
University of Virginia**
University of Washington**
University of Wisconsin-
Madison**
Utah State University**
Vanderbilt University
Virginia Tech
Wake Forest University
Washington University
Yale University Library
November 3, 2014 7
How Are Costs Shared?
• Public domain volumes: All partners share in infrastructure costs
for each item.
• In copyright volumes: Partners share costs based on their
holdings.
• Infrastructure cost per volume: ~$0.168 per volume per year.
• All partners pay an additional amount above costs to fund new
programs and investigations.
12/18/2014
Collections and Access
12/18/2014
HATHITRUST.ORG
November 3, 2014 10
12.5 million total volumes
6.4 million book titles
327,000 serial titles
575,889 government publications
4.6 million volumes in the public domain
(~37%)
Link takes you to HathiTrust
Records loaded into DPLA, local library
catalogs, and commercial databases
Collective Stewardship
• Leverage expertise across institutions
• Distributed Functions and Services
• Preservation repository and access services
• University of Michigan
• Mirror site: Indiana University
• Metadata management services
• California Digital Library
• HathiTrust Research Center
• Indiana University and University of Illinois
5 November 2014 13
Collection Sources
1412/18/2014
Michigan, 37.54%
California, 28.63%
Harvard, 6.15%
Wisconsin, 4.47%
Indiana, 4.19%
Cornell, 4.02%
llinois (UC), 2.45%
NYPL, 2.35%
Princeton, 2.02%
PSU, 1.19%
Mnnesota, 1.11%
Universidad Complutense, 0.92%
LoC, 0.87%
Keio, 0.72%
Columbia, 0.52%
Northwestern, 0.45%
Ohio State, 0.42%
Chicago, 0.41%
Virginia, 0.41%
Purdue, 0.38%
Yale, 0.19%
UNC Chapel Hill, 0.14%
Getty Research Institute, 0.13%
Massachusetts, 0.09%
Florida, 0.08%
Duke, 0.06%
Connecticutt, 0.04%
Boston College, 0.03%
NC State, 0.03%
Mgill, 0.01%
Texas A&M, 0.01%
Alberta, < 0.01%
Delaware, < 0.01%
Utah State, < 0.01%
Dates
2000-2009
10%
1990-1999
14%
1980-1989
14%
1970-1979
13%
1960-1969
11%
1950-1959
6%
1940-1949
4%
1930-1939
4%
1920-1929
4%
1910-1919
4%
1900-1909
4%
1850-1899
10%
1800-1849
3%
1700-1799, 0.01%
1600-1699, 0.01%
1500-1599, 0.07%
0-1500, 0.04%
12/18/2014
Language Distribution (1)
The top 10 languages make up
~87% of all content
English, 49%
German, 9%
French, 7%
Spanish, 5%
Chinese, 4%
Russian, 4%
Japanese, 3%
Italian, 3%
Arabic, 2%
Latin, 1%
Remaining
Languages, 13%
12/18/2014
Language Distribution (2)
Portuguese, 7%
Polish, 7%
Dutch, 5%
Hebrew, 5%
Hindi, 5%
Indonesian, 4%
Korean, 4%
Swedish, 4%
Thai, 3%Urdu, 3%
Turkish, 3%
Danish, 3%
Czech, 3%
Croatian, 3%
Persian, 2%
Tamil, 2%
Hungarian, 2%
Bengali, 2%
Norwegian, 2%
Sanskrit, 2%
Greek,-Modern-
(1453--), 2%
Vietnamese, 1%
Ukrainian, 1%
Serbian, 1%
Bulgarian, 1%
Greek,-Ancient-
(to-1453), 1%
Armenian, 1%
Romanian, 1%
Marathi, 1%
Panjabi, 1%
Telugu, 1% Catalan,
1%
Malay,
1%
Multiple-languages, 1%
Malayalam, 1%
Finnish, 1%
Slovak, 1%
Slovenian
, 1%
Turkish,-
Ottoman,
1%
Yiddish, 1%
Nepali, 0%
The next 40
languages
make up
~12% of
total
12/18/2014
Copyright Distribution
In Copyright or
undetermined
63%
Public Domain
Worldwide
21%
US Government
Documents
5%
Public Domain (US)
11%
Open Access
0.06%
Creative Commons
0.06%
“Public domain”
38%
12/18/2014 18
Services
10 September, 2014 | 20
Preservation with Access
• Preservation
– TRAC-certified
– Long-term commitments to preserve digital content facilitate planning,
decision-making
• Discovery
– Bibliographic and full-text search of all materials
– Mechanisms for local loading of records
• Access and Use
– Full text search (all users)
– Public domain and open access works (all users)
– Collections and APIs (all users)
– Lawful uses of in-copyright works (members)
10 September, 2014 | 21
Access: Lawful uses of
in-copyright works
• Sensitive to multiple legal regimes
– Full-text search (everyone everywhere)
– Access to users who have print disabilities (through member proxy in
US, and where law permits)**
– Access works that are damaged or missing and also out of print and
unavailable (members in US only)
**Terms and conditions at http://www.hathitrust.org/access_use#ic-
access
10 September, 2014 | 22
Collective Action: Copyright Review
• Copyright Review Management System
– Systematic manual review of copyright registrations to determine
status of portions of the HathiTrust Collection
– CRMS US: Published in US, 1923-1963
• 316,396 reviewed / 166,753 PD (~53%)
– CRMS-World: Published in UK (1874-1944), Canada, Australia (1894-
1964)
• 145,804 reviewed / 75,775 PD-world 9 (~52%)
21 October 2014 22
10 September, 2014 | 23
HathiTrust Research Center
• http://www.hathitrust.org/htrc
• Operated by the University of Illinois, Urbana-Champaign and
Indiana University, with additional financial support from
HathiTrust.
• Co-led by Beth Plale (Indiana) and Stephen Downie (Illinois).
• Goal: enable researchers world-wide to carry out
computational investigation of HT repository.
10 September, 2014 | 24
Aims of the HTRC
• Focus on developing services to researchers
• Develop model for access: the ‘workset’
• Develop tools that facilitate research by digital humanities and
informatics communities
• Develop secure cyberinfrastructure that allows computational
investigation of entire copyrighted and public domain
HathiTrust repository
10 September, 2014 | 25
Example Projects Supported by HTRC
• Muñoz, Trevor, University of Maryland. “Distributed Metadata Correction and Annotation.”
– Correction, annotation and enhancement of HT records and export as linked data
• Page, Kevin, Oxford University. “ElEPHãT: Early English Print in HathiTrust, a Linked Semantic
Workset Prototype”
– Development of secondary worksets based on both HT and the Early English Books Online Text
Creation Partnership (EEBO-TCP).
• Burton, Vernon. “The South as ‘Other,’ the Southerner as ‘Stranger.’”
– Explore how attitudes expressed in print about slavery, southerners, and non-southerners have
changed over both time and space.
• Ted Underwood, Associate Professor of English at the University of Illinois, Urbana-
Champaign.
– Using public domain texts received from HathiTrust to explore changing relationships in literary
genres from 1700-1899.
10 September, 2014 | 26
HathiTrust overall benefits to libraries
• Digital Curation
– Drive costs down
– Reduce “bibliographic indeterminacy”
– Make meaningful decisions about formats and quality
– Increase discoverability, use
– Consolidate development talent
– Improve strength of archiving
• Print Curation
– Means to associate our print holdings
– Coordinated record-keeping
• Subsidiary benefits
– Quantify problems
– Collective attention to solving shared problems
– Understanding relationship between collective and local
12/18/2014
10 September, 2014 | 28
Benefits for UNC-Chapel Hill
• Preservation solution for UNC digitized books and journals.
• Online access to hundred’s of thousands of titles we do not
have in our collection.
• Live links to Hathi materials in our catalog is a convenience for
users and enriches our collections.
• Hathi-led “community developments” provide tools and
expertise we might not have otherwise.
• Digital humanities scholars and other researchers have the
benefit of computational research over the large-scale corpus.
10 September, 2014 | 29
10 September, 2014 | 30
The HathiTrust Digital Library
Large Scale Digital Preservation and Access
For the Public Good

More Related Content

What's hot

Scottish Open Education Declaration
Scottish Open Education Declaration Scottish Open Education Declaration
Scottish Open Education Declaration Lorna Campbell
 
Open scholarship : a US research library view in 2014 – Jisc and CNI conferen...
Open scholarship: a US research library view in 2014 – Jisc and CNI conferen...Open scholarship: a US research library view in 2014 – Jisc and CNI conferen...
Open scholarship : a US research library view in 2014 – Jisc and CNI conferen...Jisc
 
Introduction to the University Data Library and national data services
Introduction to the University Data Library and national data servicesIntroduction to the University Data Library and national data services
Introduction to the University Data Library and national data servicesEDINA, University of Edinburgh
 
The Clarke Studios Collection in TCD Library: A study in collaboration - Mar...
The Clarke Studios Collection in TCD Library:  A study in collaboration - Mar...The Clarke Studios Collection in TCD Library:  A study in collaboration - Mar...
The Clarke Studios Collection in TCD Library: A study in collaboration - Mar...CONUL Conference
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLAAcademicandResea
 
Digital transformations: new challenges for the arts and humanities - Andrew ...
Digital transformations: new challenges for the arts and humanities - Andrew ...Digital transformations: new challenges for the arts and humanities - Andrew ...
Digital transformations: new challenges for the arts and humanities - Andrew ...Jisc
 
LinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science FairLinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science FairMarieke Guy
 
Creating a new collaborative future: the evolving role of libraries in today’...
Creating a new collaborative future: the evolving role of libraries in today’...Creating a new collaborative future: the evolving role of libraries in today’...
Creating a new collaborative future: the evolving role of libraries in today’...Jisc
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaNick Sheppard
 
Digital Publishing in the Arts and Humanities
Digital Publishing in the Arts and HumanitiesDigital Publishing in the Arts and Humanities
Digital Publishing in the Arts and Humanitiesmattphillpott
 
Open access, universities as publishers - Jisc Digital Festival 2015
Open access, universities as publishers - Jisc Digital Festival 2015Open access, universities as publishers - Jisc Digital Festival 2015
Open access, universities as publishers - Jisc Digital Festival 2015Jisc
 
The Open Education Working Group: Bringing people and projects together
The Open Education Working Group: Bringing people and projects togetherThe Open Education Working Group: Bringing people and projects together
The Open Education Working Group: Bringing people and projects togetherMarieke Guy
 
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...Archaeological Training in an Open Access World: Lessons from the REWARD Proj...
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...ariadnenetwork
 
What are the key issues and opportunities in digital scholarship, and how sho...
What are the key issues and opportunities in digital scholarship, and how sho...What are the key issues and opportunities in digital scholarship, and how sho...
What are the key issues and opportunities in digital scholarship, and how sho...Stuart Dempster
 
LinkedUp - European Data Forum
LinkedUp - European Data ForumLinkedUp - European Data Forum
LinkedUp - European Data ForumMarieke Guy
 
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...lorna_hughes
 
Sustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghSustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghNick Sheppard
 

What's hot (20)

Scottish Open Education Declaration
Scottish Open Education Declaration Scottish Open Education Declaration
Scottish Open Education Declaration
 
Open scholarship : a US research library view in 2014 – Jisc and CNI conferen...
Open scholarship: a US research library view in 2014 – Jisc and CNI conferen...Open scholarship: a US research library view in 2014 – Jisc and CNI conferen...
Open scholarship : a US research library view in 2014 – Jisc and CNI conferen...
 
Introduction to the University Data Library and national data services
Introduction to the University Data Library and national data servicesIntroduction to the University Data Library and national data services
Introduction to the University Data Library and national data services
 
The Clarke Studios Collection in TCD Library: A study in collaboration - Mar...
The Clarke Studios Collection in TCD Library:  A study in collaboration - Mar...The Clarke Studios Collection in TCD Library:  A study in collaboration - Mar...
The Clarke Studios Collection in TCD Library: A study in collaboration - Mar...
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
 
Digital transformations: new challenges for the arts and humanities - Andrew ...
Digital transformations: new challenges for the arts and humanities - Andrew ...Digital transformations: new challenges for the arts and humanities - Andrew ...
Digital transformations: new challenges for the arts and humanities - Andrew ...
 
EDINA Supporting Digital Research
EDINA Supporting Digital ResearchEDINA Supporting Digital Research
EDINA Supporting Digital Research
 
LinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science FairLinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science Fair
 
Creating a new collaborative future: the evolving role of libraries in today’...
Creating a new collaborative future: the evolving role of libraries in today’...Creating a new collaborative future: the evolving role of libraries in today’...
Creating a new collaborative future: the evolving role of libraries in today’...
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and Wikimedia
 
Digital Publishing in the Arts and Humanities
Digital Publishing in the Arts and HumanitiesDigital Publishing in the Arts and Humanities
Digital Publishing in the Arts and Humanities
 
Open access, universities as publishers - Jisc Digital Festival 2015
Open access, universities as publishers - Jisc Digital Festival 2015Open access, universities as publishers - Jisc Digital Festival 2015
Open access, universities as publishers - Jisc Digital Festival 2015
 
David Price, UCL #RLUK14
David Price, UCL #RLUK14David Price, UCL #RLUK14
David Price, UCL #RLUK14
 
The Open Education Working Group: Bringing people and projects together
The Open Education Working Group: Bringing people and projects togetherThe Open Education Working Group: Bringing people and projects together
The Open Education Working Group: Bringing people and projects together
 
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...Archaeological Training in an Open Access World: Lessons from the REWARD Proj...
Archaeological Training in an Open Access World: Lessons from the REWARD Proj...
 
What are the key issues and opportunities in digital scholarship, and how sho...
What are the key issues and opportunities in digital scholarship, and how sho...What are the key issues and opportunities in digital scholarship, and how sho...
What are the key issues and opportunities in digital scholarship, and how sho...
 
Repositories Update (UK)
Repositories Update (UK) Repositories Update (UK)
Repositories Update (UK)
 
LinkedUp - European Data Forum
LinkedUp - European Data ForumLinkedUp - European Data Forum
LinkedUp - European Data Forum
 
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...
What's Welsh for Crowdsourcing?: Citizen Science at the National Library of W...
 
Sustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghSustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of Edinburgh
 

Similar to Sarah Michalak, HathiTrust #RLUK14

Exploring Perpetual Access
Exploring Perpetual AccessExploring Perpetual Access
Exploring Perpetual AccessNASIG
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...Hazel Hall
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Robin Rice
 
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...Building Capacities and Communities for Digital Scholarship: The "Digging Dee...
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...Harriett Green
 
DYAS: The Greek Research Infrastructure Network for the Humanities
DYAS: The Greek Research Infrastructure Network for the HumanitiesDYAS: The Greek Research Infrastructure Network for the Humanities
DYAS: The Greek Research Infrastructure Network for the Humanitiesariadnenetwork
 
Gore lyrasis dpla-2
Gore lyrasis dpla-2Gore lyrasis dpla-2
Gore lyrasis dpla-2Regan Harper
 
Oregon Explorer: the evolution of a natural resources digital library at OSU
Oregon Explorer: the evolution of a natural resources digital library at OSUOregon Explorer: the evolution of a natural resources digital library at OSU
Oregon Explorer: the evolution of a natural resources digital library at OSUInstitute for Natural Resources
 
National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...Mindtrek
 
Report on the Rethinking Resource Sharing Initiative
Report on the Rethinking Resource Sharing InitiativeReport on the Rethinking Resource Sharing Initiative
Report on the Rethinking Resource Sharing Initiativekramsey
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformTrevor Owens
 
Challenges for researchers in the Digital Humanities
Challenges for researchers in the Digital HumanitiesChallenges for researchers in the Digital Humanities
Challenges for researchers in the Digital HumanitiesLIBIS
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020OpenAIRE
 
Change Management for Libraries
Change Management for LibrariesChange Management for Libraries
Change Management for LibrariesThomas King
 
Digital collections and humanities research
Digital collections and humanities researchDigital collections and humanities research
Digital collections and humanities researchHarriett Green
 
Manage it locally to share it globally: RDM and Wikimedia Commons
Manage it locally to share it globally: RDM and Wikimedia CommonsManage it locally to share it globally: RDM and Wikimedia Commons
Manage it locally to share it globally: RDM and Wikimedia CommonsNick Sheppard
 
Europeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers
 

Similar to Sarah Michalak, HathiTrust #RLUK14 (20)

NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
 
Exploring Perpetual Access
Exploring Perpetual AccessExploring Perpetual Access
Exploring Perpetual Access
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...
 
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...Building Capacities and Communities for Digital Scholarship: The "Digging Dee...
Building Capacities and Communities for Digital Scholarship: The "Digging Dee...
 
DYAS: The Greek Research Infrastructure Network for the Humanities
DYAS: The Greek Research Infrastructure Network for the HumanitiesDYAS: The Greek Research Infrastructure Network for the Humanities
DYAS: The Greek Research Infrastructure Network for the Humanities
 
Gore lyrasis dpla-2
Gore lyrasis dpla-2Gore lyrasis dpla-2
Gore lyrasis dpla-2
 
Oregon Explorer: the evolution of a natural resources digital library at OSU
Oregon Explorer: the evolution of a natural resources digital library at OSUOregon Explorer: the evolution of a natural resources digital library at OSU
Oregon Explorer: the evolution of a natural resources digital library at OSU
 
National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...
 
Report on the Rethinking Resource Sharing Initiative
Report on the Rethinking Resource Sharing InitiativeReport on the Rethinking Resource Sharing Initiative
Report on the Rethinking Resource Sharing Initiative
 
David ppt
David pptDavid ppt
David ppt
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital Platform
 
Challenges for researchers in the Digital Humanities
Challenges for researchers in the Digital HumanitiesChallenges for researchers in the Digital Humanities
Challenges for researchers in the Digital Humanities
 
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
 
Change Management for Libraries
Change Management for LibrariesChange Management for Libraries
Change Management for Libraries
 
Digital collections and humanities research
Digital collections and humanities researchDigital collections and humanities research
Digital collections and humanities research
 
Bibliotheca Digitalis Summer School: Humanities at Scale and Dariah-EU - Nico...
Bibliotheca Digitalis Summer School: Humanities at Scale and Dariah-EU - Nico...Bibliotheca Digitalis Summer School: Humanities at Scale and Dariah-EU - Nico...
Bibliotheca Digitalis Summer School: Humanities at Scale and Dariah-EU - Nico...
 
Manage it locally to share it globally: RDM and Wikimedia Commons
Manage it locally to share it globally: RDM and Wikimedia CommonsManage it locally to share it globally: RDM and Wikimedia Commons
Manage it locally to share it globally: RDM and Wikimedia Commons
 
Digitization and public libraries
Digitization and public librariesDigitization and public libraries
Digitization and public libraries
 
Europeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking session
 

More from ResearchLibrariesUK

Examining the Potential Role of ...
Examining the Potential Role of ...Examining the Potential Role of ...
Examining the Potential Role of ...ResearchLibrariesUK
 
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...DCDC16 | Joining the dots: projects on conservation and research of Malian wr...
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...ResearchLibrariesUK
 
RLUK Warwick Meeting | Iron Mountain, Jeremy Suratt
RLUK Warwick Meeting | Iron Mountain, Jeremy SurattRLUK Warwick Meeting | Iron Mountain, Jeremy Suratt
RLUK Warwick Meeting | Iron Mountain, Jeremy SurattResearchLibrariesUK
 
RLUK Warwick Meeting | Academic Book of the Future, Samantha Rayner
RLUK Warwick Meeting | Academic Book of the Future, Samantha RaynerRLUK Warwick Meeting | Academic Book of the Future, Samantha Rayner
RLUK Warwick Meeting | Academic Book of the Future, Samantha RaynerResearchLibrariesUK
 
Warwick Library Symposium | Cathrine Harboe-Ree and David Groenewegen
Warwick Library Symposium |  Cathrine Harboe-Ree and  David GroenewegenWarwick Library Symposium |  Cathrine Harboe-Ree and  David Groenewegen
Warwick Library Symposium | Cathrine Harboe-Ree and David GroenewegenResearchLibrariesUK
 
Warwick Library Symposium | Anja Smit, Utrecht
Warwick Library Symposium | Anja Smit, UtrechtWarwick Library Symposium | Anja Smit, Utrecht
Warwick Library Symposium | Anja Smit, UtrechtResearchLibrariesUK
 
Warwick Library Symposium | Jim Neal
Warwick Library Symposium | Jim NealWarwick Library Symposium | Jim Neal
Warwick Library Symposium | Jim NealResearchLibrariesUK
 
Warwick Library Symposium | John MacColl, St Andrews and RLUK
Warwick Library Symposium | John MacColl, St Andrews and RLUKWarwick Library Symposium | John MacColl, St Andrews and RLUK
Warwick Library Symposium | John MacColl, St Andrews and RLUKResearchLibrariesUK
 
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...ResearchLibrariesUK
 
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University ResearchLibrariesUK
 
People unlock collections - user participation at The National Archives | DCDC14
People unlock collections - user participation at The National Archives | DCDC14People unlock collections - user participation at The National Archives | DCDC14
People unlock collections - user participation at The National Archives | DCDC14ResearchLibrariesUK
 
Human Genetics Historical Library | DCDC14
Human Genetics Historical Library | DCDC14Human Genetics Historical Library | DCDC14
Human Genetics Historical Library | DCDC14ResearchLibrariesUK
 
Worcestershire World War 100 | DCDC14
Worcestershire World War 100 | DCDC14Worcestershire World War 100 | DCDC14
Worcestershire World War 100 | DCDC14ResearchLibrariesUK
 

More from ResearchLibrariesUK (20)

Edinburgh
EdinburghEdinburgh
Edinburgh
 
Examining the Potential Role of ...
Examining the Potential Role of ...Examining the Potential Role of ...
Examining the Potential Role of ...
 
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...DCDC16 | Joining the dots: projects on conservation and research of Malian wr...
DCDC16 | Joining the dots: projects on conservation and research of Malian wr...
 
RLUK Warwick Meeting | Iron Mountain, Jeremy Suratt
RLUK Warwick Meeting | Iron Mountain, Jeremy SurattRLUK Warwick Meeting | Iron Mountain, Jeremy Suratt
RLUK Warwick Meeting | Iron Mountain, Jeremy Suratt
 
RLUK Warwick Meeting | Academic Book of the Future, Samantha Rayner
RLUK Warwick Meeting | Academic Book of the Future, Samantha RaynerRLUK Warwick Meeting | Academic Book of the Future, Samantha Rayner
RLUK Warwick Meeting | Academic Book of the Future, Samantha Rayner
 
Warwick Library Symposium | Cathrine Harboe-Ree and David Groenewegen
Warwick Library Symposium |  Cathrine Harboe-Ree and  David GroenewegenWarwick Library Symposium |  Cathrine Harboe-Ree and  David Groenewegen
Warwick Library Symposium | Cathrine Harboe-Ree and David Groenewegen
 
Warwick Library Symposium | Anja Smit, Utrecht
Warwick Library Symposium | Anja Smit, UtrechtWarwick Library Symposium | Anja Smit, Utrecht
Warwick Library Symposium | Anja Smit, Utrecht
 
Warwick Library Symposium | Jim Neal
Warwick Library Symposium | Jim NealWarwick Library Symposium | Jim Neal
Warwick Library Symposium | Jim Neal
 
Warwick Library Symposium | John MacColl, St Andrews and RLUK
Warwick Library Symposium | John MacColl, St Andrews and RLUKWarwick Library Symposium | John MacColl, St Andrews and RLUK
Warwick Library Symposium | John MacColl, St Andrews and RLUK
 
DCDC15 programme
DCDC15 programmeDCDC15 programme
DCDC15 programme
 
Stella Butler | RLUK Leeds 2015
Stella Butler | RLUK Leeds 2015Stella Butler | RLUK Leeds 2015
Stella Butler | RLUK Leeds 2015
 
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...
From Global to Local - User Centred Design at Cambridge :: Sue Mehrer, Univer...
 
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University
Evolution of Library Ethnography Studies :: Susan Gibbons, Yale University
 
People unlock collections - user participation at The National Archives | DCDC14
People unlock collections - user participation at The National Archives | DCDC14People unlock collections - user participation at The National Archives | DCDC14
People unlock collections - user participation at The National Archives | DCDC14
 
The Agnew's Archive | DCDC14
The Agnew's Archive | DCDC14The Agnew's Archive | DCDC14
The Agnew's Archive | DCDC14
 
Human Genetics Historical Library | DCDC14
Human Genetics Historical Library | DCDC14Human Genetics Historical Library | DCDC14
Human Genetics Historical Library | DCDC14
 
Collecting genomics | DCDC14
Collecting genomics | DCDC14Collecting genomics | DCDC14
Collecting genomics | DCDC14
 
Worcestershire World War 100 | DCDC14
Worcestershire World War 100 | DCDC14Worcestershire World War 100 | DCDC14
Worcestershire World War 100 | DCDC14
 
Somewhere in France | DCDC14
Somewhere in France | DCDC14Somewhere in France | DCDC14
Somewhere in France | DCDC14
 
Whose remembrance? | DCDC14
Whose remembrance? | DCDC14Whose remembrance? | DCDC14
Whose remembrance? | DCDC14
 

Recently uploaded

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Recently uploaded (20)

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Sarah Michalak, HathiTrust #RLUK14

  • 1. HATHITRUST A Shared Digital Repository HathiTrust: An Above Campus Solution Sarah Michalak RLUK Birmingham November 14, 2014
  • 3. Today’s Discussion - HathiTrust • Mission and partnership • Collections • Services • HathiTrust Research Center • Benefits for Libraries 12/18/2014
  • 4. The Name • The meaning behind the name • Hathi (hah-tee)--Hindi for elephant • Never forgets • Full of wisdom • Secure • Trustworthy • Big, strong 12/18/2014
  • 5. The Mission and Partnership 12/18/2014
  • 6. Mission To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge. Efforts include, but are not limited to …building comprehensive collections co-owned and managed by partners. …enabling access by users with print disabilities. …supporting computational research with the collections. …stimulating shared collection storage strategies among libraries. 12/18/2014
  • 7. HathiTrust Members Allegheny College Arizona State University Baylor University Boston College Boston University Brandeis University Brown University California Digital Library Carnegie Mellon University Colby College Columbia University Cornell University Dartmouth College Duke University Emory University Florida State University** Getty Research Institute Harvard University Library Indiana University Iowa State University Johns Hopkins University Kansas State University Lafayette College Library of Congress Massachusetts Institute of Technology McGill University` Michigan State University Montana State University Mount Holyoke College New York Public Library New York University North Carolina Central University North Carolina State University Northwestern University The Ohio State University The Pennsylvania State University Princeton University Purdue University Rutgers University Stanford University Syracuse University Temple University Texas A&M University Texas Tech Tufts University Universidad Complutense de Madrid University of Alabama University of Alberta University of Arizona University of British Columbia University of Calgary University of California Berkeley Davis Irvine Los Angeles Merced Riverside San Diego San Francisco Santa Barbara Santa Cruz The University of Chicago University of Connecticut University of Delaware University of Florida University of Houston University of Illinois University of Illinois at Chicago The University of Iowa** University of Kansas** University of Maine University of Maryland** University of Massachusetts, Amherst University of Miami University of Michigan University of Minnesota** University of Missouri** University of Nebraska- Lincoln** University of New Mexico The University of North Carolina at Chapel Hill** University of Notre Dame University of Oklahoma University of Pennsylvania University of Pittsburgh University of Queensland University of Tennessee, Knoxville** University of Texas University of Utah University of Vermont University of Virginia** University of Washington** University of Wisconsin- Madison** Utah State University** Vanderbilt University Virginia Tech Wake Forest University Washington University Yale University Library November 3, 2014 7
  • 8. How Are Costs Shared? • Public domain volumes: All partners share in infrastructure costs for each item. • In copyright volumes: Partners share costs based on their holdings. • Infrastructure cost per volume: ~$0.168 per volume per year. • All partners pay an additional amount above costs to fund new programs and investigations. 12/18/2014
  • 11. 12.5 million total volumes 6.4 million book titles 327,000 serial titles 575,889 government publications 4.6 million volumes in the public domain (~37%)
  • 12. Link takes you to HathiTrust Records loaded into DPLA, local library catalogs, and commercial databases
  • 13. Collective Stewardship • Leverage expertise across institutions • Distributed Functions and Services • Preservation repository and access services • University of Michigan • Mirror site: Indiana University • Metadata management services • California Digital Library • HathiTrust Research Center • Indiana University and University of Illinois 5 November 2014 13
  • 14. Collection Sources 1412/18/2014 Michigan, 37.54% California, 28.63% Harvard, 6.15% Wisconsin, 4.47% Indiana, 4.19% Cornell, 4.02% llinois (UC), 2.45% NYPL, 2.35% Princeton, 2.02% PSU, 1.19% Mnnesota, 1.11% Universidad Complutense, 0.92% LoC, 0.87% Keio, 0.72% Columbia, 0.52% Northwestern, 0.45% Ohio State, 0.42% Chicago, 0.41% Virginia, 0.41% Purdue, 0.38% Yale, 0.19% UNC Chapel Hill, 0.14% Getty Research Institute, 0.13% Massachusetts, 0.09% Florida, 0.08% Duke, 0.06% Connecticutt, 0.04% Boston College, 0.03% NC State, 0.03% Mgill, 0.01% Texas A&M, 0.01% Alberta, < 0.01% Delaware, < 0.01% Utah State, < 0.01%
  • 16. Language Distribution (1) The top 10 languages make up ~87% of all content English, 49% German, 9% French, 7% Spanish, 5% Chinese, 4% Russian, 4% Japanese, 3% Italian, 3% Arabic, 2% Latin, 1% Remaining Languages, 13% 12/18/2014
  • 17. Language Distribution (2) Portuguese, 7% Polish, 7% Dutch, 5% Hebrew, 5% Hindi, 5% Indonesian, 4% Korean, 4% Swedish, 4% Thai, 3%Urdu, 3% Turkish, 3% Danish, 3% Czech, 3% Croatian, 3% Persian, 2% Tamil, 2% Hungarian, 2% Bengali, 2% Norwegian, 2% Sanskrit, 2% Greek,-Modern- (1453--), 2% Vietnamese, 1% Ukrainian, 1% Serbian, 1% Bulgarian, 1% Greek,-Ancient- (to-1453), 1% Armenian, 1% Romanian, 1% Marathi, 1% Panjabi, 1% Telugu, 1% Catalan, 1% Malay, 1% Multiple-languages, 1% Malayalam, 1% Finnish, 1% Slovak, 1% Slovenian , 1% Turkish,- Ottoman, 1% Yiddish, 1% Nepali, 0% The next 40 languages make up ~12% of total 12/18/2014
  • 18. Copyright Distribution In Copyright or undetermined 63% Public Domain Worldwide 21% US Government Documents 5% Public Domain (US) 11% Open Access 0.06% Creative Commons 0.06% “Public domain” 38% 12/18/2014 18
  • 20. 10 September, 2014 | 20 Preservation with Access • Preservation – TRAC-certified – Long-term commitments to preserve digital content facilitate planning, decision-making • Discovery – Bibliographic and full-text search of all materials – Mechanisms for local loading of records • Access and Use – Full text search (all users) – Public domain and open access works (all users) – Collections and APIs (all users) – Lawful uses of in-copyright works (members)
  • 21. 10 September, 2014 | 21 Access: Lawful uses of in-copyright works • Sensitive to multiple legal regimes – Full-text search (everyone everywhere) – Access to users who have print disabilities (through member proxy in US, and where law permits)** – Access works that are damaged or missing and also out of print and unavailable (members in US only) **Terms and conditions at http://www.hathitrust.org/access_use#ic- access
  • 22. 10 September, 2014 | 22 Collective Action: Copyright Review • Copyright Review Management System – Systematic manual review of copyright registrations to determine status of portions of the HathiTrust Collection – CRMS US: Published in US, 1923-1963 • 316,396 reviewed / 166,753 PD (~53%) – CRMS-World: Published in UK (1874-1944), Canada, Australia (1894- 1964) • 145,804 reviewed / 75,775 PD-world 9 (~52%) 21 October 2014 22
  • 23. 10 September, 2014 | 23 HathiTrust Research Center • http://www.hathitrust.org/htrc • Operated by the University of Illinois, Urbana-Champaign and Indiana University, with additional financial support from HathiTrust. • Co-led by Beth Plale (Indiana) and Stephen Downie (Illinois). • Goal: enable researchers world-wide to carry out computational investigation of HT repository.
  • 24. 10 September, 2014 | 24 Aims of the HTRC • Focus on developing services to researchers • Develop model for access: the ‘workset’ • Develop tools that facilitate research by digital humanities and informatics communities • Develop secure cyberinfrastructure that allows computational investigation of entire copyrighted and public domain HathiTrust repository
  • 25. 10 September, 2014 | 25 Example Projects Supported by HTRC • Muñoz, Trevor, University of Maryland. “Distributed Metadata Correction and Annotation.” – Correction, annotation and enhancement of HT records and export as linked data • Page, Kevin, Oxford University. “ElEPHãT: Early English Print in HathiTrust, a Linked Semantic Workset Prototype” – Development of secondary worksets based on both HT and the Early English Books Online Text Creation Partnership (EEBO-TCP). • Burton, Vernon. “The South as ‘Other,’ the Southerner as ‘Stranger.’” – Explore how attitudes expressed in print about slavery, southerners, and non-southerners have changed over both time and space. • Ted Underwood, Associate Professor of English at the University of Illinois, Urbana- Champaign. – Using public domain texts received from HathiTrust to explore changing relationships in literary genres from 1700-1899.
  • 26. 10 September, 2014 | 26 HathiTrust overall benefits to libraries • Digital Curation – Drive costs down – Reduce “bibliographic indeterminacy” – Make meaningful decisions about formats and quality – Increase discoverability, use – Consolidate development talent – Improve strength of archiving • Print Curation – Means to associate our print holdings – Coordinated record-keeping • Subsidiary benefits – Quantify problems – Collective attention to solving shared problems – Understanding relationship between collective and local
  • 28. 10 September, 2014 | 28 Benefits for UNC-Chapel Hill • Preservation solution for UNC digitized books and journals. • Online access to hundred’s of thousands of titles we do not have in our collection. • Live links to Hathi materials in our catalog is a convenience for users and enriches our collections. • Hathi-led “community developments” provide tools and expertise we might not have otherwise. • Digital humanities scholars and other researchers have the benefit of computational research over the large-scale corpus.
  • 30. 10 September, 2014 | 30 The HathiTrust Digital Library Large Scale Digital Preservation and Access For the Public Good

Editor's Notes

  1. 1
  2. I am University Librarian and Associate Provost for University Libraries at the University of North Carolina at Chapel Hill. UNC is a comprehensive research university with a law school, medical school, strong research programs in numerous disciplines, especially including bio-medicine and public health. There are 28,000 students. The library has about 7.6 million volumes and has twelve libraries in various campus locations. This is the Wilson Special Collections Library, located exactly at the center of our beautiful campus. I am speaking today as chair of the Board of Governors of the HathiTrust Digital Library but the UNC library is a member of HathiTrust and I will point out some of the benefits we receive along the way. Why
  3. 4
  4. 6
  5. 7
  6. Annual income: approximately $2.5 million (Operations: $1.688 m Programmatic: $863K) Public Domain formula: (PD*C*X)/N In copyright forumula: IC=(C*X)/H Computation of fees must be approved by the membership every year. We are now working on collecting holdings and computing costs for 2015.
  7. Here is where it all starts – the HathiTrust search page Read services.
  8. 11
  9. WE HAVE WORKED TO IMPLEMENT A SHARED APPROACH TO CURATION AND STEWARDSHIP We have an intent that our members will be able to develop new services over time. HOW WORK GETS DONE At Michigan: 7 FTE on the operational side, 2 on the programs side. PLUS additional subsidized FTE that is not on the budget. Indiana and California are provided funds to support the services they operate. HTRC: The Board of Governors has approved some central funding, but Indiana and Illinois are also picking up about 2/3 – ¾ of the costs. Plus grant funds. Why are these schools supporting? Because they get something back greater than what they pay in. And they can take advantage of running these operations in a way to help facilitate other mission-driven programs on their campus. The intent is for the central staff to stay lean and promote contributed effort from other institutions. One of the goals: “To create and sustain this “public good” in a way that mitigates the problem of free-riders.” In short: the costs are shared in a way that ensures that everyone pays for what they benefit most directly from. And since almost the entire collection is based on what we have reformatted from libraries, we are basing it on Library holdings. We are NOT basing it on what your library has digitized and put in the repository. Annual income: approximately $2.5 million (Operations: $1.688 m Programmatic: $863K) Computation of fees must be approved by the membership every year. We are now working on collecting holdings and computing costs for 2015.
  10. 15
  11. This slide is not so surprising….
  12. 17
  13. Two ways we determine copyright status. First: date of publication listed in catalog. Then, manually, through collective effort on certain subsets: large-scale review of materials in collection to check on copyright status. Have identified an additional 250,000 volumes as public domain in US or worldwide. 18 member libraries contributing at least ¼ of a person’s time to this work. US Govt Publications are not all assumed to be public domain. Certain publishers, such as Smithsonian, are not opened automatically. We also close NTIS materials published in the last 6 years. If a work is known to have substantial 3rd party copyrighted material in the publication, then we would close it. We have taken a more liberal approach to this than Google did; they go primarily by date, not content. We do have clearly published guidelines for receiving a complaint, and a process for taking down and reviewing status of a work.
  14. So how do you use all this stuff? Talk for just a second before going into end uses, about how libraries view HathiTrust.
  15. 20
  16. Our unofficial slogan is “We make lawful uses of published works.” Search Print disabilities Preservation access   In HathiTrust we’ve been willing to take vigorous advantage of rights and protections available to libraries and individuals in US Copyright Law, and in doing so I think we have been able to make the concept of “Fair Use” in our law much stronger and in some ways, more available, to libraries. Multiple courts have agreed that libraries can digitize items for the purposes of enabling computationally-driven search, and that we can digitize items for the purposes of providing access to users with print disabilities. We feel very confident that our policies and practices around providing replacement copy access for damaged/lost items is lawful. But you see here how these differing laws affect the types of benefits our partner libraries. While we are confident that we can provide access to damaged or missing works to US institutions under the fair use and libraries exceptions in our laws, we cannot do so elsewhere.
  17. 22
  18. KEY FOCUS: EXPAND COMPUTATIONAL ACCESS, BOTH BY SIMPLIFYING FOR NON-EXPERTS AND BY DEVELOPING ROBUST INFRASTRUCTURE FOR SPECIALISTS Focus on developing services to researchers Develop model for access: the ‘workset’ Develop tools that facilitate research by digital humanities and informatics communities Develop secure cyberinfrastructure that allows computational investigation of entire copyrighted and public domain HathiTrust repository Beth is Director, Data to Insight Center Managing Director, Pervasive Technology Institute (PTI) Professor, School of Informatics and Computing, Indiana University Stephen is Professor and Associate Dean for Research at the Graduate School of Library and Information Science. Management team includes Robert McDonald, Beth Sandore, and John Unsworth.
  19. A secure computing framework that: Trusts that researcher will not deliberately leak repository data, but Prevents malware acting on user's behalf from leaking data. Enforces: Non-consumptive use: framework provides safe handling of large volumes of protected data Openness: framework supports user-contributed analysis tools (that is, not limit uses to a known set of algorithms) Efficiency: framework supports user-contributed analysis tools without resorting to code walkthroughs prior to acceptance Large-scale and low cost: protections can be extended to utilization of large-scale national (public) supercomputers GOAL: ENABLE ACCESS FOR COMPUTATION WITHIN A SECURE FRAMEWORK Pushing out the Services through Scholarly Commons Gives HT institutions exclusive access to training and learning materials that help them establish programs that integrate HTRC tools and services into their scholarly commons programs in libraries and digital humanities centers. Physically located on the University of Illinois Library’s Scholarly commons. Supported by several Library staff and faculty. Key among these is the Digital Humanities Research Specialist who will assist with the development of training and outreach initiatives in support of researchers working with the Hathi Trust Research Center and HathiTrust digital library affiliates who seek to start their own HTRC research services. Effort involves planning, implementation and continuous development of training materials, educational workshops, and potential tools, and outreach activities in support of the usage of HTRC tools and datasets.
  20. “Workset Creation through Image Analysis of Document Pages”, Texas A&M University (PI: Keith Biggers) Biggers will work with Neal Audenaert and Natalie M. Houston to develop a software application that uses the visual characteristics of digitized printed pages to identify documents that contain three types of visually distinctive materials of interest to humanities researchers: poetry, music, and illustrations. This prototype will demonstrate the value of using visual analysis of document images in conjunction with more traditional textual analysis to enable scholars to ask more refined questions about texts and their physical manifestations. “Semantic Analysis of Documents from the HathiTrust Corpus”, Waikato University (PI: Annike Hinze) Hinze’s team will develop a suite of tools that analyze documents by the semantics of their content and metadata. Clustering documents by semantic similarity will open up a wealth of opportunities for scholarly research.The project is designed in close collaboration with two humanities scholars from the areas of Maori & Pacific Studies, and Historical Anthropology, who not only drive this project with research questions based on their scholarly practice, but also provide ongoing input and feedback during the development process. “Distributed Metadata Correction and Annotation”, Maryland Institute for Technology in the Humanities, University of Maryland. (PI: Trevor Muñoz) Muñoz will collaborate with Peter Mallios and the Foreign Literatures in America (FLA) project team to develop a set of services and interfaces that will allow the FLA project (and other projects like it) to pull metadata records from the HathiTrust, correct and annotate these records using standardized vocabularies, gather corrections and annotations from other teams or scholars, and export enhanced metadata in formats suitable for publication as linked data. “ElEPHãT: Early English Print in HathiTrust, a Linked Semantic Workset Prototype”, Oxford University (PI: Kevin Page) Page will work with colleagues from the Bodleian Library to produce software that exposes the necessary metadata from individual collections for building aggregate worksets drawn from multiple sources. The prototype will build integrated worksets that combine resources from the HathiTrust and from the the Early English Books Online Text Creation Partnership (EEBO-TCP) collection, which focuses on high quality images and accurate transcriptions of items usually found in libraries’ special collections.
  21. 26