This document discusses the mass scanning workflow of the Biodiversity Heritage Library (BHL), a digital library of taxonomic literature. It describes how the BHL member institutions collaborated on scanning priorities, avoiding duplication, and handling complex serials and monographs. Key aspects of the workflow included selecting materials, establishing scanning viability, metadata analysis, de-duplication processes, quality assurance, and making the materials accessible online. The BHL created effective tools and procedures to manage the large-scale digitization project.
Mass Scanning Workflow Discussion Sociological Aspects
1. A Mass Scanning Workflow Discussion Sociological aspects of a global mass scanning project Biodiversity Heritage Library Suzanne C. Pilsk- Smithsonian Institution Libraries Matthew Person – MBLWHOI Library, Woods Hole June 28, 2010
2. Biodiversity Heritage Libr ary In any well-appointed Natural History Library there should be found every book and every edition of every book dealing in the remotest way with the subjects concerned. One never knows wherein one edition differs from or supplements the other and unless these are on the same table at the same time it is not possible to collate them properly. Moreover for accurate work it is necessary for the student to verify every reference he may find; it is not enough to copy from a previous author; he must verify each reference itself from the original . Charles Davies Sherborn, Epilogue to Index Animalium , March 1922 Charles Davies Sherborn (1861-1942)
48. Mass Scanning Workflow Local data flow Vendor data flow WonderFetch tm Return of data Return of material Quality Assurance Billing
49. EOL species need Curator Request “ gap-fill” for other BHL library Pull from stacks Circ in ILS Preliminary metadata check And physical check Goin’ down the rows serial? “ Bid” on title, select in picklist The Stacks Meta- data check Preser- vation review Other library “ bid” ? Circ to cataloging for MARC editing Update picklist if item record has been changed During cataloging touch-up Circ to scanner Put on shipping cart, generate‘packinglist’ invoice Evaluate title. Need is… pass yes pass fail fail no Carts delivered to scanner Picklist database stores select/reject/ship state & supplies item metadata to IA Select title in picklist, upload to monograph de-duper Duplicate? yes Bibliographic Data from SIRIS no Reject in picklist, Circ in Horizon Return to stacks no Reject in picklist, return to stacks
50. IA scanning process Unique IA id is assigned Metadata is gathered from SIRIS and the picklist db And associated with the scan JP2000s generated & transformed Served on archive.org QA is done by IA on 10% Books are returned, cart contents are verified against invoice SIL does 20% QA Checking for metadata matching With item, scan quality etc Updated in picklist as scanned Circ in Horizon Place BHL sticker near barcode Return to Stacks Pass QA? BHL Portal Periodically harvests Marc.xml (bib) and item Records, along with JP2000 from Archive.org To index and display In the portal yes no Update picklist to indicate rescan Put on shipping cart, generate ‘packinglist’ Invoice, alert scanning center Carts delivered to scanner Download .csv from portal with SIL barcodes, Portal URLs Send URLs to SIRIS Office for batch updates
83. 24 April 2009 The following came from a public librarian in Falmouth, Massachusetts: "We recently were asked the question: who discovered the zebra fish? In searching the Encyclopedia of Life I kept seeing the phrase “Hamilton, 1822” next to the “danio rerio”. Wondering who Hamilton was, I searched WorldCat and discovered that Hamilton was Francis Hamilton who had published in 1822 An account of the fishes found in the river Ganges and its branches . I looked at the EOL record and clicked on the Biodiversity Heritage Library link. One of the links was to a Hamilton book! In 1878 the book The Fishes of India was published which included a description and a image of the danio rerio. Links were provided to the exact place in the text where the fish was mentioned, as well as to the plate with the fish itself illustrated. Not only that, but I could send the patron the exact link to both pages which described her fish. How remarkable it was to find this Harvard University book available so easily through the Biodiversity Heritage Library. A great success for our patron, and we looked like magicians bringing the book to her."
84.
85. “ The Biodiversity Heritage Library is a valuable resource for acquiring crustacean literature. At present, a search there ( http:// www.biodiversitylibrary.org/Search.aspx? searchTerm=pycnogonid&searchCat=) will turn up 5 publications (one of which was not contributed by the Smithsonian). Also note that the BHL has scanned these and additional literature at the site for taxonomic terms, and provides links to those documents. There are 1592 "hits" for Pycnogonida. It is likely that you could turn up a lot of additional articles within larger works that way. Alternatively, you could perform searches for volumes of interest (if you know of specific references), to home in on the papers you want. There will be A LOT of additional material becoming available at that site.”
86. “ Yesterday whilst reading the latest edition of The Entomologist's Record I was pleased to find that early editions of this invaluable publication, edited by the seminal entomologist James Tutt (no relation to Elvis's drummer as far as I am aware) are available digitised […] So I went there, and was amazed at what I found. They even have a blog. What a fantastic project!!!” From the blog: http://forteanzoology.blogspot.com/2009/03/fantastic-resource.html
87. […]Michael, an colleague researching wasps was excited that he had discovered in the Biodiversity Heritage Library a copy of an obscure 1860s book: Saussure, H. de & Sichel, J. (1864). Catalogue des espèces de l'ancien genre Scolia , contenant les diagnoses, les descriptions et la synonymie des espèces, avec des remarques explicatives er critiques. Genève & Paris : Henri Georg & V. Masson et Fils pp. 1–350 This book was not in our library, probably not in Australia, and almost impossible to get hold of without travelling to the northern hemisphere. Thanks to the BHL for their work in providing access to works of importance. Michael is now able to use detailed content of this book in his work. John Tann Australian Museum
88. The Biodiversity Heritage Library : Advancing Metadata Practices in a Collaborative Digital Library Suzanne C. Pilsk, Smithsonian Institution Libraries; Matthew Person, MBLWHOI Library ; Joseph deVeer, Ernst Mayr Library, Museum of Comparative Zoology ; John F. Furfey, MBLWHOI Library ; Martin R. Kalfatovic, Smithsonian Institution Libraries Abstract: The Biodiversity Heritage Library is an open access digital library of taxonomic literature, forming a single point of access to this collection for use by a worldwide audience of professional taxonomists, as well as “citizen scientists.” A successful mass scanning digitization program, one that creates functional and findable digital objects, requires thoughtful metadata workflow that parallels the workflow of the physical items from shelf to scanner. This article examines the needs of users of taxonomic literature, specifically in relation to the transformation of traditional library material to digital form. It details the issues that arise in determining scanning priorities, avoiding duplication of scanning across the founding twelve natural history and botanical garden library collections, and the problems related to the complexity of serials, monographs, and series. Highlighted are the tools, procedures, and methodology for addressing the details of a mass scanning operation.
89.
90. A Mass Scanning Workflow Discussion [email_address] [email_address] Thanks to All Staff of the BHL Members
Notes de l'éditeur
First slide, take a deep breath
All of these words apply to this project. Today’s talks have touched upon how these words are reflected in this project. It happens to be a fact that these words exist in a rather well balanced state in this project. Perhaps time will tell as to why that is.
Vision
What it takes; intensive interactive planning, cooperation and vision
Who the original team were
Stop here, perhaps ask question, how often do institutions like these form agreements to merge their collections?
Diane mentioned the huge amount of time processing takes as a central theme of workflow issues.
General search reveals searhc results
New search, bulletin, zoological : looks like single title has 2 records, first record 4 institutions attached to, second record single institution attached
I’m not sure what this is although I could speculate…