As institutions put more collections online they will inevitably come across wide variations in the level of description within collections. Without unlimited resources (time, staff and funding) it can be a daunting task to describe the collection. An increasingly recognized way to gather descriptions is to use enthusiastic amateurs and subject experts inside and outside the organization and other interested parties through crowdsourcing. The National Institute of Standards and Technology’s Digital Archives is using crowdsourcing to elevate the description of artifacts in the museum collection. This presentation will outline the decision-making process, outreach efforts and a review of our successes and lessons learned in this endeavor.
The presentation will include metrics on the site visits to the various collections, emphasizing users’ interaction with the items we’ve crowdsourced. We will also illustrate how the new 6.0 user comment option was utilized in this process. There will be a review of our decisions on standards and best practices. We will also share our assessment of some of the research literature on crowdsourcing.
Crowdsourcing and the NIST Digital Archives: Using the 'crowd' to describe NIST Museum artifacts
1. Crowdsourcing and theNIST Digital Archives Using the “crowd” to describe NIST Museum artifacts Eastern CONTENTdm Users Group August 2, 2011 Towson University, Towson, MD
2. Introduction Regina Avila Digital Services Librarian CONTENTdm Administrator regina.avila@nist.gov 301-975-3575 Andrea Medina-Smith Metadata Librarian andrea.medina-smith@nist.gov 301-975-3575
3. National Institute of Standards and Technology Founded in 1901, NIST is a non-regulatory agency within the U.S. Department of Commerce. NIST's mission is to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life.
4. ISO provides professional scientific and technical research assistance through three primary programs: Research Library Information Program Electronic Information and Publications Program NIST Museum and History Program Information Services Office
6. Complements ISO efforts to tell NIST’s story through publications and museum & history program to increase NIST’s impact Fulfilled long-term goal of creating digital surrogates to increase visibility of scientific instruments developed and used by NBS/NIST scientists Coincided with need to conduct inventory of NIST heritage assets and to move artifacts from storage into space within the Library Why Museum artifacts?
9. What is crowdsourcing, anyway? “Simply defined, crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call.” - Jeff Howe June 2, 2006
10. Correction & Transcription Tasks Contextualization Collecting Classification Co-curation What does crowdsourcing do? http://bit.ly/pvWndt
15. Wired.com Popular Science Information Week Government Computer News R&D Magazine ReadWriteWeb Smithsonian The Gazette (Montgomery County) Radio Several Blogs Press attention
19. Some very helpful “Test apparatus for respirator masks. Such masks had to be designed to fit a large number of facial structures, so a ‘95% profile model’ was developed. The contours of this model were said to be common to 95% of the population. When designing a respirator mask, you need to have the edges seal tightly against the face, so possibly these wooden heads quantified what a ‘face’ is.”
20. Some not as useful … but funny “These items are middle managers. You can distinguish them from upper management which are made of bone instead of wood.”
21. Indirect Response: Answers from outside sources Google translation: “Raking piles in the attic, museum workers of the National Institute of Standards and Technology (NIST) stumble under an hour on a very sophisticated instruments and mechanisms. Lost documentation and manuals plunged scientists in gloom, because the purpose of these devices often do not identify at first glance (and it often happens that in the second). That's why they turn to you for help in identifying every piece of iron dust.”
24. Current and Retired NIST Employees Standards Alumni Organization Tapping the experts Photo credit: Chris Rossi/The Gazette
25. End of March to mid-April went from 1,165 Visits, 135 Unique hosts to 23,283 visits, 16,606 unique hosts Continued responses, press coverage “Long tail” effect Usage stats
26. 31 Countries Hungary Iceland Israel Italy Japan Mexico Netherlands New Zealand Norway Paraguay Poland Romania Russia Spain Switzerland USA Argentina Australia Belarus Belgium Brazil Canada Chile Colombia Denmark Egypt Finland France Germany Greece Guatemala
28. C. Anderson, The long tail, Wired Magazine, 12 (10), October 2004, http://www.wired.com/wired/archive/12.10/tail.html (accessed August 2, 2011). J. Howe, Crowdsourcing: A Definition, Crowdsourcing Blog, http://crowdsourcing.typepad.com/cs/2006/06/crowdsourcing_a.html , (accessed August 2, 2011). M.R. Kalfatovic, E. Kapsalis et.al., Smithsonian Team Flickr: A library, archives and museums collaboration in web 2.0 space, Arch. Sci. 2009, http://hdl.handle.net/10088/8156 . M.G.Krause and E. Yakel , Interaction in virtual archives: The Polar Bear Expedition Digital Collections next generation finding aid, American Archivist 70 (2), 282-314 (2007). Bibliography
29. J. Oomen and L. Aroyo, Crowdsourcing in the cultural heritage domain: Opportunities and Challenges in 5th International Conference on Communities & Technologies – C&T 2011, http://www.crowdsourcing.org/document/crowdsourcing-in-the-cultural-heritage-domain-opportunities-and-challenges/3136 , (accessed August 2, 2011). C. Shirkey, Power laws, weblogs, and inequality, Clay Shirkey’s Writing About the Internet, February 8, 2003 http://www.shirky.com/writings/powerlaw_weblog.html, (accessed August 2, 2011). M. Springer, B. Dulabahn et.al., For the common good: The Library of Congress Flickr pilot project, Library of Congress, Washington, DC, October 2008, 50pp.www.loc.gov/rr/print/flickr_report_final.pdf (accessed August 2, 2011). B.Stein, Crowdsourcing science history: NIST Digital Archives seeks help in identifying mystery artifacts, NIST Tech Beat, April 12, 2011, http://www.nist.gov/director/archives-041211.cfm (accessed August 2, 2011). Bibliography
Notes de l'éditeur
Introduce myself Regina Avila, digital services librarian. Came from Colorado in early 2009 to start work on the repository. For close to two years I.ve been tasked with developing an online digital collection of NIST publications (mainly Journal of research) and which has now expanded to include museum objects and plans for other collections. Andrea Medina-Smith – As the Metadata Librarian at the National Institute of Standards and Technology Andrea Medina-Smith works with the digital repository, designs workflows for processing and describing digital collections and helps document museum collections. Before coming to NIST she wored for the Jewish Women’s Archive as the digital archivist.She has a MS, LIS with a concentration in Archives Management from Simmons College and a BA from the University of California, Santa Cruz. Andrea has experience creating descriptive schemas for archives and special collections, using diverse standards, and is now diving head long into the world of data curation.
Science NIST does is in everybody’s homes, everybody’s lives.Accuracy of instruments used everyday:Water flow meter - garden hoseHow much produce weighs at a grocery storeAccuracy of a pilot’s instruments
RLIP – Research and reference assistanceEIPP – Publishing assistance (NIST Journal of Research, Technical papers, reports, etc.)NIST Museum and History Program.
Addition to the NIST History Program – Addition to electronic products:NIST Virtual Library, NIST Virtual Museum, now we have the NIST Digital Archives.Mission of preservation and dissemination of NIST history
NDA has NIST publications, but we also chose museum artifacts.Selecting digital objects:Why museum objects – we took advantage of the fact that we were moving them closer to the museum. As they were moved we conducted an inventory and fulfilled a long-term goal of updating our database and adding to the library’s photo collection. The reason we chose to put images of the museum artifacts was to make them more accessible to the public. This supports the ISO role in preserving NIST’s history, etc…A lot of interesting and historically significant pieces. Needed to update the description, decided to crowdsource information about them.Digitizing content – Museum objects – months-long project photographing items. Some were heavy, some very fragile, some were dirty or broken. Some were even hazardous (didn’t touch those).Process the newly created digital objects – Conversion to archival formats – kept RAW and converted to DNG for dark storage (possibly TIFF later); JPG for CDM.Edited images - Cropped and arranged photosResearch objects – verified information artifacts, creators, use, dates, etc. Verified references we were given. Documented in the registrar’s database.
A review of the 5ws and why NIST choose to use it.
http://crowdsourcing.typepad.com/cs/2006/06/crowdsourcing_a.htmlIt is a call to participation by both amateurs and experts, those within and outside of an organization
Oomen, J. & Arroyo, L. (2011). Crowdsourcing in the cultural heritage domain: opportunities and challenges. In: 5th International Conference on Communities & Technologies. Brisbane, Australia - 29 June – 2 July 2011. 2011, van: http://www.cs.vu.nl/~marieke/OomenAroyoCT2011.pdf (accessed July 11, 2011).Correction/transcription – NYPL menu collectionContextualization – “adding contextual knowledge to objects, e.g. by telling stories or writing articles/wiki pages w/ contextual data.” google maps mashups (for commercial reviews/ yelp style) flickr commonsCollecting – Katrina’s Jewish VoicesClassification – flickr commons, steve.museum, Beyond Brown Paper @ Plymouth College, NH (http://tagger.steve.museum/)Co-curation – brooklyn museum “click”Funding – kickstarterhttp://menus.nypl.org/http://www.flickr.com/commons?GXHC_gx_session_id_=6afecb2055a3c52chttp://katrina.jwa.org/http://beyondbrownpaper.plymouth.edu/item/33831#comment-14945http://www.brooklynmuseum.org/exhibitions/click/top_discussed.phphttp://kickstarter.com
Add lots of images of LOC, Smithsonian, Powerhouse, ZooniverseScience: World Wildlife Fund (The Moabi is a powerful online tool for tracking information spatially.), NOAA National Climatic Data Center (in prototype stage: Global Databank of historical climate data and a project to reanalyze and classify tropical storms)http://www.flickr.com/photos/library_of_congress/http://www.flickr.com/photos/smithsonian/http://www.galaxyzoo.org/http://www.flickr.com/photos/powerhouse_museum/
Enriched collectionsStats from LOC: in 10 months on Flickr Commons their 3k images received more than 7k comments. More than 2,500 people added 67k tags (keywords) – an average of 14 tags per imageEngaged users – creating a community UMich Polar Bear Expeditions
Can embrace Web 2.0 by going where your audience already is (if you are posting to flickr, Facebook, or using Twitter to push the information out to users)Creates an interaction with your customers -- Makes them participants with your collections not just passive usersSeveral objects lacked description, such as this, “Wood box with one dial at top”Since CDM is web-based, it allowed us to call on people inside and outside of NIST, both employees and outsiders, subject experts and amateur enthusiastsGreat marketing tool
We used it as a hook to get people interested – starting with our own internal PR people.Presents customers with a puzzle, a challenge to get them involved. Presented at open house, internal PR decided to write an article.After this initial NIST TechBeat article ran, the word spreadhttp://www.nist.gov/director/archives-041211.cfm
TechBeat article generated a lot of press attention.Two interviews with GCN and Wired.com – also UK wired.comAsked to do a photo feature for an iPad-only publication, The DailyLocal newspaper
The NDA version of a “compound object” Postcard or Cube weren’t options, so we edited images into multiple images for single frame.
Screen shot of front page and our mystery objects.Front page reads:“You can help identify NIST Museum artifacts!Sometimes we have only minimal information about the artifacts we receive for the NIST Museum.Please browse these artifacts and tell us if you have more information on any of them.With your help, we can enhance the NIST Museum experience!”Used custom search URL with “crowdsource” in a hidden metadata field.Some have become demystified – we have the title, even a few references. But we’re still seeking information about how they were used at NIST.
ResponseAfter Wired.com article, response really took off. Set up a dedicated e-mail address to receive responses: NDA@nist.gov60 emails in first week, continuing to come in each week.
Some very useful.
Some not as useful.
Other website’s comments section.Google Translation:“Raking piles in the attic, museum workers of the National Institute of Standards and Technology (NIST) stumble under an hour on a very sophisticated instruments and mechanisms. Lost documentation and manuals plunged scientists in gloom, because the purpose of these devices often do not identify at first glance (and it often happens that in the second). That's why they turn to you for help in identifying every piece of iron dust.”
Found a similar drafting instrument and described how part of the mechanism moves, is used.
Other website comments section. (Need to have mht files in same folder.)Metafilter
Originally we had the idea of using our internal “crowd” of NIST employees. In fact, the first object we identified was thanks to the help of a NIST employee who attended our Open House when we debuted the NDA.Since then we have continued to call on employees through our in-house blog, NewsCenter, and more recently, Have called on NIST’s Standards Alumni Association. – they gave us a lead on the “box with one dial”Pictured:James Schooley, president of SAA. Authored NIST history book “Responding to National Needs: The NBS becomes NIST” MDC says: “actually we tried something before crowd scourcing became aan in term we took aobut 20 objects and put an exhibit in the halll and a book for NIST employees to ID and make comments it was fairly successful and an early version of crowd sourcing.”PHOTO CREDIT:Chris Rossi/The Gazette Retired NIST scientist Jim Schooley looks through one of the museum storage cages in the agency’s basement in Gaithersburg. To help identify items stored in the basement, NIST has launched an online digital archive of artifacts and is working with retired employees to identify where and when each object was used.http://www.gazette.net/article/20110713/NEWS/707139400/1020/nist-enlists-retired-scientists-to-help-identify-artifacts&template=gazette
Visits occur when some remote site makes a request for a page on your server for the first time. As long as the same site keeps making requests within a given timeout period, they will all be considered part of the same VisitHosts is the number of unique IP addresses/hostnames that made requests to the server. Care should be taken when using this metric for anything other than that. Many users can appear to come from a single site, and they can also appear to come from many IP addresses so it should be used simply as a rough gauge as to the number of visitors to your server.4/12/2011 - TechBeat article April 4/19/2011 - Wired.com a6/15/2011 - Smithsonian7/13/2011 – GazetteLate July radio interview