SlideShare une entreprise Scribd logo
1  sur  53
METADATA MIGRATION AND
INTEGRATION AT DUMBARTON OAKS
Shalimar Fojas White
Image Collections and Fieldwork Archives
Dumbarton Oaks Research Library and Collection
Visual Resources Association Annual Conference
March 13, 2014, 1:35 PM
IMAGE COLLECTIONS and FIELDWORK ARCHIVES
(ICFA)
PHOTOGRAPH COLLECTIONS
FIELDWORK ARCHIVES
ARCHIVAL FINDING AIDS
IMAGE CATALOGING RECORDS
CATALOGING INVENTORY:
•EmbARK – 61,000 image records
•OLIVIA – 5,000 image records
•MS Access – 5,000 image records
•Word/PDF – 15 archival finding aids
Objective evaluation
Team approach (user-defined)
Peer reviews and resources:
 Council of Nova Scotia Archives: AMS Review (2008)
 CLIR Report by Lisa Spiro (2009)
 Museum Association of New York CMS list (2011)
 Canadian Heritage Information Network (CHIN)
guidance and template
METHODOLOGY
 Handle archival and image
descriptions
 Unlimited hierarchical
relationships
 Afford complex geographic
cataloging
 Item-level storage locations
 Global search
 Local authorities
 Data reuse
 Accessioning
 Import for legacy metadata
migration
 Exports for Hollis, OASIS and VIA
SYSTEM REQUIREMENTS
SYSTEM EVALUATION
V
R
A
C
O
R
E
M
S
A
C
C
E
S
S
E
M
B
A
R
K
O
L
I
V
I
A
V
I
A
M
S
A
C
C
E
S
S
M
S
A
C
C
E
S
S
A
t
o
M
Q
U
B
I
T
D
U
B
L
I
N
C
O
R
E
A
t
o
M
D
I
P
L
A
Y
V
R
A
C
O
R
E
ANALYSIS AND PATTERN
DETECTION
REMOVING EXTRA CHARACTERS
GOOGLE REFINE – Faceting and Clustering
GOOGLE REFINE – Transformations and
Exporting
USABILITYTESTINGAND HELP DOCUMENTATION
SPRING 2014
- Complete VRA
Core Template
Development
- Import Accession
Records
SUMMER 2014
- Test Imports of
Image Records
- Test Uploads of
Digital Objects
FALL 2014
- Import Backlog
Records and Images
- Contribute
Collection-Level
Records to HOLLIS,
WorldCat, etc.
THE ICFA TEAM:
 Anne-Marie Viola, Metadata and Cataloging Specialist
 Rona Razon, Archivist
 Fani Gargova, Byzantine Research Associate
 Beth Bayley, Jessica Cebra, and Ameena Mohammad, Departmental
and Archival Assistants
 Shalimar Fojas White, Manager
SPECIAL THANKS TO:
 Prathmesh Mengane, Database and CMS Specialist
 Alison Miner, ICFA Intern (Fall 2012)
 Artefactual Systems, Inc.

Contenu connexe

Plus de Visual Resources Association

Disinformation and Deepfakes: The Urgent Need for Visual Literacy
Disinformation and Deepfakes: The Urgent Need for Visual LiteracyDisinformation and Deepfakes: The Urgent Need for Visual Literacy
Disinformation and Deepfakes: The Urgent Need for Visual LiteracyVisual Resources Association
 
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...Visual Resources Association
 
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...Visual Resources Association
 
Supporting Art History Students' Digital Projects at American University
Supporting Art History Students' Digital Projects at American UniversitySupporting Art History Students' Digital Projects at American University
Supporting Art History Students' Digital Projects at American UniversityVisual Resources Association
 
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...Visual Resources Association
 
Describing Art on the Street: The Graffiti Art Community Voice
Describing Art on the Street: The Graffiti Art Community VoiceDescribing Art on the Street: The Graffiti Art Community Voice
Describing Art on the Street: The Graffiti Art Community VoiceVisual Resources Association
 
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...Visual Resources Association
 
Accessibility Guidance for Digital Cultural Heritage
Accessibility Guidance for Digital Cultural HeritageAccessibility Guidance for Digital Cultural Heritage
Accessibility Guidance for Digital Cultural HeritageVisual Resources Association
 
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCOCCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCOVisual Resources Association
 
CCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
CCO (Cataloging Cultural Objects): Incorporating CCO in Your WorkflowCCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
CCO (Cataloging Cultural Objects): Incorporating CCO in Your WorkflowVisual Resources Association
 
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...Visual Resources Association
 

Plus de Visual Resources Association (20)

Personal Archiving for Undergraduate Students
Personal Archiving for Undergraduate StudentsPersonal Archiving for Undergraduate Students
Personal Archiving for Undergraduate Students
 
Disinformation and Deepfakes: The Urgent Need for Visual Literacy
Disinformation and Deepfakes: The Urgent Need for Visual LiteracyDisinformation and Deepfakes: The Urgent Need for Visual Literacy
Disinformation and Deepfakes: The Urgent Need for Visual Literacy
 
Jean Charlot: Artist as Archivist
Jean Charlot: Artist as ArchivistJean Charlot: Artist as Archivist
Jean Charlot: Artist as Archivist
 
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
Pattern and Representation: Critical Cataloging for a New Perspective on Camp...
 
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
Stories from the Stop (and Re-Start?): Visual Resources Professionals Face Re...
 
Supporting Art History Students' Digital Projects at American University
Supporting Art History Students' Digital Projects at American UniversitySupporting Art History Students' Digital Projects at American University
Supporting Art History Students' Digital Projects at American University
 
Material Objects and Special Collections
Material Objects and Special CollectionsMaterial Objects and Special Collections
Material Objects and Special Collections
 
Digital Art History
Digital Art HistoryDigital Art History
Digital Art History
 
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
Assessing the use of Qualitative Data Analysis Software (QDAS) by Art Histori...
 
Describing Art on the Street: The Graffiti Art Community Voice
Describing Art on the Street: The Graffiti Art Community VoiceDescribing Art on the Street: The Graffiti Art Community Voice
Describing Art on the Street: The Graffiti Art Community Voice
 
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...
Photographic Glass Plates and Birthdates: Secrets to Optimizing AI-Generated ...
 
Crowdsourcing Collection Development
Crowdsourcing Collection DevelopmentCrowdsourcing Collection Development
Crowdsourcing Collection Development
 
Accessibility Guidance for Digital Cultural Heritage
Accessibility Guidance for Digital Cultural HeritageAccessibility Guidance for Digital Cultural Heritage
Accessibility Guidance for Digital Cultural Heritage
 
CCO (Cataloging Cultural Objects): Why CCO?
CCO (Cataloging Cultural Objects): Why CCO?CCO (Cataloging Cultural Objects): Why CCO?
CCO (Cataloging Cultural Objects): Why CCO?
 
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCOCCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
CCO (Cataloging Cultural Objects): Structuring and Storing Data with CCO
 
CCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
CCO (Cataloging Cultural Objects): Incorporating CCO in Your WorkflowCCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
CCO (Cataloging Cultural Objects): Incorporating CCO in Your Workflow
 
CCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCOCCO (Cataloging Cultural Objects): Applying CCO
CCO (Cataloging Cultural Objects): Applying CCO
 
Emerging Voices Lightning Round 2021
Emerging Voices Lightning Round 2021Emerging Voices Lightning Round 2021
Emerging Voices Lightning Round 2021
 
VRA 2021 JSTOR Forum User Group
VRA 2021 JSTOR Forum User GroupVRA 2021 JSTOR Forum User Group
VRA 2021 JSTOR Forum User Group
 
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...
Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...
 

Dernier

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 

Dernier (20)

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 

VRA 2014 Back to Basics, Fojas White

Notes de l'éditeur

  1. Dumbarton Oaks is a research institute of Harvard University, dedicated to supporting scholarship in three very distinct fields of study, Byzantine, Pre-Columbian, and Garden and Landscape Studies.
  2. My department, the Image Collections and Fieldwork Archives (ICFA) has a bifurcated name that reflects is history as two separate departments that were brought together in the 1990s.
  3. However, for most of the fifty years since Dumbarton Oaks’ founding in 1940s, the Photograph Collection was distinct in organization, purpose, and physical location
  4. from the Fieldwork Archives, which collected papers related to the archaeological and conservation projects sponsored by Dumbarton Oaks and the Byzantine Institute.
  5. When I arrived at Dumbarton Oaks three years ago, this separate history was reflected in how the collections had been cataloged.
  6. The archival collections had finding aids as Word documents or PDFs and were described hierarchically.
  7. The image collections had item level cataloging records, which could be found in several legacy systems. There was no way to search across all our holdings, which was particularly problematic given the nature of our collections – the fieldwork photography had been separated from the fieldwork archives, thereby losing valuable context.
  8. Since the rest of our half a million images had never been fully inventoried, we needed to establish full intellectual and physical control over our holdings. We embarked upon a comprehensive inventory of all the collections, which also included an inventory of all existing cataloging.
  9. In addition to rounding up all versions of the finding aids, we also turned up multiple legacy datasets for the image collections: EmbARK,OLIVIA/VIA, Access databases, for a total of 71,000 image records. How we are attempting to create an aggregated data repository for all these datasets will be the subject of my talk today, but first, a
  10. Full disclosure, I am not a cataloger, nor have I played one on TV. The work I will describe has been spearheaded by Anne-Marie Viola, our Metadata and Cataloging Specialist, who I immediately set out to hire once it became clear that for our inventory to work
  11. We needed an integrated system for collection management. Hired in January 2012, Anne-Marie dove into the system selection process with verve.
  12. She developed a methodology based on objective evaluation of systems against a set of user-defined criteria developed by our interdisciplinary team.
  13. We held several requirements gathering sessions with our team to compile all of the necessary features of our ideal system. Our Archivist needed to be able to continue creating finding aids for our archival collections, with multiple levels of descriptions, from the collection level, subgroups, series, subseries, down to the folder level. Our Byzantine Research Associate insisted on the necessity of browsability by place, the primary method by which scholars had traditionally used our collections, which are arranged geographically by medium. As the manager, I needed a way to record and track accessions, donors, and rights/restrictions, since our predecessors had only left us paper accession logs and records. And of course, there were those 71,000 legacy image cataloging records that we needed to bring together. Eventually, the requirements were whittled down to the high priority deal-breakers.
  14. Anne-Marie reviewed approximately 20 systems, both proprietary and open source. She evaluated them against our requirements list, using the CHIN evaluation template. Eventually, we got it down to 4 candidate systems, two proprietary and two open source, and conducted demonstrations, peer reviews, and reference interviews. We settled on ICA-AtoM.
  15. Which stands for International Council on Archives Access to Memory, an open source archival collection management software.
  16. The system is based on standards developed and promulgated by the International Council on Archives (ICA) with initial funding from UNESCO. We selected the system for two main reasons. The first was flexibility - AtoM allows you to catalog in a number of different record templates: ISAD(G), MODS, RAD, and Dublin Core.
  17. Since it was open source, this would allow us to develop an additional cataloging template for VRA Core. Also, once data is entered into the system, we could get it out as XML, such as EAD XML so that we could contribute our finding aids and records to other distribution channels like Harvard’s OASIS, WorldCat, and ArchiveGrid. Also, since it was open source, there was no ongoing licensing fee and we could devote our limited funds towards development of additional functionality. Simply put, we were priced out of the proprietary systems.
  18. Once we selected the system, we followed two parallel routes for feature development. We took on the development of the VRA Core cataloging template in house, since Prathmesh Mengane, our Database and CMS Specialist, was confident that he could work from the existing Dublin Core plugin to adapt it to VRA Core.
  19. For more complex functionality related to geographic description, we sponsored development through Artefactual Systems, the lead developers of AtoM and archivematica, the digital preservation system. This team of archivists and programmers recognized the benefit of developing a browsable vocabulary of geographic terms for their larger user community and agreed to take on the project. So, we put our small budget for the CMS toward this sponsored development in a forthcoming version of AtoM. In the meantime, we focused on the in-house development…
  20. To guide Prathmesh’ development of the VRA Core cataloging template, Anne-Marie conducted an analysis of all extant cataloging in our legacy datasets.
  21. She evaluated all the fields used in EmbARK, OLIVIA, the various Access databases, and mapped them to VRA Core’s elements.
  22. Then, she developed a crosswalk
  23. Between Quibit, the backend database for AtoM, the existing fields in the Dublin Core template, the VRA Core elements, and the new AtoM displaylabels we desired. Where elements did not map, Prathmesh would have to create new fields for our new VRA Core template.
  24. Finally, Anne-Marie created a Metadata Application Profile
  25. A reference document for Prathmesh thatspecified the encoding scheme, elements, attributes, and labels that will be used in the AtoM collection management system, along with the type of field required, whether free text, a controlled list, or linked to an existing taxonomy in AtoM (Names, Places, or Subjects).
  26. While Prathmesh got started with the development on a staging environment, Anne-Marie started to export datasets from the legacy systems. In some cases, this was relatively straightforward, as was the case with Access and Embark, which had a CSV export.
  27. Other cases, such as OLIVIA, required the assistance of technicians at Harvard’s Digital Repository System, since exporting data from this complex hierarchical system could only be achieved piecemeal, table by table. In the end, Anne-Marie had to reconstruct the original dataset from several spreadsheets. This points to another benefit of AtoM, the ability to get our data back out as XML. For those of you currently in the process of selecting a system, remember that you will always need an exit strategy. Any system, no matter how perfect now, will be obsolete within 5-7 years, so plan accordingly and make sure that you can get your data out for reuse and migration into another system.
  28. Next what I like to call data hygiene, otherwise known as data cleanup. Now, just because it was easy to export a dataset, did not mean that data cleanup was equally straightforward.
  29. In the case of EmbARK, the system added additional characters that would cause column shifting and line breaking, which could be detected by scanning the dataset in Excel.
  30. Once we discovered the nature of the extra characters and the places where EmbARK would typically add them, Anne-Marie developed a method of globally replacing them in BBEdit, an HTML and text editor. Since most of the legacy cataloging had been done on a project basis, Anne-Marie could export discrete sets of data that were more manageable then the 71,000 total records.
  31. At this point, Anne-Marie and Fani, reviewed the data sets to develop the work records that would be used, since they didn’t exist for the legacy records. We started with our fieldwork photography, which concentrated on specific Byzantine sites or monuments. This work helps Anne-Marie to plan out the next step of
  32. Data Quality Assurance
  33. So that the detailed quality control could really begin. We all know what messy and dirty data looks like, but in some cases, it amazed even us. Many of these datasets had been entered over the course of 20 years with no discernable commonstandard applied across them. In one dataset I worked on, describing 3,200 architectural drawings, the name of the department at the time “Fieldwork Archive, Dumbarton Oaks,” was entered at least 48 different ways. Many fields were inconsistently used, with the Site field containing values representing varying levels of specificity from regional terms like “Asia Minor” to countries “Turkey” to cities “Istanbul” and even generic terms like “church.”
  34. Thankfully, one of our interns, Alison Miner, introduced us to the tool we are now using for most of our data standardization, Open Refine.
  35. Previously known as Google Refine, the tool is something that you download and use within a browser to manipulate your datasets. You can facet any column and see all the values entered into that field, along with the numerical distribution, so that you can easily see outliers and cleanup typos and variants in capitalization and spacing. Open Refine also has a very powerful clustering tool that can suggest matches between words that are spelled similarly, sound alike, or might have been mistakenly keyed in.
  36. There are also common transformations that you can perform to remove leading and trailing spaces or globally change the case for your values. The best part? Everything is reversible and your original dataset has not been touched. Once you are done making your changes, your can simply export a fully cleaned version of your dataset as a CSV file, ready for additional review.
  37. The final step is another review with our Byzantine Subject Specialist to ensure that the values used not only conform to existing vocabularies, such as the Getty’s AAT and TGN, but also to accepted norms within the field of Byzantine studies. Since we serve such a specialized audience, we have to be fairly precise about things like using naos instead of nave, or providing alternative titles for monuments like Hagia Sophia, also variously known as Ayia Sofia, Santa Sofia, Saint Sophia, which is located in Istanbul / Constantinople. In certain cases, our local preferred term may diverge from those used by the Library of Congress or other authorities, but we always provide the alternate terms, as well. This essentially serves as a reality check for us librarians and archivists, which points to the benefit of a multidisciplinary team like ours in ICFA where we can incorporate all the perspectives to improve our users’ experience.
  38. In fact, while Anne-Marie was busy with the data cleanup, the rest of us were simultaneously busy populating the CMS.
  39. Fani was working on expanding and refining our taxonomy of geographic terms, adding URIs for matching concepts in vocabularies like TGN and Pleiades. The goal is to get this taxonomy imported into AtoM as SKOS XML. Wherever possible, we included URIs from Linked Data sets, since in the future, we will be able to export our authorities from AtoMas SKOS and hopefully be able to contribute to Linked Open Data repositories like those of the Getty, Pleiades, etc.
  40. We also developed over 400 authority records, all of which conform to the EAC-CPF standard (Encoded Archival Context – Corporate Bodies, Persons, Families). As with the other taxonomies, we tried as much as we could to match to existing records in vocabularies like VIAF, LCNAF, and ULAN. TheAtoM authority records can be exported as EAC XML, which we also hope to eventually contribute to the National Archival Authorities Cooperative (or NAAC), which built off of the SNAC project (Social Networks and Archival Context) to use Linked Data and data visualizations to recreate historical social networks from archival records.
  41. Meanwhile, our Archivist, part-time staff, and interns were converting our existing Word finding aids into AtoM records.
  42. This was primarily accomplished by data entry, cutting and pasting text, but in the future, we plan to accomplish much of our pre-processing of archival collections through the system itself. The numbers of eyes and hands doing the data entry necessitated the development of two key documents, a finding aid Style Guide and a Workflow for the CMS.
  43. We planned for a soft launch of AtoM@DO in December 2013, making it available within the Dumbarton Oaks community only.
  44. This allowed us to conduct 2 weeks of usability testing, where we identified problems and addressed them before our public launch in February 2014. We also developed help documentation tailored to the issues brought up by our testers – who were rewarded with Starbucks gift cards for their time and input.
  45. Our February hard launch consisted of all of our existing archival finding aids, representing approximately 40 collections. Some of these are more robustly described than others, which have minimal collection-level records. However, we are still processing archival collections as we speak, so new content will be added going forward.
  46. But, more importantly, we are in the final stages of Prathmesh’s development of the VRA Core cataloging template.
  47. As of last week, we were down to three fields! The complications have mostly involved linking certain fields to thetaxonomies, which we believe are crucial to the success of the venture. Our users ability to browse by Place and locate records by their creators’ Names will be vitally important to the success of AtoM@DO as an integrated data repository for our blended photograph and archival collections.
  48. So, next steps. Once the VRA Core template is completed, we plan to start test imports of Anne-Marie’s cleaned up datasets, which have been waiting in the wings, ready to go, along with their associated digital images.
  49. We hope to start importing VRA Core compliant image records into AtoM by this Summer and working steadily through the backlog of 71,000 records through the remainder of the year. I’ve also finally been able to OCR and transcribe our paper accession records and we plan to start importing those into AtoM’s accessions module in the coming months.
  50. As the manager of this complex, but infinitely fascinating, collection of photographs and archival records, one of the main selling points of AtoM was the fact that we could combine the collection management system that we so desperately needed with a discovery tool to enable our users to access our data online, even before we had gained full intellectual and physical control over our holdings.
  51. AtoM@DO is a work in progress, but is flexible enough to allow us to provide access while we rationalize collection management behind the scenes. For now, our users will only see our pretty records in AtoM@DO, without any idea of the messy paper records, multitude of old finding aids and inventories, small menagerie of legacy systems, and behemoth backlog of dirty data.
  52. Thankfully, with a great team in place, ICFA can go back to basics and concentrate on reuniting our separated collections through metadata, using the legacy that our predecessors have left us as a foundation from which to move forward into the future.