Preserving the Smithsonian Institution’s Web Presence

•Download as PPTX, PDF•

1 like•975 views

Presentation delivered by Lynda Schmitz Fuhrig, Electronic Archivist, and Jennifer Wright, Archivist, for the Smithsonian Institution Archives, at the Smithsonian Archives Fair on October 14, 2011 in Washington, DC. Although it first began capturing institutional websites in the late 1990s, the Smithsonian Institution Archives initiated a project in 2009 to capture the explosion of public websites and social media instances maintained by its many museums, research centers, and programs with the Heritrix crawler. This presentation reviews appraisal, accessioning, and capture issues in documenting the Smithsonian’s web presence in the early 21st Century.

Education Technology Business

Preserving the
Smithsonian Institution’s
Web Presence

Smithsonian Lynda Schmitz Fuhrig and Jennifer Wright
Institution Archives
Oct. 14, 2011
Fair

The Mission of SI Archives
 Appraise, acquire, and preserve the records of
the Institution
 Offer a range of research and reference
services
 Establish policy and provide expert guidance
on record keeping practices
 Create and promote products and services
that broaden understanding of the
Smithsonian
 Provide professional archival and conservation
expertise

Website and Social Media
Registry
 A “record” is any official recorded information,
regardless of medium or characteristics,
created, received, and maintained by a
Smithsonian museum, office, or employee
 Websites and social media accounts must be
managed as records
 Registry allows staff from across the
Smithsonian to add and update information
about all of their websites and social media
accounts

Appraising Records
 All records must be appraised to determine
their ultimate disposition
 Records appraised based on
administrative, legal, historical, and research
value
 Records with long-term value are transferred
to Archives

Appraising Traditional Websites
Websites are public face of Smithsonian
 Significant historical and research value

 Constantly changing

 Crawl annually and before and after major

redesigns
 Work with webmasters to determine if crawls
should be more or less frequent

Appraising Social Media
Accounts
All social media accounts are used differently
 Each account appraised individually based on

content
 Accounts containing significant original content

will be fully captured each year
 Accounts consisting mostly of links to other

resources will be captured occasionally to
document existence
 Method and frequency of capture may depend
on terms of service and ability to avoid
capturing non-Smithsonian content

Past Web Archiving Procedures
• Files transferred from the Smithsonian’s IT
office
• HTTrack web crawler
• Scripts used to create XHTML preservation
files but very manual and time-consuming

Heritrix
• Archival web crawler
• Open source
• Java
• Developed by Internet Archive, National Library
of Norway and National and University Library of
Iceland

WARC
WARC – Web ARChive file format
 International standard – ISO 28500:2009

 Extension of the ARC format in use since 1996

 Container format

STRI website in 1995
SIA Accession 05-032

Social Media
 Third-party issues
 Privacy concerns
 Different tools

Lessons Learned
 In-house archiving takes time
 No one-size fits all solution
 Master site registry requires regular updating

Contacts and Resources
Lynda Schmitz Fuhrig
Digital Services Division
schmitzfuhrigl@si.edu

Jennifer Wright
Archives and Information Management Team
wrightjm@si.edu

Smithsonian Institution Archives website:
http://siarchives.si.edu

Similar to Preserving the Smithsonian Institution’s Web Presence

Managing Scholarly Research Output: The Smithsonian Institution ExperienceMartin Kalfatovic

Managing provenance in the Social Sciences: the Data Documentation Initiative...ARDC

Community Collaboration in the Creation of Digital Collections - 2015 OR Heri...Samuel W. Shogren, MPA., LEAD assoc.

Nina wilson.resume.2015.2Nina Wilson

ArchivesSpace-Archivematica-DSpace Workflow Integration Project Introduction ...mikeum

Sharing Your Digital CollectionWiLS

Lighting Talks: Innovations in Digital ProjectsWiLS

Take Control of Your Collectionsleolandis

Wetzel, Baish, Johnson, Reich, and Grant "Digital Preservation: Current Efforts"National Information Standards Organization (NISO)

Archives in an Online WorldCreating LSE Digital LibraryALISS

Core webinar updated 30-05-2020Dr Trivedi

Nina Wilson ResumeNina Wilson

OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research

Working together – Using social media tools / enterprise tools (Sharepoint, B...Rindra Ramli

SLABoston_Presentation2015Rindra M Ramli

Reference Rot and Linked Data: Threat and RemedyEDINA, University of Edinburgh

How to Manage Managing Your Enterprise ContentPatrick Tucker

Eastern Shores Library System digitization projectRecollection Wisconsin

Ala cspace aspace rep services demo 2015LYRASIS

Smithsonian Libraries in Service of Scholarly Communications: An Introduction...Martin Kalfatovic

Similar to Preserving the Smithsonian Institution’s Web Presence (20)

Managing Scholarly Research Output: The Smithsonian Institution Experience

Managing provenance in the Social Sciences: the Data Documentation Initiative...

Community Collaboration in the Creation of Digital Collections - 2015 OR Heri...

Nina wilson.resume.2015.2

ArchivesSpace-Archivematica-DSpace Workflow Integration Project Introduction ...

Sharing Your Digital Collection

Lighting Talks: Innovations in Digital Projects

Take Control of Your Collections

Wetzel, Baish, Johnson, Reich, and Grant "Digital Preservation: Current Efforts"

Archives in an Online WorldCreating LSE Digital Library

Core webinar updated 30-05-2020

Nina Wilson Resume

OCLC Research @ U of Calgary: New directions for metadata workflows across li...

Working together – Using social media tools / enterprise tools (Sharepoint, B...

SLABoston_Presentation2015

Reference Rot and Linked Data: Threat and Remedy

How to Manage Managing Your Enterprise Content

Eastern Shores Library System digitization project

Ala cspace aspace rep services demo 2015

Smithsonian Libraries in Service of Scholarly Communications: An Introduction...

Recently uploaded

TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez

ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz

Measures of Position DECILES for ungrouped dataBabyAnnMotar

Keynote by Prof. Wurzer at Nordex about IP-designMIPLM

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup

Expanded definition: technical and operationalssuser3e220a

Concurrency Control in Database Management systemChristalin Nelson

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxExcellence Foundation for South Sudan

Presentation Activity 2. Unit 3 transv.pptxRosabel UA

ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri

The Contemporary World: The Globalization of World PoliticsRommel Regala

Activity 2-unit 2-update 2024. English translationRosabel UA

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

4.16.24 21st Century Movements for Black Lives.pptxmary850239

Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor

Field Attribute Index Feature in Odoo 17Celine George

ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1

ClimART Action | eTwinning Projectjordimapav

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco

Recently uploaded (20)

TEACHER REFLECTION FORM (NEW SET........).docx

ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...

Measures of Position DECILES for ungrouped data

Keynote by Prof. Wurzer at Nordex about IP-design

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf

Expanded definition: technical and operational

Concurrency Control in Database Management system

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx

Presentation Activity 2. Unit 3 transv.pptx

ICS2208 Lecture6 Notes for SL spaces.pdf

The Contemporary World: The Globalization of World Politics

Activity 2-unit 2-update 2024. English translation

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)

LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx

4.16.24 21st Century Movements for Black Lives.pptx

Dust Of Snow By Robert Frost Class-X English CBSE

Field Attribute Index Feature in Odoo 17

ANG SEKTOR NG agrikultura.pptx QUARTER 4

ClimART Action | eTwinning Project

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf

Preserving the Smithsonian Institution’s Web Presence

1. Preserving the Smithsonian Institution’s Web Presence Smithsonian Lynda Schmitz Fuhrig and Jennifer Wright Institution Archives Oct. 14, 2011 Fair

2. The Mission of SI Archives  Appraise, acquire, and preserve the records of the Institution  Offer a range of research and reference services  Establish policy and provide expert guidance on record keeping practices  Create and promote products and services that broaden understanding of the Smithsonian  Provide professional archival and conservation expertise

3. Smithsonian’s First Home Page, 1995

4. The Smithsonian Today

5. Website and Social Media Registry  A “record” is any official recorded information, regardless of medium or characteristics, created, received, and maintained by a Smithsonian museum, office, or employee  Websites and social media accounts must be managed as records  Registry allows staff from across the Smithsonian to add and update information about all of their websites and social media accounts

6. Appraising Records  All records must be appraised to determine their ultimate disposition  Records appraised based on administrative, legal, historical, and research value  Records with long-term value are transferred to Archives

7. Appraising Traditional Websites Websites are public face of Smithsonian  Significant historical and research value  Constantly changing  Crawl annually and before and after major redesigns  Work with webmasters to determine if crawls should be more or less frequent

8. Appraising Social Media Accounts All social media accounts are used differently  Each account appraised individually based on content  Accounts containing significant original content will be fully captured each year  Accounts consisting mostly of links to other resources will be captured occasionally to document existence  Method and frequency of capture may depend on terms of service and ability to avoid capturing non-Smithsonian content

9. Past Web Archiving Procedures • Files transferred from the Smithsonian’s IT office • HTTrack web crawler • Scripts used to create XHTML preservation files but very manual and time-consuming

10. Heritrix • Archival web crawler • Open source • Java • Developed by Internet Archive, National Library of Norway and National and University Library of Iceland

11. WARC WARC – Web ARChive file format  International standard – ISO 28500:2009  Extension of the ARC format in use since 1996  Container format

12. Crawling in Heritrix

13.

14.

15. STRI website in 1995 SIA Accession 05-032

16. Viewing a Crawl

17. More To Do

18. Social Media  Third-party issues  Privacy concerns  Different tools

19. Lessons Learned  In-house archiving takes time  No one-size fits all solution  Master site registry requires regular updating

20.

21. Contacts and Resources Lynda Schmitz Fuhrig Digital Services Division schmitzfuhrigl@si.edu Jennifer Wright Archives and Information Management Team wrightjm@si.edu Smithsonian Institution Archives website: http://siarchives.si.edu

Preserving the Smithsonian Institution’s Web Presence

Recommended

Recommended

More Related Content

Similar to Preserving the Smithsonian Institution’s Web Presence

Similar to Preserving the Smithsonian Institution’s Web Presence (20)

More from Smithsonian Institution Archives

More from Smithsonian Institution Archives (9)

Recently uploaded

Recently uploaded (20)

Preserving the Smithsonian Institution’s Web Presence