SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
The Web is a Mess
How I learned to stop
worrying and love web
archiving
We are a Digital Library
Mission Statement: Universal access to all knowledge
o Founded by Brewster Kahle in San Francisco,
California in 1996
o Officially designated a Library by the State of California
in 2007
About Internet ArchiveAbout Internet Archive
500,000 Books
500,000
500,000
Books
Moving Images
http://flickr.com/photos/marfis75/
500,000
500,000
1,000,000
Books
Moving Images
Audio Recordings
500,000
500,000
1,000,000
2,000,000
Books
Moving Images
Audio Recordings
Hours of TV
500,000
500,000
1,000,000
2,000,000
3,600,000
Books
Moving Images
Audio Recordings
Hours of TV
eBooks
The Archive is accessible to the public via
the website: www.archive.org
o Started collecting content in 1996
o First web pages public available in 2001
o 347+ billion web pages
o 200+ million websites
o Almost every domain
o Content in 140+ Languages
o Collect a broad summary of the web every 30-60
days - approximately 10 billion pages per
snapshot
Access to General Web Archive
Access to General Archive
What is Web Archiving?
Web archiving is the process of
collecting portions of web content,
preserving the collections, and then
providing access to the archives - for
use and re use.
A web archive is a collection of archived
URLs grouped by theme, event, subject area,
or web address.
A web archive contains as much as possible
from the original resources and documents
the change over time. It is a priority to
recreate the same experience a user would
have had if they had visited the live site on
the day it was archived.
What is a Web Archive?
Who is archiving the webWho is web archiving?
Why are We Doing This?
• Web archives preserve the web. They act as the web
equivalent of the archive or library. In this role, their
mission is to acquire and preserve the web for future
generations… ensuring its continued survival for
future generations
• Billions of people around the world have grown
accustomed to using the web as their primary
resource to acquire information.
• The availability of this electronic information is
taken for granted and it is a fallacy that if something
is on the web it will be there forever.
• There’s an essential need for people to understand
that the web represents who we are. It’s our culture
and our social fabric, and we don’t want to lose it.
Why should we archive
the web?
How long does a website live?
• A 1997 report in Scientific
American claims 44 days.
• A subsequent academic 2001
study in IEEE suggests 75 days.
• A 2003 Washington Post
article indicates the number is
100 days.
• A 2013 study by Old Dominion
University says that after the
first year of publishing, nearly
11% of social media will be lost
and after that we will continue to
lose 0.02% per day
How long does a website live?
• Create a thematic/topical web archive on a specific subject
• Capture ‘at risk’ content during a spontaneous event
• Fulfill organizational mandate to preserve institutional memory &
history
• Archive state/local agency publications no longer deposited
in print form
• Archive records to meet university and/or government retention
policies.
• Collect content to act as a research service for scholars to turn to
• Capture social media sites as part of organizational records
• Collect web-based information to augment physical holdings.
• Archive online art ephemera
• End of Life/Closure
Web Archiving Use Cases
How does web archiving
work?
What is a crawler?
A crawler is the
software that captures
and archives web
pages. A crawler visits a
page and indexes the
content included
therein
Some technical challenges in
capturing content
• Technical: dynamic content utilize
scripting languages (Flash and
JavaScript). The web is a hodgepodge
of technologies, some old and
outdated, others at the cutting edge.
• Capturing social media sites has
become necessary as the web is
moving away from html and moving
towards applications
• Explore other capture mechanisms
besides using a traditional crawler
resource: hybrid
architecture/API/headless browsers
http://www.chaitalag.com/new/s/tubig
http://www.helenbrowngroup.com/2011/02/rescue-from-the-digital-firehose/gushing-firehose-by-
joseph-robertson/
Amount of
content that is
being archived
Amount of data being
created by content
providers
Challenge: a lot of data
Challenge: How much to archive?
There Are LimiTs…
Challenge: What to archive?
…What is important to you? What do you
want people to know about? What are your
organization’s collecting activities? Vision?
Participant Poll
• Does any of this make
any sense?
Managing Collections
Starting a Collection
Collection: A group of
URLs crawled and
organized around a
common theme, topic or
domain
Ask Yourself:
• What is the topic of this collection?
• What websites would you like to archive as
part of this collection?
Collections Start with Seeds
• Seed: starting point URL
for the crawler. The crawler
will follow linked pages
from your seed URL and
archive them if they are ‘in
scope’.
• Document: any file with a
distinct URL (html, image,
PDF, video, etc).
Some of our Partner’s
Digital Collections
• Stanford University (Palo Alto California)
• American University in Cairo
• Biblioteca Nacional de España
Stanford University,
Islamic & Middle Eastern Collection
Use Case: harvest and preserve Iranian
Blogs
• Archiving over 300 blogs written by and for
Iran and the Iranian people
• Includes coverage of 2009 Iranian elections
and the current Middle East unrest
Stanford and New York Universities
Islamic and Middle Eastern Collection
American University of Cairo
Use Case: The American University in Cairo
Web Archive collects, preserves, and
provides access to the web content
published by students, faculty, departments,
and offices at AUC. The archive also collects
Web documents that have long-term
research or historical value.
January 25th Revolution and
University on the Square
Demonstrators in Tahrir Square.
Image courtesy of Ahmad and the American University in Cairo Rare Books and Special Collections Library.
Archivist Driven Captures
Thank you to Egypt's youth and Facebook .
Image courtesy of Martin and Amy Rowe and the American University in Cairo Rare Books and Special Collections Library.
Patron Driven Captures
Screenshot of the University on the Square Contribution form.
In addition to soliciting photos and videos, we asked content providers to
websites, blogs, Twitter feeds, etc.
Archivist as Advocate
Protester documenting the demonstrations in Tahrir Sqare.
Image courtesy of Robeir Rasmy and the American University in Cairo Rare Books and Special Collections Library.
Breaking down the life cycle
• One of its top priorities as a memory institution is to
consolidate whichever strategies lead to the integral
preservation of Spanish Internet-published contents, in
accordance with the library's mission as keeper and
disseminator of Spanish culture.
• Commitment to its patrons, who expect the web archive to
become a publicly and freely accessible key information
source for the study of the 21st century.
Biblioteca Nacional de España
Breaking down the life cycle
Use cases:
• 2011 Election crawl
• 2012 Humanities crawl
• 2009-present .es domain crawls
• 2013 .es Broad Survey Crawl, visited the top level page of
every web site registered to .es ( in partnership with Red.es)
• 2011-2013 Thematic curation (World cups, Olympics,Global
Hunger)
Biblioteca Nacional de España
http://www.udatleticoisleño.es
http://www.facebook.com/eajpnv
http://twitter.com/xalmar
• Archived wen page from
Facebook and/or Flickr
http://es.wikipedia.org/wiki/Partido_P
irata_(España)
http://www.estrelladigital.es
http://leer.es
http://iuabierta.blogspot.com
Not available on the live web
http://www.piratamadrid.es
Not available on the live web
Making sense of it all
• Web Archiving life cycle /model
• Internet Archive future objectives
– Social Media
– Distributed Content
– Visualization and analytical tools for more
useful interaction
– Search
– Mobile platforms
– Enhanced Researcher Access
Web Archiving Life Cycle Model
Web Archiving Life Cycle Model white paper available: http://www.archive-it.org/publications
Breaking down the life cycle
Outer layer:
• Vision and Objectives
• Resources and Workflow
• Access / Use / Reuse.
• Preservation
• Risk Management
Inner Circle:
• Appraisal and Selection.
• Scoping
• Data Capture
• Storage and Organization
• Quality Assurance and Analysis
Breaking down the life cycle
Participant Poll
• Are you confused yet?
I hope not. Happy to
answer questions!
The importance of web archiving
“As our digital world continues to grow at a
breathtaking pace and more and more of our daily
live occurs within its digital boundaries, we must
ensure that web archives are there to preserve our
collective global consciousness for future
generations”
Kalev H. Leetaru, University of Illinois
Kristine Hanna,
Director, Archiving Services
Internet Archive
kristine@archive.org
Thank you!
The web is a mess: how I learnt to stop worrying and love web archiving. Kristine Hanna

Contenu connexe

Tendances

Virtual Libraries and their Amplification in context of Web 2.0
Virtual Libraries and their Amplification in context of Web 2.0Virtual Libraries and their Amplification in context of Web 2.0
Virtual Libraries and their Amplification in context of Web 2.0Markus Trapp
 
“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative
“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative
“Agile” as Key to Collaboration on NYU Digital Collections Discovery InitiativeLovins, Daniel
 
Digitisation at KU Leuven University Libraries: Towards consolidation
Digitisation at KU Leuven University Libraries: Towards consolidationDigitisation at KU Leuven University Libraries: Towards consolidation
Digitisation at KU Leuven University Libraries: Towards consolidationIMPACT Centre of Competence
 
Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017Nuno Freire
 
Digital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework ProgrammeDigital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework Programmelocloud
 
Biblissima and Sustainability: challenges, priorities, possibilities
Biblissima and Sustainability: challenges, priorities, possibilitiesBiblissima and Sustainability: challenges, priorities, possibilities
Biblissima and Sustainability: challenges, priorities, possibilitiesEquipex Biblissima
 
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...Biblioteca Nacional de España
 
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...The European Library
 
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...Emma Stanford
 
Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531charper
 
Biblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsBiblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsEquipex Biblissima
 
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-Salmerón
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-SalmerónTools 2.0 in Library Associations and National Libraries. Glòria Pérez-Salmerón
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-SalmerónBiblioteca Nacional de España
 

Tendances (19)

Cpd25_Aquiles Alencar Brayner
Cpd25_Aquiles Alencar BraynerCpd25_Aquiles Alencar Brayner
Cpd25_Aquiles Alencar Brayner
 
Islandora and Omeka: Building U of T Digital Collections & Exhibits
Islandora and Omeka: Building U of T Digital Collections & ExhibitsIslandora and Omeka: Building U of T Digital Collections & Exhibits
Islandora and Omeka: Building U of T Digital Collections & Exhibits
 
Virtual Libraries and their Amplification in context of Web 2.0
Virtual Libraries and their Amplification in context of Web 2.0Virtual Libraries and their Amplification in context of Web 2.0
Virtual Libraries and their Amplification in context of Web 2.0
 
Bl labs roadshow aab_sheffield.2016
Bl labs roadshow aab_sheffield.2016Bl labs roadshow aab_sheffield.2016
Bl labs roadshow aab_sheffield.2016
 
“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative
“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative
“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative
 
Europeana in a nutshell. Annette Friberg
Europeana in a nutshell. Annette FribergEuropeana in a nutshell. Annette Friberg
Europeana in a nutshell. Annette Friberg
 
Digitisation at KU Leuven University Libraries: Towards consolidation
Digitisation at KU Leuven University Libraries: Towards consolidationDigitisation at KU Leuven University Libraries: Towards consolidation
Digitisation at KU Leuven University Libraries: Towards consolidation
 
Digital Scholarship at the British Library
Digital Scholarship at the British LibraryDigital Scholarship at the British Library
Digital Scholarship at the British Library
 
Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017
 
Digital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework ProgrammeDigital Cultural Heritage and the new EU Framework Programme
Digital Cultural Heritage and the new EU Framework Programme
 
Biblissima and Sustainability: challenges, priorities, possibilities
Biblissima and Sustainability: challenges, priorities, possibilitiesBiblissima and Sustainability: challenges, priorities, possibilities
Biblissima and Sustainability: challenges, priorities, possibilities
 
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...
Imagine Libraries...: beyond a future that is already here. Glòria Pérez- Sal...
 
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
British Library Labs, Aly Conteh, Digitisation Programme Manager at British L...
 
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...
Making the most of digitized manuscripts: IIIF and the Digital Manuscripts To...
 
Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531
 
Biblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsBiblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts Collections
 
OpenGLAM presentation at EOD conference, 11 April 2014
OpenGLAM presentation at EOD conference, 11 April 2014OpenGLAM presentation at EOD conference, 11 April 2014
OpenGLAM presentation at EOD conference, 11 April 2014
 
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-Salmerón
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-SalmerónTools 2.0 in Library Associations and National Libraries. Glòria Pérez-Salmerón
Tools 2.0 in Library Associations and National Libraries. Glòria Pérez-Salmerón
 

Similaire à The web is a mess: how I learnt to stop worrying and love web archiving. Kristine Hanna

Digital collections: Increasing awareness and use
Digital collections:  Increasing awareness and useDigital collections:  Increasing awareness and use
Digital collections: Increasing awareness and useButtes
 
Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!daveyp
 
Eastern Shores Library System digitization project
Eastern Shores Library System digitization projectEastern Shores Library System digitization project
Eastern Shores Library System digitization projectRecollection Wisconsin
 
Sharing Your Digital Collection
Sharing Your Digital CollectionSharing Your Digital Collection
Sharing Your Digital CollectionWiLS
 
OER: What are they and how can I use them?
OER: What are they and how can I use them?OER: What are they and how can I use them?
OER: What are they and how can I use them?Rob Darrow
 
Introduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIntroduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIIIF_io
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011lljohnston
 
Web Archiving for University Records
Web Archiving for University RecordsWeb Archiving for University Records
Web Archiving for University Recordselliotdwilliams
 
Smart Libraries, Smart Classrooms
Smart Libraries, Smart ClassroomsSmart Libraries, Smart Classrooms
Smart Libraries, Smart ClassroomsJudy O'Connell
 
Web usability in practice: a case study from the First World War Poetry Digit...
Web usability in practice: a case study from the First World War Poetry Digit...Web usability in practice: a case study from the First World War Poetry Digit...
Web usability in practice: a case study from the First World War Poetry Digit...Kate Lindsay
 
Libraries and Librarians: Nexus of Trends in Librarianship and Social Media
Libraries and Librarians: Nexus of Trends in Librarianship and Social MediaLibraries and Librarians: Nexus of Trends in Librarianship and Social Media
Libraries and Librarians: Nexus of Trends in Librarianship and Social MediaIdowu Adegbilero-Iwari
 
Exploring Cultural History Online -- Winding Rivers Library System Kickoff Event
Exploring Cultural History Online -- Winding Rivers Library System Kickoff EventExploring Cultural History Online -- Winding Rivers Library System Kickoff Event
Exploring Cultural History Online -- Winding Rivers Library System Kickoff EventRecollection Wisconsin
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...lljohnston
 
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....FIAT/IFTA
 
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012lljohnston
 
Learning via the Social Web
Learning via the Social WebLearning via the Social Web
Learning via the Social WebJohn Breslin
 
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...wnradmin
 

Similaire à The web is a mess: how I learnt to stop worrying and love web archiving. Kristine Hanna (20)

Digital collections: Increasing awareness and use
Digital collections:  Increasing awareness and useDigital collections:  Increasing awareness and use
Digital collections: Increasing awareness and use
 
Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!
 
Eastern Shores Library System digitization project
Eastern Shores Library System digitization projectEastern Shores Library System digitization project
Eastern Shores Library System digitization project
 
Sharing Your Digital Collection
Sharing Your Digital CollectionSharing Your Digital Collection
Sharing Your Digital Collection
 
OER: What are they and how can I use them?
OER: What are they and how can I use them?OER: What are they and how can I use them?
OER: What are they and how can I use them?
 
Introduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability FrameworkIntroduction to the International Image Interoperability Framework
Introduction to the International Image Interoperability Framework
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
 
Web Archiving for University Records
Web Archiving for University RecordsWeb Archiving for University Records
Web Archiving for University Records
 
Smart Libraries, Smart Classrooms
Smart Libraries, Smart ClassroomsSmart Libraries, Smart Classrooms
Smart Libraries, Smart Classrooms
 
Web usability in practice: a case study from the First World War Poetry Digit...
Web usability in practice: a case study from the First World War Poetry Digit...Web usability in practice: a case study from the First World War Poetry Digit...
Web usability in practice: a case study from the First World War Poetry Digit...
 
International Digital Library Initiatives
International Digital Library InitiativesInternational Digital Library Initiatives
International Digital Library Initiatives
 
Libraries and Librarians: Nexus of Trends in Librarianship and Social Media
Libraries and Librarians: Nexus of Trends in Librarianship and Social MediaLibraries and Librarians: Nexus of Trends in Librarianship and Social Media
Libraries and Librarians: Nexus of Trends in Librarianship and Social Media
 
Exploring Cultural History Online -- Winding Rivers Library System Kickoff Event
Exploring Cultural History Online -- Winding Rivers Library System Kickoff EventExploring Cultural History Online -- Winding Rivers Library System Kickoff Event
Exploring Cultural History Online -- Winding Rivers Library System Kickoff Event
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
 
Wrangling Wikipedia
Wrangling WikipediaWrangling Wikipedia
Wrangling Wikipedia
 
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....
American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH....
 
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
 
Learning via the Social Web
Learning via the Social WebLearning via the Social Web
Learning via the Social Web
 
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...
WNR.sg - Keynote Address by Mr John van Oudenaren, Director, World Digital Li...
 

Plus de Biblioteca Nacional de España

La colección de relaciones de sucesos en la Biblioteca Nacional de España
La colección de relaciones de sucesos en la Biblioteca Nacional de EspañaLa colección de relaciones de sucesos en la Biblioteca Nacional de España
La colección de relaciones de sucesos en la Biblioteca Nacional de EspañaBiblioteca Nacional de España
 
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos Aramburo
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos AramburoIdentidad común: las fuentes del patrimonio bibliográfico. Ana Santos Aramburo
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos AramburoBiblioteca Nacional de España
 
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...Biblioteca Nacional de España
 
RDA. Autoridades. Fundamentos. Identificación de entidades. Relaciones
RDA. Autoridades. Fundamentos. Identificación de entidades. RelacionesRDA. Autoridades. Fundamentos. Identificación de entidades. Relaciones
RDA. Autoridades. Fundamentos. Identificación de entidades. RelacionesBiblioteca Nacional de España
 
Pleno del Real Patronato. Biblioteca Nacional de España
Pleno del Real Patronato. Biblioteca Nacional de EspañaPleno del Real Patronato. Biblioteca Nacional de España
Pleno del Real Patronato. Biblioteca Nacional de EspañaBiblioteca Nacional de España
 
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de España
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de EspañaObjetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de España
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de EspañaBiblioteca Nacional de España
 
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...Biblioteca Nacional de España
 
Evaluación actuaciones 2018. Planificación actuaciones 2019
Evaluación actuaciones 2018. Planificación actuaciones 2019Evaluación actuaciones 2018. Planificación actuaciones 2019
Evaluación actuaciones 2018. Planificación actuaciones 2019Biblioteca Nacional de España
 
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos Aramburo
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos AramburoPleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos Aramburo
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos AramburoBiblioteca Nacional de España
 
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...Biblioteca Nacional de España
 

Plus de Biblioteca Nacional de España (20)

La colección de relaciones de sucesos en la Biblioteca Nacional de España
La colección de relaciones de sucesos en la Biblioteca Nacional de EspañaLa colección de relaciones de sucesos en la Biblioteca Nacional de España
La colección de relaciones de sucesos en la Biblioteca Nacional de España
 
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos Aramburo
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos AramburoIdentidad común: las fuentes del patrimonio bibliográfico. Ana Santos Aramburo
Identidad común: las fuentes del patrimonio bibliográfico. Ana Santos Aramburo
 
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...
La Biblioteca Nacional de España como centro de apoyo a la investigación. Ana...
 
Data privacy in library authority files: a survey
Data privacy in library authority files: a surveyData privacy in library authority files: a survey
Data privacy in library authority files: a survey
 
Perfil de RDA de la BNE. Resumen de cambios
Perfil de RDA de la BNE. Resumen de cambiosPerfil de RDA de la BNE. Resumen de cambios
Perfil de RDA de la BNE. Resumen de cambios
 
RDA. Autoridades. Fundamentos. Identificación de entidades. Relaciones
RDA. Autoridades. Fundamentos. Identificación de entidades. RelacionesRDA. Autoridades. Fundamentos. Identificación de entidades. Relaciones
RDA. Autoridades. Fundamentos. Identificación de entidades. Relaciones
 
RDA: el nuevo texto
RDA: el nuevo textoRDA: el nuevo texto
RDA: el nuevo texto
 
Pleno del Real Patronato. Biblioteca Nacional de España
Pleno del Real Patronato. Biblioteca Nacional de EspañaPleno del Real Patronato. Biblioteca Nacional de España
Pleno del Real Patronato. Biblioteca Nacional de España
 
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de España
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de EspañaObjetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de España
Objetivos 2019. Pleno del Real Patronato. Biblioteca Nacional de España
 
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...
Pleno del Real Patronato. Biblioteca Nacional de España. Evaluación actuacion...
 
Evaluación actuaciones 2018. Planificación actuaciones 2019
Evaluación actuaciones 2018. Planificación actuaciones 2019Evaluación actuaciones 2018. Planificación actuaciones 2019
Evaluación actuaciones 2018. Planificación actuaciones 2019
 
Dirección Técnica. Objetivos 2019
Dirección Técnica. Objetivos 2019Dirección Técnica. Objetivos 2019
Dirección Técnica. Objetivos 2019
 
Evaluación 2018. Objetivos 2019
Evaluación 2018. Objetivos 2019Evaluación 2018. Objetivos 2019
Evaluación 2018. Objetivos 2019
 
Evaluación actuaciones 2018. Dirección Cultural
Evaluación actuaciones 2018. Dirección CulturalEvaluación actuaciones 2018. Dirección Cultural
Evaluación actuaciones 2018. Dirección Cultural
 
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos Aramburo
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos AramburoPleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos Aramburo
Pleno CCB. Consejo de Cooperación Bibliotecaria. Ana Santos Aramburo
 
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...
Descubrir, aprender, disfrutar en la Biblioteca Nacional de España. Ana Santo...
 
VIAF GDPR
VIAF GDPRVIAF GDPR
VIAF GDPR
 
Renacer prensa historica
Renacer prensa historicaRenacer prensa historica
Renacer prensa historica
 
RDA y Linked data (Ricardo Santos Muñoz)
RDA y Linked data (Ricardo Santos Muñoz)RDA y Linked data (Ricardo Santos Muñoz)
RDA y Linked data (Ricardo Santos Muñoz)
 
Desarrollo actual de RDA (Pilar Tejero López)
Desarrollo actual de RDA (Pilar Tejero López)Desarrollo actual de RDA (Pilar Tejero López)
Desarrollo actual de RDA (Pilar Tejero López)
 

Dernier

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Dernier (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

The web is a mess: how I learnt to stop worrying and love web archiving. Kristine Hanna

  • 1. The Web is a Mess How I learned to stop worrying and love web archiving
  • 2. We are a Digital Library Mission Statement: Universal access to all knowledge o Founded by Brewster Kahle in San Francisco, California in 1996 o Officially designated a Library by the State of California in 2007 About Internet ArchiveAbout Internet Archive
  • 3.
  • 9. The Archive is accessible to the public via the website: www.archive.org o Started collecting content in 1996 o First web pages public available in 2001 o 347+ billion web pages o 200+ million websites o Almost every domain o Content in 140+ Languages o Collect a broad summary of the web every 30-60 days - approximately 10 billion pages per snapshot Access to General Web Archive Access to General Archive
  • 10. What is Web Archiving? Web archiving is the process of collecting portions of web content, preserving the collections, and then providing access to the archives - for use and re use.
  • 11. A web archive is a collection of archived URLs grouped by theme, event, subject area, or web address. A web archive contains as much as possible from the original resources and documents the change over time. It is a priority to recreate the same experience a user would have had if they had visited the live site on the day it was archived. What is a Web Archive?
  • 12. Who is archiving the webWho is web archiving?
  • 13. Why are We Doing This? • Web archives preserve the web. They act as the web equivalent of the archive or library. In this role, their mission is to acquire and preserve the web for future generations… ensuring its continued survival for future generations • Billions of people around the world have grown accustomed to using the web as their primary resource to acquire information. • The availability of this electronic information is taken for granted and it is a fallacy that if something is on the web it will be there forever. • There’s an essential need for people to understand that the web represents who we are. It’s our culture and our social fabric, and we don’t want to lose it.
  • 14. Why should we archive the web?
  • 15. How long does a website live? • A 1997 report in Scientific American claims 44 days. • A subsequent academic 2001 study in IEEE suggests 75 days. • A 2003 Washington Post article indicates the number is 100 days. • A 2013 study by Old Dominion University says that after the first year of publishing, nearly 11% of social media will be lost and after that we will continue to lose 0.02% per day How long does a website live?
  • 16. • Create a thematic/topical web archive on a specific subject • Capture ‘at risk’ content during a spontaneous event • Fulfill organizational mandate to preserve institutional memory & history • Archive state/local agency publications no longer deposited in print form • Archive records to meet university and/or government retention policies. • Collect content to act as a research service for scholars to turn to • Capture social media sites as part of organizational records • Collect web-based information to augment physical holdings. • Archive online art ephemera • End of Life/Closure Web Archiving Use Cases
  • 17. How does web archiving work?
  • 18. What is a crawler? A crawler is the software that captures and archives web pages. A crawler visits a page and indexes the content included therein
  • 19. Some technical challenges in capturing content • Technical: dynamic content utilize scripting languages (Flash and JavaScript). The web is a hodgepodge of technologies, some old and outdated, others at the cutting edge. • Capturing social media sites has become necessary as the web is moving away from html and moving towards applications • Explore other capture mechanisms besides using a traditional crawler resource: hybrid architecture/API/headless browsers
  • 21. Challenge: How much to archive? There Are LimiTs…
  • 22. Challenge: What to archive? …What is important to you? What do you want people to know about? What are your organization’s collecting activities? Vision?
  • 23. Participant Poll • Does any of this make any sense?
  • 25. Starting a Collection Collection: A group of URLs crawled and organized around a common theme, topic or domain Ask Yourself: • What is the topic of this collection? • What websites would you like to archive as part of this collection?
  • 26. Collections Start with Seeds • Seed: starting point URL for the crawler. The crawler will follow linked pages from your seed URL and archive them if they are ‘in scope’. • Document: any file with a distinct URL (html, image, PDF, video, etc).
  • 27. Some of our Partner’s Digital Collections • Stanford University (Palo Alto California) • American University in Cairo • Biblioteca Nacional de España
  • 28. Stanford University, Islamic & Middle Eastern Collection Use Case: harvest and preserve Iranian Blogs • Archiving over 300 blogs written by and for Iran and the Iranian people • Includes coverage of 2009 Iranian elections and the current Middle East unrest
  • 29. Stanford and New York Universities Islamic and Middle Eastern Collection
  • 30.
  • 31. American University of Cairo Use Case: The American University in Cairo Web Archive collects, preserves, and provides access to the web content published by students, faculty, departments, and offices at AUC. The archive also collects Web documents that have long-term research or historical value.
  • 32. January 25th Revolution and University on the Square Demonstrators in Tahrir Square. Image courtesy of Ahmad and the American University in Cairo Rare Books and Special Collections Library.
  • 33. Archivist Driven Captures Thank you to Egypt's youth and Facebook . Image courtesy of Martin and Amy Rowe and the American University in Cairo Rare Books and Special Collections Library.
  • 34. Patron Driven Captures Screenshot of the University on the Square Contribution form. In addition to soliciting photos and videos, we asked content providers to websites, blogs, Twitter feeds, etc.
  • 35. Archivist as Advocate Protester documenting the demonstrations in Tahrir Sqare. Image courtesy of Robeir Rasmy and the American University in Cairo Rare Books and Special Collections Library.
  • 36. Breaking down the life cycle • One of its top priorities as a memory institution is to consolidate whichever strategies lead to the integral preservation of Spanish Internet-published contents, in accordance with the library's mission as keeper and disseminator of Spanish culture. • Commitment to its patrons, who expect the web archive to become a publicly and freely accessible key information source for the study of the 21st century. Biblioteca Nacional de España
  • 37. Breaking down the life cycle Use cases: • 2011 Election crawl • 2012 Humanities crawl • 2009-present .es domain crawls • 2013 .es Broad Survey Crawl, visited the top level page of every web site registered to .es ( in partnership with Red.es) • 2011-2013 Thematic curation (World cups, Olympics,Global Hunger) Biblioteca Nacional de España
  • 40. http://twitter.com/xalmar • Archived wen page from Facebook and/or Flickr
  • 45. Not available on the live web
  • 47. Not available on the live web
  • 48. Making sense of it all • Web Archiving life cycle /model • Internet Archive future objectives – Social Media – Distributed Content – Visualization and analytical tools for more useful interaction – Search – Mobile platforms – Enhanced Researcher Access
  • 49. Web Archiving Life Cycle Model Web Archiving Life Cycle Model white paper available: http://www.archive-it.org/publications
  • 50. Breaking down the life cycle Outer layer: • Vision and Objectives • Resources and Workflow • Access / Use / Reuse. • Preservation • Risk Management Inner Circle: • Appraisal and Selection. • Scoping • Data Capture • Storage and Organization • Quality Assurance and Analysis Breaking down the life cycle
  • 51. Participant Poll • Are you confused yet? I hope not. Happy to answer questions!
  • 52. The importance of web archiving “As our digital world continues to grow at a breathtaking pace and more and more of our daily live occurs within its digital boundaries, we must ensure that web archives are there to preserve our collective global consciousness for future generations” Kalev H. Leetaru, University of Illinois
  • 53. Kristine Hanna, Director, Archiving Services Internet Archive kristine@archive.org Thank you!