SlideShare une entreprise Scribd logo
1  sur  27
Digital
Assessing the Quality
of Web Archives
Michael L. Nelson
Scott G. Ainsworth, Justin F. Brunelle,
Mat Kelly, Hany SalahEldeen,
Michele C. Weigle
Old Dominion University
Web Science & Digital Libraries Research Group
ws-dl.cs.odu.edu
@WebSciDL
Digital
The State of Web Archiving
"Hooray! It's in the archive!"
vs.
"How well was it archived?"
current:
future:
Digitalhttp://web.archive.org/web/20140717152222/http://vk.com/strelkov_info
http://www.csmonitor.com/World/Europe/2014/0717/Web-evidence-points-to-pro-Russia-rebels-in-downing-of-MH17-video
Digitalhttp://web.archive.org/web/20140717152222/http://vk.com/strelkov_info
http://www.csmonitor.com/World/Europe/2014/0717/Web-evidence-points-to-pro-Russia-rebels-in-downing-of-MH17-video
Digital
Three Ways We're
Assessing Quality
• Weighting the "importance" of missing
embedded resources
– "damage" measure for comparing archived
pages
• Detecting "temporal violations"
– some rendered pages never existed
• Defining an archival tool benchmark
– "Archive Acid Test"
Digital
Not All Mementos Are Created Equal:
Measuring The Impact Of Missing Resources
JCDL 2014
http://www.cs.odu.edu/~mln/pubs/jcdl-2014/jcdl-2014-brunelle-damage.pdf
Digital
M = 0.17
D = 0.09
(live web)
M = 0.24
D = 0.41
(missing main)
M = 0.29
D = 0.36
(missing logo + navigation)
Synthetic Damage:
Removing Images From xkcd.com
damage (D) differs from % missing (M)!
Digital
Was missing
resource
important?
<img>and
<embed>
can leave hints
about size and
centrality.
For CSS, we
look at the
distribution of
background
color in page
divided into
vertical thirds.
Digital
Weights from Turker Assessment of Damage
first: establish that Turkers
can determine damaged vs.
undamaged pages (81% of the time)
second: find weights that match
Turker's rankings of (real) differently
damaged versions of the same page
Digital
Good News:
Although M is steady/increasing, D is decreasing
Digital
A Framework for Evaluation of
Composite Memento Temporal Coherence
(in preparation)
http://arxiv.org/abs/1402.0928
Digital
As Presented by IA
http://web.archive.org/web/20041209190926/http://www.wunderground.org/cgi-bin/findWeather/getForecast?query=50593 (now 404, but that's a different story…)
Digital
Not Everything Is
200412091900926
+ 9 months
http://web.archive.org/web/20041209190926/http://www.wunderground.org/cgi-bin/findWeather/getForecast?query=50593 (now 404, but that's a different story…)
Digital
html jpegjpegjpeg
Consider:
<html>
<img src="foo.jpeg">
</html>
change change change change
Digital
html jpegjpegjpeg
Correct Archival Rendering
change change change change
Digital
html jpegjpegjpeg
But Archives Miss Updates…
change missed
change
change change
Digital
html jpegjpegjpeg
You Can Choose the Closest
change missed
change
change change
(closest is the current policy of most archives)
Digital
html jpegjpegjpeg
You Can Choose the Past
change missed
change
change change
Digital
html jpegjpegjpeg
Or You Can "Bracket" the HTML
change missed
change
change change
(when possible, brackets can be made via HTTP metadata or content comparison)
?
In this case, there is no right answer.
Either choice will result in a temporal violation.
Digital
Completeness vs. Coherence
Description
Closest
Single
Archive
Closest
Multi-
Archive
Bracket
Single
Archive
Bracket
Multi-
Archive
Completeness
Mean complete 76.1% 80.2% 76.2% 80.3%
Mean missing 23.9% 19.8% 23.8% 19.7%
Temporal Coherence
Mean prima facie coherent 41.0% 40.9% 54.7% 54.6%
Mean possibly coherent 27.3% 27.3% 12.8% 14.2%
Mean probably violative 2.5% 5.3% 2.5% 5.3%
Mean prima facie violative 5.3% 5.3% 6.2% 6.2%
At least 5% of pages can be shown to be temporal violations
Digital
The Archival Acid Test:
Evaluating Archive Performance on
Advanced HTML and JavaScript
JCDL 2014
http://ws-dl.blogspot.com/2014/07/2014-07-14-archival-acid-test.html
http://acid.matkelly.com/
Digital
Inspired by the
Acid3 Test for Browsers
http://acid3.acidtests.org/
http://en.wikipedia.org/wiki/Acid3
Digital
The Archival Acid Test
Heritrix WARCreateGNU Wget
Archiving Tools
Archives
Digital
Archival Tools & Sites on Acid3
Digital
Archival Acid Tests
Digital
Archival Tools & Sites on AAT
(mummify.it died
in early 2014)
Digital
Future of Web Archiving:
Increasing Quantitative Analysis
• Measure "damage" instead of completeness
of archived pages
– enables large-scale comparison of archives
• Even if an embedded resource is present, it
doesn't mean it's right
– ~5% of archived pages have temporal violations
• To improve the quality of the archives, we
need to be able to benchmark archival tools
– Archival Acid Test is an easy to use benchmark

Contenu connexe

Tendances

How to Start Performance Testing?
How to Start Performance Testing?How to Start Performance Testing?
How to Start Performance Testing?Nebojša Videnov
 
Preparing your web services for Android and your Android app for web services...
Preparing your web services for Android and your Android app for web services...Preparing your web services for Android and your Android app for web services...
Preparing your web services for Android and your Android app for web services...Droidcon Eastern Europe
 
AtlasCamp 2014: Static Connect Add-ons
AtlasCamp 2014: Static Connect Add-onsAtlasCamp 2014: Static Connect Add-ons
AtlasCamp 2014: Static Connect Add-onsAtlassian
 
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...Codemotion
 
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporterFetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporterRay Schwartz
 
Go With The Reflow
Go With The ReflowGo With The Reflow
Go With The Reflowlsimon
 
Connecting to Web Services on Android
Connecting to Web Services on AndroidConnecting to Web Services on Android
Connecting to Web Services on Androidsullis
 
Netw420 week 4 i lab 4
Netw420 week 4 i lab 4Netw420 week 4 i lab 4
Netw420 week 4 i lab 4netw420
 
Web Services Introduction
Web Services IntroductionWeb Services Introduction
Web Services IntroductionEric Brown
 
Netw420 week 5 i lab 5
Netw420 week 5 i lab 5Netw420 week 5 i lab 5
Netw420 week 5 i lab 5netw420
 
How to investigate and recover from a security breach in WordPress
How to investigate and recover from a security breach in WordPressHow to investigate and recover from a security breach in WordPress
How to investigate and recover from a security breach in WordPressOtto Kekäläinen
 
.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana StinguRoxana Stingu
 

Tendances (13)

How to Start Performance Testing?
How to Start Performance Testing?How to Start Performance Testing?
How to Start Performance Testing?
 
Preparing your web services for Android and your Android app for web services...
Preparing your web services for Android and your Android app for web services...Preparing your web services for Android and your Android app for web services...
Preparing your web services for Android and your Android app for web services...
 
AtlasCamp 2014: Static Connect Add-ons
AtlasCamp 2014: Static Connect Add-onsAtlasCamp 2014: Static Connect Add-ons
AtlasCamp 2014: Static Connect Add-ons
 
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...
Tech Webinar: Offline First: Creare un'app Phonegap che funzioni offline e si...
 
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporterFetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporter
 
Go With The Reflow
Go With The ReflowGo With The Reflow
Go With The Reflow
 
Connecting to Web Services on Android
Connecting to Web Services on AndroidConnecting to Web Services on Android
Connecting to Web Services on Android
 
Netw420 week 4 i lab 4
Netw420 week 4 i lab 4Netw420 week 4 i lab 4
Netw420 week 4 i lab 4
 
Web Services Introduction
Web Services IntroductionWeb Services Introduction
Web Services Introduction
 
Netw420 week 5 i lab 5
Netw420 week 5 i lab 5Netw420 week 5 i lab 5
Netw420 week 5 i lab 5
 
How to investigate and recover from a security breach in WordPress
How to investigate and recover from a security breach in WordPressHow to investigate and recover from a security breach in WordPress
How to investigate and recover from a security breach in WordPress
 
.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu.htaccess for SEOs - A presentation by Roxana Stingu
.htaccess for SEOs - A presentation by Roxana Stingu
 
5 things MySql
5 things MySql5 things MySql
5 things MySql
 

En vedette

Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Michael Nelson
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesMichael Nelson
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesMichael Nelson
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple ArchivesMichael Nelson
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Michael Nelson
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptMichael Nelson
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionSawood Alam
 
More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better Michael Nelson
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015Michael Nelson
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet ArchiveMichael Nelson
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingMichael Nelson
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesMichael Nelson
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingYasmin AlNoamany, PhD
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web ArchivesMichael Nelson
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeMichael Nelson
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research ObjectYasmin AlNoamany, PhD
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?Michael Nelson
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesMichael Nelson
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Michael Nelson
 

En vedette (20)

Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief Introduction
 
More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web Archiving
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web Archives
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through Storytelling
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over Time
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 

Similaire à Assessing the Quality of Web Archives

Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?Michael Nelson
 
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...UKSG: connecting the knowledge community
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkSawood Alam
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)Pavlo Baron
 
Creating an Effective Mobile API
Creating an Effective Mobile API Creating an Effective Mobile API
Creating an Effective Mobile API Nick DeNardis
 
Finding harmony in web development
Finding harmony in web developmentFinding harmony in web development
Finding harmony in web developmentChristian Heilmann
 
Webware - from Document to Operating System
Webware - from Document to Operating System Webware - from Document to Operating System
Webware - from Document to Operating System Channy Yun
 
Stefan Judis "Did we(b development) lose the right direction?"
Stefan Judis "Did we(b development) lose the right direction?"Stefan Judis "Did we(b development) lose the right direction?"
Stefan Judis "Did we(b development) lose the right direction?"Fwdays
 
LinkedGov: RDF HTTP XML URI REST GOV OMG
LinkedGov: RDF HTTP XML URI REST GOV OMGLinkedGov: RDF HTTP XML URI REST GOV OMG
LinkedGov: RDF HTTP XML URI REST GOV OMGhadleybeeman
 
How fast are we going now?
How fast are we going now?How fast are we going now?
How fast are we going now?Steve Souders
 
Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseGiorgio Orsi
 
Readying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web BundlesReadying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web BundlesSawood Alam
 
HTML5 and the web of tomorrow!
HTML5  and the  web of tomorrow!HTML5  and the  web of tomorrow!
HTML5 and the web of tomorrow!Christian Heilmann
 
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016Phil Leggetter
 
Puglia marac-file formats-20111020
Puglia marac-file formats-20111020Puglia marac-file formats-20111020
Puglia marac-file formats-20111020MARAC Bethlehem PC
 
Semantic Water Quality - Ping Wang
Semantic Water Quality - Ping WangSemantic Water Quality - Ping Wang
Semantic Water Quality - Ping WangTim Clark
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web PagesMichael Nelson
 

Similaire à Assessing the Quality of Web Archives (20)

Can’t Find Your 404s?
Can’t Find Your 404s?Can’t Find Your 404s?
Can’t Find Your 404s?
 
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...
UKSG Conference 2016 Breakout Session - Discovery and linking integrity – do ...
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification Framework
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)
 
Creating an Effective Mobile API
Creating an Effective Mobile API Creating an Effective Mobile API
Creating an Effective Mobile API
 
Finding harmony in web development
Finding harmony in web developmentFinding harmony in web development
Finding harmony in web development
 
Webware - from Document to Operating System
Webware - from Document to Operating System Webware - from Document to Operating System
Webware - from Document to Operating System
 
Stefan Judis "Did we(b development) lose the right direction?"
Stefan Judis "Did we(b development) lose the right direction?"Stefan Judis "Did we(b development) lose the right direction?"
Stefan Judis "Did we(b development) lose the right direction?"
 
LinkedGov: RDF HTTP XML URI REST GOV OMG
LinkedGov: RDF HTTP XML URI REST GOV OMGLinkedGov: RDF HTTP XML URI REST GOV OMG
LinkedGov: RDF HTTP XML URI REST GOV OMG
 
How fast are we going now?
How fast are we going now?How fast are we going now?
How fast are we going now?
 
Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash Course
 
Readying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web BundlesReadying Web Archives to Consume and Leverage Web Bundles
Readying Web Archives to Consume and Leverage Web Bundles
 
Mezi snem a realitou. Otevřená data českého webového archivu.
Mezi snem a realitou. Otevřená data českého webového archivu.Mezi snem a realitou. Otevřená data českého webového archivu.
Mezi snem a realitou. Otevřená data českého webového archivu.
 
HTML5 and the web of tomorrow!
HTML5  and the  web of tomorrow!HTML5  and the  web of tomorrow!
HTML5 and the web of tomorrow!
 
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016
Real-Time Web Apps & .NET. What Are Your Options? NDC Oslo 2016
 
Puglia marac-file formats-20111020
Puglia marac-file formats-20111020Puglia marac-file formats-20111020
Puglia marac-file formats-20111020
 
Semantic Water Quality - Ping Wang
Semantic Water Quality - Ping WangSemantic Water Quality - Ping Wang
Semantic Water Quality - Ping Wang
 
(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages(Re-) Discovering Lost Web Pages
(Re-) Discovering Lost Web Pages
 

Plus de Michael Nelson

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Michael Nelson
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?Michael Nelson
 

Plus de Michael Nelson (9)

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pages
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?
 

Dernier

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Assessing the Quality of Web Archives

Notes de l'éditeur

  1. A memento a presented by the Internet Archive.
  2. mindist = minimum distance = closest bracket = closest, but choose bracket if available