SlideShare a Scribd company logo
1 of 31
Download to read offline
A new class of primary source?
Prospects and pitfalls in using
web archives for research
Dr Peter Webster
Webster Research and Consulting
@pj_webster
A lost archive?
A lost archive?
A lost archive?
The web its own archive?
Open UK Web Archive 2004-13 comparison.
@anjacks0n http://britishlibrary.typepad.co.uk/webarchive/2014/10/what-is-still-on-
the-web-after-10-years-of-archiving-.html
Disappearing predictably
Disappearing unpredictably
.. But safe and sound in the archive
Reasons to care about web
archiving
• education and research
• enforcement of the law
• public accountability
Three archives for the UK
Temporal scope Content scope Access
Open UKWA 2004-present Selective
(14.7k)
Online
Legal Deposit
UKWA
2013-present Comprehensive
(for UK)
Onsite
JISC UK
Domain Dataset
1996-2013 Comprehensive
(for .uk)
Index only
JISC UK Web Domain Dataset
(1996-2013)
• copy of Internet Archive holdings for .uk
• bought by JISC, held by British Library
• 60TB of data
• no direct access to content
• prototype search at webarchive.org.uk/shine
• derived datasets in public domain
Web archives for NI and RoI
Temporal scope Content scope Access
NLI Web
Archive
2011-present Selective (542) Online
PRONI Web
Archive
2010-present Selective (115) Online
Legal Deposit
UKWA
2013-present Comprehensive
(for UK!)
Onsite (TCD)
Ways to use the archived web
• URL search -> single page
• Full-text search -> single page
• Visualisation -> trend -> page
Changing aesthetics
gov.ie, captured by archive.org, 15 August 2000
Vanished content
southtippcoco.ie, captured by archive.org, 4 Jan 2014
Visualising trends: Ngram
http://www.webarchive.org.uk/shine/graph
Ways to use the archived web
• URL search -> single page
• Full-text search -> single page
• Visualisation -> trend -> page
• Direct access to WARC
• Derived datasets
• API access
Derived datasets from the BL
From JISC UK Web Domain Dataset (1996-
2010)
• File format profile
• Geo-index
• Crawled URL Index (CDX)
• Host Link Graph
Public domain at data.webarchive.org.uk
Creationism ?
• non-evolutionary account of human
origins
• modern
• a long history
• a feature of some parts of evangelicalism
• (anti-evolutionism, Intelligent Design)
The creationist web :
three questions
A justified conspiracy theory about
marginalisation of creationist voices?
A real danger or a moral panic (Truth in
Science) ?
The web as friend of the marginalised
opinion?
http://peterwebster.me/2014/11/18/reading-creationism-in-the-web-archive/
UK Host Link Graph (1996-
2010)
2008 | newsimg.bbc.co.uk | youtube.com | 45
2008 | archbishopofyork.org.uk | flickr.com | 1
2002 | secularism.org.uk | geocities.com | 1
Public domain at: data.webarchive.org.uk
Approach
• selection of key UK creationist sites
• extraction of all unique inbound referring
hosts for 1996-2010
• inspection and classification
Caveats on method
• partial nature of the dataset
• benchmarking of absolute numbers
• selective sample
• what does a link mean, anyway ?
• not looking at number of linking resources
per host
Truth in Science: how
significant?
• only 46 unique inbound hosts
• … of which many were other creationists
or secularist sites
• two churches, one school
• fewer in 2010 than 2007
Conclusions
• a utopian dream unfulfilled
• a genuine moral panic
• a justified conspiracy theory
Next steps (1)
1. NI the 'creationism capital of Europe'?
(Analysis of:
• links from GB organisations to NI
creationists
• links from NI to RoW)
2. What about creationism in .ie ?
Next steps (2)
Project: EU National Web Spheres
• part of resaw.eu
• investigating the nature of a national web
domain
• .. including the interlinking between them
• case study I: Anglican & Presbyterian
churches in Ireland, north and south
Web Archives for Historians
@HistWebArchives , http://webarchivehistorians.org/
Questions ?
Peter Webster
peter@websterresearchconsulting.com
@pj_webster
peterwebster.me
websterresearchconsulting.com

More Related Content

What's hot

Working with the archived web, 1996-2013
Working with the archived web, 1996-2013 Working with the archived web, 1996-2013
Working with the archived web, 1996-2013 labsbl
 
Open Access and Wikipedia : Taking accessible research to the global public"
Open Access and  Wikipedia : Taking accessible research to the global public"Open Access and  Wikipedia : Taking accessible research to the global public"
Open Access and Wikipedia : Taking accessible research to the global public"Nick Sheppard
 
Disrupting Academic Publishing
Disrupting Academic PublishingDisrupting Academic Publishing
Disrupting Academic PublishingBrian Hole
 
Reports from the UKMHL and Historical Texts live lab
Reports from the UKMHL and Historical Texts live lab Reports from the UKMHL and Historical Texts live lab
Reports from the UKMHL and Historical Texts live lab Jisc
 
Sustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghSustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghNick Sheppard
 
Quantifying the impacts of investment in humanities archives
Quantifying the impacts of investment in humanities archivesQuantifying the impacts of investment in humanities archives
Quantifying the impacts of investment in humanities archivesEric Meyer
 
Disrupting Academic Publishing: Returning Control to Universities
Disrupting Academic Publishing: Returning Control to UniversitiesDisrupting Academic Publishing: Returning Control to Universities
Disrupting Academic Publishing: Returning Control to UniversitiesBrian Hole
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaNick Sheppard
 
Disrupting academic publishing: a future role for libraries
Disrupting academic publishing: a future role for librariesDisrupting academic publishing: a future role for libraries
Disrupting academic publishing: a future role for librariesBrian Hole
 
Open Science: A New Publisher Perspective
Open Science: A New Publisher PerspectiveOpen Science: A New Publisher Perspective
Open Science: A New Publisher PerspectiveBrian Hole
 
The Ubiquity Partner Network: Enabling Library-Based Publishing
The Ubiquity Partner Network: Enabling Library-Based PublishingThe Ubiquity Partner Network: Enabling Library-Based Publishing
The Ubiquity Partner Network: Enabling Library-Based PublishingBrian Hole
 
Publishing Open Data: Incentivising Rigour
Publishing Open Data: Incentivising RigourPublishing Open Data: Incentivising Rigour
Publishing Open Data: Incentivising RigourBrian Hole
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open DataBrian Hole
 
Open Access is Just the Beginning: Disrupting Publishing
Open Access is Just the Beginning: Disrupting PublishingOpen Access is Just the Beginning: Disrupting Publishing
Open Access is Just the Beginning: Disrupting PublishingBrian Hole
 
Reflections on Open Educational Practice ​
Reflections on Open Educational Practice ​Reflections on Open Educational Practice ​
Reflections on Open Educational Practice ​Nick Sheppard
 
Ouls Open Meeting Slides
Ouls Open Meeting SlidesOuls Open Meeting Slides
Ouls Open Meeting SlidesRichard Ovenden
 
The Ubiquity Partner Network: Global Support for Publishing
The Ubiquity Partner Network: Global Support for PublishingThe Ubiquity Partner Network: Global Support for Publishing
The Ubiquity Partner Network: Global Support for PublishingBrian Hole
 

What's hot (20)

Open.Ed
Open.EdOpen.Ed
Open.Ed
 
Working with the archived web, 1996-2013
Working with the archived web, 1996-2013 Working with the archived web, 1996-2013
Working with the archived web, 1996-2013
 
Open Access and Wikipedia : Taking accessible research to the global public"
Open Access and  Wikipedia : Taking accessible research to the global public"Open Access and  Wikipedia : Taking accessible research to the global public"
Open Access and Wikipedia : Taking accessible research to the global public"
 
Disrupting Academic Publishing
Disrupting Academic PublishingDisrupting Academic Publishing
Disrupting Academic Publishing
 
Reports from the UKMHL and Historical Texts live lab
Reports from the UKMHL and Historical Texts live lab Reports from the UKMHL and Historical Texts live lab
Reports from the UKMHL and Historical Texts live lab
 
Sustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of EdinburghSustainable support for OER at the University of Edinburgh
Sustainable support for OER at the University of Edinburgh
 
EOSC and the role of Research Libraries, Jeannette Frey
EOSC and the role of Research Libraries, Jeannette FreyEOSC and the role of Research Libraries, Jeannette Frey
EOSC and the role of Research Libraries, Jeannette Frey
 
Quantifying the impacts of investment in humanities archives
Quantifying the impacts of investment in humanities archivesQuantifying the impacts of investment in humanities archives
Quantifying the impacts of investment in humanities archives
 
Open Researh Europe, Michael Markie
Open Researh Europe, Michael MarkieOpen Researh Europe, Michael Markie
Open Researh Europe, Michael Markie
 
Disrupting Academic Publishing: Returning Control to Universities
Disrupting Academic Publishing: Returning Control to UniversitiesDisrupting Academic Publishing: Returning Control to Universities
Disrupting Academic Publishing: Returning Control to Universities
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and Wikimedia
 
Disrupting academic publishing: a future role for libraries
Disrupting academic publishing: a future role for librariesDisrupting academic publishing: a future role for libraries
Disrupting academic publishing: a future role for libraries
 
Open Science: A New Publisher Perspective
Open Science: A New Publisher PerspectiveOpen Science: A New Publisher Perspective
Open Science: A New Publisher Perspective
 
The Ubiquity Partner Network: Enabling Library-Based Publishing
The Ubiquity Partner Network: Enabling Library-Based PublishingThe Ubiquity Partner Network: Enabling Library-Based Publishing
The Ubiquity Partner Network: Enabling Library-Based Publishing
 
Publishing Open Data: Incentivising Rigour
Publishing Open Data: Incentivising RigourPublishing Open Data: Incentivising Rigour
Publishing Open Data: Incentivising Rigour
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open Data
 
Open Access is Just the Beginning: Disrupting Publishing
Open Access is Just the Beginning: Disrupting PublishingOpen Access is Just the Beginning: Disrupting Publishing
Open Access is Just the Beginning: Disrupting Publishing
 
Reflections on Open Educational Practice ​
Reflections on Open Educational Practice ​Reflections on Open Educational Practice ​
Reflections on Open Educational Practice ​
 
Ouls Open Meeting Slides
Ouls Open Meeting SlidesOuls Open Meeting Slides
Ouls Open Meeting Slides
 
The Ubiquity Partner Network: Global Support for Publishing
The Ubiquity Partner Network: Global Support for PublishingThe Ubiquity Partner Network: Global Support for Publishing
The Ubiquity Partner Network: Global Support for Publishing
 

Viewers also liked

Archives in an Online World Creating LSE Digital Library
Archives in an Online WorldCreating LSE Digital LibraryArchives in an Online WorldCreating LSE Digital Library
Archives in an Online World Creating LSE Digital LibraryALISS
 
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...Peter Stockinger
 
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01Julio Pari
 
Donnelly providing reference services in archives
Donnelly providing reference services in archivesDonnelly providing reference services in archives
Donnelly providing reference services in archivesJennie Graves
 
ASK THE USERS: EXPECTATIONS, BEHAVIORS AND SATISFACTION OF ONLINE ARCHIVE...
ASK THE  USERS:  EXPECTATIONS, BEHAVIORS  AND SATISFACTION OF  ONLINE ARCHIVE...ASK THE  USERS:  EXPECTATIONS, BEHAVIORS  AND SATISFACTION OF  ONLINE ARCHIVE...
ASK THE USERS: EXPECTATIONS, BEHAVIORS AND SATISFACTION OF ONLINE ARCHIVE...Pierluigi Feliciati
 
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival RecordsInternet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival Recordsmwe400
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015Humphrey Southall
 

Viewers also liked (7)

Archives in an Online World Creating LSE Digital Library
Archives in an Online WorldCreating LSE Digital LibraryArchives in an Online WorldCreating LSE Digital Library
Archives in an Online World Creating LSE Digital Library
 
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...
Archiving Culture in the Digital Age. The "Audiovisual Research Archive" (ARA...
 
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01
Resumenestudiodecomercioelectrnicoecommercedaylima2011 110719083629-phpapp01
 
Donnelly providing reference services in archives
Donnelly providing reference services in archivesDonnelly providing reference services in archives
Donnelly providing reference services in archives
 
ASK THE USERS: EXPECTATIONS, BEHAVIORS AND SATISFACTION OF ONLINE ARCHIVE...
ASK THE  USERS:  EXPECTATIONS, BEHAVIORS  AND SATISFACTION OF  ONLINE ARCHIVE...ASK THE  USERS:  EXPECTATIONS, BEHAVIORS  AND SATISFACTION OF  ONLINE ARCHIVE...
ASK THE USERS: EXPECTATIONS, BEHAVIORS AND SATISFACTION OF ONLINE ARCHIVE...
 
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival RecordsInternet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015
 

Similar to Prospects and pitfalls in using web archives for research

Peter webster interrogating the archived uk web
Peter webster   interrogating the archived uk webPeter webster   interrogating the archived uk web
Peter webster interrogating the archived uk webDigital History
 
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...Reading creationism in the web archive: a utopian dream, a conspiracy theory ...
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...Peter Webster
 
Digital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesDigital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesPeter Webster
 
Digital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesDigital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesPeter Webster
 
Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Andy Jackson
 
GLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionGLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionBarry Norton
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunitiesAhmed AlSum
 
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and researchIIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and researchAmerican Art Collaborative
 
UBC Library Web Archiving 2016
UBC Library Web Archiving 2016UBC Library Web Archiving 2016
UBC Library Web Archiving 2016Larissa Ringham
 
Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useALISS
 
Intro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLWIntro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLWGlen Robson
 
Collaborative Web Archiving with Ivy Plus / Borrow Direct
Collaborative Web Archiving with Ivy Plus / Borrow Direct Collaborative Web Archiving with Ivy Plus / Borrow Direct
Collaborative Web Archiving with Ivy Plus / Borrow Direct Anna Perricci
 
The meaning and value of web archives for research
The meaning and value of web archives for researchThe meaning and value of web archives for research
The meaning and value of web archives for researchPeter Webster
 
Ancient History of the UK Web
Ancient History of the UK WebAncient History of the UK Web
Ancient History of the UK WebScott A. Hale
 
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Nuno Freire
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...Micah Altman
 

Similar to Prospects and pitfalls in using web archives for research (20)

Peter webster interrogating the archived uk web
Peter webster   interrogating the archived uk webPeter webster   interrogating the archived uk web
Peter webster interrogating the archived uk web
 
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...Reading creationism in the web archive: a utopian dream, a conspiracy theory ...
Reading creationism in the web archive: a utopian dream, a conspiracy theory ...
 
Digital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesDigital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issues
 
Digital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issuesDigital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issues
 
Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27Digging into the Web Archive at the British Library 2014-11-27
Digging into the Web Archive at the British Library 2014-11-27
 
Scaling up to archive the UK Web. Helen Hockx-Yu
Scaling up to archive the UK Web. Helen Hockx-YuScaling up to archive the UK Web. Helen Hockx-Yu
Scaling up to archive the UK Web. Helen Hockx-Yu
 
GLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionGLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introduction
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and researchIIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
 
UBC Library Web Archiving 2016
UBC Library Web Archiving 2016UBC Library Web Archiving 2016
UBC Library Web Archiving 2016
 
Building a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly useBuilding a Collection of the Historical UK Web for scholarly use
Building a Collection of the Historical UK Web for scholarly use
 
Intro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLWIntro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLW
 
Collaborative Web Archiving with Ivy Plus / Borrow Direct
Collaborative Web Archiving with Ivy Plus / Borrow Direct Collaborative Web Archiving with Ivy Plus / Borrow Direct
Collaborative Web Archiving with Ivy Plus / Borrow Direct
 
Webarchiv - Curatorial approaches, topic collections and cooperation with the...
Webarchiv - Curatorial approaches, topic collections and cooperation with the...Webarchiv - Curatorial approaches, topic collections and cooperation with the...
Webarchiv - Curatorial approaches, topic collections and cooperation with the...
 
International Digital Library Initiatives
International Digital Library InitiativesInternational Digital Library Initiatives
International Digital Library Initiatives
 
The meaning and value of web archives for research
The meaning and value of web archives for researchThe meaning and value of web archives for research
The meaning and value of web archives for research
 
GLAMorous LOD
GLAMorous LODGLAMorous LOD
GLAMorous LOD
 
Ancient History of the UK Web
Ancient History of the UK WebAncient History of the UK Web
Ancient History of the UK Web
 
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 

Recently uploaded

2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoilmeghakumariji156
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.krishnachandrapal52
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptxAsmae Rabhi
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 

Recently uploaded (20)

2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 

Prospects and pitfalls in using web archives for research

  • 1. A new class of primary source? Prospects and pitfalls in using web archives for research Dr Peter Webster Webster Research and Consulting @pj_webster
  • 2.
  • 6. The web its own archive? Open UK Web Archive 2004-13 comparison. @anjacks0n http://britishlibrary.typepad.co.uk/webarchive/2014/10/what-is-still-on- the-web-after-10-years-of-archiving-.html
  • 9. .. But safe and sound in the archive
  • 10. Reasons to care about web archiving • education and research • enforcement of the law • public accountability
  • 11. Three archives for the UK Temporal scope Content scope Access Open UKWA 2004-present Selective (14.7k) Online Legal Deposit UKWA 2013-present Comprehensive (for UK) Onsite JISC UK Domain Dataset 1996-2013 Comprehensive (for .uk) Index only
  • 12. JISC UK Web Domain Dataset (1996-2013) • copy of Internet Archive holdings for .uk • bought by JISC, held by British Library • 60TB of data • no direct access to content • prototype search at webarchive.org.uk/shine • derived datasets in public domain
  • 13. Web archives for NI and RoI Temporal scope Content scope Access NLI Web Archive 2011-present Selective (542) Online PRONI Web Archive 2010-present Selective (115) Online Legal Deposit UKWA 2013-present Comprehensive (for UK!) Onsite (TCD)
  • 14. Ways to use the archived web • URL search -> single page • Full-text search -> single page • Visualisation -> trend -> page
  • 15. Changing aesthetics gov.ie, captured by archive.org, 15 August 2000
  • 16. Vanished content southtippcoco.ie, captured by archive.org, 4 Jan 2014
  • 18. Ways to use the archived web • URL search -> single page • Full-text search -> single page • Visualisation -> trend -> page • Direct access to WARC • Derived datasets • API access
  • 19. Derived datasets from the BL From JISC UK Web Domain Dataset (1996- 2010) • File format profile • Geo-index • Crawled URL Index (CDX) • Host Link Graph Public domain at data.webarchive.org.uk
  • 20. Creationism ? • non-evolutionary account of human origins • modern • a long history • a feature of some parts of evangelicalism • (anti-evolutionism, Intelligent Design)
  • 21. The creationist web : three questions A justified conspiracy theory about marginalisation of creationist voices? A real danger or a moral panic (Truth in Science) ? The web as friend of the marginalised opinion? http://peterwebster.me/2014/11/18/reading-creationism-in-the-web-archive/
  • 22. UK Host Link Graph (1996- 2010) 2008 | newsimg.bbc.co.uk | youtube.com | 45 2008 | archbishopofyork.org.uk | flickr.com | 1 2002 | secularism.org.uk | geocities.com | 1 Public domain at: data.webarchive.org.uk
  • 23. Approach • selection of key UK creationist sites • extraction of all unique inbound referring hosts for 1996-2010 • inspection and classification
  • 24. Caveats on method • partial nature of the dataset • benchmarking of absolute numbers • selective sample • what does a link mean, anyway ? • not looking at number of linking resources per host
  • 25. Truth in Science: how significant? • only 46 unique inbound hosts • … of which many were other creationists or secularist sites • two churches, one school • fewer in 2010 than 2007
  • 26.
  • 27. Conclusions • a utopian dream unfulfilled • a genuine moral panic • a justified conspiracy theory
  • 28. Next steps (1) 1. NI the 'creationism capital of Europe'? (Analysis of: • links from GB organisations to NI creationists • links from NI to RoW) 2. What about creationism in .ie ?
  • 29. Next steps (2) Project: EU National Web Spheres • part of resaw.eu • investigating the nature of a national web domain • .. including the interlinking between them • case study I: Anglican & Presbyterian churches in Ireland, north and south
  • 30. Web Archives for Historians @HistWebArchives , http://webarchivehistorians.org/