SlideShare une entreprise Scribd logo
1  sur  28
Reconstructing the past with
MediaWiki:
Programmatic Issues and Solutions
Shawn M. Jones
sjone@cs.odu.edu
Old Dominion University
Reconstructing the Past with the
Internet Archive
HTML
Images
JavaScript
CSS
Our goal: Temporal Coherence
Make the page look as it looked at the time it was archived.
Some Results from the Internet
Archive Are Lacking
Images change between the time
the Archive crawls the main page
and the time it gets to the images
Sometimes embedded images
are missing when the Archive
gets to them
Sometimes the page is designed
for a specific browser in mind
Image from “A Framework for Evaluation of Composite Memento Temporal Coherence”
by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928
MediaWiki Shouldn’t Have This
Problem
HTML Images
JavaScript
CSS
What we’re not doing
Interest in Reconstructing the Past
With MediaWiki
Simplified Memento Overview
Rules for Reconstructing the Past With
MediaWiki
Do not modify any existing MediaWiki
code!
Conform to
MediaWiki
coding standards
And…
Reconstructing the Past
Articles
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Accessing Old Article Text
The oldid argument references a revision of a page
within MediaWiki's database
Merely visiting the URI with the oldid will give you the
text content of the page as it existed at that revision
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Including the Right Template
This gives us:
$title - the Title object for the given page
$parser - the Parser object for the given page
$id - the revision ID (oldid) for the Template page
Using $parser, and $title, we can change the $id and
fetch an old revision of the Template
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Embedded JavaScript
Embedded CSS
But What About Images?
This Map is important to
understanding the
content of this article
This image is changed
as the article is
changed, to reflect its
content
It’s the same map if we look at the
June 6, 2013 revision now
Users can't view this
embedded resource as
it looked on June 2013
while reading the article
from that time period
What should have happened
This is the the map from
June, 2013 that should
have been displayed
This is the current map
The content of the article won't match the data in this visual aide, possibly
confusing a user who wanted historical information on this topic
We Tried To Solve This
Upon further inspection of the code in MediaWiki, the $time argument
from this function is never used as detailed here
We Just Solved This
Upon further inspection of the code in MediaWiki, the $file argument’s
getHistory() function can be used to acquire previous revisions of images
Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Prototyped for future version of
Memento MediaWiki Extension
Embedded JavaScript
Embedded CSS
What about CSS/JavaScript?
The present CSS of
this page conflicts
with the past
Template.
We Couldn’t Solve This
The data is present, but we could not find any way for an
extension to access or render it.
Recap on Reconstructing the Past
Articles
Handled by
Memento MediaWiki Extension
Templates
Handled by
Memento MediaWiki Extension
Embedded Images
Prototyped for future version of
Memento MediaWiki Extension
Embedded JavaScript
Requires changes to MediaWiki
Embedded CSS
Requires changes to MediaWiki
Uniform solution
• RFC 7089, Memento, was designed to provide
uniform access to past versions of all resources
on the Web
• Memento provides a web standard to access
these resources
Resources
• Memento Protocol: http://tools.ietf.org/html/rfc7089
• Memento Website: http://www.mementoweb.org/
• Memento MediaWiki Extension:
http://www.mediawiki.org/wiki/Extension:Memento
• Memento Chrome Extension:
http://bit.ly/memento-for-chrome
• More details:
http://ws-dl.blogspot.com/2014/04/2014-04-01-
yesterdays-wiki-page-todays.html
• Contact me: sjone@cs.odu.edu
Backup Slides
Sample URI-R (Step 1) HTTP Response
HTTP/1.1 200 OK
Date: Sun, 25 May 2014 21:39:02 GMT
Server: Apache
X-Content-Type-Options: nosniff
Link: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen;
rel="original latest-version",
http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen;
rel="timegate",
http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen;
rel="timemap”; type="application/link-format”
Content-language: en
Vary: Accept-Encoding,Cookie
Cache-Control: s-maxage=18000, must-revalidate, max-age=0
Last-Modified: Sat, 17 May 2014 16:48:28 GMT
Connection: close
Content-Type: text/html; charset=UTF-8
Sample URI-G (Step 2) HTTP Response
HTTP/1.1 302 Found
Date: Sun, 25 May 2014 21:43:08 GMT
Server: Apache
X-Content-Type-Options: nosniff
Vary: Accept-Encoding, Accept-Datetime
Location: http://ws-dl-
05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499
Link: <http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>;
rel="timemap”; type="application/link-format",
<http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>;
rel="original latest-version”
Connection: close
Content-Type: text/html; charset=UTF-8
Sample URI-M (Step 3) HTTP Response
HTTP/1.1 200 OK
Date: Sun, 25 May 2014 21:46:12 GMT
Server: Apache
X-Content-Type-Options: nosniff
Memento-Datetime: Sun, 22 Apr 2007 15:01:20 GMT
Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>;
rel="original latest-version”,
<http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>;
rel="timegate”,
<http://ws-dl-
05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>;
rel="timemap”; type="application/link-format”
Content-language: en
Vary: Accept-Encoding,Cookie
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: private, must-revalidate, max-age=0
Connection: close
Content-Type: text/html; charset=UTF-8

Contenu connexe

Similaire à Reconstructing Past Wiki Pages Programmatically

Designing CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebDesigning CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebZoe Gillenwater
 
SharePoint as a Web CMS
SharePoint as a Web CMSSharePoint as a Web CMS
SharePoint as a Web CMSCraig Bailey
 
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Matt Gibson
 
Bootstrap for Beginners
Bootstrap for BeginnersBootstrap for Beginners
Bootstrap for BeginnersD'arce Hess
 
WordPress Theme Structure
WordPress Theme StructureWordPress Theme Structure
WordPress Theme Structurekeithdevon
 
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin..."Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...Yandex
 
Pablo Villalba -
Pablo Villalba - Pablo Villalba -
Pablo Villalba - .toster
 
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentEECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentFortySeven Media
 
Introduction to Web Components
Introduction to Web ComponentsIntroduction to Web Components
Introduction to Web ComponentsRich Bradshaw
 
3) web development
3) web development3) web development
3) web developmenttechbed
 
2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair OldfieldEmeldi Group
 
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentWordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentEvan Mullins
 
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!Evan Mullins
 
CreateJS hackathon in Zurich
CreateJS hackathon in ZurichCreateJS hackathon in Zurich
CreateJS hackathon in ZurichHenri Bergius
 
Semantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSemantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSergeyChernyshev
 
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...Inexture Solutions
 

Similaire à Reconstructing Past Wiki Pages Programmatically (20)

Designing CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible WebDesigning CSS Layouts for the Flexible Web
Designing CSS Layouts for the Flexible Web
 
SharePoint as a Web CMS
SharePoint as a Web CMSSharePoint as a Web CMS
SharePoint as a Web CMS
 
Aaug sample slides
Aaug sample slidesAaug sample slides
Aaug sample slides
 
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013Adapting to a Responsive Design at Untangle the Web on 29th July 2013
Adapting to a Responsive Design at Untangle the Web on 29th July 2013
 
Bootstrap for Beginners
Bootstrap for BeginnersBootstrap for Beginners
Bootstrap for Beginners
 
WordPress Theme Structure
WordPress Theme StructureWordPress Theme Structure
WordPress Theme Structure
 
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin..."Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
"Responsive Web Design: Clever Tips and Techniques". Vitaly Friedman, Smashin...
 
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAYMANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
MANAGE STATIC RESOURCES IN SITECORE IN HELIX WAY
 
Pablo Villalba -
Pablo Villalba - Pablo Villalba -
Pablo Villalba -
 
2012.10 Oldfield
2012.10 Oldfield2012.10 Oldfield
2012.10 Oldfield
 
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine DevelopmentEECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
EECI2009 - From Design to Dynamic - Rapid ExpressionEngine Development
 
Introduction to Web Components
Introduction to Web ComponentsIntroduction to Web Components
Introduction to Web Components
 
3) web development
3) web development3) web development
3) web development
 
React in 2018
React in 2018React in 2018
React in 2018
 
2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield2012.10 Liferay Europe Symposium, Alistair Oldfield
2012.10 Liferay Europe Symposium, Alistair Oldfield
 
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to DevelopmentWordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
WordCamp Greenville 2018 - Beware the Dark Side, or an Intro to Development
 
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
WordCamp Asheville 2017 - So You Wanna Dev? Join the Team!
 
CreateJS hackathon in Zurich
CreateJS hackathon in ZurichCreateJS hackathon in Zurich
CreateJS hackathon in Zurich
 
Semantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic FormsSemantic Media Wiki & Semantic Forms
Semantic Media Wiki & Semantic Forms
 
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
A Comprehensive Guide on Building Lightning-Fast Websites with React Static S...
 

Plus de Shawn Jones

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Shawn Jones
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...Shawn Jones
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Shawn Jones
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...Shawn Jones
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Shawn Jones
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsShawn Jones
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)Shawn Jones
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Shawn Jones
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web ArchivesShawn Jones
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesShawn Jones
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Shawn Jones
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitShawn Jones
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-ItShawn Jones
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesShawn Jones
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsShawn Jones
 

Plus de Shawn Jones (16)

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
 
Reference Rot
Reference RotReference Rot
Reference Rot
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
 

Dernier

OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 

Dernier (20)

OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 

Reconstructing Past Wiki Pages Programmatically

  • 1. Reconstructing the past with MediaWiki: Programmatic Issues and Solutions Shawn M. Jones sjone@cs.odu.edu Old Dominion University
  • 2. Reconstructing the Past with the Internet Archive HTML Images JavaScript CSS Our goal: Temporal Coherence Make the page look as it looked at the time it was archived.
  • 3. Some Results from the Internet Archive Are Lacking Images change between the time the Archive crawls the main page and the time it gets to the images Sometimes embedded images are missing when the Archive gets to them Sometimes the page is designed for a specific browser in mind Image from “A Framework for Evaluation of Composite Memento Temporal Coherence” by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928
  • 4. MediaWiki Shouldn’t Have This Problem HTML Images JavaScript CSS
  • 6. Interest in Reconstructing the Past With MediaWiki
  • 8. Rules for Reconstructing the Past With MediaWiki Do not modify any existing MediaWiki code! Conform to MediaWiki coding standards And…
  • 9. Reconstructing the Past Articles Templates Embedded Images Embedded JavaScript Embedded CSS
  • 10. Accessing Old Article Text The oldid argument references a revision of a page within MediaWiki's database Merely visiting the URI with the oldid will give you the text content of the page as it existed at that revision
  • 11. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Embedded Images Embedded JavaScript Embedded CSS
  • 12. Including the Right Template This gives us: $title - the Title object for the given page $parser - the Parser object for the given page $id - the revision ID (oldid) for the Template page Using $parser, and $title, we can change the $id and fetch an old revision of the Template
  • 13. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Embedded JavaScript Embedded CSS
  • 14. But What About Images? This Map is important to understanding the content of this article This image is changed as the article is changed, to reflect its content
  • 15. It’s the same map if we look at the June 6, 2013 revision now Users can't view this embedded resource as it looked on June 2013 while reading the article from that time period
  • 16. What should have happened This is the the map from June, 2013 that should have been displayed This is the current map The content of the article won't match the data in this visual aide, possibly confusing a user who wanted historical information on this topic
  • 17. We Tried To Solve This Upon further inspection of the code in MediaWiki, the $time argument from this function is never used as detailed here
  • 18. We Just Solved This Upon further inspection of the code in MediaWiki, the $file argument’s getHistory() function can be used to acquire previous revisions of images
  • 19. Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Prototyped for future version of Memento MediaWiki Extension Embedded JavaScript Embedded CSS
  • 20. What about CSS/JavaScript? The present CSS of this page conflicts with the past Template.
  • 21. We Couldn’t Solve This The data is present, but we could not find any way for an extension to access or render it.
  • 22. Recap on Reconstructing the Past Articles Handled by Memento MediaWiki Extension Templates Handled by Memento MediaWiki Extension Embedded Images Prototyped for future version of Memento MediaWiki Extension Embedded JavaScript Requires changes to MediaWiki Embedded CSS Requires changes to MediaWiki
  • 23. Uniform solution • RFC 7089, Memento, was designed to provide uniform access to past versions of all resources on the Web • Memento provides a web standard to access these resources
  • 24. Resources • Memento Protocol: http://tools.ietf.org/html/rfc7089 • Memento Website: http://www.mementoweb.org/ • Memento MediaWiki Extension: http://www.mediawiki.org/wiki/Extension:Memento • Memento Chrome Extension: http://bit.ly/memento-for-chrome • More details: http://ws-dl.blogspot.com/2014/04/2014-04-01- yesterdays-wiki-page-todays.html • Contact me: sjone@cs.odu.edu
  • 26. Sample URI-R (Step 1) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:39:02 GMT Server: Apache X-Content-Type-Options: nosniff Link: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen; rel="original latest-version", http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen; rel="timegate", http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Cache-Control: s-maxage=18000, must-revalidate, max-age=0 Last-Modified: Sat, 17 May 2014 16:48:28 GMT Connection: close Content-Type: text/html; charset=UTF-8
  • 27. Sample URI-G (Step 2) HTTP Response HTTP/1.1 302 Found Date: Sun, 25 May 2014 21:43:08 GMT Server: Apache X-Content-Type-Options: nosniff Vary: Accept-Encoding, Accept-Datetime Location: http://ws-dl- 05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499 Link: <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format", <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version” Connection: close Content-Type: text/html; charset=UTF-8
  • 28. Sample URI-M (Step 3) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:46:12 GMT Server: Apache X-Content-Type-Options: nosniff Memento-Datetime: Sun, 22 Apr 2007 15:01:20 GMT Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version”, <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>; rel="timegate”, <http://ws-dl- 05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control: private, must-revalidate, max-age=0 Connection: close Content-Type: text/html; charset=UTF-8