SlideShare a Scribd company logo
1 of 19
IIPC GA 2015 – Stanford CA
Restoring US First Website
AHMED ALSUM
WEB PALEONTOLOGIST
IIPC GA 2015 – Stanford CA2
IIPC GA 2015 – Stanford CA
The Early WWW Exhibit on 2006
http://www.slac.stanford.edu/history/earlyweb/firstpages.shtml
IIPC GA 2015 – Stanford CA
Where are the pieces?
IIPC GA 2015 – Stanford CA
What we got from SLAC
• Yearly UNIX backup files.
• Mix of web pages and system artifacts like UNIX backup notes and
console history.
IIPC GA 2015 – Stanford CA
Where can we replay it at Stanford?
epa.gov
IIPC GA 2015 – Stanford CA
Reconstruction Strategy
Page Content
URI
Timestamp
CDX File WARC File
example.org 19980722050200 …. 22222 file1.warc
example.org 19991201081010 …. 33333 file1.warc
.
.
.
.
…..
……..
……….
……….
……….
……….
IIPC GA 2015 – Stanford CA
Page Content
CRAWLING THE ARCHIVE
 Determine a list of pages.
 Using wget command to combine these pages in WARC format.
 wget generates CDX file but with today’s date!
Page1.html
Page2.html
Page3.html
WARC File
Page1.html
…
…
Page2.html
…
…
Page3.html
…
…
wget
CDX File
Localhost/... 2014 …….
Localhost/... 2014 …….
Localhost/... 2014 …….
IIPC GA 2015 – Stanford CA
URI Challenges
 There was no one-one mapping between directory structure and URI.
 Host name changed through time.
IIPC GA 2015 – Stanford CA
URI Discovery
R
R
R R
R
Inlinks within the collection Backlinks in SE HTML Source code
Web Archives Publication about Early Web Interviews with SLAC staff
IIPC GA 2015 – Stanford CA
URI Discovery
FIRST SET OF URIS WERE ONE LEVEL
 File path: 1993/SLACVM/www/192/rl2971/default.html
 URL: http://slacvm.slac.stanford.edu/FIND/default.html
TO LIMIT ACCESS TO SLAC NETWORK
 http://slacvm.slac.stanford.edu/SLACONLY/something
IIPC GA 2015 – Stanford CA
Timestamp
 We depend on the UNIX back context file that listed the last-modified
date for each file.
 We updated the wget-generated CDX file with the old dates.
IIPC GA 2015 – Stanford CA
Access Problem
WE USED OPEN WAYBACK 2.0, BUT…
 For historical reason, the start-date was hard-coded to 1996, we’ve to
modify to start on 1991.
 Early websites have capital letters in the URI.
 SLAC early logo was on XBM format which becomes obsolete now.
IIPC GA 2015 – Stanford CA
Quality Assurance
WE SHARED THE RESTORED SITE WITH THE EARLY SLAC STAFF.
IIPC GA 2015 – Stanford CA
Results
Year Number of Discovered URI
1991 52
1992 119
1993 356
1994 663
1995 5264
1996 4911
1997 2026
1998 2378
IIPC GA 2015 – Stanford CA
SLAC homepage evolution
IIPC GA 2015 – Stanford CA
Home Page, we didn’t have homepage
1991 – DEFAULT PAGE
1995 – THREE HOMEPAGES
1998 – MODERN HOMEPAGE
IIPC GA 2015 – Stanford CA
First Under-construction page
IIPC GA 2015 – Stanford CA
Visit swap.stanford.edu

More Related Content

Similar to Restoring US First Website

Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedEdureka!
 
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...Denis_infinum
 
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...Infinum
 
Session 3 - CloudStack Test Automation and CI
Session 3 - CloudStack Test Automation and CISession 3 - CloudStack Test Automation and CI
Session 3 - CloudStack Test Automation and CItcloudcomputing-tw
 
Emerging storage-trends-for-containers
Emerging storage-trends-for-containersEmerging storage-trends-for-containers
Emerging storage-trends-for-containerskiran mova
 
Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Knoldus Inc.
 
Web Front End Performance
Web Front End PerformanceWeb Front End Performance
Web Front End PerformanceChris Love
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardJohn Kunze
 
Into the TarPit: A TarMK Deep Dive
Into the TarPit: A TarMK Deep DiveInto the TarPit: A TarMK Deep Dive
Into the TarPit: A TarMK Deep DiveMichael Dürig
 
JAMstack your Angular Applications with Scully
JAMstack your Angular Applications with ScullyJAMstack your Angular Applications with Scully
JAMstack your Angular Applications with ScullySteffi Keran Rani J
 
Maa wp-10g-racprimaryracphysicalsta-131940
Maa wp-10g-racprimaryracphysicalsta-131940Maa wp-10g-racprimaryracphysicalsta-131940
Maa wp-10g-racprimaryracphysicalsta-131940gopalchsamanta
 
5 steps to faster web sites & HTML5 games - updated for DDDscot
5 steps to faster web sites & HTML5 games - updated for DDDscot5 steps to faster web sites & HTML5 games - updated for DDDscot
5 steps to faster web sites & HTML5 games - updated for DDDscotMichael Ewins
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Web ARChive (WARC) File Format
Web ARChive (WARC) File FormatWeb ARChive (WARC) File Format
Web ARChive (WARC) File FormatSawood Alam
 
Icinga @ OSMC 2014
Icinga @ OSMC 2014Icinga @ OSMC 2014
Icinga @ OSMC 2014Icinga
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamNETWAYS
 
インフラ野郎Azureチーム Night
インフラ野郎Azureチーム Nightインフラ野郎Azureチーム Night
インフラ野郎Azureチーム NightToru Makabe
 
Implementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetImplementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetRoy Zimmer
 

Similar to Restoring US First Website (20)

Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an Android archive (.aar) to Maven...
 
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...
Infinum Android Talks #04 - How to publish an android archive (.aar) to Maven...
 
Session 3 - CloudStack Test Automation and CI
Session 3 - CloudStack Test Automation and CISession 3 - CloudStack Test Automation and CI
Session 3 - CloudStack Test Automation and CI
 
Emerging storage-trends-for-containers
Emerging storage-trends-for-containersEmerging storage-trends-for-containers
Emerging storage-trends-for-containers
 
Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0
 
Web Front End Performance
Web Front End PerformanceWeb Front End Performance
Web Front End Performance
 
ARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forwardARK identifiers: lessons learnt at BnF: paths forward
ARK identifiers: lessons learnt at BnF: paths forward
 
Into the TarPit: A TarMK Deep Dive
Into the TarPit: A TarMK Deep DiveInto the TarPit: A TarMK Deep Dive
Into the TarPit: A TarMK Deep Dive
 
JAMstack your Angular Applications with Scully
JAMstack your Angular Applications with ScullyJAMstack your Angular Applications with Scully
JAMstack your Angular Applications with Scully
 
Maa wp-10g-racprimaryracphysicalsta-131940
Maa wp-10g-racprimaryracphysicalsta-131940Maa wp-10g-racprimaryracphysicalsta-131940
Maa wp-10g-racprimaryracphysicalsta-131940
 
5 steps to faster web sites & HTML5 games - updated for DDDscot
5 steps to faster web sites & HTML5 games - updated for DDDscot5 steps to faster web sites & HTML5 games - updated for DDDscot
5 steps to faster web sites & HTML5 games - updated for DDDscot
 
EVOLVE'14 | Enhance | Anshul Chhabra & Akhil Aggrawal | Cisco - AEM High Avai...
EVOLVE'14 | Enhance | Anshul Chhabra & Akhil Aggrawal | Cisco - AEM High Avai...EVOLVE'14 | Enhance | Anshul Chhabra & Akhil Aggrawal | Cisco - AEM High Avai...
EVOLVE'14 | Enhance | Anshul Chhabra & Akhil Aggrawal | Cisco - AEM High Avai...
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Web ARChive (WARC) File Format
Web ARChive (WARC) File FormatWeb ARChive (WARC) File Format
Web ARChive (WARC) File Format
 
Icinga @ OSMC 2014
Icinga @ OSMC 2014Icinga @ OSMC 2014
Icinga @ OSMC 2014
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga Team
 
インフラ野郎Azureチーム Night
インフラ野郎Azureチーム Nightインフラ野郎Azureチーム Night
インフラ野郎Azureチーム Night
 
Implementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetImplementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student Budget
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Restoring US First Website

  • 1. IIPC GA 2015 – Stanford CA Restoring US First Website AHMED ALSUM WEB PALEONTOLOGIST
  • 2. IIPC GA 2015 – Stanford CA2
  • 3. IIPC GA 2015 – Stanford CA The Early WWW Exhibit on 2006 http://www.slac.stanford.edu/history/earlyweb/firstpages.shtml
  • 4. IIPC GA 2015 – Stanford CA Where are the pieces?
  • 5. IIPC GA 2015 – Stanford CA What we got from SLAC • Yearly UNIX backup files. • Mix of web pages and system artifacts like UNIX backup notes and console history.
  • 6. IIPC GA 2015 – Stanford CA Where can we replay it at Stanford? epa.gov
  • 7. IIPC GA 2015 – Stanford CA Reconstruction Strategy Page Content URI Timestamp CDX File WARC File example.org 19980722050200 …. 22222 file1.warc example.org 19991201081010 …. 33333 file1.warc . . . . ….. …….. ………. ………. ………. ……….
  • 8. IIPC GA 2015 – Stanford CA Page Content CRAWLING THE ARCHIVE  Determine a list of pages.  Using wget command to combine these pages in WARC format.  wget generates CDX file but with today’s date! Page1.html Page2.html Page3.html WARC File Page1.html … … Page2.html … … Page3.html … … wget CDX File Localhost/... 2014 ……. Localhost/... 2014 ……. Localhost/... 2014 …….
  • 9. IIPC GA 2015 – Stanford CA URI Challenges  There was no one-one mapping between directory structure and URI.  Host name changed through time.
  • 10. IIPC GA 2015 – Stanford CA URI Discovery R R R R R Inlinks within the collection Backlinks in SE HTML Source code Web Archives Publication about Early Web Interviews with SLAC staff
  • 11. IIPC GA 2015 – Stanford CA URI Discovery FIRST SET OF URIS WERE ONE LEVEL  File path: 1993/SLACVM/www/192/rl2971/default.html  URL: http://slacvm.slac.stanford.edu/FIND/default.html TO LIMIT ACCESS TO SLAC NETWORK  http://slacvm.slac.stanford.edu/SLACONLY/something
  • 12. IIPC GA 2015 – Stanford CA Timestamp  We depend on the UNIX back context file that listed the last-modified date for each file.  We updated the wget-generated CDX file with the old dates.
  • 13. IIPC GA 2015 – Stanford CA Access Problem WE USED OPEN WAYBACK 2.0, BUT…  For historical reason, the start-date was hard-coded to 1996, we’ve to modify to start on 1991.  Early websites have capital letters in the URI.  SLAC early logo was on XBM format which becomes obsolete now.
  • 14. IIPC GA 2015 – Stanford CA Quality Assurance WE SHARED THE RESTORED SITE WITH THE EARLY SLAC STAFF.
  • 15. IIPC GA 2015 – Stanford CA Results Year Number of Discovered URI 1991 52 1992 119 1993 356 1994 663 1995 5264 1996 4911 1997 2026 1998 2378
  • 16. IIPC GA 2015 – Stanford CA SLAC homepage evolution
  • 17. IIPC GA 2015 – Stanford CA Home Page, we didn’t have homepage 1991 – DEFAULT PAGE 1995 – THREE HOMEPAGES 1998 – MODERN HOMEPAGE
  • 18. IIPC GA 2015 – Stanford CA First Under-construction page
  • 19. IIPC GA 2015 – Stanford CA Visit swap.stanford.edu