Client-Assisted Memento Aggregation Using the Prefer Header
1. Client-Assisted Memento Aggregation
Using the Prefer Header
Mat Kelly, Sawood Alam, Michael L. Nelson, and Michele C. Weigle
Old Dominion University
Web Science & Digital Libraries Research Group
{mkelly, salam, mln, mweigle}@cs.odu.edu
@machawk1 • @WebSciDL
Web Archiving and Digital Libraries (WADL) Workshop
June 6, 2018, Fort Worth, TX
3. @machawk1
A Framework for Aggregating Private and Public Web Archives
JCDL 2018 • June 5, 2018 • Fort Worth, TX
Today’s Memento Aggregation
3
Archives Queried (A0 )
4. @machawk1
A Framework for Aggregating Private and Public Web Archives
JCDL 2018 • June 5, 2018 • Fort Worth, TX
Motivation
4
Archives Queried (A0 )
> Include personal archives
> Include other non-aggregated archives
5. @machawk1
A Framework for Aggregating Private and Public Web Archives
JCDL 2018 • June 5, 2018 • Fort Worth, TX
Motivation
5
Archives Queried (A0 )
> Include personal archives
> Include other non-aggregated archives
6. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
State of Aggregators’ Capabilities
● Mementoweb aggregator
○ Cannot customize set of archives aggregated
○ Open source? Unavailable for individuals’ deployment
● MemGator
○ Open source ✔ https://github.com/oduwsdl/MemGator
○ Requires static set of archives on-launch
○ Still specified by server, clients have no say
● With each, the set of archives is determined on the “server”.
● Neither allows client to specify set of archives aggregated.
6
7. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
HTTP Prefer
● RFC 7240 (June 2014)
● CLIENT requests with HTTP Header:
○ Prefer: foo; bar=""
● SERVER may response with HTTP Header:
○ Preference-Applied: foo
7
8. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
HTTP Prefer
● RFC 7240 (June 2014)
● CLIENT requests with HTTP Header:
○ Prefer: foo; bar=""
● SERVER may response with HTTP Header:
○ Preference-Applied: foo
Prefer: archives="data:application/json;charset=utf-8;base64,Ww0KIC7...NCn0="
OUR APPROACH:
8
9. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
Prefer + Memento
● S. Jones, H. Van de Sompel, et al. “Mementos in the Raw” 1
○ Prefer: original-content, original-links, original headers
○ Mitigate replay system rewriting, may “raw” information more accessible
● D.S.H. Rosenthal “Content negotiation and Memento” 2
○ none, screenshot, altered-dom, url-rewritten, banner-inserted
○ Additional focus on derived representations
9
1 http://ws-dl.blogspot.com/2016/08/2016-08-15-mementos-in-raw-take-two.html
2 https://blog.dshr.org/2016/08/content-negotiation-and-memento.html
11. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
Memento Meta-Aggregator (MMA)1
● Additional responsibilities beyond aggregation
● Provide hierarchical querying model to other aggregators
● Advanced querying models like Precedence and Short-Circuiting
● Systematic interaction and aggregation with Private and Personal Web
archive
1 Kelly et al. “A Framework for Aggregator Private and Public Web Archives”, JCDL 2018
11
21. @machawk1
Client-Assisted Memento Aggregation Using the Prefer Header
WADL 2018 • June 6, 2018 • Fort Worth, TX
Potential Approaches Toward Archival Set
Persistence for Subsequent Queries
1. Maintain state
○ content-location: /timemap/link/5bd...8e9/http://fox.cs.vt.edu/wadl2017.html
○ Not something we want to do with HTTP
2. Require re-specification with each request
○ not portable to other users
3. Server-side set caching
○ combinatorial explosion
21
22. Client-Assisted Memento Aggregation
Using the Prefer Header
Mat Kelly, Sawood Alam, Michael L. Nelson, and Michele C. Weigle
Old Dominion University
Web Science & Digital Libraries Research Group
{mkelly, salam, mln, mweigle}@cs.odu.edu
@machawk1 • @WebSciDL
Web Archiving and Digital Libraries (WADL) Workshop
June 6, 2018, Fort Worth, TX