10. Archivingworkflow
Collect Analyse Archive Present
Two stage archiving strategy: web
analyzing storage archive
Archivist describes target
HTML and API crawlers fetch content
10
11. Archivingworkflow
Collect Analyse Archive Present
Different modules analyse semantic
information & social context to filter
relevant content
HBase and RDF triple storage
11
12. Archivingworkflow
Collect Analyse Archive Present
Only relevant content is preserved in
(W)ARC format
Semiautomatic content selection
Heritrix and Wayback compatible
12
13. Archivingworkflow
Collect Analyse Archive Present
Fulltext search and facet browsing
Semantic and social contextualization
Visualizations to be developed on top
(not in ARCOMEM sope)
13