Making Future-proof Library Content for the Web: Metadata-driven Workflows and Doing Things the “Right” Way

Making future-proof library content for the Web
Metadata-driven workflows & doing things the “right” way
Tuesday, June 4, 2013

About us
NTNU UB
Gunnerus special collections
Data, since 2009
Extremists?->ODC PDDL, CC-BY-SA, moving towards RDF as a sole format
Disagree with the trend towards discovery
Reject ideas of working around legacy crap
Personal journey -> from scripter to coder, architect…planner

The big idea
Like,
dude,we
totally
need a new
webpage
“
”
depression, they mean a webpage
The average library manager is aware of their IT shortcomings
We need a new way of getting data and assets to users
Process of asset production, ingestion, documentation, preservation, storage and provision

data provision
resource linking
search
Capabilitystack
addressing

From webapp to Web
You try
finding
smartass
a suitable image
Web scale? No, the only webscale thing is the Web
Web means via HTTP and standard Web tech…not weird library shit
Being a part of the Web is more important than anything else
Do what serious Web companies do
Consume data to provide data

HTTP
RDF
JSON-LD RDF/XML
Indexer
HTML5
Technologystack

HTTP
RDF
JSON-LD RDF/XML
Indexer
HTML5
Apache + Tomcat + JAX-RS + Jena
Technologystack

HTTP
RDF
JSON-LD RDF/XML
Indexer
HTML5
Elastic search
Technologystack

HTTP
RDF
JSON-LD RDF/XML
Indexer
HTML5
Elastic searchGoogle, Yandex…you
Technologystack

Challenges and issues
Here
Status quo: IT policy
We need to revise everything (IT plan from early 2000s)
Where we are vs. where we need to be
Partners
Architectural choices

From metadata to data-driven
“Hi there!”
What we’re doing right now
Adopting linked data changed the way we looked at metadata
Much more at the centre of the process
Workﬂows are more important than publishing data
Data is very important
Scripting removes 2/3 of the workload, data drives the scripts
Killing holy cows…quality of data and image quality…
We can do a lot…

scanning
unique
identifier
(meta-)data
cataloguing
preservation
transformation
Web
storage/delivery

scanning
unique
identifier
(meta-)data
cataloguing
preservation
transformation
Web
storage/delivery
ingestion

Documents, data, search & discovery
right here,
something,
really
stinks
“
”
The problem with discovery: it’s not Web, it’s just on the web…sort of
Search: One page of many millions of pages
Come to us via your preferred route
Add links to enrich
Provide content

The right tools for the job
“We’re going to need a bigger hammer…”
Documentation at every stage, code, processing, etc.
Technology choices
Nothing wrong with being custom…
Scripting IS on UNIX…
You’re saddled with legacy crap —WinXP? There is a solution

It grows
Yeah…
it’s
this
way…
“
”
See from own experience, eg face detection for img cataloguing
Provide solutions for real problems, people come back
Acceptance? In current climate?
…on offer from commercial providers is the same old stuff

Extending to the institutional level
Well,
this
is
nice
“
”
No reason to not extend this thinking to every level
PDF/A…
Partners with content or DIY
Slow and uphill struggle
Better than the alternative

Takeaways
Om nom
nom
Talk
Work towards the goal of being of the Web
Provide data in the formats for the Web
Consume and use the same data
ELAG2013 -> I see common movement, concensus Sven Schlarb, Joachim Neubert, Niklas

rurik.greenall@ub.ntnu.no
@brinxmat
folk.ntnu.no/greenall
Thanks, folks!

“Uret innpakket i plast” ©2013 Nils Eikeland/NTNU, CC-BY-SA http://creativecommons.org/licenses/by-sa/3.0/
Hoodie Dude (http://www.flickr.com/photos/elvissa/6653254409/) / http://creativecommons.org/licenses/by/2.0/
(http://creativecommons.org/licenses/by/2.0/)
The wall (http://www.flickr.com/photos/86778817@N00/100046174/in/photostream/) / http://
creativecommons.org/licenses/by/2.0/ (http://creativecommons.org/licenses/by/2.0/)
OK, let me drive... (http://www.flickr.com/photos/fhmira/5307432721/) / http://creativecommons.org/licenses/by-
sa/2.0/ (http://creativecommons.org/licenses/by-sa/2.0/)
Tasmania-4200 (http://www.flickr.com/photos/julieedgley/3258094114/lightbox/) / http://creativecommons.org/
licenses/by-sa/2.0/ (http://creativecommons.org/licenses/by-sa/2.0/)
“Metadata 00000001” ©2013 Rurik Greenall/NTNU, CC-BY-SA http://creativecommons.org/licenses/by-sa/3.0/
In Search Of... (http://www.flickr.com/photos/satterwhiteb/5518767608/) / http://creativecommons.org/licenses/by/
2.0/ (http://creativecommons.org/licenses/by/2.0/)
LART: Essential for every BOFH (http://www.flickr.com/photos/bike/198959253/) / CC BY-SA 2.0 (http://
creativecommons.org/licenses/by-sa/2.0/)
Pork dumplings (http://www.flickr.com/photos/secretlondon/3349332093/) / CC BY-SA 2.0 (http://
creativecommons.org/licenses/by-sa/2.0/)
following the crowd (http://www.flickr.com/photos/colhou/663746322/lightbox/) / CC BY 2.0 (http://
creativecommons.org/licenses/by/2.0/)
“Spesialsamlingsgjengen” ©2013 Nils Eikeland/NTNU, CC-BY-SA http://creativecommons.org/licenses/by-sa/3.0/

Making Future-proof Library Content for the Web: Metadata-driven Workflows and Doing Things the “Right” Way

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (12)

Similaire à Making Future-proof Library Content for the Web: Metadata-driven Workflows and Doing Things the “Right” Way

Similaire à Making Future-proof Library Content for the Web: Metadata-driven Workflows and Doing Things the “Right” Way (20)

Plus de Rurik Thomas Greenall

Plus de Rurik Thomas Greenall (7)

Dernier

Dernier (20)

Making Future-proof Library Content for the Web: Metadata-driven Workflows and Doing Things the “Right” Way