Imagine a world where publishers can create a single publication that functions flawlessly both online and offline, packaged or unpacked. As publishing technologies continue to converge, it's not a far-off dream. Learn how Web Publications, EPUBs, and related standards for accessibility, annotations, and image interoperability are coming together in this presentation.
This presentation was originally presented by Apex CoVantage VP and Principal Consultant, Bill Kasdorf, at the STM Digital Publishing 2016 conference in London, UK.
Decarbonising Buildings: Making a net-zero built environment a reality
The Interoperability Imperative
1. Bill Kasdorf
VP and Principal Consultant, Apex Content Solutions
Member of IDPF Board, EPUB 3 WG, W3C DPUB IG
The Interoperability Imperative
How Publishing Technologies Continue to Converge
3. Books in Browsers 2014:
“Bridging the Web and Digital Publishing”
Unofficial Draft 30 June 2015:
“EPUB+WEB” (AKA “EPUB-WEB”)
GitHub, 24 September 2015:
“Portable Web Documents for the OWP”
W3C Working Draft, 15 October 2015:
“Portable Web Publications for the OWP”
W3C Editors Draft, 28 November 2016:
“Web Publications for the OWP”
(a work in process)
4. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether/how it’s packaged is a separate issue.
“PUBLICATION” ≠ “DOCUMENT”
but
“BUNCH OF STUFF ON THE WEB”
might = “PUBLICATION”
5. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether/how it’s packaged is a separate issue.
6. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether/how it’s packaged is a separate issue.
The “P”
is coming
to mean
“packaged”
more than
“portable”
7. Relationship
with
researchers Decisions
based on
analytics
Peer review
automation
Reference tracking
Statistic checking
Text and Data
mining
How
efficient
is text?
Rethink
use of
document
Integrating
digital
artefacts
The
narrative
Research
Data
Who
wants the
Narrative
Integrating
multimedia
Taxonomy
resources
Linked
Open Data
Relationship
maps
Wikipedia
All digital
artefacts
Code protocols
ORCIDS needed
XML linked data
Reducing
friction
Publications at
different places
Open science
satellite
NEW
RELATIONS
VIA SOCIAL
NETWORKS SCN’s
creating
new journals
Become
publishers?
Start
ups
A challenge
An opportunity
in integration
STM
= B2C
publishing?
Chaos and
diversity
Not the
same as
asking an
expert
Inertia
Signal
vs noise
Business/
leisure/
research?
Needs
easier
paywalls
New
business
models
INFORMATION
STRUCTURE
AND
CONTEXTUALISATION
Behavioural
analytics
Researcher Researcher
Consumer Doctor
Patient Patient
CAVEATS
Big data
analytics
User focus
DYNAMIC
PUBLISHING
Creating
solutions
Not search
results
User
tracking =
new products
and services
PUBLISHING
1
Unstructured
Data
Precision
Information
Automated
literature
navigation
Look up
everything Outsource
your brain
and memory
Look up on the fly
Any
device
will do
Different
skill sets
Ask right questions
Get solutions,
not search
results
Facts
and data
vs meta-
analysis
The
customised
solution
Atomisation
of
information
Who wants
the full
document?
Open
Science
Platforms
Citizen
Science
Innovation
in Society
Who wants
the full
document?
EVERYONE
IS A
CUSTOMER
Virtual
reality
Augmented
reality+
Social
learning
Reputation
management
Metrics
Social
networks
Life
Logging
Social
reading
Collaboration
tools
LIFELONG
LEARNING
ASK
WATSON!
THE
PERIPHERAL
BRAIN
22 USERS
Convergence
online
and
offline
INDUSTRIAL-
ISATION OF
RESEARCH
Scaling
up
Using
big data
Text
analytics
Data
moving
from lab
to lab
Fast
translation
of results
Data
analytics
by the
crowd
OPEN
SCIENCE
Open
Data
Reproducibility
Sharing
Research
Data
SMALL
SHOP
LABS
Citizen
science
Garage
shops
Outside
academia
Findable
Retrievable
Accessible
RESEARCH
DATA
Linking
Data and
Pubs
Interoperable
Collaboration
Performance
Evaluation
Pooling
of Data
Robot
labs
Machine
generated
Research
Hypothesis
Experiments
Citizen
Science
Knowledge
graphs
AUTOMATED
KNOWLEDGE
CREATION 3RESEARCH
PRIVACYANDSECURITY
44 Warranting
reproducibility
Identity,
reputation
Link
people
Certification
VoR
Users
securing
their
own
metadataUser
privacy
Balance
Privacy
and Value
NEED
Unauthorised
PDFs
PROBLEMS
Theft and
privacy
Push
walls
Content
locks
Internet
locked
and
blocked
Big
user data
Right
to be
forgotten
Safe
harbour
Individualised
services
allowed?
BOOST IN
ARTIFICIAL
INTELLIGENCE
TDM
Statistics
on steroids
Internet
of Data
Artificial
Intelligence
Machine
learning,
machine
reading
COMPUTER
POWER ON
STEROIDS
Cloud
computing
Webscale
computing
No more
capacity
limits
Easier
innovation?
Computing
costs up
or down?
Big Data meets Artificial IntelligenceText
Non-text
Protocols
Research
Data
Knowledge
graphs
Code
Orcids
MORE
OUTPUTS
- ALL
DIGITAL
Outputs
born
digital
Increased
output
variety
5TECHNOL
O
GY
5
User-cen
tered Publishing delivers Precision Inform
at
ion
The Machine is the New Reader
Science as a Social Machine
D
ata Privacy requires a Web of Trust
STM Tech Trends: Outlook 2020
THE TECHNOLOGY FLOODGATES ARE OPEN
Kindly sponsored by
It’s not
just about text.
And almost all of this
depends on Web
technologies.
8. Minor update to HTML 5 on 1 Nov. 2016
Mostly fine-tuning to align with actual practice.
A few new elements, like <details> and <summary>,
for info users can choose whether or not to read.
Removed some features, mostly very technical.
A few changes, like <figcaption> anywhere in <figure>.
HTML 5.2 is being worked on, due late 2017.
This will continue to evolve.
This is a good thing!
HTML 5.1
9. Working toward print-quality rendering.
“Grid” and “flexbox” for complex layouts.
CSS variables and the “calc” function for adaptability.
New font features for sophisticated display.
MS is working on a new spec for table behavior;
goal is interoperability among browsers.
These will help make complex (STM!) content
more reliable and responsive.
CSS is modular: ongoing progress.
CSS
10. The Web Publication Vision:
ONE PUBLICATION FOR BOTH
ONLINE AND OFFLINE USE.
The same content in two different “states”:
Offline, packaged or cached;
Online, with all essential resources linked.
A canonical URL that leads to both.
11. Wouldn’t it be great if
there was no difference between
an online publication
and an EPUB?
14. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
15. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
16. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
17. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
Not Just Books
It’s used for all kinds of publications.
18. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
Not Just Books
It’s used for all kinds of publications.
Global
Widely adopted in US, EU, Far East, Israel.
19. We want to avoid two competing specs.
These need to be the same thing.
Could be one master spec,
or a layered spec with “profiles”:
e.g., PWP as a profile of a WP (a type of WP),
and “EPUB 4” in turn as a profile of PWP
(like EPUB for Education is for EPUB),
a type of PWP requiring more predictability,
accessibility, archivability.
EPUB 4 vs. (P)WP
20. We want to avoid two competing specs.
These need to be the same thing.
Could be one master spec,
or a layered spec with “profiles”:
e.g., PWP as a profile of a WP (a type of WP),
and “EPUB 4” in turn as a profile of PWP
(like EPUB for Education is for EPUB),
a type of PWP requiring more predictability,
accessibility, archivability.
EPUB 4 vs. (P)WP
This is why
we’re working on
combining the IDPF
into the W3C.
21. Publishing Business Group
Provides a formal voice for publishing in the W3C.
Dues are comparable to IDPF dues.
IDPF members are automatically members
for two years at current IDPF dues.
Publishing Working Group
Full participation in W3C standards at W3C dues.
EPUB 3 Community Group
Free to all; maintains EPUB 3.
Here’s the plan.
23. People are afraid publishers
will become “lost” in the W3C.
This plan is designed to ensure
that publishers have
an even greater presence and influence
and can participate
at costs that even
small publishers can bear.
25. Tightens up EPUB 3 without breaking it.
New spec format, better integrated.
Deprecates unused features of EPUB 3.0.
Clarifies/extends support for remote resources
(e.g., metadata, fonts, datasets).
Improves CSS behavior between author/RS/user.
Improved and stricter accessibility support.
Expected to advance to member vote
as final Recommended Spec on Thursday.
EPUB 3.1
26. Better alignment with OWP.
Undated references to HTML & SVG to stay in synch.
No EPUB CSS profile; uses CSS WG “official definition.”
Some metadata improvements.
Prioritizes linked bibliographic metadata records.
Deprecates @refines attribute and adds
more explicit attributes for specific functionality.
Final notice on NCX.
Slated for removal in next major revision of EPUB.
EPUB 3.1
27. EPUB in Microsoft Edge
Read EPUBs natively just as you can with PDFs.
HTML5-based version of Woodwing
Magazine industry’s leading production system
moves from proprietary to OWP software.
VitalSource Content Studio
Easy cloud-based creation of complex
educational media as EPUB 3.
IBM adopts EPUB 3 for all documents
Significant move away from PDF.
EPUB is Not Just for Books!
29. Separate spec devoted to accessibility.
Clear guidelines to enable certification of accessibility
and discovery of accessible features in an EPUB.
Based on WCAG 2.0: A is must, AA is recommended.
Adds publication-specific requirements.
Requires accessibility-specific metadata.
Techniques document provides “how to do it” advice.
Applicable and referenceable by
any version of EPUB and other specs too.
EPUB Accessibility 1.0
30. Categories of compliance.
“Discovery-Enabled”: Just metadata.
“Accessible”: MD + WCAG 2.0 + EPUB requirements.
“Optimized”: Metadata + specific features.
Metadata aligned with schema.org.
accessMode (textual, visual, auditory, tactile).
accessibilityFeature (what features does it have).
accessibilityHazard (e.g., flashing can cause seizures).
accessibilitySummary (human-readable explanation).
accessModeSufficient (e.g. text + alt text = textual).
EPUB Accessibility 1.0
32. W3C Web Annotation Working Group
Web Annotation Data Model
Web Annotation Vocabulary
Web Annotation Protocol
Provide interoperable “data structures” for annotations:
Can exchange annotations between systems.
Can store annotations on an annotation server.
Put annotations on text, images, videos, etc.
Final recommendations to pub in February.
Annotations
33. Annotating All Knowledge
Coalition of over 70 scholarly publishers, platforms,
libraries, and technology organizations.
Open source, standards-based, supports key formats
(HTML, PDF, EPUB, images, video and data).
Ambitious 3-year timeline:
Pilots at JSTOR, arXiv, eLife, etc.
Force11: Community Platform, Working Group.
Annotations
34. Ambitious AAK 3-Year Timeline
Year 1, Design & Build: Interview users, run
experiments, gather requirements. Discussions w/ key
platforms. Ship standards. Write software.
Year 2, Deploy: Make annotation available with
articles, books and other media objects.
Year 3, Market: Drive adoption through partnerships
and targeted programs.
(Thanks to Maryann E. Martone, Hypothes.is)
Annotations
36. IIIF: The International Image
Interoperability Framework
A Community . . .
Over 600 national libraries, research institutions,
museums, tech firms, aggregators, and projects.
. . . that creates APIs . . .
Image (the pixels); Presentation (human readable info);
Authentication (almost finished); Search (to come).
. . . that it uses to create interoperable services.
Focusing on providing a good UX.
Interoperable Images
37. Image API
URL and identifier for image; can express regions,
size, mirror, rotation, rights info, multiple versions.
Presentation API
Structure, properties (labels, rights, technical info, links),
can associate transcription, translation, commentary,
etc. with regions of an image.
Working on Audio, Video; 3D in future
All based on Web technologies, including Annotations.
Interoperable Images
38. WEB PUBLICATIONS
Web Publications for the Open Web Platform: https://w3c.github.io/dpub-pwp/
HTML 5.1
https://www.w3.org/TR/html/
EPUB 3.1
http://www.idpf.org/epub/31/spec/epub-spec.html
ACCESSIBILITY
EPUB Accessibility 1.0: http://www.idpf.org/epub/a11y/
EPUB Accessibility Techniques: http://www.idpf.org/epub/a11y/techniques/techniques.html
WEB ANNOTATIONS
Data Model: https://www.w3.org/TR/annotation-model/
Vocabulary: https://www.w3.org/TR/2016/CR-annotation-vocab-20160906/
Protocol: https://www.w3.org/TR/2016/CR-annotation-protocol-20160906/
ANNOTATING ALL KNOWLEDGE
https://hypothes.is/annotating-all-knowledge/
INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORK
http://iiif.io/
Resources