Keynote presentation at JCDL 2019 at UIUC, on the interaction between standards (development and usage) and communities. Looking at Linked Open Data, digital library protocols, and evaluation of standards practices.
37. @azaroth42
rsanderson
@getty.edu
IIIF:Interoperabilituy
Standardsand
Communities
@azaroth42
rsanderson
@getty.edu
IIIF Design Patterns
1. Scope design through shared use cases
2. Design for international use
3. As simple as possible, but no simpler
4. Make easy things easy, complex things possible
5. Avoid dependency on specific technologies
6. Use REST / Don’t break the web
7. Separate concerns, keep APIs loosely coupled
8. Design for JSON-LD, using LOD principles
9. Follow existing standards, best practices
10. Define success, not failure (for extensibility)
https://iiif.io/api/annex/notes/design_patterns/
52. @azaroth42
rsanderson
@getty.edu
IIIF:Interoperabilituy
Standardsand
Communities
@azaroth42
rsanderson
@getty.edu
Linked Art Profile
A Linked Open Usable Data model, collaboratively designed
to work across cultural heritage organizations, that is easy
to publish and enables a variety of consuming applications.
Main Design Principles:
• Focused on Usability, not 100% precision / completeness
• Consistently solves actual challenges from real data
• Development is iterative, as new use cases are found
• Solve 90% of use cases, with 10% of the effort
53. @azaroth42
rsanderson
@getty.edu
IIIF:Interoperabilituy
Standardsand
Communities
@azaroth42
rsanderson
@getty.edu
Linked Art Collaboration
Working to formalize the profile, funded by Kress & AHRC
• Getty
• Rijksmuseum
• Louvre
• Metropolitan Museum of Art
• Smithsonian
• MoMA
• V&A
• NGA
• Philadelphia Art Museum
• Indianapolis Art Museum
• The Frick Collection
• Harvard University
• Princeton University
• Yale Centre for British Art
• Oxford University
• Academica Sinica
• ETH Zurich
• FORTH
• Zeri Foundation (U. Bologna)
• Canadian Heritage Info. Network
• American Numismatics Society
• Europeana
And thank you especially to Patricia for starting us off on such a thought provoking topic! I don’t have her courage for a short talk and lots of questions, but I believe I’ll have a loud 5 minute warning about 10am.
I’d like to focus in a little on the interactions between standards and communities – their creation, adoption, and maintenance, rather than the technical details of the standards themselves.
How we can make those interactions, and this one, a dialogue and not just consumption. How can we balance innovation, with consistency and sustainability?
What are some of the pitfalls that prevent good standards from reaching the necessary critical mass, such that as a community, we can provide better, sustainable services.
That sounds like a big topic for a year long course, let alone a single discussion! I’ll focus on digital cultural heritage standards that I’ve had some part in across a variety of organizations and timeframes, and the lessons that I have learnt along the way.
I realize this is kind of a CV slide, but I promise this is the only one. And if you don’t know me, it’s actually a pretty good introduction.
Non-web protocol for search and browse of library catalogs.A web-oriented attempt to capture the good parts of Z39.50A web service for harvesting metadata records, same time as SRUA more recent attempt to bring PMH into the modern web stack, via sitemapsAn early ontology for Linked Data aggregations of web resourcesTime-travel for the web! HTTP layer access to web archives.
Linked Data model and protocol for Annotations of web resourcesAccess to Images and presentation information to render themAnd we’ll get to this one towards the end!
Apart from Z39.50 and OAI-PMH, I’ve been part of the editorial group for all of these specifications. I was part of the Z39.50 Implementers Group or ZIG, and spoke often with folks in the early days of PMH while I was working on SRU, and implemented both.
Why spend so much time working on standards?
Standards are important for consistency. Consistency of models, of data, of systems.
And consistency is a prerequisite for interoperability of systems, which is the important thing.
I borrow Stefan Gradman’s division of interoperability into six facets, but intend focus on technological standards, through the lens of the interactions between people, organizations and communities.
[Functional = goals]
Okay, I still haven’t answered why I really care about standards.
To me it’s about Impact – In order to connect people with the information they’re seeking, and with each other, we need standards for interoperability, because achieving that aim cannot be done in isolation.
Connected systems and connected, consistent data are needed to truly connect people with knowledge and the world’s cultures.
And that connection, where the water hits the rock, is where impact is generated.
As an aside, I have taken this exact photo, but my personal photography digital library is just a year by year folder backed up to dropbox.
Which brings me to “digital”…
To me, Digital Libraries are just libraries, because “Digital” is a means and not an end. Similarly, Digital is not just the web, although we might be excused for thinking that way these days, to Vint Cerf’s horror.
I believe that Digital is really shorthand for ”connected to the network”, which exists to connect people with each other and with knowledge and culture. Or ads and cat videos, as it turns out … to everyone’s horror.
To put it another way, standards let us work together, without having hundreds of people in hundreds of meetings. They let us implement technologies and create information independently, safe in the knowledge that the standards we’re conforming to will ensure consistent data and APIs, that in turn will drive highly usable applications. That they successfully separate and balance the concerns of the data publisher and the data consumer.
Why a photograph of keas, other than the New Zealand connection? Keas will work together to accomplish their objectives, to the same degree as chimpanzees.
Sometimes those objectives are a bit harder to achieve, like figuring out how to get into your car or what the windshield wipers do.
Some standards are very successful. Some have seen very little adoption, regardless of how good they might be in isolation.
What I’m going to explore is the interactions at a personal and community level, not the technical details of how to unlock the doors and replace windshield wiper blades.
How people interact in the context of defining and using technical interoperability standards, rather than how the machines that work for us interact.
Which is why I was so happy to see the theme of this year’s conference!
Standards are developed by connecting people that share similar goals, that would be advanced by working together. This raises questions about the scale of the organization or standards body, the related scope for the potential audience and the process by which the standards are created, disseminated, adopted, and maintained.
There are four questions to answer about the interaction between the standards process, and the relevant communities.
[…]
The answers are different at different scales, from small groups up to global organizations.
And in thinking about the scale of communities, how should we consider the effects of it on the standards, the process and their adoption?
Direction and vision are required... the focus of the community needs to be articulated clearly and convincingly. Communities need awareness and understanding of the problems that they are facing, and the motivation to work together to overcome them.
There needs to be leadership, but not at the expense of being open and welcoming.
Participation is the key requirement in solving practical challenges, not reputation. And Active participation, not just lurking. The community needs to not only think about the end result, but actively consider and adapt how it is getting there.
So Flexible, not Agile? I prefer the notion of the community being flexible but strong – when the willow tree sways with the wind it’s flexible, but is true to its purpose. A cat agilely avoiding a dangerous situation doesn’t address the challenge, it just avoids it. Agility lets you dance around the problem, Flexibility lets you overcome it. Focused, Open Active, Flexible … or FOAF. That should be google unique, right?
PMH, ResourceSync, ORE.
I know Memento is not an OAI spec, instead it’s RFC7089, but the participants and scale are approximately the same.
OAI work is typically funded by grants, rather than by the participating institutions. This provides some benefits in terms of diversity, funding dedicated work and holding outreach events, but conversely makes it harder to sustain once the money runs out. At project scale, it’s much harder to be involved. Either you’re part of the project, or you’re not. There aren’t clear on-ramps to bring people in to the work, or for how they can participate meaningfully.
Project scale relies heavily on brilliant, strong leaders with a great vision.
This may look familiar! LC as an organization has more pans in the fire than just Z39.50 or SRU, there were connections between MARC development and Z39.50 development of course. Which brings it up slightly towards community scale, but without most of the benefits of that scale.
The biggest challenge for work at the project scale is the same as for any research project – sustainability not of the documentation, but of the community of implementers. The constant attention needed to grow the active user base is very hard when many participants have moved on to something else.
$10k or $2500 to be part of the consortium, but it doesn’t limit access to participation.
Slippery slope – hope the front row didn’t /feel/ that.
From $8k to $80k for big commercial organizations, and does limit participation other than for world class experts.
ISO has 164 countries as its members. The US representative is ANSI, which has 270 thousand members. It accredits NISO, which has 221 members, as the US representative to TC46 for information standards. NISO then also works with other groups to provide more recognized support for standards work, such as Z39.99 … or Resource Sync.
ICOM, the International Council of Museums and thus MUCH smaller than ISO, still has 40 thousand members. It then has a documentation committee, CIDOC, which has a subcommittee for the CIDOC Conceptual Reference Model. Which is foreshadowing for what this Linked Art thing is.
Why this slide with so much punctuation, and so little content?
Standards are made by people, and it is hard to generate consensus with more than about 20 people in a room.
The W3C recognizes this and explicitly says that working groups should be about 20 people, and should be split if they grow much larger.
As more people get involved, it is harder for everyone to participate meaningfully, the process needs to be more formal, and building consensus takes longer. The global organizations are simply member management, and do not work directly on the standards, instead they subdivide back down again to a manageable scale.
And here is the primary conundrum of standards work -- if only 20 people can work on a specification, how can there be a community that is involved?
Katherine Skinner, Executive Director of Educopia introduced me to this notion of the community engagement pyramid. There are a few people at the top of the pyramid, and increasingly many as you move down the tiers. In the IIIF community, for example, there are probably 5 or so active leaders, but a good 20+ experts and advocates, maybe 50-100 contributors, 500 or more members that aren't constantly engaged but are actively following, and then an impossible to determine number of people on the edges looking in. Or rather, looking up.
The point is not to have a hierarchical and fixed structure, but to recognize that people look upwards and it is the sign of a healthy community when everyone on the tier above is reaching down to help those who want to take on a bigger role to do so. In order to be successful, we need to have a solid understanding of how to work together strategically, while advancing our own organizations' immediate goals.
Catherine Bracy gave an outstanding keynote at the Museum Computer Network conference when she was Director of Community for Code for America, where she explained these four requirements for successful community leadership.
First, know your audience. Who is the community, and who, as a community leader, do you need to be working directly with.
Secondly, reach out to those people on their terms, not on yours. You need them to participate, and for that, they need to understand and agree with the goals and direction.
Thirdly, have a continuing conversation with the community about everything!
And make opportunities for people to actively and meaningfully participate. Through participation you're building the ladders to bring them to the next level of the pyramid.
Given those two notions of the pyramid and the necessity for intentionally creating opportunities to participate at all of the levels, what sort of design principles are needed for the process and structure of standards? How do we engage with the community, such that they feel confident in adopting the work and more importantly contributing to it because it meets their immediate needs.
Michael Barth has six fundamental features for API evaluation, which relate directly to the value of the API as a standard for use. This seems like a good starting point for standards for digital interoperability.
Abstraction Level -- is the abstraction of the data and functionality appropriate to the audience and use cases. An end user of the "car" API presses a button or turns a key. A "car" developer needs access to engine directly.
Comprehensibility -- is the audience able to understand how to use it to accomplish their goals
Consistency -- if you know the "rules" of the API, how well does it stick to them? Or how many exceptions are there to a core set of design principles
Documentation -- How easy is it to find out the functionality of the API?
Domain Correspondence -- If you understand the core domain of the data and API, how closely does the understanding of the domain align with an understanding of the data?
And what barriers to getting started are there?
There are two more that I think are important.
Several of these factors are not orthogonal, but instead correlate in some fashion. I think there are three areas where trade-offs between the factors become unavoidable.
The first is the basic trade-off between functionality and comprehensibility. As you add more functionality, it becomes harder to understand. Without enough functionality, you don’t have a product that anyone wants. Without comprehensibility, everyone might want it, but no one is willing to make it.
Secondly there is always some trade off between the publisher of the API or data, and the consumer of it. For example, publishing a plain JSON document statically on the web is very easy for the publisher, but can make for a lot of network requests and following links for the consumer. Conversely, something like GraphQL allows the consumer to specify by example exactly the information and structure that it needs, but takes a lot more work for the publisher.
And finally there’s a complex interdependency around the scale and overlap of the people defining the specification, and the scale of the intended audience. There are two other factors here, the correct abstraction and domain correspondence are dependent on the audience … but being managed by the community defining the standard.
Okay, some opinions that will maybe cause some controversy, as to where each of the standards fall on these axes.
I’ve learnt that the easiest way to generate controversy or at times outrage is to present personal, qualitative opinions in a quantitative style. So here are some graphs that look like real data, but really just represent my opinions for you to disagree with at the end
Scope creep drags up the functionality and down in comprehension.
Observing the effects of these sorts of qualities of APIs, and their effect on adoption and the engagement of the community, has then been codified in various design principles or design patterns documents.
The IIIF community has the following patterns, which address the points from the previous graphs.
The design principles for perhaps the most successful information standard ever, HTML, is also based on the interaction of the standard with people, and termed the priority of constituencies.
[…]
Wait, a *semantic* architect is agreeing that theoretical purity of the ontology is the lowest priority? Is the world ending? What have I done with the real Rob Sanderson??
This is where we get to consistency of data, and its interaction with people. I am strongly convinced that LOD is the right technical solution … but it hasn’t seen strong adoption, and we can easily see why in terms of the previous criteria for how people and standards interact. What we really need is not just LOD but …
LOUD. Linked Open Usable Data. Standards allow for data to be consistent, but for usable applications to be built, the data also needs to both be usable and explicitly be concerned with that application usability.
What do I mean by “Usable”? In a well established tradition, the wikipedia definition clarifies that ..
So, * who wants to do * what, * how and in * what environment.
Unlike the entirely objective five stars of LOD publishing, any recommendations about usability need to take into account the consumer.
* Usability is thus dependent on, and determined by, the Audience
Or put another way, the API is the User Interface of the Developer. As a New Zealander, I must regretfully announce that the Australians have this absolutely 100% correct.
The Australian government wrote a fantastic API Design Guide in 2015 that nails it on core principles and the important notion of requiring empathy for developers, the same way that that any user interface should be accessible and comfortable for its audience.
To reapply the priority of constituencies in the context of LOUD …
But this is another correlated trade-off. In Linked Data, the API is directly tied to the model, and therefore there is a relationship between the usability of the API and the completeness or semantic correctness of the model.
Bibframe 1.0 was terrible. It was complex without actually addressing the issues, worse than Simple Dublin Core, which at least has enough to get /something/ done with. Then you have frameworks further up the usability scale like Web Annotations and EDM, and a little more complete, but by no means everything you want to say.
The IIIF Presentation API is, demonstrably, about as usable as a linked data API can get ... but we're constantly fighting to stay at high usability by resisting requests to add arbitrary features.
Then comes the slippery slope that schema.org is further down ... still usable for now, but they're constantly adding to it ... and not in a sustainable or directed fashion ... until you hit rock bottom with CIDOC CRM and the meta-meta-meta statements available via CRMInf.
The zone to be in is where the area under the point is maximized. So if the line is the likely possible values, anywhere in this target zone is the optimal area for a semantic API.
Part of the good practices or design considerations though, is that if you only need half of the completeness, you should not be punished in terms of usability to the half that you don’t need. You should be able to get close to the maximum usability for the particular use case’s completeness requirements.
We think that we have the right people, with the right commitment to participation, and have learnt from the open but formal processes of previous work to have a good chance of dramatically improving the way that museum metadata is published.
Okay, so to summarize what we have learnt, and thus what we hope to apply in Linked Art …
Standards are made by people for people, and require a commitment to engage with the community that constitutes the audience.
That community must feel welcome, and have the ability to participate meaningfully, not just in a community group walled garden or in meaningless voting procedures.
The organization that defines the specification doesn’t need to be a globally recognized standards body, it just needs to speak for and work on behalf of the community.
Standards need have enough functionality to be useful, but not so much as to be incomprehensible.
Similarly, they need to be be complete enough to fulfil the use cases, but no so complete as to become unusable.
They need to balance the ease of implementation by publishers and consumers.
There may or may not be a spoon, but there certainly is no magic bullet. There is no one correct answer that magically solves all standards problems.
The right answer for one community, specification or set of use cases might be completely wrong for another.
I’m afraid I don’t have a short, pithy checklist of actions to take a photo of and apply at home
The only way to know that you’re on the right path is to be keep focused on the task, be open to meaningful participation, remain active (or at least attentive), and be flexible in the face of changing requirements and environment.
Most importantly, the process needs to actively create opportunities for meaningful participation by the people that the standard is intended to serve. Without engaging with the audience, not only is it impossible to know if the standard is successful, I believe a stronger assertion, that it is impossible for the standard to be successful.
Scope creep drags up the functionality and down in comprehension.
Bibframe 1.0 was terrible. It was complex without actually addressing the issues, worse than Simple Dublin Core, which at least has enough to get /something/ done with. Then you have frameworks further up the usability scale like Web Annotations and EDM, and a little more complete, but by no means everything you want to say.
The IIIF Presentation API is, demonstrably, about as usable as a linked data API can get ... but we're constantly fighting to stay at high usability by resisting requests to add features.
Then comes the slippery slope that schema.org is further down ... still usable for now, but they're constantly adding to it ... and not in a sustainable or directed fashion ... until you hit rock bottom with CIDOC CRM and the meta-meta-meta statements available via CRMInf.