Linked Open Data is great for recommendations about publishing data, but we need five more stars for the consumer -- How can it be both complete and usable? Design principles for Linked Open Usable Data.
RDF and the "Semantic Web" changed the way we think about data in general. Instead of relational tables hidden behind an HTML interface, without standards for the data, we began to think about managing information in a graph with shared definitions for the relationships and classes.
But it was initially focused on consumption in the same way as a relational database: as input to our own internal processing with a grand vision of powering a web-scale semantic artificial intelligence. Lovely for academics writing papers, but the only practical, broadly adopted effect was to change the way we think about our data.
Linked Open Data, with its five stars of excellence, then changed the way we publish data on the web. It gave us a short and very practical checklist that we could use to go from thinking to doing. It promotes open-ness as a necessity for re-use. It promotes standards as a necessity for re-use. It promotes linking between systems as a necessity for re-use.
But … all of the stars are concerns of publishing systems, not consumers. It's like Sir Tim came down the mountain with 10 commandments but only gave us the publishing tablet. Maybe he dropped the other one, maybe it was too heavy to carry. So, while LOD brought about a massive the upsurge in publishing of data, I believe that it's also not enough.
The web community has started to recognize that we're missing the other 5 stars. If our data isn't used, there's no value gained from the resources that were invested in its creation, publication, maintenance and improvement. And if we want our data to be used, the data needs to be …
Usable. And, I argue, it needs to be designed to be usable. We need Linked Open *Usable* Data. LOUD not just LOD.
What do I mean by “Usable”? In a well established tradition, the wikipedia definition clarifies that ..
So, * who wants to do * what, * how and in * what environment.
Unlike the entirely objective five stars of publishing, any recommendations about usability need to take into account the consumer.
* Usability is thus dependent on, and determined by, the Audience
And who is the audience for Linked Open Data, as published by cultural heritage organizations such as ourselves? My reaction in thinking about this was initially “researchers”. And researchers quite broadly, with school students being just as important as university professors.
But I was falling into the “south park gnomes” trap. There is a * “magic happens here” step before we get to * “profit”.
And that magic is carried out by Developers! Researchers (that could not also be considered as developers) interact with visualizations and user interfaces, not the raw data directly. We need the developer role in the middle, to translate the unreadable RDF into a web application that can be understood by many.
If the “who” is Developers, how can we get to the What, How and in which Context of Usability? For this, I turn to Catherine Bracy’s four points on community.
Know your Audience … who are you targeting with your product, or who is participating in your community?
Meet on their terms … if you’re looking to expand your community or product usage, you need to talk to your audience in a way that makes them comfortable and included, not in your own internal language
Have a Conversation … don’t just present at them, or direct them to read the documentation, discuss the XXXXX
Create Opportunities … While you’re discussing, and afterwards, give them ways to participate, not just consume. The feeling of ownership is an important motivator.
Okay … so … usability?
The audience is developers, and Usability is meeting on their terms. Steps 3 and 4 bring them in to the community, and then keep them engaged, thereby building usage.
Having a conversation lets you customize that for the particular needs of individuals within the community, if possible.
Or put another way, the API is the User Interface of the Developer. As a New Zealander, I must regretfully announce that the Australians have this absolutely 100% correct.
The Australian government wrote a fantastic API Design Guide in 2015 that nails it on core principles and the important notion of requiring empathy for developers, the same way that that any user interface should be accessible and comfortable for its audience.
In Linked Open Data, the API is built on top of HTTP like any other web API. It’s ReST-ful – when you dereference a URI, you receive useful information about the resource that is identified by that URI. That response uses open standards such as … ahhh :(
In Linked Open Data, the ontology determines the API up front. The ontology is almost exclusively designed to meet the requirements of the publisher of the data, and not the consumer.
This is not meeting on the audience’s terms. Lets go back to Pat’s wonderful picture, but add in the core metric for success for each step…
The model is successful when it is semantically complete and precise, but the output is successful when the API is Usable. If that information is also accurate, then the researcher is happy. So the question for us is how to optimize between the success of the model and the success of the API: Complete vs Usable.
If you only need half of the completeness, you should not be punished in terms of usability. Should be able to get close to the maximum usability for the particular use case’s completeness requirements.
If you only need half of the completeness, you should not be punished in terms of usability. Should be able to get close to the maximum usability for the particular use case’s completeness requirements.
As easy as ABC … or it would be if there were three stars. So also D and E.
Don’t learn new vocabulary.
We expect users to understand a website’s UI in seconds or leave, but for developers to read documentation on their UI (the API) for hours before doing anything.
Learning by introspection gets you started quickly, but clear and complete documentation about the data is just as important.
With complete, relevant examples that work if you cut and paste them into your system … because that’s exactly what people are going to do.
Consistent patterns.
Inconsistency, no matter that it fits, is very jarring. Every exception needs to be memorized separately, rather than the rule to follow.
With apologies for the resume slide …
All of these specifications, and many others, follow five design principles towards ensuring usability.
All of the projects required use cases for every feature. And not only use cases, but data to support those use cases, and preferably implementations that made use of the data. This avoided infinitely long and pointless discussions about how many hypothetical E39 Actors could dance on the head of an E22 Man-Made Object… except the location of an E7 Activity must be an E53 Place, requiring a … you get the picture.
Another core principle is the maxim attributed to Einstein – As simple as possible, and no simpler. IIIF has done a great job meeting this principle by avoiding technology dependencies, and only adding complexity when those use cases are shared by multiple organizations. This increases the likelihood of adoption and reuse, in which IIIF has been very successful. As simple as possible means fewest barriers to entry.
As recognized from the beginning of LOD, it’s important to be of the web, not just on the web. This means a resource oriented paradigm, such that web caches are used to their full potential, and to make it easier for static implementations that just put files on disk. The more cacheable, the more performant with no additional cost. The web also runs on standards, and adopting appropriate standards and best practices is essential.
This is a derivation of Postel’s Law: Be liberal in what you accept, and conservative in what you send. When applied to Linked Open Data and APIs, it means that clients should expect to see data that they don’t understand, and publishers should be careful to respect the model where specified. This facilitates experimentation with extensions as part of iterative development towards new versions. It is especially important for linked.art to allow unknown features from the rest of CIDOC-CRM without getting in the way of the core profile.
I think Many Sporny perfectly captured the sentiment that JSON-LD aims to avoid: When developers hear “RDF” they think: Not in my back yard!