6. about the COAR Next
Generation Repositories
Working Group
7.
8. Next Generation Repositories Working Group
โข Eloy Rodrigues, chair (COAR,
Portugal)
โข Andrea Bollini (CINECA, Italy)
โข Alberto Cabezas (LA Referencia,
Chile)
โข Donatella Castelli (OpenAIRE/CNR,
Italy)
โข Les Carr (Southampton University,
UK)
โข Leslie Chan (University of Toronto
at Scarborough, Canada)
โข Rick Johnson (SHARE/University of
Notre Dame, US)
โข Petr Knoth (Jisc and Open
University, UK)
โข Paolo Manghi (CNR, Italy)
โข Lazarus Matizirofa (NRF, South
Africa)
โข Pandelis Perakakis (Open Scholar,
Spain)
โข Oya Rieger (Cornell University, US)
โข Jochen Schirrwagen (University of
Bielefeld, Germany)
โข Daisy Selematsela (NRF, South
Africa)
โข Kathleen Shearer (COAR, Canada)
โข Tim Smith (CERN, Switzerland)
โข Herbert Van de Sompel (Los
Alamos National Laboratory, US)
โข Paul Walk (Antleaf, UK)
โข David Wilcox (Duraspace/Fedora,
Canada)
โข โช Kazu Yamaji (National
Institute of Informatics, Japan)
9. To position repositories as the
foundation for a distributed, globally
networked infrastructure for scholarly
communicationโฆ
10. objectives
โข cross-repository interoperability
โข encourage the emergence of added-value services
โข transform the scholarly communication system by emphasising:
โข collective, open and distributed management of open content
โข collective innovation
11. principles
โข distribution of control of scholarly resources
โข inclusiveness: different institutions and regions have particular needs (e.g
diverse language, policies and priorities) and this must be supported
โข for the public good
โข intelligent openness
12. Intended outputs
โข direct outputs:
โข the Next Generation Working Group will collectively produce:
โข reports
โข conceptual models
โข recommendations for particular technologies
โข indirect outputs:
โข some individuals independently of the Next Generation Working Group
will:
โข implement software changes to repository platforms
โข build infrastructure (micro-services)
13. design assumptions
โข focus on resources
โข not just associated metadata - treat them equally
โข pragmatism
โข favour the simpler approach
โข evolution, not revolution
โข use existing software and systems where possible
โข convention over configuration
โข standardise only where necessary and minimise constraints
โข engage with users where they are:
โข integrate into environments and systems where users are already engaged
Not all users are human, some are machines!
15. โbehavioursโ
โข Supporting discovery of content
โข exposing identifiers and links between resources
โข supporting navigation
โข supporting batch discovery
โข actively sharing or exposing notifications
โข Participating in the social network
โข Global identification of people in the repository network
โข Annotation, commenting and reviews - e.g. Open Peer Review
โข Logging and exposing of user interaction data across repositories
โข Preservation
โข Supporting other processes
โข Declaring licenses at a resource level
โข Exposing standardised usage metrics
โข Content transfer (e.g. for text and data mining)
16. user stories
as <some actor>,
I want to <do something>,
in order to gain <some benefit>
17. user stories relating to repository โbehavioursโ
Example user-stories for the behaviour โDiscovery through navigationโ:
โข as a human or machine user, I want to easily and uniformly identify the
metadata in a repository record, so that I can ascertain the relevance
of the resource.
โข as a repository manager, I want to be able to access the metadata in
my repository in real time through an API in order to build views or
services on any platform using the data.
โข as a research manager (funder or institution), I want to be able to track
the research outputs related to a specific funded project to
demonstrate value and compliance with policy
19. repositories must be deeply connected
โข outgoing:
โข individual content resources
โข directly accessible on the network
โข individual metadata records
โข not just in batches
โข individual users
โข as part of a variety of professional and social networks
โข incoming:
โข using all appropriate global identifier systems
โข accepting automated deposit of content and data from other systems (e.g.
scientific instruments)
โข allowing external services to interact with content
โข content mining
โข annotation services
โข etc.
20. repositories need to be active
โข the next generation repository needs to talk to the world
โข publishing events to notification hubs and notifying users
โข and to listen, and respond:
โข respond to requests for content and metadata, equally
โข continuously improve the information it has, adding value where it can by:
โข responding to and supporting annotation and peer review
โข not just allowing text/data-mining, but supporting it and benefitting from the
derived information
supporting user workflows - providing and accepting data
21. active repositories
โข repositories could become pro-active
components in an event-driven
scholarly system
โข publishing โeventsโ such as the addition
of a new item to one or more
notification hubs
โข third-party systems โsubscribingโ to
these notifications - many potential
applications
โข would involve very little or no effort by
repository administrators
โข modest software development
22. being of, not just on, the Web
โข obviousโฆbut not really done yet
โข the โsplash pageโ requiring human
mediation is a real problem
โข โsignposting the scholarly webโ
โข link HTTP headers
โข would involve very little or no effort
by repository administrators
โข a small amount of software
development in repository systems
http://signposting.org
24. conclusion
โข the goal:
โข To position repositories as the foundation for a distributed, globally
networked infrastructure for scholarly communicationโฆ
โข we already have much of what is needed:
โข ubiquitous distribution of open repository platforms
โข the desire to challenge the status quo
to work in the square (meydan), not the tower (kule)
together, we can establish a scholarly communications
infrastructure that we can be proud of, and that our
children will thank us for!
25. Paul Walk
Director, Antleaf
Managing Director, Dublin Core Metadata Initiative (DCMI)
Web: http://www.paulwalk.net
Email: paul@paulwalk.net
Twitter: @paulwalk www.antleaf.com www.dublincore.org
Teลekkรผrler!
More information:
http://bit.ly/coar-repo-ng
Editor's Notes
Thank you for inviting me to speak - it is an honour to be invited to Izmir.
I should tell you that I no longer work for the University of Edinburgh - I have decided to start my own consultancy company instead working in the are of open access and research data management. Today I am representing COAR
Iโd like to start by proposing 3 cheers for the current generation of repositories - three important aspects
most of our repository systems are built from technology which has been in near-continuous development for more than a decade.
the community support for repository systems is considerable - look around you for the evidence of that! :-)
the resources within our repositories are under the control of our institutions, not under the control of a handful of publishers
monopoly avoidance strategy
the most important aspect from my point of view
Confederation of Open Access Repositories
international association with >100 members from 35 countries - 5 continents represented
libraries, universities, research institutions, government funding agencies etc.
the WG includes some luminaries from the world of repositories. And I'm in there too.
this is the goal - implicit in this is competition for the current global infrastructure which is largely owned and deployed by the commercial academic publishers
to support:
discovery, access, annotation, real-time curation, sharing, quality assessment, content transfer, analytics, provenance tracing, etc.
by intelligent openness, I mean actually supporting re-use, not just making something โopenโ
we are already seeing some of the repository platforms adopt some of the recommendations emerging from this work (for example the adoption of โsignpostingโ)
this provisional lists of behaviours allow us to group related technologies together and apply them to addressing user-stories
from agile development methodology - a useful way to simply frame usersโ priorities
not just connected in a general sense, connected at every level:
repositories are nodes in the network
content items are nodes in the network
metadata records are nodes in the network
users are are nodes in the network
and the network is The Web
This blog post is why I was invited to join the COAR working group
some interesting musing about peer-to-peer distributed control!
alternatives to high-latency aggregation
Herbert Van de Sompel & Michael Nelson
make the webpage itself both human and machine readable
resources are linked through a common vocabulary and url that expresses the relationship between content and metadata.