Presented at Digital Life 2018, Bergen, March 2018. In the Trust and Accountability session.
In recent years we have seen a change in expectations for the management and availability of all the outcomes of research (models, data, SOPs, software etc) and for greater transparency and reproduciblity in the method of research. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for stewardship [1] have proved to be an effective rallying-cry for community groups and for policy makers.
The FAIRDOM Initiative (FAIR Data Models Operations, http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards and sensitivity to asset sharing and credit anxiety. Our aim is a FAIR Research Commons that blends together the doing of research with the communication of research. The Platform has been installed by over 30 labs/projects and our public, centrally hosted FAIRDOMHub [2] supports the outcomes of 90+ projects. We are proud to support projects in Norway’s Digital Life programme.
2018 is our 10th anniversary. Over the past decade we learned a lot about trust between researchers, between researchers and platform developers and curators and between both these groups and funders. We have experienced the Tragedy of the Commons but also seen shifts in attitudes.
In this talk we will use our experiences in FAIRDOM to explore the political, economic, social and technical, social practicalities of Trust.
[1] Wilkinson et al (2016) The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
[2] Wolstencroft, et al (2016) FAIRDOMHub: a repository and collaboration environment for sharing systems biology research Nucleic Acids Research, 45(D1): D404-D407. DOI: 10.1093/nar/gkw1032
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
1. Trust and Accountability:
experiences from the
FAIRDOM Commons Initiative.
Professor Carole Goble, carole.goble@manchester.ac.uk
The University of Manchester, UK
The FAIRDOM Association Coordinator
ELIXIR-UK Head of Node
Co-lead ELIXIR Interoperability Platform
Digital Life 2018, Bergen, Norway 21-22 March 2018
4. Systems and Synthetic Biology Projects
Practically address (Open) Assets Management
Support transparency, reproducibility, personal productivity
In an ecosystem of platforms and an egosystem of research projects
10Year Anniversary!
5. Why? Programmes
• Foster stewardship & skills
• Stimulate sharing
• Ensure retention
• Capitalise on investments
• Audit & Compliance
• Respect global community,
local project resources
Synthetic Biology for
Growth Programme
6. … FAIR model reuse and reproducibility …
Stanford et alThe evolution of standards and data management practices in systems
biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
7. 1. Ecosystem
• public collections & archives
• data centres
• journals
• Institutional repositories
• most researchers
• labs & universities
• my resources
*Meso too – but to complicated for 20 minutes! See
http://www.knowledge-exchange.info/event/ke-approach-open-scholarship
9. Capitalising on investments
Retaining results post-project
Pooling, transfer, sharing results
Public collections
Skilling workforce
Compliance audit/metrics
New publishable assets
Business models
Reproducibility
Doing science with collaborators
Publishing & getting credit
Productivity
Access to resources, results, collections
Retention of my results post student
Repeatability - reviewer wants more
Competitiveness, protecting assets
Managing costs
Compliance
StakeholderAccountabilityValues
overlaps, mismatches?
10. “The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
Metadata
Identifiers
Access policies
Licences
Technical: Political
Social
Economic:
Rally
1. Make everything FAIR
11. 2. Improve Knowledge Flow
A FAIRDOM Commons & Catalogue
• Draw together scattered
resources, platforms, people
• Coordination, collaboration
• Reproducible, transparent
[original figure: Josh Sommer]
Commons
Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, I3CK, 2013, isbn: 978-3-642-37186-8
“Resources collectively
created, owned or shared
and used between or
among a community (with
Governance)”
13. What is FAIRDOM?
multi-institution collaboration, FAIRDOm Association e.V.
Social
SW Platforms Processes
Stewardship Support
Tech Solns & Support
Public Commons
FAIRDOMHub.org
Policy, Advocacy
Community Work
90+ Projects on Hub
30+ Private
installations
Standards
14. FAIRDOM Platform
collecting metadata
Web-based portal
Project spaces
Metadata catalogue
Yellow pages
Results repository
Collaboration
Archives Gateway
Sharing Organisation
Front end
Projects Hub
Entry Point
On site
Tracking
Pipelines
LIMS, Instruments
Large data
Samples
Auto-archiving
Back end
Onsite storage &
analytics
15. Two Flavours
Trust vs Responsibility
I’ll run my own – when the
project ends can you host it for
me …
Service hosted at HITS
InstitutionalGuarantee
2029
16. Projects, People, Assets
• Project spaces
– Upload or link to data
• One place catalogue
– Regardless of physical store
– Standards-compliant metadata
• Linked with other systems
– Project repositories
– Public deposition archives
– Tools
19. 20
Programme
Overarching research theme (The Digital Salmon)
Project
Research grant (DigiSal, GenoSysFat)
Investigation
A particular biological process, phenomenon or
thing
(typically corresponds to [plans for] one or more
closely related papers)
Study
Experiment whose design reflects a specific
biological research question
Assay
Standardized measurement or diagnostic
experiment using a specific protocol
(applied to material from a study)Jon OlavVik,
Norwegian University of Life Science
Investigation Study
Assay
24. Designed by PALs
What methods are been used to determine
enzyme activity?
What SOP was used for
this sample?
Where is the validation data for this model?
Is there any group generating kinetic data?
Is this data available?
Track versions of my model
Whats the relationship between the data and
model?
Which data belong to
which publications?
25. Transparent Publication
16 datafiles (kinetic, flux inhibition, runout)
19 models (kinetics, validation)
13 SOPs
3 studies (model analysis, construction,
validation)
24 assays/analyses (simulations, model
characterisations)
Penkler, G., du Toit, F., Adams, W., Rautenbach, M.,
Palm, D. C., van Niekerk, D. D. and Snoep, J. L. (2015),
Construction and validation of a detailed kinetic model
of glycolysis in Plasmodium falciparum. FEBS J, 282:
1481–1511. doi:10.1111/febs.13237
32. 1. Resourcing
• Software and data are free,
like free puppies
• Puppies are not a one-off
cost
“we want FAIR data but we
will only fund research”
The economics of data infrastructures needs new brave
funding models …
34. FAIR Play Sensitivities
Open science applies to you but not me … not available = not citable
Jurgen Hannstra
Vrije Universiteit,
Amsterdam
Using FAIRDOM my own
lab colleagues saw what
I was doing and called to
collaborate!
• Licenses
• Negotiated access
• Embargos
• Permission controls
• Staged sharing
• Private walled gardens
35. me
ME
my team
close
colleagues
peers
Staged Spiral – Data Lifecycle
organisation – collaboration - dissemination
The number of assets
reduces
The richness of metadata
needed increases
As reach of sharing
increases
Staged sharing
36. Tragedy of the Commons
metadata & identifier quality
https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/
https://metadatacenter.org
“The challenge for all the data-commons initiatives — is
that many online datasets are annotated with metadata
that are simply terrible…. Creating good metadata takes
considerable work ….
When investigators act in their own self-
interest, taking short cuts to generate
metadata as quickly as possible, we
should expect that the overall utility of
the resource will decline.
The creation of a data commons requires the ability to deal with extremely
varied — and often unanticipated — metadata patterns and data types …. a
need for easy-to-use solutions that are generic to provide guidance over the
entire life cycle of metadata — streamlining metadata creation, discovery,
and access, as well as supporting metadata publication to third-party
repositories”
Mark Musen
37. TheTragedy of the Commons
community socialisation
Value Systems
• of assets, of reproducibility,
of metadata
• economics of infrastructure
• priorities
• public vs personal good
Sweatshops
• collaborating but competing
• burden - time, skills
• short term, shortcuts
• leadership sets the tone
38. rightfield.org.uk
templates
spreadsheets, notebooks
seamless system join-ups
automated metadata
“Last / First Mile” “Born FAIR”
“FAIR Ramps” “FAIR by Design”
Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship
The ‘last mile’ challenge for European research e-infrastructures https://doi.org/10.3897/rio.2.e9933
Semantic
Annotation by
Stealth
39. 3. Adoption - stewards, champions, PALs
respected, embedded not tolerated, external
• 500,000 stewards needed in
Europe*
• Specialist skills
• Career pathways
Curation and management
• Supported, resourced
• Recognised, rewarded
• By the Projects, Programmes
and PIs
Building trust between FAIRDOM
and the projects
* Realising the Open European Science Cloud (2016)
44. Systems and Synthetic Biology Projects
Practically address (Open) Assets Management
Support transparency, reproducibility, personal productivity
In an ecosystem of platforms and an egosystem of research projects