- Data transfer
- Compute access
- Training
Data Hub:
- Secure data sharing
- Data management
- Metadata catalogues
AAI:
- Single sign-on
- User attributes
- Group management
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
National
resources
1. ELIXIR: European infrastructure for
biological information
Data infrastructure for
Europe’s life-science
research:
www.elixir-europe.org @ELIXIREurope
Data
Interoperability
Tools
Compute
Training
Marine metagenomics
Human data
Crop and forest plants
Rare diseases
2.
3. A network of data Nodes
• ELIXIR Nodes are funded
nationally
• ELIXIR Nodes build on
national strengths and
priorities
• ELIXIR Nodes provides a
national framework for long-
term resource management
de.NBI - The German Network for
Bioinformatics Infrastructure
de.NBI consortium
• 39 project partners
• 30 institutions
• 8 service centers
• designated national German
node in ELIXIR
www.denbi.de
4. Common Services, Common Standards,
Data deposition:
ENA, EGA, PDBe, EuropePMC, …
Bioinformatics tools:
Bio.tools,Containers, Galaxy
Data Interoperability:
Standards,Identifiers, Ontologies
Compute:
Secure data transfer, cloud computing, AAI
Community partnerships:
Human access controlled, Plant, Marine, Rare
Disease, Proteomics, Metabolomics, Industry
outreach
Training:
TeSS, Data Carpentry, eLearning
Data management:
Genome annotation
Data management plans
Added value data:
UniProt, Ensembl,OrphaNet, …
6. What are ELIXIR Core Data Resources?
Fundamental
importance
Complete
collections
of generic
value
High levels
of usage,
scientific quality
and service
7. ELIXIR Core Data Resources – fundamentally important to life-
science research • The ELIXIR Deposition Databases meet the
technical quality and governance criteria
expected of ELIXIR Core Data Resources
• ELIXIR is committed to Open Access as a core
principle for publicly funded research.
• ELIXIR Core Data Resources should reflect this
commitment and have terms of use or a licence
that enables the reuse and remixing of data.
• See “Identifying ELIXIR Core Data Resources”
• Agreed collectively by 21 Node directors
https://www.elixir-europe.org/platforms/data/core-data-
resources
9. Towards a global effort
Significant interest internationally, e.g.
NIH/NSF
Global coalition organized by HFSP:
Presented by W. Anderson/Eric Green at
HIRO (June 2017)
10. Changing landscape with many actors
• Highly distributed data-generating &
monitoring
• Distributed analysis requires reference
datasets (organized centrally, locally or
in distributed networks)
• Manage Legal requirements in
transnational settings
International
Resources
National data
centres
Institutional
data centres
11. ELIXIR’s principles on FAIR data management
• Open sharing of research data is a core principle for publicly-funded research and ELIXIR
encourages all funders to adopt Open Data mandates.
• Data Management is crucial part of good scientific practice and research excellence.
• Whenever possible, biological research data should be submitted to the recommended
community deposition databases.
• All data submitted to Open Data archives must be annotated in accordance with
community-defined standards.
• ELIXIR Nodes are the national implementation of a harmonised FAIR Data Management
programme for the life sciences.
• FAIR data management requires professional skills and adequate resources.
• Good research data management requires appropriate funding for data infrastructures.
ELIXIR position paper on FAIR data management in the life sciences
(doi: 10.7490/f1000research.1114985.1)
12. “Whenever possible, biological research data should be submitted to
the recommended community deposition databases"
• The ELIXIR Deposition Databases meet the
technical quality and governance criteria
expected of ELIXIR Core Data Resources
• See “Identifying ELIXIR Core Data Resources”
• Agreed collectively by 21 Node directors
• International collaborative effort
https://elixir-europe.org/platforms/data/elixir-deposition-databases
13. “All data submitted to Open Data archives must be annotated in
accordance with community-defined standards”
https://elixir-europe.org/platforms/interoperability
14. “FAIR data management requires professional skills and adequate resources”
Bring your own data workshops
• Problem-centered
workshops
• Integration experts -
Data resources –Users
• With national nodes or
pan-European projects
15. “ELIXIR Nodes are the national implementation of a harmonised FAIR
Data Management programme for the life sciences”
18. Bioschemas.org
Search enginesRegistries
Data
Aggregators
• Standardised
metadata
• Metadata
publish and
harvest
without APIs
or special
feeds
• Feed bio
registries and
aggregators
A community initiative built on top of Schemas.org to
improve Findability and Accessibility in Life Sciences • Rapid markup
• Exposed to harvesting
• Find
Major data
resources
Smaller
datasets
Bioschemas Bioschemas
19. Data Repository and Datasets Descriptions
Information about repositories
with consistent structured data
Align overlapping registry efforts
around certain metadata.
Help with consistency of
metadata collected by registries
With:
omicsDI
Bioschemas.org
21. Dataset index
Scientific File
PID
Dataset index
Scientific File
PID
Dataset index
Scientific File
PID
EarthLife ...
Common Access Common Access Common Access
Data
Services
Compute Storage Transfer …
”Science schemas” as Emerging federation architecture in EOSC
EOSC Catalogue
22. • Accept that much sensitive data are
stored locally
• Data discovery & data access services
• EGA Central to ELIXIR approach
• Underpinned by GA4GH standards
Find and Access human genomic data
23. Human genomics in research & health needs…
Standards Networks of trust Reference archives
24. Reliable electronic identification of users (ELIXIR ID) is
needed to access the key services and capacities of ELIXIR
• Existing user accounts can be used to create your ELIXIR
ID today at www.elixir-europe.org. ELIXIR AAI allows
users to continue using their federated academic,
corporate or social media identity by linking it to a
personal ELIXIR ID.
• In production since January 2017 (running 08/2016)
• 292 Identity providers
• 584 ELIXIR identities in 101 groups
• 16 relying ELIXIR services
• ELIXIR AAI credential accepted by e-infrastructures; EGI
CheckIn and pilot on EDUAT B2ACCESS
ELIXIR AAI
The ELIXIR service providers connected
to ELIXIR AAI benefit from a centralised
user identity and access management
services
Permananet ID even if affiliation change
Protocols SAML2, OpenIDConnect
Contact: Mikael Linden, Michal Prochazka
25. Federated AAI: What is a “registered ELIXIR user”?
• Identification (ELIXIR ID)
• Group/role and attribute (such as researchers home
organization)
• Authentication (via GEANT/eduGAIN, social media or ORCID)
• Strong step-up authentication (for sensitive services)
• Personal authorisation management (for datasets that require DAC
approval)
• International mutual recognition – code-of-conducts, policies
• Institutional maturation models (cf OECD)
• Bona fide researcher status management (e.g. restricted services)
28. ELIXIR Innovation and SME programme: 2017/2018
Previous Events Upcoming Events
• France - Paris 14-15 November
2017: Rare diseases and
personalized medicine
• Cambridge – UK 24-25 January
2018: Discovery of data, tools and
training
29. CONFIRMED SPEAKERS:
• Jean Francoise Deleuze (Director of the Centre National de Génotypage (CNG-IG-
CEA)
• Daria Julkowska (Programme coordinator eRARE)
• Frederic Revah (President,Yposkesi - CEO, Genethon)
• Ana Rath (Orphanet)
DATA RESOURCE SHOWCASE |TRAINING | FLASH-TALK
Registration OPEN:
https://sme_paris.eventbrite.co.uk
31. ELIXIR in numbers
• ~ 180 institutes involved
• 600+ staff
• 11 Implementation Studies
currently in operation
• 10 papers in ELIXIR F1000R
channel
• 223 live events in TeSS
• 200 companies attended
Innovation and SME
programme
33. ELIXIR Platforms and Use cases leads
Interoperability:
Chris Evelo (ELIXIR NL), Carole Goble
(ELIXIR UK), Helen Parkinson (ELIXIR EMBL-EBI)
Tools:
Søren Brunak (ELIXIR DK),
Alfonso Valencia (ELIXIR ES)
Compute:
Tommi Nyrönen (ELIXIR FI), Luděk Matyska
(ELIXIR CZ), Steven Newhouse (ELIXIR EMBL-EBI)
Data:
Jo McEntyre (ELIXIR EMBL-EBI),
Christine Durinx (ELIXIR CH)
Training:
Celia van Gelder (ELIXIR-NL) Gabry Rusticci (ELIXIR
UK), Patricia Palagi (ELIXIR CH)
Human data:
Jordi Rambla (ELIXIR ES),Thomas Keane (ELIXIR
EMBL-EBI)
Plants:
Celia Miguel (ELIXIR PT), Paul
Kersey (ELIXIR EMBL-EBI)
Rare diseases:
Ivo Gut (ELIXIR ES),
Marco Roos (ELIXIR NL)
Marine metagenomics:
NilsWillassen (ELIXIR No),
Rob Finn (ELIXIR EMBL-EBI)
34. ELIXIR Compute: 3 core pan-ELIXIR services for data sharing
Link national resources using
common standards, shared
services and user management
protocols
Notes de l'éditeur
ELIXIR unites Europe’s leading life science organisations in managing and safeguarding the increasing volume of data being generated by publicly funded research.
ELIXIR coordinates, integrates and sustains bioinformatics resources across its member states and enables users in academia and industry to access vital data, tools, standards, compute and training services for their research.
ELIXIR provides data infrastructure for Europe’s 500,000 life-science researchers
ELIXIR is organised in five technical platforms: Data, Interoperability, Tools, Compute and Training.
The four Use case of ELIXIR connect the technical activities to the real needs of user communities in the life sciences: Marine metagenomics, Crop and forest plants, Human data, Rare diseases.
The ELIXIR network currently counts 17 Members and 2 Observers.
The Network is coordinated from ELIXIR Hub, based alongside EMBL-EBI in Hinxton, UK
Five new ELIXIR Members in 2015-2016: France, Spain, Belgium, Italy, Slovenia Luxembourg, Germany and Irelandm, Hungary joined in January 2017
Creating a robust infrastructure for biological information is a bigger task than any individual organisation or nation can take on alone
These are issues of such complexity that no single institution or country can tackle alone
ELIXIR Nodes ensures local bioinformatics capacity throughout Europe
Represented by SIB - Swiss Institute of Bioinformatics
The largest ELIXIR Node – 48 research groups, 650 scientists
Co-Leader of ELIXIR Platform on Data Services
At heart of long-term sustainability objectives
www.sib.swiss
ELIXIR’s services include databases, tools, standards and how to achieve interoperability of standards, training courses, compute resources, data management support and industry cooperation and technology transfer. The list of services is constantly evolving
Data deposition: ELIXIR Nodes run several data deposition archives where researchers can save their data. Deposition is usually done online and is free as a public resource. The major deposition archives run by ELIXIR Nodes include ENA, EGA, PDBe, EuropePMC
Added-value databases : Added-value databases process, analyse and annotate data (adding comments and other information) from deposition archives and make them accessible to wider scientific community. The added value comes from data processing, additional annotation, or mapping to standardized vocabularies or ontologies (a formal specification of terms and relationships among them). They facilitate the discovery of useful data and the usage of it. Examples of the major knowledge-bases run by ELIXIR Nodes include: UniProt, Ensembl, OrphaNet
Using Compute services, researchers can carry out computationally intensive modeling and simulation studies and make the most of the big data revolution in the life science. Several ELIXIR Nodes offer Cloud computing services, which enable researchers use compute resources on-demand, without the need to manage their own hardware or datacenter in-house. Services currently available:
Computerome (ELIXIR Denmark,
ePouta, cPouta (ELIXIR Finland)
Embassy cloud (EMBL-BLI)
Training: ELIXIR Training trains European developers, trainers and researchers within ELIXIR communities. The developers are trained to get a better performance and use relevant methodologies to implement software. Trainers are trained to deliver courses to use ELIXIR services. Researchers are trained to effectively use the tools and services offered by ELIXIR. ELIXIR training portal TeSS collects training related materials - users can browse, discover and organise life science training resources aggregated automatically from ELIXIR Nodes and 3rd party providers.
Data management: Several ELIXIR Nodes offer practical help to research teams in developing and implementing their data management plans. The support ranges from practical help to research project in drafting their data management plans to services
Industry: ELIXIR's Innovation and SME programme is a series of specialised events that bring together operators of ELIXIR services with industry and SME representatives, who have the opportunity to learn how to effectively use the bioinformatics tools and services offered by ELIXIR.
Tools:
We defined the ELIXIR Core Data Resources as
A set of data resources that are of
[click]
fundamental importance to the broad life science community and the long-term preservation of biological data
[click]
They provide complete collections of generic value to life science,
[click]
and show high levels of usage, scientific quality and service.
8
Information about repositories with consistent structured data to help index, search & tools
Align overlapping registry efforts to collect & curate certain metadata.
Help with consistency of metadata collected by registries
Use Case
Properties
Data sharing requires compatible technology – GA4GH incredibly important role as the global, neutral body that look after and preserve standards. From our perspective it has been a pleasure to collaborate.
It is important to recognise the role of global, open , community driven organisations for setting the technical and scientific standards. As an international organisation, supported by government funders we can support, we can facilitate and we can help sustain. But the standards needs to be driven out of the community.
Networks of trust – sharing of human derived, potentially identifieable data requires trust between the parties. This is one of the key aspects of the Euopean General Data Protection – it is positive to sharing of research data but it sets out a number of legal requirements for this to happen. The Data Controller is responsible to ensure that these are met by recipent - and so we need to broker trust. (Just like in the moblie phone case – you go roaming and AT&T trusts that vodaphone will reimburse. Vodafone trusts that you pay the bills – and you display trustworthiness via credit rating and signing a contract.)
We also need the reference databases - lighthouses that help us navigate an ocean of data. BRCA exchange – but also the genbanks, genbuilds, nomenclature commitees without which it is impossible to do genomics research.
Largest of all ESFRI RIs
The output and activities to the
In production since January 2017 (running 08/2016)
292 Identity providers
584 ELIXIR identities in 101 groups
16 relying ELIXIR services
ELIXIR AAI credential accepted
Including two private companies that provide cloud brokering services (nationally)