Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

DIRISA for Open Data and Open Science/Anwar Vahed

217 vues

Publié le

Presented during the SA-EU Dialogue Facility, 15-16 May 2018.

Publié dans : Données & analyses
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

DIRISA for Open Data and Open Science/Anwar Vahed

  1. 1. DIRISA for Open Data and Open Science SA-EU Open Science Dialogue Project 15 May 2018
  2. 2. National Integrated Cyberinfrastructure System National Integrated Cyberinfrastructure System (NICIS) • Advanced Integrated cyber platform offering services for • HPC • Data • Networking • Priority science and education • Overarching coordination implemented by CSIR © CSIR, 2018 2 Core services Networked resources Skills&expertise Computing Services (CHPC +) Networking Services (SANReN) Data Services (DIRISA) e-Research environments (Cloud) Materials & Manuf. Energy Earth & Environment Phy Sci & Eng. Humans & Society Health, Bio & Food
  3. 3. DIRISA Objectives 3 Build national data infrastructure • Build and maintain Tier 1 nodes and services • Start Tier 2 domain nodes Develop human capital and skills • e-Science postgraduate programs • Conferences and training workshops Research data management • DMP tool • PID service • User policies and practices Advocate and coordinate • R&D initiatives • Stakeholder workshops © CSIR, 2018
  4. 4. National Data Infrastructure © CSIR, 2018 4 • “I just want to store/preserve my data (reliably)” • “I just want to share my data (in a controlled way)” • “I just want to process my data” • NICIS-DIRISA role: • Link into Tier 0 • Build and maintain Tier 1 • Support starting up Tier 2 • Link into Tier 3 One-Stop-Shop: Federated access to research data
  5. 5. Underpinning Open Data & Open Science © CSIR, 2018 5 1. National infrastructure and services for Open data • DIRISA Tier 1 (8PB) store & Research Data Management services • Regional Tier 2 Node 2. Human capital development • National e-Science Masters • Data Science training 3. Data management • PID Allocation: Handle and DOI registries • SA_DMP: SA Data Management Planning tool • Policies across data life cycle 4. Outreach and coordination • Conferences and workshops: SA Data Conference 19-21 June • Africa Open Science Platform • Big Data strategy
  6. 6. © CSIR, 2018 6 South African National Data Infrastructure (SANDI) DSubscribe • Subscribe as DIRISA user DataDrop • Deposit and store data reliably FindGet • Discover, download data sets SafeShare • Safely share data with users DataStage • Prepare data for HPC User documentation Help & support Core services (DMP, PID) Phase1: Research Data Management • My data management plans • My workflows • My data sets and outputs • My communities Phase2: Collaborative Research Environments (References: EUDAT, ANDS, JISC, Data.gov)
  7. 7. Data Access Spectrum: Open by default 7 Small – Medium – Big data Personal – Business – Government Closed Shared Open Internal access • Private • Confidential • Sensitive • Surveillance data Named access • Assigned by contract • Regulation authorised • Drivers licences Group based access • Project assigned • Selected membership • Genomic data Public access • Licence that limits use • Terms and conditions • Geospatial data Anyone • Open to public • No limits on use • Weather data (ODI)
  8. 8. Actions Data custodians/stewards: individuals; institutions; groups/consortia; government; business • Advocate and promote: Increase visibility and benefits of open data • Clarify Open data and related concepts: Governance, Stewardship, Custodianship, IP, Copyright,… • Change accreditation model: data citation recognition, altmetrics, etc • Develop policies: institutional strategies, standards, protocols, principles and recommended practices • Change training: Include Open data concepts in (data) science curricula • Funding model: incentives/requirements for Open data principles • Harmonise privacy and openness regulation © CSIR, 2018 8
  9. 9. Beyond FAIR • FAIR: Findable, Accessible, Interoperable, Reusable • FAIReR: FAIR + Reproducible • FAIReST: FAIR + Stewardship + Transparency (truth and trust) [Liz Lyon, University of Pittsburgh] Open Science => New roles… © CSIR, 2018 9
  10. 10. Thank you © CSIR, 2018 10 www.dirisa.ac.za dirisa@csir.co.za
  11. 11. © CSIR, 2018 11
  12. 12. © CSIR, 2018 12
  13. 13. Accelerating Data Intensive Research © CSIR, 2018 13 “We need to get greater value (benefit, impact) from our investments in data”
  14. 14. Architecture © CSIR, 2018 14 • Open (FAIR) Data & Open Science • Federated locally and globally (“One-stop- shop” catalogue) • Certified as Trusted Repository • Linked to funder systems • Suite of services for RDM and data intensive analytics 40 PB 2 PB Archival data & staging: VM access 8 PB Active data: near real time interactive access 0.5 PB Services & staging between DIRISA and CHPC storage systems Storage Virtualisation ServerCHPC Lustre or Posix storage systems CHPC compute systems * PB Software defined storage hierarchySmall, fast Big, slow iRODS DIRISA cloud portal

×