Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Architecture and Standards

135 vues

Publié le

Presentation by Hugo Leroux and Liming Zhu, CSIRO, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.

Publié dans : Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Architecture and Standards

  1. 1. Architecture and Standards Technology Considerations, Capabilities and Roadmap HEALTH AND BIOSECURITY / DATA61 Liming Zhu and Hugo Leroux 6 March 2019
  2. 2. Information platform and standards • Platform • At Data collection or study completion? – Has impact on architecture • Standards • Greater focus on FAIR (Findable Accessible Interoperable Reusable) data principles to enable researchers to more easily use datasets in their research • Metadata should be encapsulated within the data – Metadata must include processes/tools to facilitate results reproducibility • Data collection: CDISC CDASH, HL7 FHIR, SNOMED CT, LOINC etc • Transport protocols: CDISC ODM, HL7 FHIR • Adoption and implementation of data standards from start Architecture and Standards | Liming Zhu and Hugo Leroux2 |
  3. 3. Information Architecture • Olivier alluded to six possible options: • Locally managed by individual Investigators • Directory of studies with limited query ability. • Data buckets centrally managed for a specific domain/application • Large centralised open repository • Large federated system: locally managed with global access • Novel model: locally managed and analysis software sent to the data Architecture and Standards | Liming Zhu and Hugo Leroux3 |
  4. 4. Architecture comparison Architecture and Standards | Liming Zhu and Hugo Leroux4 | Locally managed Directory Data buckets Centralised Federated Novel Uncovering Study objectives       Facilitating Data mapping       Ensuring Data / Model harmonisation       Enforcing Data integrity       Data ownership       Scalability       Legend: Strongly aligned Aligned Weakly aligned Not aligned
  5. 5. Improving the Current State Architecture and Standards | Liming Zhu and Hugo Leroux5 | Data Source Data Source Linkage UnitRecords Aggregates Secure Container Direct Access None Share Share Link Integrate Use Protect Access Masking Authorise Risk Assessment Privacy Preserving QueryAPI Workflow Grouping
  6. 6. Sensitive Data: Improving the Current State Architecture and Standards | Liming Zhu and Hugo Leroux6 | Data Source Data Source Linkage UnitRecords Aggregates Secure Container Direct Access None Share Share Link Integrate Use Protect Access Masking Authorise Risk Assessment Exploring: Automating manual linkage with Machine Learning Building: Re-identification Risk Ready Reckoner (R4) Privacy Preserving QueryAPI Building: PROTARI API for safe queries of sensitive data Building: SenDA – Decision support and data access workflow Workflow Grouping Building: Anonlink – secure privacy preserving linkage
  7. 7. Future State – Federated Pull model Architecture and Standards | Liming Zhu and Hugo Leroux7 | Data Source Data Source Linkage Secure Container Direct Access Share Share Link Data Virtualisation UseAssess & Authorise Risk Assessment & Defence Privacy Preserving Query API Auth Workflow MaskingMasking Virtual Table Synthesis & Remote analysis Discovery Encrypted Analytics
  8. 8. Future State – Federated Pull model Architecture and Standards | Liming Zhu and Hugo Leroux8 | Data Source Data Source Linkage Secure Container Direct Access Share Share Link Data Virtualisation UseAssess & Authorise Risk Assessment & Defence Privacy Preserving Query API Auth Workflow MaskingMasking Virtual Table Synthesis & Remote analysis Discovery Encrypted Analytics Choice based on quantified risk assessment Data is close to “live” Many options for end use based on 5 Safes
  9. 9. Summary • Still too early for technology and infrastructure choices • But emerging technologies and new capabilities may inform directions • Avoid overcomplicating solution/architecture from the beginning • Consider a staged, future-proof architecture with short term wins • Depends on the value propositions, use cases and risk profiles of data, purpose of use and users • Not a single technology/solution but fit-for-purpose spectrum of tech • Consider learning from other fields and new regulations • E.g. Consumer Data Rights legislation Architecture and Standards | Liming Zhu and Hugo Leroux9 |
  10. 10. Data61 Liming Zhu Research Director t +61 2 9490 5638 e firstname.surname@csiro.au w www.data61.csiro.au Health and Biosecurity Hugo Leroux Research Scientist t +61 7 3253 3614 e firstname.surname@csiro.au w www.aehrc.com Thank you DATA61 / HEALTH AND BIOSECURITY
  11. 11. Appendix
  12. 12. Option 1: Locally managed by Chief Investigator • Examples: Typical NHMRC project • Pros: • Status quo, simplest solution • Research quality controlled by project team • Cons: • No guarantee of FAIR principles • Burden on research community to aggregate large datasets • Difficulty for community driven alternative hypothesis Architecture and Standards | Liming Zhu and Hugo Leroux12 |
  13. 13. Option 2: Directory of studies • Examples: GAAIN, ANZCTR, australianclinicaltrials.gov.au • Pros: • Awareness of existing studies, facilitate collaboration • Often allows searching through study details, eligibility criteria and participants • Inexpensive • Cons: • Same as option 1 Architecture and Standards | Liming Zhu and Hugo Leroux13 |
  14. 14. Option 3: Data buckets centrally managed • Examples: NIH repositories, CSIRO DAP, ADRC • Pros: • Semi-open access to data • Projects maintain data ownership • Scalable • Cons: • Data standard, quality, and access not guaranteed • Large overheads to aggregate datasets Architecture and Standards | Liming Zhu and Hugo Leroux14 |
  15. 15. Option 4: Centralised data repository • Examples: LONI, UK Biobank, Clinicaltrials.gov, Clinicaltrialsregister.eu • Pros: • FAIR compatible • Homogeneous data quality and standard • QC often included • Low overheads to access dataset • Single point for data and access management • Cost effective infrastructure • Cons: • High cost for administering institution • Might not be easily scalable depending on design Architecture and Standards | Liming Zhu and Hugo Leroux15 |
  16. 16. Option 5: Large Federated system • Examples: EcoPathways • Pros: • FAIR compatible • Individual projects retain data ownership • Scalable • Cons: • Complex infrastructure and deployment • High cost • Challenging governance and enforcement • Distributed infrastructure with potential for duplication – Small projects disadvantaged Architecture and Standards | Liming Zhu and Hugo Leroux16 |
  17. 17. Option 6: local data, distributed compute • Example: Enigma to some extent • Pros: • FAIR compatible • Individual projects retain data ownership • Theoretically scalable • Cons: • Very complex, relatively untested on large scale • Rely on individual site for compute and data management infrastructure • Distributed infrastructure with potential for duplication – Small projects disadvantaged Architecture and Standards | Liming Zhu and Hugo Leroux17 |

×