Scale your database traffic with Read & Write split using MySQL Router
Data Management for Grown Ups
1. Data Management
For Grown Ups
Terrell Russell, Ph.D.
@terrellrussell
Senior Data Scientist, iRODS Consortium
Renaissance Computing Institute (RENCI), UNC-Chapel Hill
2.
3. iRODS Consortium
was created to ensure the sustainability of iRODS and to
further its adoption and continued evolution. To this end, the Consortium
works to standardize the definition, development, and release of iRODS-based
data middleware technologies, evangelize iRODS among potential users,
promote new advances in iRODS, and expand the adoption of iRODS-based
data middleware technologies through the development, release, and support
of an open-source, mission-critical, production-level distribution of iRODS.
Current Members:
RENCI, DICE, Seagate, DDN, Novartis, IBM, Complete Genomics, Wellcome Trust
Sanger Institute, UCL, Cleversafe, EMC, and the NASA Atmospheric Science Data
Center
The iRODS Consortium
16. Four Verticals → Four Case Studies
Health Care & Life Science
Oil & Gas
Media & Entertainment
Archives & Records Management
17. Health Care & Life Science
Genomics Use Case - Data begins as series of images
from a sequencer, converted to bases (ATCG),
fragmented, aligned, annotated for variants, filtered,
analyzed
Extensive Data Pipelines
Saved State
Diverse Data Products
Share Results
18. Health Care & Life Science
Priorities:
reproducibility
multi-institutional
collaboration
19. Oil & Gas
Ingest Use Case - As existing storage fills up,
complementary strategies 1) migrate from active to
slower, cheaper archive and 2) add more active.
Traditional HSM has limited flexibility (access date,
physical location, etc.) and additional namespaces
just add more complexity.
Diverse Data Sources
Spread Geographically
Computationally Intense
21. Media & Entertainment
Born Digital Use Case - New valuable creative
content (movie assets, original musical tracks)
requires large, robust, long-term, flexible,
accessible infrastructure.
Popular Content
Unique
Largely Video and Games
23. Archives & Records Management
Provenance Use Case - Libraries, museums, and
other cultural institutions have a 100+ year view on
their digital assets. Must maintain archival and
dissemination copies. Lots of metadata.
Cultural Heritage
Original and Derivative Copies
Quality Search and Browse
27. Open Source Data Management Middleware
iRODS enables data discovery using a metadata catalog that
describes every file, every directory, and every storage
resource in the data grid.
iRODS automates data workflows, with a rule engine that
permits any action to be initiated by any trigger on any server
or client in the grid.
iRODS enables secure collaboration, so users only need to
log in to their home grid to access data hosted on a remote
grid.
iRODS implements data virtualization, allowing access to
distributed storage assets under a unified namespace, and
freeing organizations from getting locked in to single-vendor
storage solutions.