Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Genomics isn't Special

The science driving genomic analyses is rapidly changing, but the operational problems of processing data from DNA sequencers quickly and reliably are not new.

I present an analysis of the parallels in the fundamental limiting components of the '90s internet boom and the DNA sequencing boom that is currently underway, and illustrate how Hadoop, a proven application architecture used widely in BigData and commercial internet applications can be reused in the genomics sector.

  • Identifiez-vous pour voir les commentaires

Genomics isn't Special

  1. 1. © 2014 MapR Technologies 1 AppsSequencer Genomics isn’t Special* Analytics http://www.slideshare.net/urilaserson/genomics-is-not-special-towards-data-intensive-biology
  2. 2. © 2014 MapR Technologies 2 BISensor Genomics Follows the Standard BigData Workflow ETL
  3. 3. © 2014 MapR Technologies 3 BISensor Genomics is a Big Opportunity ETL MapR-DBMapR-FS
  4. 4. © 2014 MapR Technologies 4 Biggest Opportunity is to Save Lives (Clinical) Clinical Pharma Agriculture Manufacturing Energy … …Digitized DNA ~28PB of DNA digitized per year (2013). ~250K Human genomes sequenced (2013). ~4M Babies born (2013, USA). http://www.technologyreview.com/news/531091/emtech-illumina-says-228000-human-genomes-will-be-sequenced-this-year/
  5. 5. © 2014 MapR Technologies 5 Clinical Applications are Launching Now • 2014: US$ 2B, • mostly research, • mostly chemical costs • 2020: US$ 20B, • mostly clinical apps, • mostly analytics costs Macquarie Capital, 2014. Genomics 2.0: It’s just the beginning 0 5 10 15 20 2014 2020 Clinical Non-Clinical
  6. 6. © 2014 MapR Technologies 6 Clinical COGS: Analytics > Chemistry • 2014: US$ 2B, • mostly research, • mostly chemical costs • 2020: US$ 20B, • mostly clinical apps, • mostly analytics costs 0 5 10 15 20 2014 2020 Clinical Non-Clinical Why?
  7. 7. © 2014 MapR Technologies 7© 2014 MapR Technologies Historical Perspective – eCommerce Boom
  8. 8. © 2014 MapR Technologies 8years CPU transistors/mm2 HDD GB/mm2 Internet GB/s
  9. 9. © 2014 MapR Technologies 9 Early 1990s: Early eCommerce Vendor Setup Storage read/write read/write Website Back Office
  10. 10. © 2014 MapR Technologies 10 Late 1990s: Workload became too big Storage read/write read/write Website WebsiteWebsite Website Back Office Back Office
  11. 11. © 2014 MapR Technologies 11 2003-4: GFS+MapReduce (Hadoop) Published read/write read/write Website WebsiteWebsite Website Storage + Compute Cluster Back Office Back Office
  12. 12. © 2014 MapR Technologies 12© 2014 MapR Technologies Genomics Boom
  13. 13. © 2014 MapR Technologies 13 DNA Sequencing, pre-2004 years CPU transistors/mm2 HDD GB/mm2 DNA bp/$, pre-2004
  14. 14. © 2014 MapR Technologies 14 DNA Sequencing, pre-2004 Storage write-only read/write High-Performance Compute Cluster Coordinator / Edge Node Sequencer
  15. 15. © 2014 MapR Technologies 15 DNA Sequencing, 2004 Disruption years CPU transistors/mm2 HDD GB/mm2 DNA bp/$, post-2004 DNA bp/$, pre-2004
  16. 16. © 2014 MapR Technologies 16 DNA Sequencing, post-2004 Storage write-only read/write High-Performance Compute Cluster Coordinator / Edge Node DNA Sequencer Cluster (e.g. Illumina X-Ten) HPC bottleneck Sequencer back-pressure
  17. 17. © 2014 MapR Technologies 17 DNA Sequencing, 2014 @ Major Sequencing Vendor write-only DNA Sequencer Cluster (e.g. Illumina X-Ten Storage + Compute Cluster Decentralize I/O Decentralize I/O
  18. 18. © 2014 MapR Technologies 18 DNA Analytics Can Now Scale Out HPC Analytics Hadoop / Spark Analytics
  19. 19. © 2014 MapR Technologies 19© 2014 MapR Technologies Back to Market Analysis…
  20. 20. © 2014 MapR Technologies 20 Clinical COGS: Analytics > Chemistry • 2014: US$ 2B, • mostly research, • mostly chemical costs • 2020: US$ 20B, • mostly clinical apps, • mostly analytics costs 0 5 10 15 20 2014 2020 Clinical Non-Clinical
  21. 21. © 2014 MapR Technologies 21 Genomics Market Value Chain Sequencing Tech Pharma CLIA Patients Research HospitalsBasic R&D Patients Sequencing Tech
  22. 22. © 2014 MapR Technologies 22 Seven Billion Humans Today Seq. Tech CLIA MapR-DBMapR-FS Linear Growth with # of Humans Exponential Growth with # of Humans Pharma Res. Hospitals
  23. 23. © 2014 MapR Technologies, confidential Thanks! Questions? @allenday, @mapr aday@mapr.com linkedin.com/in/allenday

×