SlideShare a Scribd company logo
Customer Success Story
                                             National Institutes of Health




              L INST
                                          National Institutes of Health
            NA                            The National Center for Biotechnology Information (NCBI), a division of the
                      IT
        NATIO



                        UTES




                                          National Library of Medicine (NLM) at the National Institutes of Health (NIH),
            F                             serves as a national resource for molecular biology information serving
         O




                H E A LT
                     H




                                          research groups from around the world. Established in 1988, NCBI develops
                                          new information technologies to aid in the understanding of fundamental
                                          molecular and genetic processes that control health and disease. NCBI
                                          creates public databases, conducts research in computational biology,
                                          develops software tools for analyzing genomic data, and disseminates
                                          biomedical information. Some 450 people—ranging from NCBI researchers
                                          and staff scientists to programmers, curators, and indexers—generate, store,
                                          and access NCBI databases.

SUMMARY                                   The Challenge                                  Designed specifically to accelerate the
Industry:                                 Researchers at NCBI depend on high-            performance of applications deployed
Life Sciences/Government                  performance compute clusters to run            on Linux compute clusters, the Panasas
                                          complex analyses of genotyping and             storage cluster effectively eliminated the
THE CHALLENGE                             sequencing data. The existing storage          research-impacting I/O bottlenecks.
Meet demands of researchers from          architecture did not effectively scale
around the globe accessing the NCBI       to support such efforts as the 1000            PAS storage now provides scalable
public database to conduct genome
                                          Genomes Project, an ambitious endeavor         performance and capacity to multiple
research. Eliminate I/O bottlenecks and
maximize computing resources for public   to sequence the genomes of at least            internal production systems (both Linux-
databases, including an estimated 1.5     1,000 people from around the world.            and Windows-based platforms), including
PB of genetic information for the 1000                                                   NCBI’s 1800-core Dell PowerEdge cluster
                                          The project, creating the most detailed
Genomes Project.
                                          and medically useful picture to date of        that provides computing resources to some
                                          human genetic variation, is expected to        80 applications used by ten NCBI research
THE SOLUTION                                                                             groups. Panasas Storage supports much
                                          generate more than 1.5 PB of genetic
PAS Storage system with the PanFSTM       information. NCBI will be required to          of the daily computation that generates
parallel file system, 1800-core Dell
PowerEdge Cluster, Cisco 6509             archive and provide timely investigator        the data for such high-visibility services
Network Switch                            access to as much as 3 TB of new               as NCBI’s PubMed resource that brings
                                          genome data arriving weekly from each of       together more than 18 million citations from
THE RESULT                                the six institutes participating in the 1000   MEDLINE and other life science journals
                                          Genomes Project. To accommodate the            for biomedical articles.
  • 5X application performance
    improvement                           expected high demand for data access
  • Timelier database updates with        NCBI requires a storage solution that is       Most recently, NCBI implemented a PAS
    faster time-to-results                reliable, manageable, and affordable.          Storage system that provides economical
  • High performance irrespective of                                                     second-tier storage for the high-density
    access patterns/dataset size          The Solution                                   data requirements of the 1000 Genomes
  • Affordable scalability for fast-      NCBI selected Panasas Storage for the          Project. The PAS solution also provides
    growing archives                      Center’s Dell PowerEdge compute farm.          storage resources to projects such as
  • Administrative efficiencies across    The decision was based in part on testing      the NCBI Short Read Archive (SRA), a
    primary and secondary storage
                                          results that indicated the Panasas Storage     central repository for short read sequencing
                                          solution delivers a significant performance    data, and the dbGaP public repository of
                                          improvement over existing installed storage.   genotypes and phenotypes.


  1-888-panasas                                                                                                 www.panasas.com
Customer Success Story: National Institutes of Health




The Result
Flexibility and Efficiency Advances Discovery.
Technology advances that have brought down the cost
                                                                                      “Technology advances that have
of sequencing—from billions to millions per project and                               brought down the cost of sequencing
freefalling rapidly to the industry’s goal of $10K or even                            have contributed to an explosion of
as low as $1K for a single run—have also contributed to an                            data...Panasas helps NCBI keep pace
explosion of data. Taking advantage of the PAS solution for                           with the volume and complexity of
receipt and storage of genome and other project data helps
                                                                                      incoming information in a
NCBI keep pace with the volume and complexity of incoming
information in a cost-effective manner.
                                                                                      cost-effective manner.”

Performance, Scalability for Fast-Growing Archives
Panasas solutions help address the research community’s
storage needs in spite of a very high unpredictability factor.
Whether it’s unexpected demand for particular research
findings, storage requirements that mushroom from 150 TB
to 1.5 PB almost overnight, or datasets that vary from 3 TB to
30 TB in size, the needs of the scientific community dictate
storage flexibility and maximum uptime. In addition to the
inherent administrative efficiencies of a common architecture,
the Panasas unified storage platform for Tier 1 and secondary
storage applications gives flexibility to support a scientific
user community striving for discoveries that directly impact
understanding of genetics and its role in health and disease
analysis. NCBI’s mission is to help researchers better leverage
and build on the work of the larger biotechnology community,
avoiding both the cost and the time penalities of reworking
data.




About Panasas
Panasas, Inc., the leader in high-performance scale-out NAS storage solutions, enables enterprise customers to rapidly solve
complex computing problems, speed innovation and bring new products to market faster. All Panasas solutions leverage the
patented PanFS™ storage operating system to deliver exceptional performance, scalability and manageability.
                                                                                                                                                       PW-10-21000




                                                      |     Phone: 1-888-PANASAS                       |      www.panasas.com
                 © 2010 Panasas Incorporated. All rights reserved. Panasas is a trademark of Panasas, Inc. in the United States and other countries.

More Related Content

Similar to National Institutes of Health Maximize Computing Resources with Panasas

Thesis blending big data and cloud -epilepsy global data research and inform...
Thesis  blending big data and cloud -epilepsy global data research and inform...Thesis  blending big data and cloud -epilepsy global data research and inform...
Thesis blending big data and cloud -epilepsy global data research and inform...
Anup Singh
 

Similar to National Institutes of Health Maximize Computing Resources with Panasas (20)

Genomics Center Compares 100s of Computations Simultaneously with Panasas
Genomics Center Compares 100s of Computations Simultaneously with PanasasGenomics Center Compares 100s of Computations Simultaneously with Panasas
Genomics Center Compares 100s of Computations Simultaneously with Panasas
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Open Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of CancerOpen Source Networking Solving Molecular Analysis of Cancer
Open Source Networking Solving Molecular Analysis of Cancer
 
QFAB at a glance
QFAB at a glanceQFAB at a glance
QFAB at a glance
 
Cifar
CifarCifar
Cifar
 
Intel life sciences_personalizedmedicine_stanford biomed 052214 dist
Intel life sciences_personalizedmedicine_stanford biomed 052214 distIntel life sciences_personalizedmedicine_stanford biomed 052214 dist
Intel life sciences_personalizedmedicine_stanford biomed 052214 dist
 
Accelerate Discovery
Accelerate DiscoveryAccelerate Discovery
Accelerate Discovery
 
HPC lab projects
HPC lab projectsHPC lab projects
HPC lab projects
 
Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-es
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015
 
The Andrej Sali Lab Processes Millions of Small Files with Panasas
The Andrej Sali Lab Processes Millions of Small Files with PanasasThe Andrej Sali Lab Processes Millions of Small Files with Panasas
The Andrej Sali Lab Processes Millions of Small Files with Panasas
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
Thesis blending big data and cloud -epilepsy global data research and inform...
Thesis  blending big data and cloud -epilepsy global data research and inform...Thesis  blending big data and cloud -epilepsy global data research and inform...
Thesis blending big data and cloud -epilepsy global data research and inform...
 
DNA Storage
DNA StorageDNA Storage
DNA Storage
 
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson'sHPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
 
iRODS
iRODSiRODS
iRODS
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and Cloud
 
Appistry WGDAS Presentation
Appistry WGDAS PresentationAppistry WGDAS Presentation
Appistry WGDAS Presentation
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...
 

More from Panasas

PanasasActiveStor
PanasasActiveStorPanasasActiveStor
PanasasActiveStor
Panasas
 
Panasas ActiveStor Reliability that Improves with Scale
Panasas ActiveStor Reliability that Improves with ScalePanasas ActiveStor Reliability that Improves with Scale
Panasas ActiveStor Reliability that Improves with Scale
Panasas
 
Evolution of RAID
Evolution of RAIDEvolution of RAID
Evolution of RAID
Panasas
 
ActiveStor Performance at Scale
ActiveStor Performance at ScaleActiveStor Performance at Scale
ActiveStor Performance at Scale
Panasas
 

More from Panasas (20)

Is Your Storage Ready for Commercial HPC? - Three Steps to Take
Is Your Storage Ready for Commercial HPC? - Three Steps to TakeIs Your Storage Ready for Commercial HPC? - Three Steps to Take
Is Your Storage Ready for Commercial HPC? - Three Steps to Take
 
PanasasActiveStor
PanasasActiveStorPanasasActiveStor
PanasasActiveStor
 
Panasas ActiveStor Reliability that Improves with Scale
Panasas ActiveStor Reliability that Improves with ScalePanasas ActiveStor Reliability that Improves with Scale
Panasas ActiveStor Reliability that Improves with Scale
 
Evolution of RAID
Evolution of RAIDEvolution of RAID
Evolution of RAID
 
ActiveStor Performance at Scale
ActiveStor Performance at ScaleActiveStor Performance at Scale
ActiveStor Performance at Scale
 
Panasas® activestor® and ansys
Panasas® activestor® and ansysPanasas® activestor® and ansys
Panasas® activestor® and ansys
 
Panasas ® California Institute of Technology Success Story
Panasas ® California Institute of Technology Success StoryPanasas ® California Institute of Technology Success Story
Panasas ® California Institute of Technology Success Story
 
Panasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National LaboratoryPanasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National Laboratory
 
Panasas ® University of Cologne Success Story
Panasas ® University of Cologne Success StoryPanasas ® University of Cologne Success Story
Panasas ® University of Cologne Success Story
 
PANASAS® ACTIVESTOR® AND STAR-CCM+
PANASAS® ACTIVESTOR® AND STAR-CCM+PANASAS® ACTIVESTOR® AND STAR-CCM+
PANASAS® ACTIVESTOR® AND STAR-CCM+
 
Panasas ® Deluxe Australlia
Panasas ® Deluxe Australlia Panasas ® Deluxe Australlia
Panasas ® Deluxe Australlia
 
Panasas ® University of Oxford
Panasas ®  University of OxfordPanasas ®  University of Oxford
Panasas ® University of Oxford
 
Panasas ® Terraspark Geosciences Customer Success Story
Panasas ®  Terraspark Geosciences Customer Success StoryPanasas ®  Terraspark Geosciences Customer Success Story
Panasas ® Terraspark Geosciences Customer Success Story
 
Panasas ® UCLA Customer Success Story
Panasas ®  UCLA Customer Success StoryPanasas ®  UCLA Customer Success Story
Panasas ® UCLA Customer Success Story
 
Panasas ® The Defence Academy of the United Kingdom
Panasas ®  The Defence Academy of the United Kingdom Panasas ®  The Defence Academy of the United Kingdom
Panasas ® The Defence Academy of the United Kingdom
 
Panasas® Utah State Univercity
Panasas®  Utah State UnivercityPanasas®  Utah State Univercity
Panasas® Utah State Univercity
 
Accelerate Financial Simulation & Analytics
Accelerate Financial Simulation & AnalyticsAccelerate Financial Simulation & Analytics
Accelerate Financial Simulation & Analytics
 
Accelerate Oil & Gas Discovery
Accelerate Oil & Gas DiscoveryAccelerate Oil & Gas Discovery
Accelerate Oil & Gas Discovery
 
Panasas® ActiveStor ® 16
Panasas® ActiveStor ® 16Panasas® ActiveStor ® 16
Panasas® ActiveStor ® 16
 
Accelerating Design in Manufacturing Environments
Accelerating Design in Manufacturing EnvironmentsAccelerating Design in Manufacturing Environments
Accelerating Design in Manufacturing Environments
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 

National Institutes of Health Maximize Computing Resources with Panasas

  • 1. Customer Success Story National Institutes of Health L INST National Institutes of Health NA The National Center for Biotechnology Information (NCBI), a division of the IT NATIO UTES National Library of Medicine (NLM) at the National Institutes of Health (NIH), F serves as a national resource for molecular biology information serving O H E A LT H research groups from around the world. Established in 1988, NCBI develops new information technologies to aid in the understanding of fundamental molecular and genetic processes that control health and disease. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genomic data, and disseminates biomedical information. Some 450 people—ranging from NCBI researchers and staff scientists to programmers, curators, and indexers—generate, store, and access NCBI databases. SUMMARY The Challenge Designed specifically to accelerate the Industry: Researchers at NCBI depend on high- performance of applications deployed Life Sciences/Government performance compute clusters to run on Linux compute clusters, the Panasas complex analyses of genotyping and storage cluster effectively eliminated the THE CHALLENGE sequencing data. The existing storage research-impacting I/O bottlenecks. Meet demands of researchers from architecture did not effectively scale around the globe accessing the NCBI to support such efforts as the 1000 PAS storage now provides scalable public database to conduct genome Genomes Project, an ambitious endeavor performance and capacity to multiple research. Eliminate I/O bottlenecks and maximize computing resources for public to sequence the genomes of at least internal production systems (both Linux- databases, including an estimated 1.5 1,000 people from around the world. and Windows-based platforms), including PB of genetic information for the 1000 NCBI’s 1800-core Dell PowerEdge cluster The project, creating the most detailed Genomes Project. and medically useful picture to date of that provides computing resources to some human genetic variation, is expected to 80 applications used by ten NCBI research THE SOLUTION groups. Panasas Storage supports much generate more than 1.5 PB of genetic PAS Storage system with the PanFSTM information. NCBI will be required to of the daily computation that generates parallel file system, 1800-core Dell PowerEdge Cluster, Cisco 6509 archive and provide timely investigator the data for such high-visibility services Network Switch access to as much as 3 TB of new as NCBI’s PubMed resource that brings genome data arriving weekly from each of together more than 18 million citations from THE RESULT the six institutes participating in the 1000 MEDLINE and other life science journals Genomes Project. To accommodate the for biomedical articles. • 5X application performance improvement expected high demand for data access • Timelier database updates with NCBI requires a storage solution that is Most recently, NCBI implemented a PAS faster time-to-results reliable, manageable, and affordable. Storage system that provides economical • High performance irrespective of second-tier storage for the high-density access patterns/dataset size The Solution data requirements of the 1000 Genomes • Affordable scalability for fast- NCBI selected Panasas Storage for the Project. The PAS solution also provides growing archives Center’s Dell PowerEdge compute farm. storage resources to projects such as • Administrative efficiencies across The decision was based in part on testing the NCBI Short Read Archive (SRA), a primary and secondary storage results that indicated the Panasas Storage central repository for short read sequencing solution delivers a significant performance data, and the dbGaP public repository of improvement over existing installed storage. genotypes and phenotypes. 1-888-panasas www.panasas.com
  • 2. Customer Success Story: National Institutes of Health The Result Flexibility and Efficiency Advances Discovery. Technology advances that have brought down the cost “Technology advances that have of sequencing—from billions to millions per project and brought down the cost of sequencing freefalling rapidly to the industry’s goal of $10K or even have contributed to an explosion of as low as $1K for a single run—have also contributed to an data...Panasas helps NCBI keep pace explosion of data. Taking advantage of the PAS solution for with the volume and complexity of receipt and storage of genome and other project data helps incoming information in a NCBI keep pace with the volume and complexity of incoming information in a cost-effective manner. cost-effective manner.” Performance, Scalability for Fast-Growing Archives Panasas solutions help address the research community’s storage needs in spite of a very high unpredictability factor. Whether it’s unexpected demand for particular research findings, storage requirements that mushroom from 150 TB to 1.5 PB almost overnight, or datasets that vary from 3 TB to 30 TB in size, the needs of the scientific community dictate storage flexibility and maximum uptime. In addition to the inherent administrative efficiencies of a common architecture, the Panasas unified storage platform for Tier 1 and secondary storage applications gives flexibility to support a scientific user community striving for discoveries that directly impact understanding of genetics and its role in health and disease analysis. NCBI’s mission is to help researchers better leverage and build on the work of the larger biotechnology community, avoiding both the cost and the time penalities of reworking data. About Panasas Panasas, Inc., the leader in high-performance scale-out NAS storage solutions, enables enterprise customers to rapidly solve complex computing problems, speed innovation and bring new products to market faster. All Panasas solutions leverage the patented PanFS™ storage operating system to deliver exceptional performance, scalability and manageability. PW-10-21000 | Phone: 1-888-PANASAS | www.panasas.com © 2010 Panasas Incorporated. All rights reserved. Panasas is a trademark of Panasas, Inc. in the United States and other countries.