SlideShare une entreprise Scribd logo
1  sur  28
1© Copyright 2013 EMC Corporation. All rights reserved.
Research and Technology
Explosion in the Scale-Out
Storage Era
Exploring the new frontier of perpetual
data growth and how it will affect us
Ryan Sayre
Technical Strategist, EMEA
EMC ISD Office of the CTO
June 2013
2© Copyright 2013 EMC Corporation. All rights reserved.
What Is Big Data?
Data that challenges the capabilities of a system to capture,
manage, and process it within an acceptable elapsed time
~ Wikipedia ~
3© Copyright 2013 EMC Corporation. All rights reserved.
The Big Data Challenge
0
10
20
30
40
50
60
70
80
90
2009 2010 2011 2012 2013 2014
Exabytes
By 2013, 80% of all storage capacity sold will be for file-based data
Source: “Scale Out Storage in the Content Driven Enterprise: Unleashing the Value of Information Assets,” IDC White Paper (2010 Enterprise Disk Storage Consumption Model), June 2011
File based: 61.8% CAGR Block based: 23.7% CAGR
Media &
Entertainment
Design &
Simulation
HealthcareBioinformatics Data Analytics
File Shares &
Archives
5© Copyright 2013 EMC Corporation. All rights reserved.
Genomics Size
: : *
1000
EMR Radiology Genomics
88 million outpatient visits to NHS hospitals in 2010/2011
*finished data
Sources:
Dr. Halamka, BIDMC
S. Joshi, internal research
HIMSS
Internal EMC data
Volume
50GB
6© Copyright 2013 EMC Corporation. All rights reserved.
7© Copyright 2013 EMC Corporation. All rights reserved.
Bioinformatics: A “data tsunami”
• Already a cliché in 2006:
– “Data Deluge”, “Data Tsunami” …
• What changed starting in 2007: Terabyte scale
laboratory instruments
– “Next Generation” DNA Sequencers
– Confocal Microscopy & Live cell imaging
– Other Imaging (fMRI, CT, Ultrasound, etc.)
• 2010: Faster adoption of next-generation sequencing
• 2013: Scale-Out Storage is the only way to keep
surviving!
8© Copyright 2013 EMC Corporation. All rights reserved.
Vast quantities of data
• Terabyte scale issues have traditionally been “lab” or
“workgroup” problems
• Individual researchers & lab instruments can generate
terabyte volumes of data per-experiment
– Average of 40TB storage for each Solexa instrument
– A recent “100TB Single-namespace” project was for a lab
with a single 454 instrument
9© Copyright 2013 EMC Corporation. All rights reserved.
Sequencing throughput over time
(Data from one vendor’s platform)
0
2
4
6
8
10
12
14
16
18
20
GigabasesofSequenceperRun
15
x
10© Copyright 2013 EMC Corporation. All rights reserved.
Throughput Outpacing Moore’s Law
• 1000 Genomes Project
– Could generate 90Tbase of raw data (@ 30x coverage)
• International Cancer Genome Consortium
– 50,000+ samples could generate 5,000Tbase of raw data
1
10
100
1,000
10,000
100,000
1,000,000
1996 Today
kb/Day
CPU
11© Copyright 2013 EMC Corporation. All rights reserved.
0
10
20
30
40
50
60
70
80
G Per Instrument
Sequencer capacity is growing enormously
Dependent infrastructure has become a significant and critical factor
Home grown storage and
compute resources are
capable of supporting
data reduction and
alignment
Specialized HPC and
storage architectures are
required to meet
aggregate throughput and
processing demands
Current HPC architectures
can be resource
prohibitive at the quantity
required to manage data
output
Time
12© Copyright 2013 EMC Corporation. All rights reserved.
Broad Institute Sequencing Data
13© Copyright 2013 EMC Corporation. All rights reserved.
Big Data Apps Need Big Data Storage
Data intensive, HPC workflows
Medical Imaging Gene Sequencing Seismic Exploration
Media & Entertainment Product DevelopmentSatellite Images
14© Copyright 2013 EMC Corporation. All rights reserved.
Big Data Archive Challenge
Relentless Data Growth
Primary Storage Overloaded with
Unstructured Files
– Constant upgrade requirements
Performance Issues
– Hinders regulatory responses and e-
discovery applications
Storage Islands
Many Systems or 2-way clusters and Points
of Management
Numerous File Systems/Volumes
15© Copyright 2013 EMC Corporation. All rights reserved.
My own Big Data Growth Story…
Started out at 1 Terabytes of shared storage in 2004
– Image Processing and Visualisation
– Quickly grew to 5 Terabytes within 5 months
– Was worrying about storage every day, needed a way out!
– Transitioned to Scale-Out, Scaled to 300 TB within 3 years
Current organisation is over 2 Petabytes of storage
– No dedicated storage administrator
– I/O patterns are managed by policy and tier now
16© Copyright 2013 EMC Corporation. All rights reserved.
UK Case Study :
(Life Sciences Institute)
Bioinformatics Organisation needing to not only
store but cross reference multiple genome types
to create a mega database of genomic structural
variants across all species
Share across multiple organisations across the UK
and into greater Europe
Need to grow to 20 Petabytes and beyond
17© Copyright 2013 EMC Corporation. All rights reserved.
UK Case Study :
(Engineering Design Automation)
Performance requirements of over 1 million
operations a second to simulate complex electrical
pathways
Time to market required more rapid simulations to
advance technology roadmap
Multiple protocols across Windows and Linux
systems
Growing for both performance and capacity (PB’s)
18© Copyright 2013 EMC Corporation. All rights reserved.
The Scale-Out / Scale-Up Dilemma
18
Scale-out Scale-up
Isilon OneFS Other Storage Platforms
Scalability
• Scale-out
• Performance, Capacity, Both
• Scale-up
• Capacity only, limited performance options
Performance • True linear predictability • Degradation of performance & capacity at scale
19© Copyright 2013 EMC Corporation. All rights reserved.
What
does this
look like?
20© Copyright 2013 EMC Corporation. All rights reserved.
Isilon Scale-Out NAS Architecture
OneFS Operating
Environment
Intra-cluster
Communication Layer
Servers
Client/Application Layer Ethernet Layer
Servers
Servers
SingleFS/Volume
CIFSNFS
FTPHTTP
HDFS
for
Hadoop
21© Copyright 2013 EMC Corporation. All rights reserved.
Single storage pool for application consolidation
Isilon Scale-Out Innovation
Simple to scale
– Manage 20+ PB like 1TB drive
Predictable performance
– Grows linearly
Efficient and Easy to operate
– Maximize utilization to 80%+
– Automate tiering
Highly resilient
– Survives multiple failures
Enterprise proven
– Management and protection
tools that customers expect
No data migrations
22© Copyright 2013 EMC Corporation. All rights reserved.
More scalable than traditional storage systems
Largest and Most Scalable File System
OneFS scales from 18 TB to more than 20 PB in a single
file system, single volume
Under 60 seconds to
scale with no downtime
World’s fastest
performance and
capacity scaling
Over 100 GB/s of
throughput
23© Copyright 2013 EMC Corporation. All rights reserved.
Gain New Levels of Efficiency
• AutoBalance automatically moves content
to new storage nodes while system is
online and in production
• Eliminates “hot spots”
• Enables unmatched storage capacity
utilization of more than 80%
AutoBalance
Automated data balancing across nodes reduces
costs, complexity, and risks for scaling storage
EMPTY
EMPTY
EMPTY
EMPTY
EMPTY
FULL
FULL
FULL
FULL
BALANCED
BALANCED
BALANCED
BALANCED
BALANCED
Isilon AutoBalance
24© Copyright 2013 EMC Corporation. All rights reserved.
Optimize Resources with Automated Tiering
• Single point of management
– Single file system/single volume
– Multiple performance tiers
• Automatic data movement
– Policy-based tiering management
– Transparent reallocation
– NO application changes
• Optimize storage resources
– Automatically match storage resources
with data requirements
– Eliminate data migration
Isilon SmartPools
S-Series
Performance
NL-Series
Active archives
X-Series
Collaboration
Reducedcost/TB
Files
25© Copyright 2013 EMC Corporation. All rights reserved.
With N+2, N+3, and
N+4 protection,
data is 100% available
if multiple drives or
nodes fail
With N+1 protection,
data is 100% available
even if a single drive
or node fails
Highly resilient, clustered architecture
Unmatched Data Protection and Availability
100%
100%
100%
100%
100%
100%
100%
100%
FAILED
FAILED
And with Isilon, the
more nodes in the
cluster, the faster
drive rebuild time
26© Copyright 2013 EMC Corporation. All rights reserved.
Interoperability for Operational Flexibility
Platform REST API
– Simplify management and integration
– Third-party application integration
VMware integration
– VAAI: vStorage APIs for array integration
– VASA: vSphere APIs for storage awareness
– Virtual Server writeable clones
Multi-protocol support
– Integrated support for industry-standard
protocols
– Native HDFS support
27© Copyright 2013 EMC Corporation. All rights reserved.
The Cost Advantage of Scale-Out
Ease of use and management simplicity
IDC: Isilon improves IT productivity by 48%, reduces OPEX*
Storage allocation
Storage provisioning
Managing capacity
Managing backup
Space reclamation
Adding new applications
Uploading of re-loading data
0.0 0.5 1.0 1.5 2.0
FTEHoursperTBinUse
Isilon Traditional
* Source: “Quantifying the Business Benefits of Scale-Out NAS Solutions,” IDC White Paper, November 2011
28© Copyright 2013 EMC Corporation. All rights reserved.
Reduces Big Data storage costs by 40%
The Cost Advantage of Scale-Out
$0
$500
$1,000
$1,500
$2,000
$2,500
Traditional Isilon
Average Annual Cost Per TB in Use
OPEX IT Staff CAPEX
Source: “Quantifying the Business Benefits of Scale-Out NAS Solutions,” IDC White Paper, November 2011
Research and technology explosion in scale-out storage

Contenu connexe

Tendances

HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for researchEsteban Hernandez
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Jisc
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is HereMartin Hamilton
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceOla Spjuth
 
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPC
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPCHPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPC
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPCHPC DAY
 
Panasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National LaboratoryPanasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National LaboratoryPanasas
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open SourceRCCSRENKEI
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightPrecisely
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSteven Totman
 
Synergy 2014 - Syn122 Moving Australian National Research into the Cloud
Synergy 2014 - Syn122 Moving Australian National Research into the CloudSynergy 2014 - Syn122 Moving Australian National Research into the Cloud
Synergy 2014 - Syn122 Moving Australian National Research into the CloudCitrix
 
spectrum Storage Whitepaper
spectrum Storage Whitepaperspectrum Storage Whitepaper
spectrum Storage WhitepaperCarina Kordan
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Best Practices in Data Center Energy Resource Management
Best Practices in Data Center Energy Resource ManagementBest Practices in Data Center Energy Resource Management
Best Practices in Data Center Energy Resource ManagementViridity Software
 

Tendances (19)

HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Gaurav slides
Gaurav slidesGaurav slides
Gaurav slides
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...
 
EPCC MSc industry projects
EPCC MSc industry projectsEPCC MSc industry projects
EPCC MSc industry projects
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is Here
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in science
 
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPC
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPCHPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPC
HPC DAY 2017 | HPE Strategy And Portfolio for AI, BigData and HPC
 
Panasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National LaboratoryPanasas ® Los Alamos National Laboratory
Panasas ® Los Alamos National Laboratory
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Synergy 2014 - Syn122 Moving Australian National Research into the Cloud
Synergy 2014 - Syn122 Moving Australian National Research into the CloudSynergy 2014 - Syn122 Moving Australian National Research into the Cloud
Synergy 2014 - Syn122 Moving Australian National Research into the Cloud
 
Welcome to HDF Workshop V
Welcome to HDF Workshop VWelcome to HDF Workshop V
Welcome to HDF Workshop V
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
spectrum Storage Whitepaper
spectrum Storage Whitepaperspectrum Storage Whitepaper
spectrum Storage Whitepaper
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Best Practices in Data Center Energy Resource Management
Best Practices in Data Center Energy Resource ManagementBest Practices in Data Center Energy Resource Management
Best Practices in Data Center Energy Resource Management
 

En vedette

Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware WSO2
 
Scale-Out Block Storage
Scale-Out Block StorageScale-Out Block Storage
Scale-Out Block StorageRandy Bias
 
Spinning Brown Donuts: Why Storage Still Counts
Spinning Brown Donuts: Why Storage Still CountsSpinning Brown Donuts: Why Storage Still Counts
Spinning Brown Donuts: Why Storage Still CountsSparkhound Inc.
 
Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Nuxeo
 
Developing High Performance and Scalable ColdFusion Applications Using Terrac...
Developing High Performance and Scalable ColdFusion Applications Using Terrac...Developing High Performance and Scalable ColdFusion Applications Using Terrac...
Developing High Performance and Scalable ColdFusion Applications Using Terrac...Shailendra Prasad
 
Product roadmap nuxeo tour 2014
Product roadmap   nuxeo tour 2014 Product roadmap   nuxeo tour 2014
Product roadmap nuxeo tour 2014 Nuxeo
 
Future of Data Storage in the Cloud
Future of Data Storage in the CloudFuture of Data Storage in the Cloud
Future of Data Storage in the CloudBret Piatt
 
Scalability and Reliability in the Cloud
Scalability and Reliability in the CloudScalability and Reliability in the Cloud
Scalability and Reliability in the Cloudgmthomps
 
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationDigital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationJen Stirrup
 
Leveraging Advertising And Technology To Scale Your Business
Leveraging Advertising And Technology To Scale Your BusinessLeveraging Advertising And Technology To Scale Your Business
Leveraging Advertising And Technology To Scale Your BusinessPremier Agent | Zillow & Trulia
 
Leadership in the Big Data era
Leadership in the Big Data eraLeadership in the Big Data era
Leadership in the Big Data eraMick Yates
 
Chapter 1 introduction to scaling networks
Chapter 1   introduction to scaling networksChapter 1   introduction to scaling networks
Chapter 1 introduction to scaling networksJosue Wuezo
 
Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Andrei Khurshudov
 
2014 Future of Cloud Computing - 4th Annual Survey Results
2014 Future of Cloud Computing - 4th Annual Survey Results2014 Future of Cloud Computing - 4th Annual Survey Results
2014 Future of Cloud Computing - 4th Annual Survey ResultsMichael Skok
 
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsTommaso Di Bartolo
 
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in HealthcareNetApp
 

En vedette (17)

Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware Ultra-scale e-Commerce Transaction Services with Lean Middleware
Ultra-scale e-Commerce Transaction Services with Lean Middleware
 
Scale-Out Block Storage
Scale-Out Block StorageScale-Out Block Storage
Scale-Out Block Storage
 
Spinning Brown Donuts: Why Storage Still Counts
Spinning Brown Donuts: Why Storage Still CountsSpinning Brown Donuts: Why Storage Still Counts
Spinning Brown Donuts: Why Storage Still Counts
 
Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014
 
Developing High Performance and Scalable ColdFusion Applications Using Terrac...
Developing High Performance and Scalable ColdFusion Applications Using Terrac...Developing High Performance and Scalable ColdFusion Applications Using Terrac...
Developing High Performance and Scalable ColdFusion Applications Using Terrac...
 
Product roadmap nuxeo tour 2014
Product roadmap   nuxeo tour 2014 Product roadmap   nuxeo tour 2014
Product roadmap nuxeo tour 2014
 
Future of Data Storage in the Cloud
Future of Data Storage in the CloudFuture of Data Storage in the Cloud
Future of Data Storage in the Cloud
 
Scalability and Reliability in the Cloud
Scalability and Reliability in the CloudScalability and Reliability in the Cloud
Scalability and Reliability in the Cloud
 
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationDigital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
 
Leveraging Advertising And Technology To Scale Your Business
Leveraging Advertising And Technology To Scale Your BusinessLeveraging Advertising And Technology To Scale Your Business
Leveraging Advertising And Technology To Scale Your Business
 
Leadership in the Big Data era
Leadership in the Big Data eraLeadership in the Big Data era
Leadership in the Big Data era
 
Chapter 1 introduction to scaling networks
Chapter 1   introduction to scaling networksChapter 1   introduction to scaling networks
Chapter 1 introduction to scaling networks
 
Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007
 
Scalability Design Principles - Internal Session
Scalability Design Principles - Internal SessionScalability Design Principles - Internal Session
Scalability Design Principles - Internal Session
 
2014 Future of Cloud Computing - 4th Annual Survey Results
2014 Future of Cloud Computing - 4th Annual Survey Results2014 Future of Cloud Computing - 4th Annual Survey Results
2014 Future of Cloud Computing - 4th Annual Survey Results
 
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage Startups
 
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in Healthcare
 

Similaire à Research and technology explosion in scale-out storage

Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data StorageEMC
 
EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archivessolarisyougood
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothAdaryl "Bob" Wakefield, MBA
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias StorageFran Navarro
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetupByung Ho Lee
 
Workload Centric Scale-Out Storage for Next Generation Datacenter
Workload Centric Scale-Out Storage for Next Generation DatacenterWorkload Centric Scale-Out Storage for Next Generation Datacenter
Workload Centric Scale-Out Storage for Next Generation DatacenterCloudian
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureMarco van der Hart
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioAlluxio, Inc.
 
times ten in-memory database for extreme performance
times ten in-memory database for extreme performancetimes ten in-memory database for extreme performance
times ten in-memory database for extreme performanceOracle Korea
 
Accelerating Analytics for the Future of Genomics
Accelerating Analytics for the Future of GenomicsAccelerating Analytics for the Future of Genomics
Accelerating Analytics for the Future of GenomicsAmazon Web Services
 
Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and DataGuy Coates
 
Storage For Science Wp
Storage For Science WpStorage For Science Wp
Storage For Science Wpsydcarr
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-finalHaluk Ulubay
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshopsolarisyougood
 
Systems oracle overview_hardware
Systems oracle overview_hardwareSystems oracle overview_hardware
Systems oracle overview_hardwareFran Navarro
 
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Storage Switzerland
 
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...IBM India Smarter Computing
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3Tony Pearson
 
Scale-on-Scale : Part 3 of 3 - Disaster Recovery
Scale-on-Scale : Part 3 of 3 - Disaster RecoveryScale-on-Scale : Part 3 of 3 - Disaster Recovery
Scale-on-Scale : Part 3 of 3 - Disaster RecoveryScale Computing
 

Similaire à Research and technology explosion in scale-out storage (20)

Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data Storage
 
EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archives
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias Storage
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
 
Workload Centric Scale-Out Storage for Next Generation Datacenter
Workload Centric Scale-Out Storage for Next Generation DatacenterWorkload Centric Scale-Out Storage for Next Generation Datacenter
Workload Centric Scale-Out Storage for Next Generation Datacenter
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochure
 
Emc data domain
Emc data domainEmc data domain
Emc data domain
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with Alluxio
 
times ten in-memory database for extreme performance
times ten in-memory database for extreme performancetimes ten in-memory database for extreme performance
times ten in-memory database for extreme performance
 
Accelerating Analytics for the Future of Genomics
Accelerating Analytics for the Future of GenomicsAccelerating Analytics for the Future of Genomics
Accelerating Analytics for the Future of Genomics
 
Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and Data
 
Storage For Science Wp
Storage For Science WpStorage For Science Wp
Storage For Science Wp
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-final
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshop
 
Systems oracle overview_hardware
Systems oracle overview_hardwareSystems oracle overview_hardware
Systems oracle overview_hardware
 
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
Webinar: What's Best for VDI, Hybrid or All-Flash Storage?
 
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
 
Scale-on-Scale : Part 3 of 3 - Disaster Recovery
Scale-on-Scale : Part 3 of 3 - Disaster RecoveryScale-on-Scale : Part 3 of 3 - Disaster Recovery
Scale-on-Scale : Part 3 of 3 - Disaster Recovery
 

Plus de Jeff Spencer

Responsabilité Sociétale d’Entreprise Un réel atout pour Bull
Responsabilité Sociétale d’EntrepriseUn réel atout pour BullResponsabilité Sociétale d’EntrepriseUn réel atout pour Bull
Responsabilité Sociétale d’Entreprise Un réel atout pour BullJeff Spencer
 
Corporate Social Responsibility : A new business asset
Corporate Social Responsibility: A new business assetCorporate Social Responsibility: A new business asset
Corporate Social Responsibility : A new business assetJeff Spencer
 
Businesses held back by their inability to exploit their data (Infographic)
Businesses held back by their inability to exploit their data (Infographic)Businesses held back by their inability to exploit their data (Infographic)
Businesses held back by their inability to exploit their data (Infographic)Jeff Spencer
 
Mobile government presentation - Bull and Citrix - March 6th 2014
Mobile government presentation - Bull and Citrix - March 6th 2014Mobile government presentation - Bull and Citrix - March 6th 2014
Mobile government presentation - Bull and Citrix - March 6th 2014Jeff Spencer
 
Derive value from data as IT shifts from technology to useage
Derive value from data as IT shifts from technology to useageDerive value from data as IT shifts from technology to useage
Derive value from data as IT shifts from technology to useageJeff Spencer
 
The Good, The Bad, and The Ugly
The Good, The Bad, and The UglyThe Good, The Bad, and The Ugly
The Good, The Bad, and The UglyJeff Spencer
 
Bullx HPC eXtreme computing technology
Bullx HPC eXtreme computing technologyBullx HPC eXtreme computing technology
Bullx HPC eXtreme computing technologyJeff Spencer
 
Bullx HPC eXtreme computing cluster references
Bullx HPC eXtreme computing cluster referencesBullx HPC eXtreme computing cluster references
Bullx HPC eXtreme computing cluster referencesJeff Spencer
 
Bull Corporate Vision
Bull Corporate VisionBull Corporate Vision
Bull Corporate VisionJeff Spencer
 
Data, Data Everywhere but Not a BYTE to Eat
Data, Data Everywhere but Not a BYTE to EatData, Data Everywhere but Not a BYTE to Eat
Data, Data Everywhere but Not a BYTE to EatJeff Spencer
 

Plus de Jeff Spencer (11)

Responsabilité Sociétale d’Entreprise Un réel atout pour Bull
Responsabilité Sociétale d’EntrepriseUn réel atout pour BullResponsabilité Sociétale d’EntrepriseUn réel atout pour Bull
Responsabilité Sociétale d’Entreprise Un réel atout pour Bull
 
Corporate Social Responsibility : A new business asset
Corporate Social Responsibility: A new business assetCorporate Social Responsibility: A new business asset
Corporate Social Responsibility : A new business asset
 
Businesses held back by their inability to exploit their data (Infographic)
Businesses held back by their inability to exploit their data (Infographic)Businesses held back by their inability to exploit their data (Infographic)
Businesses held back by their inability to exploit their data (Infographic)
 
Mobile government presentation - Bull and Citrix - March 6th 2014
Mobile government presentation - Bull and Citrix - March 6th 2014Mobile government presentation - Bull and Citrix - March 6th 2014
Mobile government presentation - Bull and Citrix - March 6th 2014
 
Derive value from data as IT shifts from technology to useage
Derive value from data as IT shifts from technology to useageDerive value from data as IT shifts from technology to useage
Derive value from data as IT shifts from technology to useage
 
The Good, The Bad, and The Ugly
The Good, The Bad, and The UglyThe Good, The Bad, and The Ugly
The Good, The Bad, and The Ugly
 
Bullx HPC eXtreme computing technology
Bullx HPC eXtreme computing technologyBullx HPC eXtreme computing technology
Bullx HPC eXtreme computing technology
 
Bullx HPC eXtreme computing cluster references
Bullx HPC eXtreme computing cluster referencesBullx HPC eXtreme computing cluster references
Bullx HPC eXtreme computing cluster references
 
Bull Corporate Vision
Bull Corporate VisionBull Corporate Vision
Bull Corporate Vision
 
Bull UK Overview
Bull UK OverviewBull UK Overview
Bull UK Overview
 
Data, Data Everywhere but Not a BYTE to Eat
Data, Data Everywhere but Not a BYTE to EatData, Data Everywhere but Not a BYTE to Eat
Data, Data Everywhere but Not a BYTE to Eat
 

Dernier

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Dernier (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Research and technology explosion in scale-out storage

  • 1. 1© Copyright 2013 EMC Corporation. All rights reserved. Research and Technology Explosion in the Scale-Out Storage Era Exploring the new frontier of perpetual data growth and how it will affect us Ryan Sayre Technical Strategist, EMEA EMC ISD Office of the CTO June 2013
  • 2. 2© Copyright 2013 EMC Corporation. All rights reserved. What Is Big Data? Data that challenges the capabilities of a system to capture, manage, and process it within an acceptable elapsed time ~ Wikipedia ~
  • 3. 3© Copyright 2013 EMC Corporation. All rights reserved. The Big Data Challenge 0 10 20 30 40 50 60 70 80 90 2009 2010 2011 2012 2013 2014 Exabytes By 2013, 80% of all storage capacity sold will be for file-based data Source: “Scale Out Storage in the Content Driven Enterprise: Unleashing the Value of Information Assets,” IDC White Paper (2010 Enterprise Disk Storage Consumption Model), June 2011 File based: 61.8% CAGR Block based: 23.7% CAGR Media & Entertainment Design & Simulation HealthcareBioinformatics Data Analytics File Shares & Archives
  • 4. 5© Copyright 2013 EMC Corporation. All rights reserved. Genomics Size : : * 1000 EMR Radiology Genomics 88 million outpatient visits to NHS hospitals in 2010/2011 *finished data Sources: Dr. Halamka, BIDMC S. Joshi, internal research HIMSS Internal EMC data Volume 50GB
  • 5. 6© Copyright 2013 EMC Corporation. All rights reserved.
  • 6. 7© Copyright 2013 EMC Corporation. All rights reserved. Bioinformatics: A “data tsunami” • Already a cliché in 2006: – “Data Deluge”, “Data Tsunami” … • What changed starting in 2007: Terabyte scale laboratory instruments – “Next Generation” DNA Sequencers – Confocal Microscopy & Live cell imaging – Other Imaging (fMRI, CT, Ultrasound, etc.) • 2010: Faster adoption of next-generation sequencing • 2013: Scale-Out Storage is the only way to keep surviving!
  • 7. 8© Copyright 2013 EMC Corporation. All rights reserved. Vast quantities of data • Terabyte scale issues have traditionally been “lab” or “workgroup” problems • Individual researchers & lab instruments can generate terabyte volumes of data per-experiment – Average of 40TB storage for each Solexa instrument – A recent “100TB Single-namespace” project was for a lab with a single 454 instrument
  • 8. 9© Copyright 2013 EMC Corporation. All rights reserved. Sequencing throughput over time (Data from one vendor’s platform) 0 2 4 6 8 10 12 14 16 18 20 GigabasesofSequenceperRun 15 x
  • 9. 10© Copyright 2013 EMC Corporation. All rights reserved. Throughput Outpacing Moore’s Law • 1000 Genomes Project – Could generate 90Tbase of raw data (@ 30x coverage) • International Cancer Genome Consortium – 50,000+ samples could generate 5,000Tbase of raw data 1 10 100 1,000 10,000 100,000 1,000,000 1996 Today kb/Day CPU
  • 10. 11© Copyright 2013 EMC Corporation. All rights reserved. 0 10 20 30 40 50 60 70 80 G Per Instrument Sequencer capacity is growing enormously Dependent infrastructure has become a significant and critical factor Home grown storage and compute resources are capable of supporting data reduction and alignment Specialized HPC and storage architectures are required to meet aggregate throughput and processing demands Current HPC architectures can be resource prohibitive at the quantity required to manage data output Time
  • 11. 12© Copyright 2013 EMC Corporation. All rights reserved. Broad Institute Sequencing Data
  • 12. 13© Copyright 2013 EMC Corporation. All rights reserved. Big Data Apps Need Big Data Storage Data intensive, HPC workflows Medical Imaging Gene Sequencing Seismic Exploration Media & Entertainment Product DevelopmentSatellite Images
  • 13. 14© Copyright 2013 EMC Corporation. All rights reserved. Big Data Archive Challenge Relentless Data Growth Primary Storage Overloaded with Unstructured Files – Constant upgrade requirements Performance Issues – Hinders regulatory responses and e- discovery applications Storage Islands Many Systems or 2-way clusters and Points of Management Numerous File Systems/Volumes
  • 14. 15© Copyright 2013 EMC Corporation. All rights reserved. My own Big Data Growth Story… Started out at 1 Terabytes of shared storage in 2004 – Image Processing and Visualisation – Quickly grew to 5 Terabytes within 5 months – Was worrying about storage every day, needed a way out! – Transitioned to Scale-Out, Scaled to 300 TB within 3 years Current organisation is over 2 Petabytes of storage – No dedicated storage administrator – I/O patterns are managed by policy and tier now
  • 15. 16© Copyright 2013 EMC Corporation. All rights reserved. UK Case Study : (Life Sciences Institute) Bioinformatics Organisation needing to not only store but cross reference multiple genome types to create a mega database of genomic structural variants across all species Share across multiple organisations across the UK and into greater Europe Need to grow to 20 Petabytes and beyond
  • 16. 17© Copyright 2013 EMC Corporation. All rights reserved. UK Case Study : (Engineering Design Automation) Performance requirements of over 1 million operations a second to simulate complex electrical pathways Time to market required more rapid simulations to advance technology roadmap Multiple protocols across Windows and Linux systems Growing for both performance and capacity (PB’s)
  • 17. 18© Copyright 2013 EMC Corporation. All rights reserved. The Scale-Out / Scale-Up Dilemma 18 Scale-out Scale-up Isilon OneFS Other Storage Platforms Scalability • Scale-out • Performance, Capacity, Both • Scale-up • Capacity only, limited performance options Performance • True linear predictability • Degradation of performance & capacity at scale
  • 18. 19© Copyright 2013 EMC Corporation. All rights reserved. What does this look like?
  • 19. 20© Copyright 2013 EMC Corporation. All rights reserved. Isilon Scale-Out NAS Architecture OneFS Operating Environment Intra-cluster Communication Layer Servers Client/Application Layer Ethernet Layer Servers Servers SingleFS/Volume CIFSNFS FTPHTTP HDFS for Hadoop
  • 20. 21© Copyright 2013 EMC Corporation. All rights reserved. Single storage pool for application consolidation Isilon Scale-Out Innovation Simple to scale – Manage 20+ PB like 1TB drive Predictable performance – Grows linearly Efficient and Easy to operate – Maximize utilization to 80%+ – Automate tiering Highly resilient – Survives multiple failures Enterprise proven – Management and protection tools that customers expect No data migrations
  • 21. 22© Copyright 2013 EMC Corporation. All rights reserved. More scalable than traditional storage systems Largest and Most Scalable File System OneFS scales from 18 TB to more than 20 PB in a single file system, single volume Under 60 seconds to scale with no downtime World’s fastest performance and capacity scaling Over 100 GB/s of throughput
  • 22. 23© Copyright 2013 EMC Corporation. All rights reserved. Gain New Levels of Efficiency • AutoBalance automatically moves content to new storage nodes while system is online and in production • Eliminates “hot spots” • Enables unmatched storage capacity utilization of more than 80% AutoBalance Automated data balancing across nodes reduces costs, complexity, and risks for scaling storage EMPTY EMPTY EMPTY EMPTY EMPTY FULL FULL FULL FULL BALANCED BALANCED BALANCED BALANCED BALANCED Isilon AutoBalance
  • 23. 24© Copyright 2013 EMC Corporation. All rights reserved. Optimize Resources with Automated Tiering • Single point of management – Single file system/single volume – Multiple performance tiers • Automatic data movement – Policy-based tiering management – Transparent reallocation – NO application changes • Optimize storage resources – Automatically match storage resources with data requirements – Eliminate data migration Isilon SmartPools S-Series Performance NL-Series Active archives X-Series Collaboration Reducedcost/TB Files
  • 24. 25© Copyright 2013 EMC Corporation. All rights reserved. With N+2, N+3, and N+4 protection, data is 100% available if multiple drives or nodes fail With N+1 protection, data is 100% available even if a single drive or node fails Highly resilient, clustered architecture Unmatched Data Protection and Availability 100% 100% 100% 100% 100% 100% 100% 100% FAILED FAILED And with Isilon, the more nodes in the cluster, the faster drive rebuild time
  • 25. 26© Copyright 2013 EMC Corporation. All rights reserved. Interoperability for Operational Flexibility Platform REST API – Simplify management and integration – Third-party application integration VMware integration – VAAI: vStorage APIs for array integration – VASA: vSphere APIs for storage awareness – Virtual Server writeable clones Multi-protocol support – Integrated support for industry-standard protocols – Native HDFS support
  • 26. 27© Copyright 2013 EMC Corporation. All rights reserved. The Cost Advantage of Scale-Out Ease of use and management simplicity IDC: Isilon improves IT productivity by 48%, reduces OPEX* Storage allocation Storage provisioning Managing capacity Managing backup Space reclamation Adding new applications Uploading of re-loading data 0.0 0.5 1.0 1.5 2.0 FTEHoursperTBinUse Isilon Traditional * Source: “Quantifying the Business Benefits of Scale-Out NAS Solutions,” IDC White Paper, November 2011
  • 27. 28© Copyright 2013 EMC Corporation. All rights reserved. Reduces Big Data storage costs by 40% The Cost Advantage of Scale-Out $0 $500 $1,000 $1,500 $2,000 $2,500 Traditional Isilon Average Annual Cost Per TB in Use OPEX IT Staff CAPEX Source: “Quantifying the Business Benefits of Scale-Out NAS Solutions,” IDC White Paper, November 2011

Notes de l'éditeur

  1. High Performance Computing has influenced and changed the way we manage our scientific endeavours in the UK and beyond. The evolution of how we use scale-out compute infrastructure also affects the way we store data as well. Traditional islands of data storage used in previous eras cannot scale to solve the current challenges of bioinformatics, complex scientific simulations, and technical innovation. Scaling-out is the only way to manage the size of the problems that are being solved today. UK case studies in research and technology and related opportunities to be discussed.
  2. Note to Presenter: View in Slide Show mode for animation.We hear a lot about Big Data, but sometimes the definition isn’t clear. Here is a useful definition of Big Data from Wikipedia: Big Data is data that challenges the capabilities of a system to capture, manage, and process it within a tolerable elapsed time.In the context of today’s presentation, two key attributes that we’ll be discussing is the volume of data and the composition of the data. In terms of “volume,” we’ll focus on the multi-terabyte to multi-petabyte range. And for “composition,” we’ll focus primarily on unstructured, file-based data. In this context, Big Data includes audio, video, graphics, images, and enterprise file data sets such as office files, home directories, VMDKs, and large-scale file archives. Isilon supports all kinds of unstructured and file-based data.
  3. http://www.nhsconfed.org/priorities/political-engagement/Pages/NHS-statistics.aspx source for 88 million outpatientsIf ¼ of the patients opted for genomic analysis due to a possible genetic factor in their health, this would factor to over an exabyte of storage. To put that into perspective, that’s about 10 days worth of data processing that all of the servers at Google compute daily – extrapolating for data growth from (http://techcrunch.com/2008/01/09/google-processing-20000-terabytes-a-day-and-growing/)
  4. Here’s an example of one of these next generation sequencing machines. It’s beautiful, and can output a lot of useful data that scientists can sift through and discover meaning out of the data.
  5. These are prime examples of data-intensive industries where Isilon storage systems have been proven to deliver significant customer benefits: Medical ImagingGene SequencingSeismic Exploration in the Oil & Gas industryVideo & Graphics (Media & Entertainment)Satellite Images Product DevelopmentCompanies in these industries have been the leading edge because large-scale files and unstructured data—Big Data—have caused these firms to adopt innovative storage approaches and embrace Isilon.
  6. Legacy scale-up file systems and volume sizes are inadequate. Leads to multiple file system, hundreds of volumes Increases management overheadLowers capacity efficiencyAdds complexity
  7. Here we see the Dilemma of Scale-Out and Scale-Up in graphic formScalabilityScale-up achieves with Capacity growth only, with limited performance options In contrast, Scale-out provides both Performance and Capacity scalabilityPerformanceWith Scale-Up, we see a true degradation of performance & capacity at scale. In contrast, Scale-Out has true linear predictability
  8. Isilon scale-out NAS is an ideal storage platform for consolidation of your application data.Note to Presenter: Click now in Slide Show mode for animation.We’ll go into these capabilities in more detail later, but here is a summary of a number of important innovations from Isilon:Isilon storage is easy to scale and can support over 20PB of data in a single Isilon clusterUnlike traditional storage alternatives, Isilon storage performance increases linearly with growth in storage capacityIsilon storage is highly efficient;you can achieve over 80 percent storage utilization with Isilon’s scale-out NAS solutionsIsilon’s storage systems are highly resilient and can maintain 100 percent data availability, even with multiple component failures (including disk drives or entire nodes)Isilon provides a comprehensive portfolio of data protection and management software to help you get the full value of your Isilon storage systemsAnd with Isilon, you never need to migrate data again
  9. Note to Presenter: Click now in Slide Show mode for animation.This slide shows how Isilon SmartPools software can help you optimize storage resources with automated tiering.SmartPools is integrated with the Isilon OneFS operating system to allow a single point of management, with a single scalable file system that offer multiple tiers of performance—depending on the data.The automated, policy-based data movement is transparent to the users, and there are no application changes required.
  10. Note to Presenter: View in Slide Show mode for animation.Isilon storage systems are highly resilient and provide unmatched data protection and availability. Isilon uses the proven Reed-Solomon erasure encoding algorithm rather than RAID to provide a level of data protection that goes far beyond traditional storage systems.Here is an example of the flexibility and types of data protection that are standard in an Isilon cluster:With N+1 protection, data is 100 percent available even if a single drive or node fails. This is similar to RAID 5 in conventional storage.Note to Presenter: Click now in Slide Show mode for animation.N+2 protection allows two components to fail within the system, similar to RAID 6.With N+3 or N+4 protection, three or four components can fail, keeping the data 100 percent available.Isilon FlexProtect is the foundation for data resiliency and availability in Isilon storage solutions.Legacy “scale-up” systems are still dependent on traditional data protection. They typically use traditional RAID, which consumes 30 to 50 percent of the available disk capacity. The time to rebuild a RAID group after a drive failure continues to increase with drive capacity, and data loss is susceptible to a two-disk failure.Isilon’s industry-leading data protection will provide 100 percent accessibility to data with one-, two-, three-, or four-node failures in a pool. And, data protection levels can be established on a file, directory, or file system level so all data can be treated independently—meeting SLAs based on the application or type of data.And due to the distributed yet symmetric nature of the cluster, all nodes participate in accelerating the restoration of the portions of files from a failed drive. As the cluster grows, the rebuild times become faster and more efficient, making the adoption of larger-capacity drives very simple. With Isilon, a drive replacement can be rebuilt quickly—the larger the storage system, the faster. And in Isilon solutions, drives are hot pluggable and hot swappable with no downtime.
  11. With Isilon, you can streamline your storage infrastructure by consolidating large-scale file and unstructured data assets, eliminating silos of storage. Platform REST API: Isilon solutions incorporate a platform REST (representational state transfer) API to provide you and third-party ISVs with a robust control interface to the Isilon OneFS operating system for further automation, orchestration, and provisioning of your Isilon storage cluster.VMware integration: Isilon storage solutions readily integrate with your VMware environment and incorporate VMware VAAI and VASA APIs to simplify storage management in your virtualized IT environment. Multi-protocol support: Isilon scale-out NAS includes integrated support for a wide range of industry-standard protocols, including NFS, SMB, HTTP, FTP, and native Hadoop HDFS to: Simplify your business analytics initiativesSimplify and consolidate workflowsIncrease flexibilityGet more value from your enterprise applications and data These levels of interoperability help you leverage your large data assets more flexibly with a broad range of applications and workloads, and across a diverse IT infrastructure environment.
  12. Isilon storage systems are extremely easy to use. This “simple to manage” approach translates into a significant cost savings for you.A recent IDC white paper details Isilon’s cost advantages for enterprise environments. As shown in the graphic on the left, IDC investigated the relative amount of time needed by IT professionals to perform a wide range of data and storage management functions (listed on y axis) for Isilon as well as traditional storage systems.Isilon storage is easier to manage and requires less time. The study showed that with Isilon scale-out NAS, enterprises were able to increase IT productivity by 48 percent and thereby reduce OpEx (operating expenditures).
  13. The IDC study also found that as a result of Isilon storage systems’ unmatched efficiency—over 80 percent storage utilization—organizations were able to reduce CAPEX (capital expenditures) significantly.With the reduced CapEx and increase in IT productivity, enterprise customers were able to reduce their overall storage costs by 40 percent with Isilon scale-out NAS (compared to traditional storage systems).