Contenu connexe Similaire à DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era (20) Plus de inside-BigData.com (20) DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era1. ddn.com©2013 DataDirect Networks. All Rights Reserved.
2013 ISC HPC User Group
Massively-Scalable Platforms and Solutions
Engineered for the Big Data and Cloud Era
June, 2013
Dr. James Coomer
Senior Technical Advisor
2. ddn.com©2013 DataDirect Networks. All Rights Reserved.
HPC Drives DDN Forward
2
DDN’s Multi-Year Investment in HPC & Exascale Computing
€75,000,000Exascale Investment
2013 Academic Research Prize
€75,000WARP Prize
A DOE FastForward I/O Partner
3. ddn.com©2013 DataDirect Networks. All Rights Reserved.
The High Performance Leader, Again.
3
System Performance: 1TB/s+
Capacity: 40.3PB (raw)
File System: Lustre®
I/O Platform: 36 x DDN SFA12K-40
Media: 20,160 HDDs
DDN and ORNL Are Building
The World’s Fastest Storage
System; Supporting Titan
4. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DDN | The “Big” In Big Data
800%
PayPal accelerates
stream processing
and fraud analytics
by 8x with DDN,
saves $100Ms.
1TB/s
The world’s fastest
file system, to power
the US’s fastest
supercomputer, is
powered by DDN.
Tier 1
Tier1 CDN accelerates
the world’s video traffic
using DDN technology
to exceed customer
SLAs.
If any company is well poised to take on the challenges of exascale
computing and big data, it's DDN, since this is its heritage.
451 Group
5. ddn.com©2013 DataDirect Networks. All Rights Reserved.
HPC Drives DDN’s Success
2011
2014
$7B
$12B
19% Y/Y Growth
Commercial HPC
LifeSci, FSI, O&G, MFG
Web/Cloud
Service Providers
Media HPC Labs &
Universities
Government &
Digital Security
6. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DDN | The Technology Behind The World’s
Leading Data-Driven Organizations
HPC &
Big Data Analysis
Cloud &
Web Infrastructure
Content Security
7. ddn.com©2013 DataDirect Networks. All Rights Reserved.
Multi-Dimensional Product Portfolio
Core Storage
SFA: The World’s
Fastest Core Storage
Appliances
Analytics
hScaler:
High-Speed Hadoop
Infrastructure
Object Storage
WOS: Cloud-Enabled Storage,
Real-time Collaboration, Distributed
Data Protection, Massive Scale
8. ddn.com©2013 DataDirect Networks. All Rights Reserved.
The Broadest Portfolio For Big Data
Application-Defined Storage for Ingest, Processing & Distribution
Traditional Storage Access
▶ Parallel File Storage & NAS
▶ MapReduce Computing
▶ Data Warehousing
▶ In-Memory Computing
Cloud Storage Access
▶ Web Services Storage
▶ Archival & Archive Analytics
▶ Content Distribution
▶ Disaster Recovery
Integrated Big Data Platforms
Scale OutScale Out
Cloud Storage Tiering
ScaleUp
ScaleUp
Blocks | GridScaler | NAS | Lustre® | Hadoop™ Objects | NAS | CDN
9. ddn.com©2013 DataDirect Networks. All Rights Reserved.
Application-Defined Storage Intelligence
DDN’s Big Data Storage Access
Parallel File System Namespace
DDN GridScaler | Lustre® | Hadoop
API
API
10. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DDN | Storage Fusion Architecture (SFA)
Software Stack
Accelerating Big Data and Cloud, Optimizing TCO
Over 1 Million Lines of S/W Code – First Customer Shipped 2008
Designed Specifically for Big Data and Cloud Workloads
Parallel State-Machine Design
Maximum Performance, Lowest
Latency
Virtualized Processing
Optimized Environment for Big Data
Application Hosting
Robust Data Protection
Quality of Service and Performance
Without Compromise
Flexible & Massively Scalable
Best-In-Class Scalability and Density
Storage Fusion Architecture™
[Core Storage S/W Engine]
In-Storage Processing™ Engine & DMA Driver
DirectMon™:InfrastructureManagement
Low-Latency Connect: FC, IB, Memory
Real-Time, Interrupt-Free Storage Processing
ReACT™ Adaptive Cache Technology
DirectProtect™ Data Integrity Management
Quality of Service Engine
Storage Fusion Fabric™
Storage Fusion Xcelerator (SFX™) Flash Caching
EXAScaler™
Lustre® Storage
GRIDScaler™
hScaler™
Hadoop/HDFS
11. ddn.com©2013 DataDirect Networks. All Rights Reserved.
SFA12K-40 Performance Summary
EXAScaler Write Read
Raw Device 32.6 GB/s 39.9 GB/s
EXAScaler (obdfilter) 28.5 GB/s 33.4 GB/s
RAID 6; DirectProtect DIF: ON; ReACT: On; 1MB IOs
GRIDScaler Write Read
Raw Device 32.6 GB/s 39.9 GB/s
GridScaler (IOR) 32.3 GB/s 35.6 GB/s
RAID 6; DirectProtect DIF: ON; ReACT: On; 4MB IOs
The World’s Fastest HPC Storage Foundation
12. ddn.com©2013 DataDirect Networks. All Rights Reserved.
SFA7700 | Platform Scaling
12
2-60 Drives in Base System 2-120 Drives with 1
Expansion Enclosure
2- 300 drives with 4xSS7000 Enclosures
2-396 drives with 4xSS8460 Enclosures
(Next Release)
180 TBs in 4U with 3 TB drives
240 TBs in 4U with 4 TB drives
360 TBs in 8U with 3 TB drives
480 TBs in 8U with 4 TB drives
1.2 PBs in 20U with 3 TB drives
1.5 PBs in 20U with 4 TB drives
7 GB/sec 10.2 GB/sec 10.2 GB/sec
Shown with
SS7000
Enclosures
Fully Integrated
Hybrid Appliance
Appliance with 4
additional enclosures
Pay as you go scalability
Appliance with 1
additional enclosure
13. ddn.com©2013 DataDirect Networks. All Rights Reserved.
sgpdd-survey
Pliant SSD
0
500
1000
1500
2000
2500
1 2 4 8 16 32 64 128 256 512 102420484096
Throughput(MB/sec)
Total number of thread
sgpdd-survey(SSD, RAID6, write)
crg=1 crg=2 crg=4 crg=8 crg=16
crg=32 crg=64 crg=128 crg=256
0
500
1000
1500
2000
2500
3000
3500
4000
4500
1 2 4 8 16 32 64 128 256 512 102420484096
Throughput(MB/sec)
Total number of thread
sgpdd-survey(SSD, RAID6, read)
crg=1 crg=2 crg=4 crg=8 crg=16
crg=32 crg=64 crg=128 crg=256
14. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DRAM Cache
SFX & ReACT
The Many Dimensions Of SFX Acceleration
14
HDD TierSFX Write*
PartialWritestoRAM
AlignedWritesToFlash
ReACT
Cache Flush
Cache Fill
SFX Read
Instant Commit
Context Commit API
In-Band * Out Of Band
* EoY 2013
15. ddn.com©2013 DataDirect Networks. All Rights Reserved.
The Power of Hybrid Storage, Today.
A Simple, Current Performance Case Study
15
Without SFX With SFX
Goal 20 GB/s
…..
400 NL SAS Drives 40 SSDs + 200 NL SAS Drives
+. . . . ….. .
…..
…..
…..
. . . … .
…..
Goal 20 GB/s
. . .
…..
Mono Hybrid Gain
Drives 400 HDD 40 SSD; 200HDD -
Power 4,400W 2,420W 45% Power Consumption Gain
Data Center 28U 16U 42% Reduction in Footprint
Cost (SRP) $496K $379K 25% Cost Advantage
As NVRAM Prices Decline & Concurrency Compounds, The Benefits of Hybrid Grow
16. ddn.com©2013 DataDirect Networks. All Rights Reserved.
In-Store Processing Virtualization
DDN Hypervisor Minimizes Latency & Saves on TCO
Multi-core CPU Application Processor (AP)
Back-End
SAS HBAsFileServer
Dedicated
I/O Bridge
Multi-core CPU RAID Processor (RP)
Memory Pointers
(Virtual Disks)
Multi-Threaded Rea-Time
RAID Engine, Hypervisor
Dedicated
I/O Bridge
Cache
Memory
InfiniBand Client Ports
Ethernet Client Ports
High Speed Bus
FileServer
FileServer
Application
Memory
Virtual Disk
Block Driver
Dedicated
PCI-e I/O
……
Many Ways To Save:
• Data Center Space
• Latency
• File System Licenses
• Management Overhead
• Networking
17. ddn.com©2013 DataDirect Networks. All Rights Reserved.
SFA12K™-20E
Parallel File Storage Appliances
• SFA12K-20E available with DDN | EXAScaler™ and
DDN | GRIDScaler™ parallel file storage solutions
• Integrate multiple appliances to scale to over 1000GB/s
and 10’s of petabytes
EXAScaler
SFA12K-20E
20GB/s
Up To 5.3PB*
Usable capacity
GRIDScaler
SFA12K-20E
20GB/s
Up To 5.3PB*
Usable capacity
* - Initial release limited to 840 Drives
18. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DDN | Web Object Scaler (WOS®)
Software Stack
Enabling Real-Time Global Collaboration
Web-Scale, High-Performance Cloud Storage Appliances
99% Efficiency, Petabyte-Class Peer-to-Peer Object Storage
ObjectAssure™ Erasure CodingReplication Engine
WOS Policy Engine
De-clustered Data Management & Fast Rebuild
Self-Healing Object Storage Clustering
Latency-Aware Access Manager
WOS Core: Peer-to-Peer Object Storage
WOSClusterManagementUtility
Connectors
Limitless Scale & Speed
Eliminates limitations of traditional
file systems - access data at
millions of objects per second
Store Objects Intelligently
User-Defined Metadata allows
customers to understand their data
Global, Peer: Peer
Distribute data across 100s of sites
in one namespace
Self-Healing
Intelligent Data Management system
recovers from failures rapidly and
autonomously
User-Defined Metadata
NFS & CIFS
GRIDScaler HSM
Android, iOS & S3
Multi-Tenancy
Layer
WOS API
C++, Java, Python,
PHP, HTTP,
HTTP CDN Caching
EXAScaler HSM
19. ddn.com©2013 DataDirect Networks. All Rights Reserved.
Introducing DDN’s hScaler Appliance
Factory-Delivered Enterprise Hadoop Appliance
Engineered for Speed: 7X Faster Than Commodity Compute
Integrated, Turnkey Management & ETL Toolset
Efficient Compute + RDMA Storage
56Gb/s Data Access; Better Than DAS Performance
100-2000 Nodes, One Hadoop Cluster Appliance
Fully Tuned, No Optimization Needed, Embedded ETL
Offload & Accelerate HDFS Data Management
Powered By DDN’s SFA Appliances
40GB/s; 1.4M Sustained SSD IOPS in Real-Time
Scale Capacity & Performance Independently
Efficient Compute + RDMA Storage
840 Disks Per Rack
No Cluster CPU Penalty for Rebuild
25% Disk Storage Overhead vs. 200% With Commodity
20. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DirectMon™ | Simple, Scalable Appliance UI
DDN | DirectMon™
Storage Management
Made Simple
▶ A powerful, intuitive single
pane of glass to monitor &
manage your environment
▶ Simplify the administration
of DDN SFA and hScaler*
environments
▶ Leverages DDN SFA Mgmt API
21. ddn.com©2013 DataDirect Networks. All Rights Reserved.
Concurrency: A key exascale challenge
2015 2018
Performance (TF) 20000 1000000
Concurrency 5000000 1000000000
0
200000000
400000000
600000000
800000000
1E+09
1.2E+09
1000
10000
100000
1000000
10000000
100000000
Future Gains In Performance Are At The Expense of Concurrency
Exascale Computing Can Only Be Achieved With Major Advances in
Applications, HPC I/O Middleware, Object Stores and Tiered Storage
22. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DDN’s Exascale Drivers
22
Scale:
Concurrency in the range of (O)1B Threads
Consistency at the Exabyte Level is hard
Much can be learned from Web2.0
Affordability:
Disk-Only Technology Projected To Cost $100Ms
Flash-Only Approaches Will Cost Even More
Megawatts Are Also Expensive
Efficiency:
Single-Media Solutions Will Require Megawatts
Analytic Toolsets Can Make Apps Information-Aware
New Approaches to Defensive I/O Pose Opportunity
1018
23. ddn.com©2013 DataDirect Networks. All Rights Reserved.©2013 DataDirect Networks. All Rights Reserved. ddn.com
Status Quo: Use Disk Based Shared Global
ll l l dParallel File System to Provide Dump Space
Notice that using
these modeling
parameters, we
finally reach the
predicted cross
over point ofover point of
buying disk for
BW and not
Capacity in 2012
Buying disk for
capacity is reasonably
priced but buying disk
2018 medium memory machine
•4166 IO nodes, 175k disks
•File System sees 50 100k way parallelism (assumes IOFSL)
$ priced but buying disk
for bandwidth gets
expensive fast!
•$225M pessimistic purchase (assumes no technologies
pushing disk other than Flash)
•Power 1.5MWatts Miracle Needed!
Source:LANL
24. ddn.com©2013 DataDirect Networks. All Rights Reserved.©2013 DataDirect Networks. All Rights Reserved. ddn.com24
Use MLC Based Shared Global Parallel
l dFile System to Provide Dump Space
Notice that
buying MLC for
capacity is
expensive but
buying it for
Bandwidth isBandwidth is
cheaper
2018 medium memory machine
•4166 IO nodes
•File System sees 50 100k way parallelism (assumes IOFSL)
•$625M pessimistic purchase (assumes no technologies•$625M pessimistic purchase (assumes no technologies
pushing disk other than Flash)
•Power 2.5MWatts (have to buy so much to get capacity)
Miracle Needed!
Source:LANL
25. ddn.com©2013 DataDirect Networks. All Rights Reserved.©2013 DataDirect Networks. All Rights Reserved. ddn.com25
Lets Try to Buy Disks for Capacity and
f d d h b d d lMLC for Bandwidth == Hybrid Model
•Twin tailed non global MLC
connected to NN compute nodes Nconnected to NN compute nodes N
I/O Nodes, compute nodes dump to
MLC at 10% MTTI time and IO nodes
bleed to global disk without causing
ji 1/10th h d b djitter at 1/10th the dump burst data
rate or less
•3 memory dumps in MLC
•30 dumps in global disk•30 dumps in global disk
Source:LANL
26. ddn.com©2013 DataDirect Networks. All Rights Reserved.
DRAM Cache
SFX & ReACT
The Many Dimensions Of SFX Acceleration
26
HDD TierSFX Write SFX Read
Burst Buffer
File System
27. ddn.com©2013 DataDirect Networks. All Rights Reserved.
We’re Just Getting Started…
27
NVM
Storage Fusion
Xcelerator
Storage Fusion
Architecture
Parallel File Storage Active Cloud-Enabled Archive
SFA: Hybrid Flash Arrays WOS Object Storage
Appliances
SoftwarePlatformHardwarePlatform
Buffer Parallel File System Archive / Cloud
TBA
TBA
Distributed Caching Layer
Lustre®
28. ddn.com©2013 DataDirect Networks. All Rights Reserved.
End-to-End Architecture
28
Buffer + FS + Archive + Cloud
A Fully Integrated Exascale I/O Platform To Minimize The Cost of Big Data
Computing & Real-Time Analytics
Our opportunity resides in addressing the end-end efficiency and
scalability challenge at 1018…
… we’re thinking BIG! Stay Tuned.