Brent Compton and Kyle Bader of Red Hat took the stage at Red Hat Storage Day New York on 1/19/16 to share with attendees best practices and lessons learned for architecting solutions with Red Hat Ceph Storage.
2. CLUSTER BUILDING BLOCKS
STANDARD SERVERS AND MEDIA (HDD, SSD, PCIE)
STANDARD NICS AND SWITCHES
WORKLOADS
ACCESS
PLATFORM
NETWORK
CEPH STORAGE CLUSTER
CEPH BLOCK & OBJECT CLIENTS
3. 1. Qualify need for scale-out storage
2. Design for target workload IO profile(s)
3. Choose storage access method(s)
4. Identify capacity
5. Determine fault-domain risk tolerance
6. Select data protection method
Target
Cluster
Architecture
CLUSTER DESIGN CONSIDERATIONS
5. OpenStack Starter
100TB
S
500TB
M
1PB
L
2PB
IOPS
OPTIMIZED
2-4x PCIe/NVMe slot servers (PCIe)
12x 2.5” SSD bay servers (SAS/SATA)
THROUGHPUT
OPTIMIZED
12-16x 3.5” bay servers
24-36x 3.5” bay
servers
24-36x 3.5” bay
servers
COST-
CAPACITY
OPTIMIZED
60-72x 3.5” bay
servers
BROAD SERVER SIZE TRENDS
6. OpenStack Starter
100TB
S
500TB
M
1PB
L
2PB
IOPS
OPTIMIZED
• Ceph RBD (block)
• OSDs on all flash media (SATA SSD or PCIe)
• High-bin, dual-socket CPU
• 2x replication w/ backup or 3x replication
• Multiple OSDs per drive (if PCIe)
THROUGHPUT
OPTIMIZED
• Ceph RBD (block) or RGW (object)
• OSDs on HDD media with dedicated SSD write journals (4:1 ratio)
• Mid-bin, dual-socket CPU (single-socket adequate, servers <=12 OSDs)
• 3x replication (RBD/RGW read intensive) or erasure-coded (RGW write-intensive)
• High-bandwidth networking, >10Gb (for servers with >12 OSDs)
COST-
CAPACITY
OPTIMIZED
• Ceph RGW (object)
• OSDs on HDD media (write journals co-located on HDDs)
• Mid-bin, single-socket CPU (dual-socket, servers >12 OSDs)
• Erasure-coded data protection (v. replication)
BROAD SERVER CONFIGURATION TRENDS
7. Elastic provisioning across storage server cluster
Standardized servers and networking
Petabyte scale: 10s, 100s, or 1000s of servers/cluster
Data HA across ‘islands’ of scale-up storage servers
Performance and capacity scaled independently
Incremental vs. forklift upgrades
STEP 1: QUALIFY NEED FOR SCALE-OUT
STORAGE
8. Performance vs. ‘cheap-and-deep’?
Performance: throughput vs. IOPS intensive?
Small block vs. large block?
Sequential vs. random IO?
Read vs. write mix?
Latency: absolute vs. consistency targets?
STEP 2: DESIGN FOR TARGET WORKLOADS
9. DISTRIBUTED FILE* OBJECT BLOCK**
CEPH STORAGE CLUSTER
* Support for CephFS is not yet included in Red Hat Ceph Storage
** RBD supported with replicated data protection only
STEP 3: CHOOSE STORAGE ACCESS METHODS
11. How much cluster capacity can you tolerate on one node?
• With fewer nodes in the cluster, performance will be more degraded during recovery
• Each node must devote a greater % of its compute/IO utilization to recovery operations
• With fewer nodes in the cluster, maximum node utilization is limited
• Each node must contribute a greater % of its reserve capacity for backfill/recovery operations
Guidelines:
• Minimum supported (Red Hat Ceph Storage): 3 OSD nodes per cluster
• Minimum recommended (performance cluster): 10 OSD nodes per cluster
• 1 node represents <10% of total cluster capacity
• Minimum recommended (cost/capacity cluster): 7 OSD nodes per cluster
• 1 node represents <15% of total cluster capacity
STEP 5: DETERMINE FAILURE RISK TOLERANCE
12. STEP 6: SELECT DATA PROTECTION METHOD
Replication
• Data is copied n times and spread onto different disks on different servers
• Clusters can tolerate n-1 disk failures without data loss
• 3 replicas is a popular configuration
Erasure Coding (analogous to network RAID)
• Data is encoded into k chunks with m parity chunks and spread onto different
disks on different servers
• Clusters can tolerate m disk failures without data loss
• 8+3 k+m is a popular configuration
This decision will affect the initial cost of your cluster more than any other.
13. 1. Qualify need for scale-out storage
2. Design for target workload IO profile(s)
3. Choose storage access method(s)
4. Identify capacity
5. Determine fault-domain risk tolerance
6. Select data protection method
Target
Cluster
Architecture
CLUSTER DESIGN CONSIDERATIONS
14. RESOURCES
Ceph on Supermicro Performance & Sizing Guide
http://www.redhat.com/en/resources/red-hat-ceph-storage-clusters-supermicro-storage-servers
Ceph on Cisco UCS C3160 Whitepaper
http://www.cisco.com/c/en/us/products/collateral/servers-unified-computing/ucs-c-series-rack-
servers/whitepaper-C11-735004.html
Ceph on Scalable Informatics Whitepaper
https://www.scalableinformatics.com/assets/documents/Unison-Ceph-Performance.pdf
15. RED HAT STORAGE TEST DRIVES
Test drive:
bit.ly/glustertestdrive
Test-drive:
bit.ly/cephtestdrive