SlideShare a Scribd company logo
1 of 31
Ceph at DreamHost
A Storage Journey
About Me
• One of the original four of DreamHost
• Still active daily at DreamHost
• Have spent a lot of time working on the
Ops side.
• Hosting company founded in 1997
• Sage’s other company
• shared hosting, virtual
servers, dedicated servers, cloud
storage, cloud computing
• 375k customers, 1.3MM websites
Storage Journey
A long strange trip
His name was Destro
... and then there
were more.
The First NetApp
Remote Failover
Remote Failover
Meanwhile...
... and still more.
Lots of NetApps
• Peak of around 125 individual NetApps
• Smallish capacity on each (8TB)
• Internal software continuously moving
data between NetApps
• Lots of time spent managing nearly full
filers
Ideal
Reality
Hosting Landscape
• Included storage had grown from 50MB
to gigabytes, then terabytes.
• Prices stayed the same.
• Eventually went to unlimited Storage
• Usage per customer skyrocketed.
Failed Experiments
Failed Experiments
• ATAoE and XFS-based
systems
• Performance &
Stability issues
• 2006 era gear
Failed Experiments
• High capacity
• Nice features
• Expensive
• 85% full and it
failed
Some Success
• First on Sun hardware
then Supermicro
• Great stability
• Not enough IO for
front-line network
storage
Back to Basics
Local RAID
• SATA drives had grown in capacity and
were very cheap
• 4-6TB per hosting server
• Less dependence on congested
network
• Smaller failure domains
The Good
Local RAID
• No more quota, too slow to scan
filesystem
• No more fast failovers
• Multiple hour filesystem check with ext3
• More failure domains
The Bad
Local RAID
• Complete RAID loss more common
than anticipated
• Multiple days to fully restore from
backup
The Ugly
Storage Today
Light at the end of the tunnel
Hybrid Mix
• We learned something from every step
of the way
• No one size fits all when it comes to
storage
• Use whatever is best for the job
• Be ready to change
Best Tool For The Job
A Bit of Everything
• Clustered NetApps and NFS for email
• Local RAID in hosting servers
• ZFS and OpenSolaris backup servers
• Ceph for DreamObjects and
DreamCompute
Best Tool For The Job
• Object Storage, S3/Swift compatible
• 2+ Petabytes raw storage
• 3x replication, 900+ OSDs
• RGW behind HAProxy
• Row, rack, node and disk fault tolerant
• OpenStack-based Public Cloud
• 3+ Petabytes raw storage
• All storage is on Ceph RBD
• Boot and Attachable Volumes
• Nicira SDN + Ceph, Live Migration
HA Load Balancer
MySQL / PostgreSQL
Horizon
Cockpit Pod
Glance
Keystone
Nova
Quantum
Cinder
Nicira NVP
Glance Store (Ceph)
OSMirrors (apt)
Ceph Monitors
Opscode Chef
Logstash + Graphite
Networking Gear
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Node
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
Pods
• 512 cores
• 1.5TB of RAM
• 504TB raw storage
• 168TB redundant storage
N etworking
• ODM switches w/ Linux
• 10Gbps everywhere
• IPv6 from the ground up
• Spine and leaf topology
• 120 Gbps between pods (!)
The Internets
Thar be dragonshere!
Nicira NVP Nicira NVP NiciraNVP
CephFS & The Future
• The return of Failovers
• No more backup servers
• No more major disk-related outages
• Fault tolerant low cost hosting
Storage Panacea?
Thanks!
@dallas
dallas@dreamhost.com

More Related Content

What's hot

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Dave Pitts
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HashiCorp User Group
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhCeph Community
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtJens Hadlich
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Amazon Web Services
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewJens Hadlich
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfRedis Labs
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)Michael Pirker
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSJohn McCormack
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubJon Benson
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014Ian Colle
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services InnoTech
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...DevOpsDays Tel Aviv
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
 

What's hot (20)

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at Spreadshirt
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural Overview
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWS
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHub
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
 

Viewers also liked

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Community
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Community
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Community
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization Ceph Community
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress Ceph Community
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNCeph Community
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph Community
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuCeph Community
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiCeph Community
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph Ceph Community
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Ceph Community
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph Community
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Ceph Community
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesCeph Community
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarCeph Community
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Renato Cruz
 

Viewers also liked (20)

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wild
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance Networks
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage Weil
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with Juju
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph Features
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015
 
62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilitycherryhillco
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowEd Balduf
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance DrupalChapter Three
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudAnshum Gupta
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2ScribbleLive
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecturesaipriyacoool
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksDirk Harms-Merbitz
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted CloudColin Charles
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_finalWeb2Present
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudOSCON Byrum
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost (20)

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance Drupal
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricks
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted Cloud
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_final
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data Cloud
 
Redis
RedisRedis
Redis
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Ceph Day Santa Clara: Ceph at DreamHost

  • 1. Ceph at DreamHost A Storage Journey
  • 2. About Me • One of the original four of DreamHost • Still active daily at DreamHost • Have spent a lot of time working on the Ops side.
  • 3. • Hosting company founded in 1997 • Sage’s other company • shared hosting, virtual servers, dedicated servers, cloud storage, cloud computing • 375k customers, 1.3MM websites
  • 4. Storage Journey A long strange trip
  • 5. His name was Destro
  • 6. ... and then there were more.
  • 11. ... and still more.
  • 12. Lots of NetApps • Peak of around 125 individual NetApps • Smallish capacity on each (8TB) • Internal software continuously moving data between NetApps • Lots of time spent managing nearly full filers
  • 13. Ideal
  • 15. Hosting Landscape • Included storage had grown from 50MB to gigabytes, then terabytes. • Prices stayed the same. • Eventually went to unlimited Storage • Usage per customer skyrocketed.
  • 17. Failed Experiments • ATAoE and XFS-based systems • Performance & Stability issues • 2006 era gear
  • 18. Failed Experiments • High capacity • Nice features • Expensive • 85% full and it failed
  • 19. Some Success • First on Sun hardware then Supermicro • Great stability • Not enough IO for front-line network storage
  • 21. Local RAID • SATA drives had grown in capacity and were very cheap • 4-6TB per hosting server • Less dependence on congested network • Smaller failure domains The Good
  • 22. Local RAID • No more quota, too slow to scan filesystem • No more fast failovers • Multiple hour filesystem check with ext3 • More failure domains The Bad
  • 23. Local RAID • Complete RAID loss more common than anticipated • Multiple days to fully restore from backup The Ugly
  • 24. Storage Today Light at the end of the tunnel
  • 25. Hybrid Mix • We learned something from every step of the way • No one size fits all when it comes to storage • Use whatever is best for the job • Be ready to change Best Tool For The Job
  • 26. A Bit of Everything • Clustered NetApps and NFS for email • Local RAID in hosting servers • ZFS and OpenSolaris backup servers • Ceph for DreamObjects and DreamCompute Best Tool For The Job
  • 27. • Object Storage, S3/Swift compatible • 2+ Petabytes raw storage • 3x replication, 900+ OSDs • RGW behind HAProxy • Row, rack, node and disk fault tolerant
  • 28. • OpenStack-based Public Cloud • 3+ Petabytes raw storage • All storage is on Ceph RBD • Boot and Attachable Volumes • Nicira SDN + Ceph, Live Migration
  • 29. HA Load Balancer MySQL / PostgreSQL Horizon Cockpit Pod Glance Keystone Nova Quantum Cinder Nicira NVP Glance Store (Ceph) OSMirrors (apt) Ceph Monitors Opscode Chef Logstash + Graphite Networking Gear 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Node 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod Pods • 512 cores • 1.5TB of RAM • 504TB raw storage • 168TB redundant storage N etworking • ODM switches w/ Linux • 10Gbps everywhere • IPv6 from the ground up • Spine and leaf topology • 120 Gbps between pods (!) The Internets Thar be dragonshere! Nicira NVP Nicira NVP NiciraNVP
  • 30. CephFS & The Future • The return of Failovers • No more backup servers • No more major disk-related outages • Fault tolerant low cost hosting Storage Panacea?