SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Building a highly available cloud with Ceph
and Apache CloudStack
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Who am I?
• Wido den Hollander (1986)
• Owner and founder of 42on.com, Ceph Training and Consultancy Company
• Co-owner and CTO @ PCextreme B.V. (Dutch hosting company)
• Developed the Ceph (RBD) integration for libvirt storage drivers and Apache
CloudStack
• Wrote PHP and Java bindings for librados
• Ceph Board member
• Apache CloudStack PMC member
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Ceph
Being technical and owning a hosting company you quickly get frustrated with storage. It is
expensive, difficult and just annoying. Storage is just always the problem.
While searching the internet I found this little project called Ceph still being hosted in
http://ceph.newdream.net/ and there was this guy Sage publishing some tarballs.
At that time, late 2009, there was version 0.18 and I gave it a try. It worked!
Ever since I've been involved with the project. I write code, patches, documentation and I am a
member of the board.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Apache CloudStack
From my role as CTO at PCextreme I first came into contact with CloudStack shortly after Citrix
bought cloud.com in 2012.
I travelled to the Citrix office in California for their first cloud.com/CloudStack conference at which
they announced donation to the Apache Foundation.
Ever since I've been working with CloudStack as a committer and PMC member. It is a vital part
of my company.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
PCextreme
Started the company when I was 16 from home. It has now grown to a company with 30
employees and runs thousands of servers.
Dutch hosting company focused on:
• Traditional webhosting
• Cloud computing
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Ceph and CloudStack
At PCextreme we offer our cloud services under the name Aurora.
We offer various products under this umbrella:
• Compute (Virtual Machines)
• DNS (Intelligent DNS service)
• Objects (Amazon S3 compatible service)
All services are available over both IPv4 and IPv6.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Aurora Compute
We offer our Compute service in two different flavors:
• Stamina
• Agile
Both are managed by Apache CloudStack (KVM hypervisor), but only Stamina uses Ceph.
Our services are offered from different locations in the world:
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
• Amsterdam (main location with 3 datacenters)
• Barcelona
• Miami
• Tokyo
• Los Angeles
• Antwerp (Soon)
• Frankfurt (Soonish)
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Compute: Stamina
Virtual Machines running on Stamina run in High Available mode by using Ceph with 3x data
replication for its storage.
Should a Hypervisor fail CloudStack will start the VM on a different hypervisor.
In case of a failure of a Ceph node (or disk) the high availability of Ceph makes sure all data stay
available and is safely stored.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Compute: Stamina
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Compute: Agile
Agile Virtual Machines are deployed on hypervisors with local SSD storage (Samsung PM863a
SATA and Samsung PM963 NVMe) and are not Highly Available.
There is no RAID nor any form of data replication for this Virtual Machines.
This offering is for customers who build their application cloud aware and can loose a Virtual
Machine.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Compute: Agile
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Apache CloudStack
For those not familiar with CloudStack and its terminology a quick introduction:
CloudStack uses a concept of a Management server (or a cluster of them) which manages a
Agent running on a KVM, Xen, VMWare of Hyper-V hypervisor.
These Agents manage Primary Storage (virtual disks) and Secondary Storage (templates and
snapshots).
In addition there are Console Proxy, Secondary Storage and Virtual Router Virtual Machines
which are all under control of the Management Server(s).
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Ceph and CloudStack
When we first (Early 2013) started using CloudStack there was no integration between
CloudStack and Ceph's block device RBD.
I wrote the integration in libvirt and the Java bindings in order to support Ceph storage for KVM in
CloudStack.
Let's say that the road to our current sitation was a interesting one, we learned a few lessons:
• Do not use HDDs behind VMs. Customers will always complain about performance.
• Apply I/O limitations to all customers. A small group of customers will eat all IOps.
• KISS! Overcomplicating things resulted in hickups because we tried to make it to fancy.
• Failure domains! Yes, large environments are financially very attractive, but they will go
down at some point and all customers start calling.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
HDDs vs SSDs in the cloud
At first we focused on a low price per GB as we thought that would attract customers. And it did.
We soon learned that customers expect unlimited IOps with the Gigabytes they are provided.
HDDs are slow, very, very slow compared to SSDs. Although SSDs are more expensive per GB
they are very cheap for the IOps they provide.
The latter is actually the most important one we found out.
Even if you use SSDs for storage you still need to apply I/O limitations to all virtual disks provided
to Virtual Machines.
Customers will use all the IOps they can get. People will run very I/O intensive applications like
databases on their Virtual Machines.
For every GB a customer buys we allocate X amount of IOps to that disk. CloudStack supports
I/O limitations on virtual dsks and are enforced by libvirt/KVM.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Failure domains
Both Ceph and CloudStack scale to very large sizes. Ceph is able to managed thousands of
disks, PetaBytes of storage and even more.
CloudStack can handle hundreds of hypervisors and tens of thousands Virtual Machines.
What we however learned the hard way is that things will fail at some point. Human failure,
software bugs, a combination of them or just bad luck. Murhpy will come along at some point.
We now build moderately sized cloud environments which we glue together in our Control Panel.
The customer is not aware of this abstraction as our Control Panel talks to different CloudStack
Management servers.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Control Panel
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Control Panel
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Our current design
With the lessons learned with running with Ceph and CloudStack for over 5 years now we have
come up with a design which works really well for us.
CloudStack Hypervisors (8 to 12 per cluster):
• Dual Intel Xeon CPU
• 256 ~ 512GB of Memory per node
• Redundant 10Gbit connection
Ceph cluster:
• 3 Monitors
• 15 to 25 1U nodes with 10 SSDs each
• Redundant 10Gbit connection
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Our current design
SuperMicro SYS-1018R-WC0R
• Intel Xeon E5-1650 v3 3.50Ghz 6-core
• 64GB Memory
• 64GB SATA-DOM (Operating System)
• 10x Samsung PM863a 1.92TB (SATA)
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Our current design
All hardware (SuperMicro) is installed in a single rack with two 10Gbit (Arista) Top-of-Rack
switches which also handle the gateways (IPv4 and IPv6).
This allows us to run thousands of Virtual Machines in a single 19" rack with a redundant
32A/7kW power supply.
Scaling larger is technically possible, but by confining things to smaller islands we are able to
spread risks and reduce the load on our staff in case of a outage.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
SSDs
Our Ceph clusters all run on Samsung PM-series SSDs. Most of our clusters run on SATA with
the PM863a SSDs, but we recently ordered our first batch of hardware with the PM963 NVMe
SSDs.
The hardware has been installed, but we were not able to run any tests on them. We do expect a
rather large performance improvement going from SATA to NVMe.
Overall the PM-series have performed very, very well for us. Failure rates is <1% and we haven't
seen any of them wear out while they are being used heavily in a 24/7 application.
As we are very satisfied with the performance and price the Samsung PM-series provide we
have not looked at other vendors.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Our new design
SuperMicro SYS-1029U-TN10RT
• Dual Intel Xeon Silver 4110 Processor 8-core 2.10GHz
• 96GB Memory
• 128GB SATA-DOM (Operating System)
• 10x Samsung PM963 1.92TB (NVMe)
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Software
Our complete infrastructure runs on Ubuntu LTS versions:
• Ubuntu 14.04
• Ubuntu 16.04
CloudStack and Ceph:
• CloudStack 4.9 and 4.10
• Ceph 12.2.X Luminous Release with BlueStore
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Ceph BlueStore
Traditionally Ceph stored all data eventually on a XFS filesystem running underneath the OSD.
This backend is called FileStore This had some serious performance implications as well as
some scaling problems.
With the release of Ceph Luminous (12.2.X) a new storage backend BlueStore was introduced,
which provides:
• More performance by consuming a RAW Block Device
• CRC checking on all data
• Compression
In the last few weeks we migrated all our data (~3PB) from FileStore to BlueStore without any
downtime. We re-installed all machines from Ubuntu 14.04 to 16.04 and at the same time wiped
all OSDs running with FileStore and migrated them to BlueStore.
Looking at statistics we see a latency improvement (lower IO-wait) inside Virtual Machines of
about 20% overall.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
IPv6
All our services are IPv6-ready and enabled. This means that each Virtual Machine deployed on
our infrastructure obtains both a IPv4 and IPv6 address.
Interesting fact is that internally Ceph and CloudStack are IPv6-only. Our hypervisors and Ceph
nodes only have a IPv6 address over which they are managed and controlled.
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Conclusion
We manage tens of thousands of Virtual Machines running with Ceph and CloudStack and we
think it is a winning and golden combination:
• Open Source
• Easy to access projects
• Easy to deploy and maintain
• Stable and fast
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
Thank you!
Thanks for listening, questions?
Contact me:
• Github: wido
• Twitter: @widodh
• LinkedIn: widodh
• E-Mail: wido@pcextreme.nl
Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay

Contenu connexe

Tendances

Tendances (20)

Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov  Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov
 
Paul Angus – Backup & Recovery in CloudStack
Paul Angus – Backup & Recovery in CloudStackPaul Angus – Backup & Recovery in CloudStack
Paul Angus – Backup & Recovery in CloudStack
 
CloudStack usage service
CloudStack usage serviceCloudStack usage service
CloudStack usage service
 
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackAdam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
 
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
Wido den Hollander - 10 ways to break your Ceph cluster
Wido den Hollander - 10 ways to break your Ceph clusterWido den Hollander - 10 ways to break your Ceph cluster
Wido den Hollander - 10 ways to break your Ceph cluster
 
vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28
vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28
vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowski
 
Building clouds with apache cloudstack apache roadshow 2018
Building clouds with apache cloudstack   apache roadshow 2018Building clouds with apache cloudstack   apache roadshow 2018
Building clouds with apache cloudstack apache roadshow 2018
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
Introduction and news
Introduction and newsIntroduction and news
Introduction and news
 
CloudStack - Top 5 Technical Issues and Troubleshooting
CloudStack - Top 5 Technical Issues and TroubleshootingCloudStack - Top 5 Technical Issues and Troubleshooting
CloudStack - Top 5 Technical Issues and Troubleshooting
 
1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockage1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockage
 
Building a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologiesBuilding a Microsoft cloud with open technologies
Building a Microsoft cloud with open technologies
 
Simplifying the Move to OpenStack
Simplifying the Move to OpenStackSimplifying the Move to OpenStack
Simplifying the Move to OpenStack
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in Docker
 
Meshing OpenStack and Bare Metal Networks with EVPN - David Iles, Mellanox Te...
Meshing OpenStack and Bare Metal Networks with EVPN - David Iles, Mellanox Te...Meshing OpenStack and Bare Metal Networks with EVPN - David Iles, Mellanox Te...
Meshing OpenStack and Bare Metal Networks with EVPN - David Iles, Mellanox Te...
 
Whats New in Apache CloudStack Version 4.5
Whats New in Apache CloudStack Version 4.5Whats New in Apache CloudStack Version 4.5
Whats New in Apache CloudStack Version 4.5
 

Similaire à Wido den Hollander - building highly available cloud with Ceph and CloudStack

Similaire à Wido den Hollander - building highly available cloud with Ceph and CloudStack (20)

In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputeIn-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
 
Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E...
 Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E... Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E...
Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E...
 
Wido den hollander cloud stack and ceph
Wido den hollander   cloud stack and cephWido den hollander   cloud stack and ceph
Wido den hollander cloud stack and ceph
 
Cloudstack: the best kept secret in the cloud
Cloudstack: the best kept secret in the cloudCloudstack: the best kept secret in the cloud
Cloudstack: the best kept secret in the cloud
 
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes][BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
Tlu introduction-to-cloud
Tlu introduction-to-cloudTlu introduction-to-cloud
Tlu introduction-to-cloud
 
Cassandra and Docker Lessons Learned
Cassandra and Docker Lessons LearnedCassandra and Docker Lessons Learned
Cassandra and Docker Lessons Learned
 
Cloud Hosting: Lessons from the trenches
Cloud Hosting: Lessons from the trenchesCloud Hosting: Lessons from the trenches
Cloud Hosting: Lessons from the trenches
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your CostsJOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your Costs
 
MyCloud for $100k
MyCloud for $100kMyCloud for $100k
MyCloud for $100k
 
Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)
 
Private cloud cloud-phoenix-april-2014
Private cloud cloud-phoenix-april-2014Private cloud cloud-phoenix-april-2014
Private cloud cloud-phoenix-april-2014
 
Docker and CloudStack
Docker and CloudStackDocker and CloudStack
Docker and CloudStack
 
OSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom FifieldOSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
OSCON 2013 - Planning an OpenStack Cloud - Tom Fifield
 
QNAP NAS training 2016 Q3
QNAP NAS training 2016 Q3QNAP NAS training 2016 Q3
QNAP NAS training 2016 Q3
 
App Deployment on Cloud
App Deployment on CloudApp Deployment on Cloud
App Deployment on Cloud
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker
 

Plus de ShapeBlue

Plus de ShapeBlue (20)

CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlueCloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
 
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlueCloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
 
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
 
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlueVM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
 
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHubHow We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
 
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
 
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIOHow We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
 
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
 
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
 
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
 
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
 
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
 
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
 
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
 
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
 
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
 
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
 
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
 
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Wido den Hollander - building highly available cloud with Ceph and CloudStack

  • 1. Building a highly available cloud with Ceph and Apache CloudStack Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 2. Who am I? • Wido den Hollander (1986) • Owner and founder of 42on.com, Ceph Training and Consultancy Company • Co-owner and CTO @ PCextreme B.V. (Dutch hosting company) • Developed the Ceph (RBD) integration for libvirt storage drivers and Apache CloudStack • Wrote PHP and Java bindings for librados • Ceph Board member • Apache CloudStack PMC member Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 3. Ceph Being technical and owning a hosting company you quickly get frustrated with storage. It is expensive, difficult and just annoying. Storage is just always the problem. While searching the internet I found this little project called Ceph still being hosted in http://ceph.newdream.net/ and there was this guy Sage publishing some tarballs. At that time, late 2009, there was version 0.18 and I gave it a try. It worked! Ever since I've been involved with the project. I write code, patches, documentation and I am a member of the board. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 4. Apache CloudStack From my role as CTO at PCextreme I first came into contact with CloudStack shortly after Citrix bought cloud.com in 2012. I travelled to the Citrix office in California for their first cloud.com/CloudStack conference at which they announced donation to the Apache Foundation. Ever since I've been working with CloudStack as a committer and PMC member. It is a vital part of my company. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 5. PCextreme Started the company when I was 16 from home. It has now grown to a company with 30 employees and runs thousands of servers. Dutch hosting company focused on: • Traditional webhosting • Cloud computing Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 6. Ceph and CloudStack At PCextreme we offer our cloud services under the name Aurora. We offer various products under this umbrella: • Compute (Virtual Machines) • DNS (Intelligent DNS service) • Objects (Amazon S3 compatible service) All services are available over both IPv4 and IPv6. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 7. Aurora Compute We offer our Compute service in two different flavors: • Stamina • Agile Both are managed by Apache CloudStack (KVM hypervisor), but only Stamina uses Ceph. Our services are offered from different locations in the world: Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 8. • Amsterdam (main location with 3 datacenters) • Barcelona • Miami • Tokyo • Los Angeles • Antwerp (Soon) • Frankfurt (Soonish) Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 9. Compute: Stamina Virtual Machines running on Stamina run in High Available mode by using Ceph with 3x data replication for its storage. Should a Hypervisor fail CloudStack will start the VM on a different hypervisor. In case of a failure of a Ceph node (or disk) the high availability of Ceph makes sure all data stay available and is safely stored. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 10. Compute: Stamina Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 11. Compute: Agile Agile Virtual Machines are deployed on hypervisors with local SSD storage (Samsung PM863a SATA and Samsung PM963 NVMe) and are not Highly Available. There is no RAID nor any form of data replication for this Virtual Machines. This offering is for customers who build their application cloud aware and can loose a Virtual Machine. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 12. Compute: Agile Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 13. Apache CloudStack For those not familiar with CloudStack and its terminology a quick introduction: CloudStack uses a concept of a Management server (or a cluster of them) which manages a Agent running on a KVM, Xen, VMWare of Hyper-V hypervisor. These Agents manage Primary Storage (virtual disks) and Secondary Storage (templates and snapshots). In addition there are Console Proxy, Secondary Storage and Virtual Router Virtual Machines which are all under control of the Management Server(s). Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 14. Ceph and CloudStack When we first (Early 2013) started using CloudStack there was no integration between CloudStack and Ceph's block device RBD. I wrote the integration in libvirt and the Java bindings in order to support Ceph storage for KVM in CloudStack. Let's say that the road to our current sitation was a interesting one, we learned a few lessons: • Do not use HDDs behind VMs. Customers will always complain about performance. • Apply I/O limitations to all customers. A small group of customers will eat all IOps. • KISS! Overcomplicating things resulted in hickups because we tried to make it to fancy. • Failure domains! Yes, large environments are financially very attractive, but they will go down at some point and all customers start calling. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 15. HDDs vs SSDs in the cloud At first we focused on a low price per GB as we thought that would attract customers. And it did. We soon learned that customers expect unlimited IOps with the Gigabytes they are provided. HDDs are slow, very, very slow compared to SSDs. Although SSDs are more expensive per GB they are very cheap for the IOps they provide. The latter is actually the most important one we found out. Even if you use SSDs for storage you still need to apply I/O limitations to all virtual disks provided to Virtual Machines. Customers will use all the IOps they can get. People will run very I/O intensive applications like databases on their Virtual Machines. For every GB a customer buys we allocate X amount of IOps to that disk. CloudStack supports I/O limitations on virtual dsks and are enforced by libvirt/KVM. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 16. Failure domains Both Ceph and CloudStack scale to very large sizes. Ceph is able to managed thousands of disks, PetaBytes of storage and even more. CloudStack can handle hundreds of hypervisors and tens of thousands Virtual Machines. What we however learned the hard way is that things will fail at some point. Human failure, software bugs, a combination of them or just bad luck. Murhpy will come along at some point. We now build moderately sized cloud environments which we glue together in our Control Panel. The customer is not aware of this abstraction as our Control Panel talks to different CloudStack Management servers. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 17. Control Panel Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 18. Control Panel Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 19. Our current design With the lessons learned with running with Ceph and CloudStack for over 5 years now we have come up with a design which works really well for us. CloudStack Hypervisors (8 to 12 per cluster): • Dual Intel Xeon CPU • 256 ~ 512GB of Memory per node • Redundant 10Gbit connection Ceph cluster: • 3 Monitors • 15 to 25 1U nodes with 10 SSDs each • Redundant 10Gbit connection Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 20. Our current design SuperMicro SYS-1018R-WC0R • Intel Xeon E5-1650 v3 3.50Ghz 6-core • 64GB Memory • 64GB SATA-DOM (Operating System) • 10x Samsung PM863a 1.92TB (SATA) Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 21. Our current design All hardware (SuperMicro) is installed in a single rack with two 10Gbit (Arista) Top-of-Rack switches which also handle the gateways (IPv4 and IPv6). This allows us to run thousands of Virtual Machines in a single 19" rack with a redundant 32A/7kW power supply. Scaling larger is technically possible, but by confining things to smaller islands we are able to spread risks and reduce the load on our staff in case of a outage. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 22. SSDs Our Ceph clusters all run on Samsung PM-series SSDs. Most of our clusters run on SATA with the PM863a SSDs, but we recently ordered our first batch of hardware with the PM963 NVMe SSDs. The hardware has been installed, but we were not able to run any tests on them. We do expect a rather large performance improvement going from SATA to NVMe. Overall the PM-series have performed very, very well for us. Failure rates is <1% and we haven't seen any of them wear out while they are being used heavily in a 24/7 application. As we are very satisfied with the performance and price the Samsung PM-series provide we have not looked at other vendors. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 23. Our new design SuperMicro SYS-1029U-TN10RT • Dual Intel Xeon Silver 4110 Processor 8-core 2.10GHz • 96GB Memory • 128GB SATA-DOM (Operating System) • 10x Samsung PM963 1.92TB (NVMe) Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 24. Software Our complete infrastructure runs on Ubuntu LTS versions: • Ubuntu 14.04 • Ubuntu 16.04 CloudStack and Ceph: • CloudStack 4.9 and 4.10 • Ceph 12.2.X Luminous Release with BlueStore Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 25. Ceph BlueStore Traditionally Ceph stored all data eventually on a XFS filesystem running underneath the OSD. This backend is called FileStore This had some serious performance implications as well as some scaling problems. With the release of Ceph Luminous (12.2.X) a new storage backend BlueStore was introduced, which provides: • More performance by consuming a RAW Block Device • CRC checking on all data • Compression In the last few weeks we migrated all our data (~3PB) from FileStore to BlueStore without any downtime. We re-installed all machines from Ubuntu 14.04 to 16.04 and at the same time wiped all OSDs running with FileStore and migrated them to BlueStore. Looking at statistics we see a latency improvement (lower IO-wait) inside Virtual Machines of about 20% overall. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 26. IPv6 All our services are IPv6-ready and enabled. This means that each Virtual Machine deployed on our infrastructure obtains both a IPv4 and IPv6 address. Interesting fact is that internally Ceph and CloudStack are IPv6-only. Our hypervisors and Ceph nodes only have a IPv6 address over which they are managed and controlled. Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 27. Conclusion We manage tens of thousands of Virtual Machines running with Ceph and CloudStack and we think it is a winning and golden combination: • Open Source • Easy to access projects • Easy to deploy and maintain • Stable and fast Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay
  • 28. Thank you! Thanks for listening, questions? Contact me: • Github: wido • Twitter: @widodh • LinkedIn: widodh • E-Mail: wido@pcextreme.nl Building a highly available cloud with Ceph and Apache CloudStack - #CloudStackCephDay