SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
Presenter and company intro
Who are we and what we do?
Migration to OpenNebula and StorPool
In order to fix our scalability problems we pinpointed
the need for a virtualization layer and distributed
storage. After thorough research we ended up with
OpenNebula and StorPool
What is Inoreader and what challenges we
faced while building and maintaining it?
We were facing numerous scalability issues
while at the same time we hade a an array of
servers doing nothing mostly because of filled
storage. At certain point we hit a brick wall.
If you have any questions I will gladly answer
Some useful takeaways for you.
I have 10+ years of experience in the Telco IT sector, working with large enterprise solutions as well as
building specialized solutions from scratch.
I have founded a company called Innologica in 2013 with the mission of developing Next-Gen OSS and
BSS solutions. A side project was born back then called Inoreader, which quickly turned into a leading
platform for content consumption and is now a core product of the company.
Who Are We?
We are not a sweatshop.
We make successful
Our customers are all over
We do not push the devs,
but we cherish top
The team is small, but
each member brings great
RSS News aggregator and information hub
We have 150k daily active users (DAU) and more than
30k simultaneous sessions in peak times. Closing in on
1M registered users soon. 10k and counting premium
15,000,000,000 articles in MySQL and ES
We keep the full archive in enormous MySQL Databases
and a separate Elasticsearch cluster just for searching.
Around 20TB of data without the replicas. 10M+ new
articles per day.
1,000,000 feed updates per hour
We need to update our 10+ Million feeds in a timely
manner. A lot of machines are dedicated for this task
40 VMs and 10 physical hosts
The platform is currently running on 30 Virtual Machines
mainly in our main DC. There are some physical hosts
that were not good candidates for virtualization mainly for
The old and the new setup
No more services running
directly on bare-metal.
footprint300% more capacity with
60% of the previous
servers with room for
Huge compute and storage
Maintainability is a breeze
Our main drivers to migrate to fully virtualized
We needed to constantly buy new servers just to keep up with the
growing databases, because local storages were being quickly
We were using expensive RAID cards and RAID-10 setups for all
databases. Those severs never used more than 10% of their CPUs,
so it was a complete waste of resources.
Not so common but always hair-pulling
All components are bound to fail. Whenever we lose a server, there
was always at least some service disruption if not a whole outage.
All databases needed to have replications, which skyrocketed server
costs and didn’t provide automatic HA. If a hard-drive fails in a
RAID-10 setup you need to replace it ASAP. Bigger drives are more
prone to cause errors while rebuilding.
Large databases on RAID-10 are slow to recover from crashes, so
replications should be carefully set up and should be on identical
(expensive) hardware in case a replication should be promoted to a
Nobody likes to go to a DC on Saturday to replace a failed drive,
reinstall OS and rotate replications. We much prefer to ride bikes!
We chose to virtualize everything using
OpenNebula + StorPool
Nov 2017 – Jan 2018
We knew for quite a while
that we need a solution to
the growth problem.
PLANNING AND FIRST TESTS
While the hardware was in
transit we took our time to
learn OpenNebula and test
it as much as possible
We have finally migrated
our last server and all VMs
were happily running on
OpenNebula and StorPool.
CHOOSING A SOLUTION
We held some meetings
with vendors and
We have migrated all
servers through several
iterations which will be
described in more detail
We chose a standard 3x SuperMicro SC836 3U servers.
As recommended by StorPool we chose Quanta LB8 for
the 10G network and Quanta LB4-M for the Gigabit
We have reused our old servers, but modified their CPUs
10G LAN cards and cables
StorPool recommends to use commodity hardware. Supermicro
offers a good platform without vendor specific requirements for RAID
cards, etc. and is very budget friendly.
• Supermicro CSE-836B chassis
• Supermicro X10SRL-F motherboard
• 1x Intel Xeon E5-1620 v4 CPU (8 threads @3.5Ghz)
• 64GB DDR4-2666 RAM
• Avago 3108L RAID controller with 2G cache
• Intel X520-DA2 10G Ethernet card
• 8x 4TB HDD LFF SATA3 7200 RPM
• 8x 2TB HDD LFF SATA3 7200 RPM (reused from older servers)
Around 3300 EUR per server
Gigabit Network – Quanta LB4M
We were struggling with some old TP-Link SG2424 switches that we
wanted to upgrade, so we used the opportunity to upgrade the
regular 1G network too. We chose the Quanta LB4M.
• 48x Gigabit RJ45 ports
• 2x 10G SFP+ ports
• Redundant power supplies
• Very cheap!
• EOL – You might want to stack up some spare switches!
• Stable (4 months without a single flop for now)
Around 250 EUR per switch from eBay.
10G Network – Quanta LB8
Again due to StorPool recommendation we procured three Quanta
LB8 switches. They seem to be performing great so far.
• 48x 10G SFP+ ports
• Redundant power supplies
• Very cheap for what they offer!
• EOL – You might want to stack up some spare switches!
• Stable (4 months without a single flop for now)
700-1000 EUR per switch from eBay including customs taxes.
We have reused our old servers, but with some significant upgrades.
We currently have 12 hypervisors with the following configuration:
• Supermicro 1U chassis with X9DRW motherboards
• 2x Intel Xeon E5-2650 v2 CPU (32 total threads)
• Dual power supply
• 128G DDR3 12800R Memory
• Intel X520-DA2 10G card
• 2xHDD in mdraid for OS only
We have rented a new rack in our collocation center since we didn’t
have any more space available in the old rack.
The idea was simple – Deploy StorPool in the new rack only and
gradually migrate hypervisors.
The servers landed in our office in late January.
It was Friday afternoon, but we quickly installed them in the lab and
let the StorPool guys do their magic over the weekend.
The next Monday StorPool finished all tests and the equipment was
ready to be installed in our DC.
Fast forward several hours and we had our first StorPool cluster up
and running. Still not hypervisors. StorPool needed to perform a full
cluster check in the real environment to see if everything works well.
The very next day we installed our first hypervisors – the temporary
ones that were holding VMs installed during our test period. Those
VMs were still running on local storage and NFS.
The next step was to migrate them to StorPool.
VM Migration to StorPool
Shut down the VM
Use SunStone or cli to shut
down the VM.01
Create StorPool volumesOn the host, use the storpool cli
to create volume(s) for the VM
with the exact size of the original
Copy the VolumesUse dd or qemu-convert for raw
and qcow2 images respectively
to copy the images to the
Reattach imagesDetach local images and attach
StorPool ones. Mind the order.
There’s a catch with large
Power up the VM
Check if the VM boots properly.
We’re not done yet…05
Finalize the migrationTo fully migrate persistent VMs use
the Recover -> delete-recreate
function to redeploy all files to
*Large images (100G+) takes forever to detach on slow local storage, so we had to kill the cp process and use the onevm recover success
option to lie to OpenNebula that the detach actually completed. This is risky but save a LOT of downtime.
After all VMs are migrated, you can delete the old system and image datastores and leave only StorPool DSs
At this point we are completely on StorPool!
StorPool helps their customers with this step, but here’s the summary of what we did.
From here on we had several iterations that consisted of roughly the
• Create a list of servers for migration. The more hypervisors the
more servers we can move in a single iteration
• Create VMs and migrate the services there
• Use the opportunity to untangle microservices running on the
• Make sure servers are completely drained from any services.
• Shut down the servers and plan a visit to the DC the next day
• Continue on the next slide…
Install 10G card and smaller HDDs and reinstall OS
Install the servers in the new rack and hand over to StorPool
RINSE AND REPEAT
At each iteration we move more servers at
once because we have more capacity for
At the end we have achieved 3x capacity boost in terms of
processing power and memory with just a fraction of our previous
servers, because with virtualization we can distribute the resources
however we’d like. In terms of storage we are on a completely
different level since we are no longer restricted to a single machine
capacity, we have 3x redundancy and all the performance we need.
We did it!
A glimpse at our OpenNebula dashboard.
336 CPU cores and 1.2TB of RAM in just 12 hypervisors.
All hypervisors are all nicely balanced using the default
There’s always enough room to move VMs around in case a
hypervisor crashes or if we need to reboot a host.
Optimize CPU for homogenous clusters
Available as template setting since OpenNebula 5.4.6. Set to host-
This option presents the real CPU model to the VMs instead of the
default QEMU CPU. It can substantially increase the performance
especially if instructions like aes are needed.
Do not use it if you have different CPU models across the cluster
since it will cause the VMs to crash after live migration.
For older OpenNebula setups set this as RAW DATA in the
Beware of mkfs.xfs on large StorPool volumes inside VMs
We noticed that when doing mkfs.xfs on large StorPool volumes
(e.g. 4TB) there was a big delay before the command completes.
What’s worse is that during this time all VMs on this host starve for
IO, because the storpool_block.bin process is using 100% CPU
The image shown on the left is for 1TB volume.
The reason is that mkfs uses TRIM by default and the StorPool
driver support that.
To remedy it use -K option for mkfs.xfs or -E nodiscard for
• mkfs.xfs -K /dev/sdb1
• mkfs.ext4 -E nodiscard /dev/sdb1
Use the 10G network for OpenNebula too
This is probably an obvious one, but it deserves to be mentioned. By
default your hosts will probably resolve others via the regular Gigabit
network. Forcing them to talk through the 10G storage network will
drastically improve the live VM migration. The migration is not IO
bound so it will completely saturate the network.
Usually a simple /etc/hosts modification.
Consult with StorPool for your specific use case before doing that.
Live migrating a VM with 8G of ram takes 7 seconds on 10G. The
same VM will take aboud 1.5 minutes on a Gigabit network and will
probably disturb VM communications if the network is saturated.
Live migration on highly loaded VMs can take significantly longer
and should be monitored. In some cases it’s enough to stop busy
services for just a second for the migration to complete.
Those are the more obvious ones that probably everyone uses in
production, but still worth mentioning.
• Use cache=none, io=native when attaching volumes
• Use virtio networking instead of the default 8139 nic. The latter
has performance issues and drops packets when host IO is high
• Measure IO latency instead of IO load to judge saturation. We
have several machines with constant 99% IO load which are
doing perfectly fine.
DISK = [ driver = "raw" , cache = "none", io = "native",
discard = "unmap", bus = "scsi" ]
NIC = [ filter = "clean-traffic", model="virtio" ]
We have adapted the OpenNebula Dashboards with
Graphite and Grafana scripts by Sebastian Mangelkramer
and used them to create our own Grafana dashboards so
we can see at a glance which hypervisors are most loaded
and how much overall capacity we have.
Grafana TV Dashboard
Why not have a master dashboard on the TV at the office? This
gives our team a very quick and easy way to tell if everything is
If all you see is green, we’re good
This dashboard show our main DC on the first row, our backup DC
on the second and then some other critical aspects of our system.
It’s still a WIP, hence the empty space.
At the top is our Geckoboard that we use for more business KPIs.
Server Power Usage in Grafana
Part of our virtualization project was to optimize the
electricity bill by using less servers. We were able to easily
measure our power usage by using Graphite and Grafana.
If you are interested, the script for getting the data into
Graphite is here:
The Grafana Dashboard can be found here:
Obviously you will need to tweak it, especially the formula
for the power bill.
StorPool were nice to give us an access to their own
Grafana instance where they collect a lot of internal data
about the system and KPIs. It gives us great insights that
we couldn’t get otherwise so we can plan and estimate the
system load very well.
We are currently only using a HDD pool, but we could
benefit from a smaller SSD pool for picky MySQL
Add more hypervisors
As the service grows our needs will too. We will probably
have rack space for the near years to come.
Add more StorPool nodes
We have maxed out the HDD bays on our our current
nodes, so we’ll probably need to add more nodes in the
Upgrade StorPool nodes to 40G
Currently the nodes use 2x10G ports like the
hypervisors. After adding an SSD pool we are
considering upgrading to 40G
THANK YOU !
READ MORE ON
GET THIS PRESENTATION FROM ino.to/one-