Talk we gave at OSCON 2015 in Portland, OR within the Open Messaging Day on how to scale a mailbox management platforms with Cyrus and OpenIO object storage solution.
6. 3 avenue Antoine Pinay
Parc d'Activité des 4 Vents
59510 Hem
@OpenIO OpenIO
github.com/open-io
180 Sansom Street, FI 4
San Francisco, 94104 CA
FRANCE USA
7. • Cyrus scalability issues
• OpenIO technology
• Cyrus IMAP 3.0
Storage infrastructure challenge
Open source object storage
OpenIO for a scale-out Cyrus
Summary
8. Cyrus scalability issues
Share-nothing clusters
• Migrate mailboxes to new clusters as local
storage becomes full
• Storage cost with SAS drives
• Ends up with unused CPU/RAM ressources
on Cyrus servers
9. Disk used
email size
IOPS zone
SSD drives
capacity zone
SATA drives
Number of emails
Email distribution by size (based on actual data)
Be smart with storage
15. Existing technologies
Distributed Hash Tables
Consistent Hashing
Single name node
• Good for few large files
(large web indexes, nosql tables, …)
• Bad for numerous small objects
(emails,…)
• Good for trillion of objects
• Bad because of rebalancing of part of the
data when adding capacity
16. OpenIO is different
1,000,000,000,000 =
100,000,000 x 10,000
Mailboxes Emails
per Mailbox
Do not track objects,
track containers!
Trillions of objects? Let’s do the math
21. Grid & Conscience
Grid of nodes
• Massively distributed
• Each node takes part in directory &
storage services
• no SPOF - Resilient to node failures
On-the-fly best matchmaking
Conscience
• Collects metrics of each node
• Computes node scores
• Real time asynchronous process
• Advanced load balancing:
select the best nodes at a particular time
route requests to them
26. Cyrus 3.0 + OpenIO
OpenIO
Steps
#1
Scale out archive storage
#2
Bring Conscience
to Cyrus Murder
Sept 2015Available
#3
Scale-out backend for
calendars & contacts
End of 2015
To provide the best Cyrus
IMAP node when a
mailbox is created or
migrated
To provide per-user tiny
databases spread across
the grid of nodes
Laurent Denel - 36
CEO of OpenIO - OpenIO is an open source Object storage solution
growing market - existing solutions, if you look at the market
either expensive or complicated to setup and run
bring one that is very easy to setup, and very simple to maintain
I’m glad to be here at OSCON with you guys to speak about OpenIO and the reboot of open messaging with the open messaging team!
keep this logo in mind - a new start-up in object storage
Some important figures about the company
7 happy co-founders - Only tech guys
50 millions of active users
this timeline shows that we have nearly 10 years of experience in object storage
reaching more 10+ PB we manage today
Based on this success, we created this year a fork of this technology to provide the market the fastest and easiest object storage technology
OpenIO has been design for email storage, it’s part of our DNA, that’s why we’re here today with the open messaging team
Today it is used also for cloud, video, archives, virtual machines images, etc.
we have 2 offices : headquarter and R&D in France - a business office in SF
we have a github account, open « dash » io
I’ll talk about what we bring to cyrus 3.0
object storage and more
previous life, we know cyrus : independent cyrus clusters
it’s not scalability, it was just duplication
worse than 80 - 20 rule
a small number of emails generates the most of the disk usage
let’s archive them to the right technology drives
they do not grow at the same pace
makes sense to separate storage from cyrus server
let’s put openio in the middle, it will act as an abstraction layer
make the storage scale independently from cyrus
what is objectstorage?
seems like magical isn’t it?
Wikipedia definition: https://en.wikipedia.org/wiki/Object_storage
thousands hard drives means billions/trillions of objects
directory of containers, not objects
on mail platforms: containers map mailboxes, perfect fit
grid&conscience the brain of the systemto build openio we used 2 key components
a directory and the grid and conscience system
no consistent hashing - what do we do?
I got an old professor at the university that told me this
data is not collocated to containers, they’re just pointers
so when new objects stores are available, use them right away
because moving data is bad
it eats performance
it puts data at risk
rebalance takes weeks at petabyte scale!!!
openio is efficient and saves money and time
what I call scalability
Cyrus 3.0 - The keystone for next open source messaging
What do we bring?
New object store API
local SATA archive pool is now by default
easier-than-ever to connect to anobject store
we bring native object storage support
openio as scale-out storage for cyrus
OpenIO’s not only object storage
it’s also the solution to scale your application backends
it’s what we call Grid for Apps
you can even use free CPU and RAM on your servers
OpenIO
OpenIO’s not only object storage
it’s also the solution to scale your application backends
it’s what we call Grid for Apps
you can even use free CPU and RAM on your servers
get the box: it has cyrus 3.0 pre-release and OpenIO