SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
How We Scaled Freshdesk to
Handle 150M Requests/Week
Kiran Darisi
Director, Technical Operations at Freshdesk
Our customer base grew by 400% and the number of requests
per week boomed from 10 to 65 million in a year (2013).
Not from an engineering perspective
Cool for a 3 year old startup?
We used a bunch of methods to scale vertically in a really
short amount of time.
Sure, we eventually had to shard our databases, but some
of these techniques helped us stay afloat, for quite a while.
MOORE’S WAY
Increasing the RAM, CPU and I/O
But the amount of RAM we added and the
CPU cycles did not correlate with the
workload we got out of the instance. So we
stayed put at 64GB.
We upgraded from Medium Instance
Amazon EC2 First Generation to High
Memory Quadruple Extra Large (thus
increasing our RAM from 3.75 GB to 64 GB)
R/W split increased the number of I/Os
performed on our databases but it didn’t
do much for write performance.
We marked dedicated roles for each
slave because using round robin
algorithm to select different slaves for
different queries proved ineffective.
THE READ/WRITE SPLIT
Using MySQL replication and distributing
the reads between master and slave
We chose the partition key and the number
of partitions and the table was partitioned
automatically.
Post-partitioning, our read performance
increased dramatically but again, the write
performance was a problem.
MYSQL PARTITIONING
Using the MySQL 5 built-in
partitioning capability.
1. Choose the partition key carefully or alter the current schema to
follow the MySQL partition rules.
2. The number of partitions you start with will affect the I/O
operations on the disk directly.
3. If you use a hash-based algorithm with hash-based keys, you
cannot control who goes where. This means you’ll be in trouble if
two or more noisy customers fall within the same partition.
4. Make sure that every query contains the MySQL partition key. A
query without the partition key ends up scanning all the
partitions. Performance is sure to take a dive.
Things to keep in mind while performing MySQL partitioning
We cached ActiveRecord objects as well as
HTML partials (bits and pieces of HTML) using
Memcached.
We chose Memcached because it scales well
with multiple clusters. The Memcached client
used makes a lot of difference in response
time so we went with dalli.
CACHING
Caching objects that rarely
change in their lifetime
DISTRIBUTED FUNCTIONS
Keeping response time low by
using different storage engines for
different purposes
We started using Amazon RedShift for
analytics and data mining, and Redis to
store state information and background
jobs for Resque.
But because Redis can’t scale or fallback,
we don’t use it for atomic operations.
We decided that scaling horizontally by sharding was
the only cost-effective way to increase write scalability
beyond the instance size.
But scaling vertically can only get you so far.
Two main concerns we had before we took the final call
on sharding:
1. No distributed transactions – We wanted all tenant
details to be in one shard.
2. Rebalancing the shards should be easy – We wanted
control over which tenant sits in which shard and to
be able to move them around when needed.
A little research showed us that directory-based
sharding was the only way to go.
REASONS FOR
CHOOSING DIRECTORY-
BASED SHARDING
It is simpler than hash key-based
or range-based sharding.
Rebalancing shards is easier here
than in other methods.
A typical directory entry looks like this
tenant info shard_details shard_status
Stark Industries shard1 Read & Write
• tenant_info - unique key referring to the DB entry
• shard_details - shard in which that tenant exists
• shard_status - tells what kind of activity the tenant is ready for (we have
multiple shard statuses like Not Ready, Only Reads, Read & Write etc)
The sharding API even
acts as a unique ID
generator so that the
tenant ID generated is
unique across shards.
How directory lookups work
API wrapper is tuned to
accept the tenant
information in multiple
forms like tenant URL,
tenant ID etc.
Sometimes a customer grows from processing 1000 tickets per day
to 10,000 tickets per day. This will affect the performance of the
whole shard.
We can’t solve this by splitting up customer data into multiple
shards because we didn’t want the mess of distributed transactions.
So, in these cases, we’d move the noisy customer to a shard of his
own. That way, everybody wins.
Why we care about rebalancing
Steps to
Rebalance a Shard
Every shard will have its own slave to scale the reads.
For example, say Wayne Enterprises and Stark
industries are in shard1.
1
Wayne Enterprises shard1 Read & Write
Stark Industries shard1 Read & Write
The directory entry looks like this:
If Wayne enterprises grows at a breakneck
pace, we would decide to move it to
another shard.
(averting the danger of Bruce Wayne and
Tony Stark being mad at us the same time).
2
So we would boot up a new slave to shard1 and call it
shard2. Then, we’d attach a read replica to the new
slave and wait for it to sync with the master.
3
We would then stop the writes for Wayne Enterprises
by changing the shard status in the directory.
4
Wayne Enterprises shard1 Read Only
Stark Industries shard1 Read & Write
Then we would stop the replication of master data in
shard2 and promote it to master.
5
Now the directory entry will be updated accordingly.
Wayne Enterprises shard2 Read & Write
Stark Industries shard1 Read & Write
This effectively moves Wayne Enterprises to its own shard.
Batman is happy and so is Iron Man.
6
1. Don’t do it unless it’s absolutely necessary. You will have to
rewrite code for your whole app, and maintain it.
2. You could use functional partitioning (moving an over-sized table
to another DB altogether) to completely avoid sharding if writes
are not a problem.
3. Choosing the right sharding algorithm is a bit tricky as each has
its own benefits and drawbacks. You need to make a thorough
study of all your requirements while picking one.
4. You will have to take care of the Unique ID generation across
shards.
Word of caution
We get 250,000 tickets across Freshdesk every day and 100 M
queries during the same time (with a peak of 3-4k QPS). We have a
separate shard now for all new sign ups. And each shard can
roughly carry 20,000 tenants.
In the future, we’d like to explore Multi-pod architecture and also
look at Proxy architecture using MySQL Fabric, Scalebase etc.
What’s next for Freshdesk
“Behind every slideshare is a
great blogpost”
Read more about scaling freshdesk here
http://blog.freshdesk.com/how-freshdesk-scaled-using-
sharding/

Contenu connexe

Tendances

Dynamo db pros and cons
Dynamo db  pros and consDynamo db  pros and cons
Dynamo db pros and cons
Saniya Khalsa
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
DataStax
 

Tendances (20)

What’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQLWhat’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQL
 
AWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentationAWS Cloud SAA Relational Database presentation
AWS Cloud SAA Relational Database presentation
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 
Making (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with CachingMaking (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with Caching
 
Dynamo db pros and cons
Dynamo db  pros and consDynamo db  pros and cons
Dynamo db pros and cons
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database Service
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database Services
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 
AWS Database Services
AWS Database ServicesAWS Database Services
AWS Database Services
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About Performance
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
NoSQL and AWS Dynamodb
NoSQL and AWS DynamodbNoSQL and AWS Dynamodb
NoSQL and AWS Dynamodb
 
Migrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration ServiceMigrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration Service
 
AWS re:Invent 2016: How Telltale Games migrated its story analytics from Apac...
AWS re:Invent 2016: How Telltale Games migrated its story analytics from Apac...AWS re:Invent 2016: How Telltale Games migrated its story analytics from Apac...
AWS re:Invent 2016: How Telltale Games migrated its story analytics from Apac...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS RDS
AWS RDSAWS RDS
AWS RDS
 

En vedette

Freshdesk overview
Freshdesk overviewFreshdesk overview
Freshdesk overview
Freshdesk
 

En vedette (12)

Social Support With Freshdesk
Social Support With FreshdeskSocial Support With Freshdesk
Social Support With Freshdesk
 
How to write awesome emails that customers love
How to write awesome emails that customers loveHow to write awesome emails that customers love
How to write awesome emails that customers love
 
Guide to writing a perfect knowledge base article
Guide to writing a perfect knowledge base articleGuide to writing a perfect knowledge base article
Guide to writing a perfect knowledge base article
 
When your favorite characters become support agents
When your favorite characters become support agentsWhen your favorite characters become support agents
When your favorite characters become support agents
 
Partition material final
Partition material final Partition material final
Partition material final
 
Getting rid of the signing in book – in 5 steps
Getting rid of the signing in book – in 5 stepsGetting rid of the signing in book – in 5 steps
Getting rid of the signing in book – in 5 steps
 
Gamification Tour
Gamification TourGamification Tour
Gamification Tour
 
Mobihelp Overview
Mobihelp OverviewMobihelp Overview
Mobihelp Overview
 
Freshdesk overview
Freshdesk overviewFreshdesk overview
Freshdesk overview
 
Five tips for writing perfect tech support emails
Five tips for writing perfect tech support emailsFive tips for writing perfect tech support emails
Five tips for writing perfect tech support emails
 
Hindu law
Hindu lawHindu law
Hindu law
 
Salary benchmarking - Presentation by Dr. Manan Chaturvedi
Salary benchmarking -  Presentation by Dr. Manan ChaturvediSalary benchmarking -  Presentation by Dr. Manan Chaturvedi
Salary benchmarking - Presentation by Dr. Manan Chaturvedi
 

Similaire à How We Scaled Freshdesk To Take 65M Requests/week

Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San Francisco
Amazon Web Services
 

Similaire à How We Scaled Freshdesk To Take 65M Requests/week (20)

Redshift
RedshiftRedshift
Redshift
 
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Data
 
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
 
Voldemort
VoldemortVoldemort
Voldemort
 
Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San Francisco
 
Which Database is Right for My Workload?
Which Database is Right for My Workload?Which Database is Right for My Workload?
Which Database is Right for My Workload?
 
Which Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFWhich Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SF
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
 
Amazon Aurora and AWS Database Migration Service
Amazon Aurora and AWS Database Migration ServiceAmazon Aurora and AWS Database Migration Service
Amazon Aurora and AWS Database Migration Service
 
Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3
 
Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 

Plus de Freshdesk Inc.

Freshdesk's Brand New In-Sync App for Facebook
Freshdesk's Brand New In-Sync App for FacebookFreshdesk's Brand New In-Sync App for Facebook
Freshdesk's Brand New In-Sync App for Facebook
Freshdesk Inc.
 

Plus de Freshdesk Inc. (20)

Competencies of an ideal customer support agent
Competencies of an ideal customer support agentCompetencies of an ideal customer support agent
Competencies of an ideal customer support agent
 
The 7 types of customer support agents
The 7 types of customer support agentsThe 7 types of customer support agents
The 7 types of customer support agents
 
The AI Dilemma - Bridging the Gap between Experimentation and Customer Expect...
The AI Dilemma - Bridging the Gap between Experimentation and Customer Expect...The AI Dilemma - Bridging the Gap between Experimentation and Customer Expect...
The AI Dilemma - Bridging the Gap between Experimentation and Customer Expect...
 
Is your support team ready for the Thanksgiving Shopping Weekend?
Is your support team ready for the Thanksgiving Shopping Weekend?Is your support team ready for the Thanksgiving Shopping Weekend?
Is your support team ready for the Thanksgiving Shopping Weekend?
 
Optimizing your customer support this holiday season.
Optimizing your customer support this holiday season. Optimizing your customer support this holiday season.
Optimizing your customer support this holiday season.
 
How To Deal With Angry Customers Without Losing Your Cool
How To Deal With Angry Customers Without Losing Your CoolHow To Deal With Angry Customers Without Losing Your Cool
How To Deal With Angry Customers Without Losing Your Cool
 
Tips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customerTips from Calvin and Hobbes on how to be a good customer
Tips from Calvin and Hobbes on how to be a good customer
 
12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.12 things Disney and Pixar teach us about customer support.
12 things Disney and Pixar teach us about customer support.
 
7 Ways To Train Yourself To Be The Next Awesome Support Rep
7 Ways To Train Yourself To Be The Next Awesome Support Rep7 Ways To Train Yourself To Be The Next Awesome Support Rep
7 Ways To Train Yourself To Be The Next Awesome Support Rep
 
Gmail for customer support
Gmail for customer supportGmail for customer support
Gmail for customer support
 
How to not sound like a robot in your email notifications
How to not sound like a robot in your email notificationsHow to not sound like a robot in your email notifications
How to not sound like a robot in your email notifications
 
How To Lose A Customer In 10 Minutes
How To Lose A Customer In 10 MinutesHow To Lose A Customer In 10 Minutes
How To Lose A Customer In 10 Minutes
 
5 ways to improve Twitter support
5 ways to improve Twitter support5 ways to improve Twitter support
5 ways to improve Twitter support
 
Dealing with stupid requests*
Dealing with stupid requests*Dealing with stupid requests*
Dealing with stupid requests*
 
Freshdesk's Brand New In-Sync App for Facebook
Freshdesk's Brand New In-Sync App for FacebookFreshdesk's Brand New In-Sync App for Facebook
Freshdesk's Brand New In-Sync App for Facebook
 
The 8 Best practices for a Better Customer Service
The 8 Best practices for a Better Customer ServiceThe 8 Best practices for a Better Customer Service
The 8 Best practices for a Better Customer Service
 
Freshdesk Arcade - Gamify Your Helpdesk
Freshdesk Arcade - Gamify Your HelpdeskFreshdesk Arcade - Gamify Your Helpdesk
Freshdesk Arcade - Gamify Your Helpdesk
 
Reporting - Better Insights, Better Decisions !
Reporting - Better Insights, Better Decisions !Reporting - Better Insights, Better Decisions !
Reporting - Better Insights, Better Decisions !
 
Automate Your Support !
Automate Your Support !Automate Your Support !
Automate Your Support !
 
Freshdesk integration with Capsule Crm
Freshdesk integration with Capsule CrmFreshdesk integration with Capsule Crm
Freshdesk integration with Capsule Crm
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

How We Scaled Freshdesk To Take 65M Requests/week

  • 1. How We Scaled Freshdesk to Handle 150M Requests/Week Kiran Darisi Director, Technical Operations at Freshdesk
  • 2. Our customer base grew by 400% and the number of requests per week boomed from 10 to 65 million in a year (2013).
  • 3. Not from an engineering perspective Cool for a 3 year old startup?
  • 4. We used a bunch of methods to scale vertically in a really short amount of time. Sure, we eventually had to shard our databases, but some of these techniques helped us stay afloat, for quite a while.
  • 5. MOORE’S WAY Increasing the RAM, CPU and I/O But the amount of RAM we added and the CPU cycles did not correlate with the workload we got out of the instance. So we stayed put at 64GB. We upgraded from Medium Instance Amazon EC2 First Generation to High Memory Quadruple Extra Large (thus increasing our RAM from 3.75 GB to 64 GB)
  • 6. R/W split increased the number of I/Os performed on our databases but it didn’t do much for write performance. We marked dedicated roles for each slave because using round robin algorithm to select different slaves for different queries proved ineffective. THE READ/WRITE SPLIT Using MySQL replication and distributing the reads between master and slave
  • 7. We chose the partition key and the number of partitions and the table was partitioned automatically. Post-partitioning, our read performance increased dramatically but again, the write performance was a problem. MYSQL PARTITIONING Using the MySQL 5 built-in partitioning capability.
  • 8. 1. Choose the partition key carefully or alter the current schema to follow the MySQL partition rules. 2. The number of partitions you start with will affect the I/O operations on the disk directly. 3. If you use a hash-based algorithm with hash-based keys, you cannot control who goes where. This means you’ll be in trouble if two or more noisy customers fall within the same partition. 4. Make sure that every query contains the MySQL partition key. A query without the partition key ends up scanning all the partitions. Performance is sure to take a dive. Things to keep in mind while performing MySQL partitioning
  • 9. We cached ActiveRecord objects as well as HTML partials (bits and pieces of HTML) using Memcached. We chose Memcached because it scales well with multiple clusters. The Memcached client used makes a lot of difference in response time so we went with dalli. CACHING Caching objects that rarely change in their lifetime
  • 10. DISTRIBUTED FUNCTIONS Keeping response time low by using different storage engines for different purposes We started using Amazon RedShift for analytics and data mining, and Redis to store state information and background jobs for Resque. But because Redis can’t scale or fallback, we don’t use it for atomic operations.
  • 11. We decided that scaling horizontally by sharding was the only cost-effective way to increase write scalability beyond the instance size. But scaling vertically can only get you so far.
  • 12. Two main concerns we had before we took the final call on sharding: 1. No distributed transactions – We wanted all tenant details to be in one shard. 2. Rebalancing the shards should be easy – We wanted control over which tenant sits in which shard and to be able to move them around when needed. A little research showed us that directory-based sharding was the only way to go.
  • 13. REASONS FOR CHOOSING DIRECTORY- BASED SHARDING It is simpler than hash key-based or range-based sharding. Rebalancing shards is easier here than in other methods.
  • 14. A typical directory entry looks like this tenant info shard_details shard_status Stark Industries shard1 Read & Write • tenant_info - unique key referring to the DB entry • shard_details - shard in which that tenant exists • shard_status - tells what kind of activity the tenant is ready for (we have multiple shard statuses like Not Ready, Only Reads, Read & Write etc)
  • 15. The sharding API even acts as a unique ID generator so that the tenant ID generated is unique across shards. How directory lookups work API wrapper is tuned to accept the tenant information in multiple forms like tenant URL, tenant ID etc.
  • 16. Sometimes a customer grows from processing 1000 tickets per day to 10,000 tickets per day. This will affect the performance of the whole shard. We can’t solve this by splitting up customer data into multiple shards because we didn’t want the mess of distributed transactions. So, in these cases, we’d move the noisy customer to a shard of his own. That way, everybody wins. Why we care about rebalancing
  • 18. Every shard will have its own slave to scale the reads. For example, say Wayne Enterprises and Stark industries are in shard1. 1 Wayne Enterprises shard1 Read & Write Stark Industries shard1 Read & Write The directory entry looks like this:
  • 19. If Wayne enterprises grows at a breakneck pace, we would decide to move it to another shard. (averting the danger of Bruce Wayne and Tony Stark being mad at us the same time). 2
  • 20. So we would boot up a new slave to shard1 and call it shard2. Then, we’d attach a read replica to the new slave and wait for it to sync with the master. 3
  • 21. We would then stop the writes for Wayne Enterprises by changing the shard status in the directory. 4 Wayne Enterprises shard1 Read Only Stark Industries shard1 Read & Write
  • 22. Then we would stop the replication of master data in shard2 and promote it to master. 5 Now the directory entry will be updated accordingly. Wayne Enterprises shard2 Read & Write Stark Industries shard1 Read & Write
  • 23. This effectively moves Wayne Enterprises to its own shard. Batman is happy and so is Iron Man. 6
  • 24. 1. Don’t do it unless it’s absolutely necessary. You will have to rewrite code for your whole app, and maintain it. 2. You could use functional partitioning (moving an over-sized table to another DB altogether) to completely avoid sharding if writes are not a problem. 3. Choosing the right sharding algorithm is a bit tricky as each has its own benefits and drawbacks. You need to make a thorough study of all your requirements while picking one. 4. You will have to take care of the Unique ID generation across shards. Word of caution
  • 25. We get 250,000 tickets across Freshdesk every day and 100 M queries during the same time (with a peak of 3-4k QPS). We have a separate shard now for all new sign ups. And each shard can roughly carry 20,000 tenants. In the future, we’d like to explore Multi-pod architecture and also look at Proxy architecture using MySQL Fabric, Scalebase etc. What’s next for Freshdesk
  • 26. “Behind every slideshare is a great blogpost” Read more about scaling freshdesk here http://blog.freshdesk.com/how-freshdesk-scaled-using- sharding/