SlideShare a Scribd company logo
1 of 29
Gimme More! !
Supporting User Growth in a!
Performant and Efficient Fashion
Arun Kejariwal, Winston Lee

(@arun_kejariwal)

(@winstl)

Capacity Engineering @ Twitter

November 2013


@Twitter 1
User Experience
•  Anytime, Anywhere, Any device
q  5.2 billion mobile users by 2017 [1]
q  More than 10 billion mobile devices/connections by 2017 [1]
q  Worldwide mobile data traffic will reach 11.2 exabytes/month by 2017 (13x increase) [1]

•  Real-time performance








[1] http://newsroom.cisco.com/release/1135354 (Feb. 5, 2013)

@Twitter 2
Capacity Planning: Why bother?
•  Organic growth
q  Over 230M monthly active users [1]

•  User engagement
•  Evolving product landscape
q  Cards, Photos, Vines
§  Mobile video will increase 16-fold between 2012 and 2017 [2]

•  Events planned or unplanned





[1] http://www.sec.gov/Archives/edgar/data/1418091/000119312513400028/d564001ds1a.htm
[2] http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html

@Twitter 3
Approaches to Capacity Planning
•  Throw hardware at the problem
o  How much?
o  What kind? (Inventory management etc.)
o  Operationally inefficient!

•  Reactive approach


Bottomline

Poor UX



@Twitter 4
Systematic Capacity Planning
•  Objectives
q  Check under-allocation
§  Performance
§  Availability
o 

Adversely impact user experience 

q  Check over-allocation
§  Operational efficiency
o 

Adversely impacts bottomline

•  Determine capacity needed proactively via forecasting
q  Business metrics
q  System resource usage



@Twitter 5
Systematic Capacity Planning: Forecasting
•  Key questions
q  Which data?
§  Raw
§  Periodic Max
§  Moving average

q  Data granularity
§  Minutely
§  Daily
o 

Depends

q  Which model?
§  Linear
§  Spline
§  Holt-Winters

Non-Trivial!

§  ARIMA



@Twitter 6
Good old Linear Regression
Linear Regression based Forecast
Adjusted R-squared: 0.6062
Raw Data

Forecast

@Twitter 7
Linear Regression using periodic max
Linear Regression Using Maxes based Forecast
Adjusted R-squared: 0.5673
Standard Error

2.45x

Raw Data

Forecast

@Twitter 8
Splines
•  Smooth Spline
q λ: penalty for “wiggliness”
Spline based Fitting
Raw Data

Fitted

@Twitter 9
Splines
Spline based Forecast
Raw Data

Forecast

@Twitter 10
Splines
Boundary 2

Boundary 1

•  Sensitive to nature of time series at the boundary

@Twitter 11
Splines – Take 2
Spline based Forecast (Boundary 1)
Raw Data

Forecast

8.31x higher than end of time series

@Twitter 12
Splines – Take 3
Spline based Forecast (Boundary 2)
Raw Data

Forecast

3.77x higher than end of time series

@Twitter 13
Holt-Winters
•  Triple exponential smoothing
Estimate of linear trend
Seasonal correction factors
Holt-Winters based Fitting
Raw Data

Fitted

@Twitter 14
Holt-Winters
Holt-Winters based Forecast
Raw Data

Upper 95% CI

Forecast

@Twitter 15
ARIMA
•  Auto-Regressive Integrated Moving Average 
q  (p, d , q)
Moving Average order
Integrated order
Autoregressive order
Autoregressive component
Moving Average component

@Twitter 16
ARIMA
•  Fitting

Auto ARIMA based Fitting
Raw Data

Fitted

@Twitter 17
ARIMA – Take 1
ARIMA based Forecast

(p, d, q): (0,1,1)(0,1,1)[7] 

Raw Data

Upper 95% CI

Forecast

@Twitter 18
ARIMA – Take 2
Auto ARIMA based Forecast

(p, d, q): (1,1,1)(2,0,0)[7]

Raw Data

Upper 95% CI

Forecast

@Twitter 19
Impact of Outliers

@Twitter 20
Forecast without outlier

@Twitter 21
Good “enough”?

@Twitter 22
Impact of “Corrections”

@Twitter 23
Implications of data characteristics
ARIMA based forecast
Raw Data

Upper 95% CI

Forecast

@Twitter 24
Forecast without the boundary case
ARIMA based Forecast - 
Without initial spike
Raw Data

Upper 95% CI

Forecast

@Twitter 25
Forecast with truncation
ARIMA based Forecast - Truncated and Without initial spike

Raw Data

Upper 95% CI

Forecast

@Twitter 26
Lessons learned
•  Data fidelity
q  Anomalies
q  Absence of seasonality

•  Modeling
q  Never perfect
§  Assess forecasting error

q  Continuous refinement
§  Incoming data stream is dynamic
o 

Organic growth

o 

New products

o 

Behavioral aspect



@Twitter 27
Acknowledgements
•  Capacity Engineering Team
•  Management team

@Twitter 28
Join the Flock

Like problem solving? 

Like challenges? 

Be at cutting Edge 

Make an impact

•  We are hiring!!
q  https://twitter.com/JoinTheFlock
q  https://twitter.com/jobs
q  Contact us: @arun_kejariwal, @winstl

@Twitter 29

More Related Content

What's hot

Deep Learning for Public Safety in Chicago and San Francisco
Deep Learning for Public Safety in Chicago and San FranciscoDeep Learning for Public Safety in Chicago and San Francisco
Deep Learning for Public Safety in Chicago and San FranciscoSri Ambati
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsMuralidhar Somisetty
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeDataWorks Summit
 
Data Philly Meetup - Big (Geo) Data
Data Philly Meetup - Big (Geo) DataData Philly Meetup - Big (Geo) Data
Data Philly Meetup - Big (Geo) DataAzavea
 
Event Processing Using Semantic Web Technologies
Event Processing Using Semantic Web TechnologiesEvent Processing Using Semantic Web Technologies
Event Processing Using Semantic Web TechnologiesMikko Rinne
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Spark Summit
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...Splunk
 

What's hot (8)

Deep Learning for Public Safety in Chicago and San Francisco
Deep Learning for Public Safety in Chicago and San FranciscoDeep Learning for Public Safety in Chicago and San Francisco
Deep Learning for Public Safety in Chicago and San Francisco
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
 
Data Philly Meetup - Big (Geo) Data
Data Philly Meetup - Big (Geo) DataData Philly Meetup - Big (Geo) Data
Data Philly Meetup - Big (Geo) Data
 
Event Processing Using Semantic Web Technologies
Event Processing Using Semantic Web TechnologiesEvent Processing Using Semantic Web Technologies
Event Processing Using Semantic Web Technologies
 
An Analytics Platform for Connected Vehicles
An Analytics Platform for Connected VehiclesAn Analytics Platform for Connected Vehicles
An Analytics Platform for Connected Vehicles
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...
 

Similar to Gimme More! Supporting User Growth in a Performant and Efficient Fashion

Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsArun Kejariwal
 
Qualcomm Institute Winter IoT Program - Final Presentation
Qualcomm Institute Winter IoT Program - Final PresentationQualcomm Institute Winter IoT Program - Final Presentation
Qualcomm Institute Winter IoT Program - Final PresentationMookeunJi
 
This is not about Tweeting and Driving
This is not about Tweeting and DrivingThis is not about Tweeting and Driving
This is not about Tweeting and DrivingSylvain Carle
 
Analysing high throughput data in real time
Analysing high throughput data in real timeAnalysing high throughput data in real time
Analysing high throughput data in real timeHotstar
 
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...MongoDB
 
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Em Campbell-Pretty
 
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Context Matters
 
osisoft.ppt
osisoft.pptosisoft.ppt
osisoft.pptIwl Pcu
 
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)Building Larimer County's Road Event Status System (RESS) (NAGW 2016)
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)Gregg Turnbull, CGDSP, CGCIO
 
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetInteractive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetHortonworks
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Summit
 
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon RedshiftAmazon Web Services
 
AdhearsionConf 2013 Keynote
AdhearsionConf 2013 KeynoteAdhearsionConf 2013 Keynote
AdhearsionConf 2013 KeynoteMojo Lingo
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Data Con LA
 
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015Bui Thi Quynh Duong
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersVoltDB
 
Data At Pollfish, Dec. 2015, Euangelos Linardos
Data At Pollfish, Dec. 2015, Euangelos LinardosData At Pollfish, Dec. 2015, Euangelos Linardos
Data At Pollfish, Dec. 2015, Euangelos LinardosEuangelos Linardos
 
Data at Pollfish
Data at PollfishData at Pollfish
Data at PollfishPollfish
 

Similar to Gimme More! Supporting User Growth in a Performant and Efficient Fashion (20)

Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
Qualcomm Institute Winter IoT Program - Final Presentation
Qualcomm Institute Winter IoT Program - Final PresentationQualcomm Institute Winter IoT Program - Final Presentation
Qualcomm Institute Winter IoT Program - Final Presentation
 
Dash Wireframe
Dash WireframeDash Wireframe
Dash Wireframe
 
This is not about Tweeting and Driving
This is not about Tweeting and DrivingThis is not about Tweeting and Driving
This is not about Tweeting and Driving
 
Analysing high throughput data in real time
Analysing high throughput data in real timeAnalysing high throughput data in real time
Analysing high throughput data in real time
 
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
MongoDB World 2016: Scaling Targeted Notifications in the Music Streaming Wor...
 
Institutional GIS
Institutional GISInstitutional GIS
Institutional GIS
 
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
 
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
Scaling Agile Data Warehousing with the Scaled Agile Framework (SAFe)
 
osisoft.ppt
osisoft.pptosisoft.ppt
osisoft.ppt
 
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)Building Larimer County's Road Event Status System (RESS) (NAGW 2016)
Building Larimer County's Road Event Status System (RESS) (NAGW 2016)
 
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetInteractive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike Freedman
 
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
 
AdhearsionConf 2013 Keynote
AdhearsionConf 2013 KeynoteAdhearsionConf 2013 Keynote
AdhearsionConf 2013 Keynote
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
 
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015
Báo cáo xu hướng sử dụng kỉ thuật số của người tiêu dùng năm 2015
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Data At Pollfish, Dec. 2015, Euangelos Linardos
Data At Pollfish, Dec. 2015, Euangelos LinardosData At Pollfish, Dec. 2015, Euangelos Linardos
Data At Pollfish, Dec. 2015, Euangelos Linardos
 
Data at Pollfish
Data at PollfishData at Pollfish
Data at Pollfish
 

More from Arun Kejariwal

Anomaly Detection At The Edge
Anomaly Detection At The EdgeAnomaly Detection At The Edge
Anomaly Detection At The EdgeArun Kejariwal
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseArun Kejariwal
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal
 
Model Serving via Pulsar Functions
Model Serving via Pulsar FunctionsModel Serving via Pulsar Functions
Model Serving via Pulsar FunctionsArun Kejariwal
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsDesigning Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsArun Kejariwal
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsArun Kejariwal
 
Deep Learning for Time Series Data
Deep Learning for Time Series DataDeep Learning for Time Series Data
Deep Learning for Time Series DataArun Kejariwal
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsArun Kejariwal
 
Live Anomaly Detection
Live Anomaly DetectionLive Anomaly Detection
Live Anomaly DetectionArun Kejariwal
 
Finding bad apples early: Minimizing performance impact
Finding bad apples early: Minimizing performance impactFinding bad apples early: Minimizing performance impact
Finding bad apples early: Minimizing performance impactArun Kejariwal
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterArun Kejariwal
 
Days In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceDays In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceArun Kejariwal
 
Techniques for Minimizing Cloud Footprint
Techniques for Minimizing Cloud FootprintTechniques for Minimizing Cloud Footprint
Techniques for Minimizing Cloud FootprintArun Kejariwal
 
A Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudA Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudArun Kejariwal
 

More from Arun Kejariwal (16)

Anomaly Detection At The Edge
Anomaly Detection At The EdgeAnomaly Detection At The Edge
Anomaly Detection At The Edge
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the Enterprise
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time Series
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time Series
 
Model Serving via Pulsar Functions
Model Serving via Pulsar FunctionsModel Serving via Pulsar Functions
Model Serving via Pulsar Functions
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsDesigning Modern Streaming Data Applications
Designing Modern Streaming Data Applications
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data Streams
 
Deep Learning for Time Series Data
Deep Learning for Time Series DataDeep Learning for Time Series Data
Deep Learning for Time Series Data
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data Streams
 
Live Anomaly Detection
Live Anomaly DetectionLive Anomaly Detection
Live Anomaly Detection
 
Finding bad apples early: Minimizing performance impact
Finding bad apples early: Minimizing performance impactFinding bad apples early: Minimizing performance impact
Finding bad apples early: Minimizing performance impact
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ Twitter
 
Days In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceDays In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy service
 
Techniques for Minimizing Cloud Footprint
Techniques for Minimizing Cloud FootprintTechniques for Minimizing Cloud Footprint
Techniques for Minimizing Cloud Footprint
 
A Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudA Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the Cloud
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Gimme More! Supporting User Growth in a Performant and Efficient Fashion

  • 1. Gimme More! ! Supporting User Growth in a! Performant and Efficient Fashion Arun Kejariwal, Winston Lee (@arun_kejariwal) (@winstl) Capacity Engineering @ Twitter November 2013 @Twitter 1
  • 2. User Experience •  Anytime, Anywhere, Any device q  5.2 billion mobile users by 2017 [1] q  More than 10 billion mobile devices/connections by 2017 [1] q  Worldwide mobile data traffic will reach 11.2 exabytes/month by 2017 (13x increase) [1] •  Real-time performance [1] http://newsroom.cisco.com/release/1135354 (Feb. 5, 2013) @Twitter 2
  • 3. Capacity Planning: Why bother? •  Organic growth q  Over 230M monthly active users [1] •  User engagement •  Evolving product landscape q  Cards, Photos, Vines §  Mobile video will increase 16-fold between 2012 and 2017 [2] •  Events planned or unplanned [1] http://www.sec.gov/Archives/edgar/data/1418091/000119312513400028/d564001ds1a.htm [2] http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html @Twitter 3
  • 4. Approaches to Capacity Planning •  Throw hardware at the problem o  How much? o  What kind? (Inventory management etc.) o  Operationally inefficient! •  Reactive approach Bottomline Poor UX @Twitter 4
  • 5. Systematic Capacity Planning •  Objectives q  Check under-allocation §  Performance §  Availability o  Adversely impact user experience q  Check over-allocation §  Operational efficiency o  Adversely impacts bottomline •  Determine capacity needed proactively via forecasting q  Business metrics q  System resource usage @Twitter 5
  • 6. Systematic Capacity Planning: Forecasting •  Key questions q  Which data? §  Raw §  Periodic Max §  Moving average q  Data granularity §  Minutely §  Daily o  Depends q  Which model? §  Linear §  Spline §  Holt-Winters Non-Trivial! §  ARIMA @Twitter 6
  • 7. Good old Linear Regression Linear Regression based Forecast Adjusted R-squared: 0.6062 Raw Data Forecast @Twitter 7
  • 8. Linear Regression using periodic max Linear Regression Using Maxes based Forecast Adjusted R-squared: 0.5673 Standard Error 2.45x Raw Data Forecast @Twitter 8
  • 9. Splines •  Smooth Spline q λ: penalty for “wiggliness” Spline based Fitting Raw Data Fitted @Twitter 9
  • 10. Splines Spline based Forecast Raw Data Forecast @Twitter 10
  • 11. Splines Boundary 2 Boundary 1 •  Sensitive to nature of time series at the boundary @Twitter 11
  • 12. Splines – Take 2 Spline based Forecast (Boundary 1) Raw Data Forecast 8.31x higher than end of time series @Twitter 12
  • 13. Splines – Take 3 Spline based Forecast (Boundary 2) Raw Data Forecast 3.77x higher than end of time series @Twitter 13
  • 14. Holt-Winters •  Triple exponential smoothing Estimate of linear trend Seasonal correction factors Holt-Winters based Fitting Raw Data Fitted @Twitter 14
  • 15. Holt-Winters Holt-Winters based Forecast Raw Data Upper 95% CI Forecast @Twitter 15
  • 16. ARIMA •  Auto-Regressive Integrated Moving Average q  (p, d , q) Moving Average order Integrated order Autoregressive order Autoregressive component Moving Average component @Twitter 16
  • 17. ARIMA •  Fitting Auto ARIMA based Fitting Raw Data Fitted @Twitter 17
  • 18. ARIMA – Take 1 ARIMA based Forecast (p, d, q): (0,1,1)(0,1,1)[7] Raw Data Upper 95% CI Forecast @Twitter 18
  • 19. ARIMA – Take 2 Auto ARIMA based Forecast (p, d, q): (1,1,1)(2,0,0)[7] Raw Data Upper 95% CI Forecast @Twitter 19
  • 24. Implications of data characteristics ARIMA based forecast Raw Data Upper 95% CI Forecast @Twitter 24
  • 25. Forecast without the boundary case ARIMA based Forecast - Without initial spike Raw Data Upper 95% CI Forecast @Twitter 25
  • 26. Forecast with truncation ARIMA based Forecast - Truncated and Without initial spike Raw Data Upper 95% CI Forecast @Twitter 26
  • 27. Lessons learned •  Data fidelity q  Anomalies q  Absence of seasonality •  Modeling q  Never perfect §  Assess forecasting error q  Continuous refinement §  Incoming data stream is dynamic o  Organic growth o  New products o  Behavioral aspect @Twitter 27
  • 28. Acknowledgements •  Capacity Engineering Team •  Management team @Twitter 28
  • 29. Join the Flock Like problem solving? Like challenges? Be at cutting Edge Make an impact •  We are hiring!! q  https://twitter.com/JoinTheFlock q  https://twitter.com/jobs q  Contact us: @arun_kejariwal, @winstl @Twitter 29