SlideShare une entreprise Scribd logo
1  sur  23
Everything I Learned About
Scaling Online Games I
Learned at Google and eBay
Randy Shoup 
@randyshoup
linkedin.com/in/randyshoup
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/kixeye-scalability

InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Presented at QCon San Francisco
www.qconsf.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Background
CTO at KIXEYE
•  Making awesome games awesomer (and scalabler
and reliabler)

Director of Engineering for Google App Engine
•  World’s largest Platform-as-a-Service

Chief Engineer at eBay
•  Multiple generations of eBay’s real-time search
infrastructure
Engineering “Fun”
Whole user / player experience
•  Think holistically about the full end-to-end
experience of the user
•  UX, functionality, performance, bugs, etc.

All useful metrics are *proxies* for fun
•  Performance: load time, frame rate, lag
•  Technology: latency, availability
•  Business: acquisition, retention, monetization
Real-Time Strategy Games are …
Real-time
Spiky
Diverse
Constantly evolving
Constantly pushing boundaries

è Technically and operationally demanding
Know Your Requirements
Less is more
•  More wood, fewer arrows
•  Solve 100% of one problem rather than 50% of two
•  Release one great feature instead of two iffy ones

Understand the requirements
• 
• 
• 
• 



e.g., Battle replay
Ephemeral combat
Immutable recording
Manageable storage footprint
Know Your Bottlenecks
Log everything
Monitor relentlessly
Measure bottlenecks and attack the first
•  “When you solve problem one, problem two gets a
promotion”
•  Theory of Constraints: attacking *any* other problem
yields no improvement


Accept that your intuition is WRONG (!)
Know Your Distributions
“Normal” distribution is *not* normal
•  Only works for quantities physically constrained on
both sides, clustered around a mean
•  E.g., adult height or weight

Leads to invalid analysis and conclusions
•  Removing outliers 
•  Ignoring real problems
•  Your (trained) intuition is WRONG (!)
Know Your Distributions
Exponential (“Long Tail”) distribution *much*
more common
•  Income, latency, human connections, etc.
•  Also easy to reason about – only single parameter

Percentiles are your best friends (!)
• 
• 
• 
• 

Reasonably characterize any distribution
Measure 90%ile, 99%ile, 99.9%ile
Focus on the real problems
Mean and Standard Deviation are useless
Layering and Responsibility
Multiple layers
• 
• 
• 
• 

Client
Game server
Services
Persistence

Clarify roles and responsibilities
•  Client- vs. server-authoritative
•  Google service layering (+)
Distribution of Data / Work
Load-balancing (for stateless work)
•  Web servers, proxies
•  Most services


Sharding (for stateful work)
• 
• 
• 
• 

Combat servers
Matchmaking
Leaderboards
Databases
Services
Simple, well-defined interface
Single-purpose
Modular and independent
Small team
Autonomy and responsibility
Component Isolation
Combat server for TOME
•  Highly “twitchy” real-time MOBA combat
•  Very latency-sensitive

Real-time interactions isolated to a single,
ephemeral component
•  No coordination with any central service

Highly dynamic load distribution
•  Router assigns battle to least-loaded server
•  Requires latency-fairness between players
Asynchrony: Do Work Up Front
Custom asset pipeline
•  Spriting, compression, etc


Pre-render “movies” instead of real-time particle
effects

Tons of caching
Asynchrony: Client Liveness
Client continues seamlessly if disconnected
•  Gameplay more important than immediate
synchronization

Event loop for rendering
•  Keep up with the frame rate (!)

Default to background processing
•  Refresh assets
•  Save client state
Asynchrony: Reactive Server
Minimize request latency
•  Respond as rapidly as possible to client
•  Queue events / messages for complex work
•  Service interactions via reliable events

Functional Reactive programming
•  Heavy use of Scala and Akka
•  Never block (!)
•  eBay, Google programming models (-)
Small, Independent Teams
Studio System
•  Full-stack, independent game teams
•  Near-complete autonomy on technology choices,
development processes

Vendor-customer discipline
•  Google service teams (+)

Reduces contention and coherence
Hire and Retain Top People
Hire ‘A’ Players
•  Difference between top and bottom performers is
not 1.5x; it’s 10x (!) 
•  (+) Google hiring process

Virtuous Cycle
•  A players bring A players
•  B players bring C players
•  Constantly raise the bar

Reduces contention and coherence
Play to People’s Strengths
People are not cogs, not fungible
•  (-) eBay “Train seats”
•  Destroyed incentives, personal pride, long-term
ownership

Align work with skills and passion
•  Symphony instead of Factory (!)
•  Skills in Flash, Scala, etc.
•  Build customizability for target developer, not
builder (DSL >> code)
Small Details Matter
In the very large, the very small matters a *lot*
•  Subatomic physics and cosmology
•  eBay and variable-byte encoding (+) 
•  GAE and memcache slab memory allocation (+)

Discipline is *which* details matter
•  Combat server and memory contention
•  40% improvement from six characters …
•  “const ”
Join us!
www.kixeye.com [jobs]
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/kixeyescalability

Contenu connexe

En vedette

Activitat Boig Per Tu
Activitat Boig Per TuActivitat Boig Per Tu
Activitat Boig Per TuEducació
 
Pablo Picasso Jv
Pablo Picasso JvPablo Picasso Jv
Pablo Picasso JvEducació
 
Exemples de bones pràctiques: El nostre APM
Exemples de bones pràctiques: El nostre APMExemples de bones pràctiques: El nostre APM
Exemples de bones pràctiques: El nostre APMCFA Jacint Verdaguer
 
Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Raffi Krikorian
 
POAC DAN SWOT ANIMATION
POAC DAN SWOT ANIMATIONPOAC DAN SWOT ANIMATION
POAC DAN SWOT ANIMATIONirvans669
 
Experience for Everyone?
Experience for Everyone?Experience for Everyone?
Experience for Everyone?Dean Johnson
 
Interface do Olho - by Alvaro Lourenço
Interface do Olho - by Alvaro LourençoInterface do Olho - by Alvaro Lourenço
Interface do Olho - by Alvaro LourençoUX Overdrive
 
Realidade virtual e o Comportamento Humano - by Taynah Miyagawa
Realidade virtual e o Comportamento Humano -  by Taynah MiyagawaRealidade virtual e o Comportamento Humano -  by Taynah Miyagawa
Realidade virtual e o Comportamento Humano - by Taynah MiyagawaUX Overdrive
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitternkallen
 

En vedette (12)

Activitat Boig Per Tu
Activitat Boig Per TuActivitat Boig Per Tu
Activitat Boig Per Tu
 
Hvordan bidrar Feide til økt sikkerhet?
Hvordan bidrar Feide til økt sikkerhet?Hvordan bidrar Feide til økt sikkerhet?
Hvordan bidrar Feide til økt sikkerhet?
 
Pablo Picasso Jv
Pablo Picasso JvPablo Picasso Jv
Pablo Picasso Jv
 
Exemples de bones pràctiques: El nostre APM
Exemples de bones pràctiques: El nostre APMExemples de bones pràctiques: El nostre APM
Exemples de bones pràctiques: El nostre APM
 
Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)Twitter for CS10 @ Berkeley (Spring 2011)
Twitter for CS10 @ Berkeley (Spring 2011)
 
POAC DAN SWOT ANIMATION
POAC DAN SWOT ANIMATIONPOAC DAN SWOT ANIMATION
POAC DAN SWOT ANIMATION
 
Experience for Everyone?
Experience for Everyone?Experience for Everyone?
Experience for Everyone?
 
Evaluación del aprendizaje: tecnicas
Evaluación del aprendizaje: tecnicasEvaluación del aprendizaje: tecnicas
Evaluación del aprendizaje: tecnicas
 
Interface do Olho - by Alvaro Lourenço
Interface do Olho - by Alvaro LourençoInterface do Olho - by Alvaro Lourenço
Interface do Olho - by Alvaro Lourenço
 
Realidade virtual e o Comportamento Humano - by Taynah Miyagawa
Realidade virtual e o Comportamento Humano -  by Taynah MiyagawaRealidade virtual e o Comportamento Humano -  by Taynah Miyagawa
Realidade virtual e o Comportamento Humano - by Taynah Miyagawa
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitter
 

Plus de C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Plus de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Dernier

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Everything I Learned About Scaling Online Games I Learned at Google and eBay: Scalability at KIXEYE

  • 1. Everything I Learned About Scaling Online Games I Learned at Google and eBay Randy Shoup @randyshoup linkedin.com/in/randyshoup
  • 2. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /kixeye-scalability InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month
  • 3. Presented at QCon San Francisco www.qconsf.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. Background CTO at KIXEYE •  Making awesome games awesomer (and scalabler and reliabler) Director of Engineering for Google App Engine •  World’s largest Platform-as-a-Service Chief Engineer at eBay •  Multiple generations of eBay’s real-time search infrastructure
  • 5. Engineering “Fun” Whole user / player experience •  Think holistically about the full end-to-end experience of the user •  UX, functionality, performance, bugs, etc. All useful metrics are *proxies* for fun •  Performance: load time, frame rate, lag •  Technology: latency, availability •  Business: acquisition, retention, monetization
  • 6. Real-Time Strategy Games are … Real-time Spiky Diverse Constantly evolving Constantly pushing boundaries è Technically and operationally demanding
  • 7. Know Your Requirements Less is more •  More wood, fewer arrows •  Solve 100% of one problem rather than 50% of two •  Release one great feature instead of two iffy ones Understand the requirements •  •  •  •  e.g., Battle replay Ephemeral combat Immutable recording Manageable storage footprint
  • 8. Know Your Bottlenecks Log everything Monitor relentlessly Measure bottlenecks and attack the first •  “When you solve problem one, problem two gets a promotion” •  Theory of Constraints: attacking *any* other problem yields no improvement Accept that your intuition is WRONG (!)
  • 9. Know Your Distributions “Normal” distribution is *not* normal •  Only works for quantities physically constrained on both sides, clustered around a mean •  E.g., adult height or weight Leads to invalid analysis and conclusions •  Removing outliers •  Ignoring real problems •  Your (trained) intuition is WRONG (!)
  • 10. Know Your Distributions Exponential (“Long Tail”) distribution *much* more common •  Income, latency, human connections, etc. •  Also easy to reason about – only single parameter Percentiles are your best friends (!) •  •  •  •  Reasonably characterize any distribution Measure 90%ile, 99%ile, 99.9%ile Focus on the real problems Mean and Standard Deviation are useless
  • 11. Layering and Responsibility Multiple layers •  •  •  •  Client Game server Services Persistence Clarify roles and responsibilities •  Client- vs. server-authoritative •  Google service layering (+)
  • 12. Distribution of Data / Work Load-balancing (for stateless work) •  Web servers, proxies •  Most services Sharding (for stateful work) •  •  •  •  Combat servers Matchmaking Leaderboards Databases
  • 13. Services Simple, well-defined interface Single-purpose Modular and independent Small team Autonomy and responsibility
  • 14. Component Isolation Combat server for TOME •  Highly “twitchy” real-time MOBA combat •  Very latency-sensitive Real-time interactions isolated to a single, ephemeral component •  No coordination with any central service Highly dynamic load distribution •  Router assigns battle to least-loaded server •  Requires latency-fairness between players
  • 15. Asynchrony: Do Work Up Front Custom asset pipeline •  Spriting, compression, etc Pre-render “movies” instead of real-time particle effects Tons of caching
  • 16. Asynchrony: Client Liveness Client continues seamlessly if disconnected •  Gameplay more important than immediate synchronization Event loop for rendering •  Keep up with the frame rate (!) Default to background processing •  Refresh assets •  Save client state
  • 17. Asynchrony: Reactive Server Minimize request latency •  Respond as rapidly as possible to client •  Queue events / messages for complex work •  Service interactions via reliable events Functional Reactive programming •  Heavy use of Scala and Akka •  Never block (!) •  eBay, Google programming models (-)
  • 18. Small, Independent Teams Studio System •  Full-stack, independent game teams •  Near-complete autonomy on technology choices, development processes Vendor-customer discipline •  Google service teams (+) Reduces contention and coherence
  • 19. Hire and Retain Top People Hire ‘A’ Players •  Difference between top and bottom performers is not 1.5x; it’s 10x (!) •  (+) Google hiring process Virtuous Cycle •  A players bring A players •  B players bring C players •  Constantly raise the bar Reduces contention and coherence
  • 20. Play to People’s Strengths People are not cogs, not fungible •  (-) eBay “Train seats” •  Destroyed incentives, personal pride, long-term ownership Align work with skills and passion •  Symphony instead of Factory (!) •  Skills in Flash, Scala, etc. •  Build customizability for target developer, not builder (DSL >> code)
  • 21. Small Details Matter In the very large, the very small matters a *lot* •  Subatomic physics and cosmology •  eBay and variable-byte encoding (+) •  GAE and memcache slab memory allocation (+) Discipline is *which* details matter •  Combat server and memory contention •  40% improvement from six characters … •  “const ”
  • 23. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/kixeyescalability