SlideShare une entreprise Scribd logo
1  sur  59
GAMING IN
THE CLOUD
HOW GEARBOX USES
AMAZON WEB SERVICES
TO REACH
MILLIONS OF PLAYERS
Jimmy Sieben @jimmys
A WORD ABOUT ME
I’ve been programming for 25+ years
Making games since 1995; at Gearbox for 12 years
Network Programming on multiple titles
• Halo: Combat Evolved (2003: PC)
• Brothers in Arms: Road to Hill 30 (2005: PC/Xbox)
• Brothers in Arms: Hell’s Highway (2008: PC/PS3/Xbox 360)
• Borderlands (2009: PC/PS3/Xbox 360)
• Borderlands 2 (2012: PC/PS3/Xbox 360)
Currently directing Spark team and SHiFT
A WORD ABOUT
GEARBOX SOFTWARE
AWARD-WINNING
INDEPENDENT VIDEO
GAME DEVELOPER
BASED IN PLANO, TX
OVER 200 ARTISTS,
DESIGNERS, ENGINEERS,
DEVELOPERS, ADMINS
OUR BRANDS
OUR BRANDS
OUR BRANDS
OUR BRANDS
A WORD ABOUT
BORDERLANDS
A WORD ABOUT
BORDERLANDS 2
Franchise sales of over 16.5 Million
SHIFT
SHiFT: Our online service
• In-game, Web, Mobile
WHY BUILD THIS?
Games are increasingly social, connected experiences
Next generation of games: Always on, Always Connected
AAA Games must go beyond the box
• Embrace the web and mobile, companion experiences
• Engage with players any time, anywhere
• Build the brand
Ultimately, all about the customer
• Connection directly to the fans
• Enable the community to forge connections
Our backend platform
Internal name, describes
the Team and Technology
Small team of 10 devops
Services-Oriented
Architecture
SPARK
Source: geek-and-poke.com
SPARK
Amazon EC2 Amazon EMR
Amazon
Kinesis
Amazon
Route 53
Elastic Load
Balancing Amazon VPCAuto Scaling
Amazon S3Amazon EBSCloudFront
DynamoDBAmazon RDSElastiCache Amazon Redshift
CloudWatchAWS Data Pipeline
AWS
CloudFormationAWS CloudTrail IAM
Amazon SES Amazon SNS Amazon SQS virtual private cloud
GETTING
STARTED
THE CHALLENGE OF
AAA GAMES
Startups & mobile teams reference a
soft launch, gradual run-up to inflection point
(John Mayer tweets about Words with Friends)
Day 0 Day 1 Day 30 Day 120
THE CHALLENGE OF
AAA GAMES
AAA game launches are the opposite:
Vertical, long tail and plateau
Day 0 Day 1 Day 30 Day 120
Startup
AAA
BUILDING THE
SERVICE
Research
Build a team
Start coding
Ship it 2-3 years later?
…. This isn’t easy. Is there a better way?
BUILDING A BETA
BUILDING A BETA
We used Borderlands 1 as a testbed for Borderlands 2
Built on Slicehost
• At the time all Gearbox websites were hosted there
• Ran our own MySQL and ActiveMQ instances
Manually provisioned hardware and configured software
• Took a couple of weeks to get everything working
• A bit of a painful, heroic effort
BENEFITS OF BETA
Clock synchronization problem on server
• Servers slowly drifted away from game clients
• Some crash reports early…
• …By Saturday morning, all clients crashing!
• Workaround server side, instantly fixed crashes!
Lessons
• Some test are vectors very difficult to predict
• Server tunability is incredibly valuable
• Tuesdays are the Best Days! (Not Friday!)
BETA CAPACITY
PLANNING
Looked at Steam data in March
Predictable decline to July Launch
March May July September
BETA CAPACITY
PLANNING
We shipped Btest in September…
Steam Summer Sale!
Borderlands 2 announced!
March May July September
Planned
Actual
BETA CAPACITY
PLANNING
Scrambled to handle dramatically higher load
• Resized DBs, more servers, reconfiguration
• Painful!
Lessons:
• Pay close attention and adjust constantly
• Be plugged in to PR and Business
• Be agile, use tools to help agility
DO ANOTHER BETA!
Source: geek-and-poke.com
SPARK -> CLOUD
BTest1 was hard to operate on Slicehost
• Capacity hard to adjust, and we didn’t get it right
• We knew we needed to design for more flexibility
• Tools didn’t support the agility we needed
BTest2 Shipped on Amazon Web Services
• EC2, RDS, ELB
• Puppet to configure instances
• Steep learning curve, but paid off
• Didn’t get everything right…
BTEST2: HOLIDAY
STABILITY
We launched and were pretty stable
However, problem Christmas evening!
• Our game was still selling, new people playing
• Queues were backing up, not severe
• A few days later, CPU is pegged!
• The Cloud to the rescue! Deploy more bigger!
Lessons:
• Queue storage in cloud gave wiggle room
• It was actually pretty easy to recover from CPU peg
• Capacity planning still hard!
BTEST2: MISSED
OPPORTUNITIES
New to AWS, Deployed classic EC2 instances
Skipped VPC
• This turned out to be a mistake
• More difficult to secure some resources like we wanted
• Had to build load balancing logic into app layer
Lessons:
• Embrace as much of the feature set as you can
• Don’t be afraid to choose long term over short term
• Especially for a Beta!
MOVING TO
LAUNCH
LAUNCHING
BORDERLANDS 2
Borderlands 2 launch: September 18, 2012
Applied some lessons from BTest2
• Doubled down on load testing
• Improved our usage of Puppet and Capistrano
• Pre-warmed our ELBs with Amazon and established LOC
Latest capacity info from industry friends and experts
projected we would survive
• But still, wave of terror washed over me at T-6 hrs
• Capacity planning is hard!
SMOOTH SAILING
(MOSTLY)
DAY 2: KEEPING
TELEMETRY GOING
Launch week capacity was tough to manage
We wanted to keep costs in check, but had not implemented
AWS Auto-Scaling Groups
Manually add/remove instances at set times
A week post-launch we were
stable enough to use SHiFT
Codes
• Randy got things started
with some quick tests
• Engaged directly with
devops team to measure
results
• Got a little TOO
engaged…
SHIFT CODES!
SHIFT CODES: CHAOS
Lessons:
Try not to intermingle monitoring for different components
Be extra careful querying 100MM record datasets!
SHIFT CODES:
UNEXPECTED BEHAVIOR
Telemetry traffic pattern changes when a code drops
Users Save & Exit game, wait to redeem in menu
Causes spike and lull in telemetry traffic
TAKING SPARK TO 1.0
We shipped Borderlands 2 on something like a 0.8
Spent next 6 months improving every aspect of platform
ADDED MORE SERVICES
AND TITLES
Borderlands 2 was a success!
Quickly integrated into Aliens: Colonial Marines
Developed a News service to communicate directly to fans
BACKPORTED NEWS
TO BORDERLANDS 2
BEYOND THE
GAME
GOT EXPERIENCE
WITH HADOOP & EMR
3 months
3 days
3 hours
1
10
100
1000
10000
Generation 1 Generation 2 Generation 3
Processing Time in hours for
1 month of raw data
SWEEPSTAKES!
Borderlands 2 Game of the Year Edition release October 2013
BEHIND THE LOOT
HUNT
Inception to ship in 2 months
No changes to the Game or Core systems
Goals
• Put the R&D EMR effort into production
• Try Elasticache with Redis
• Learn something about running a live community event
Amazon EMR AWS Data Pipeline
LOOT HUNT
RESULTS
LOOT HUNT
RESULTS
LOOT HUNT
RESULTS
LOOT HUNT
RESULTS
LOOT THE WORLD!
LOOT THE WORLD!
LOOT THE WORLD
WHY DID WE
SUCCEED?
Great team that believed in the vision
Adopt Devops Mentality
WHY DID WE
SUCCEED?
Start simple and build piece-by-piece
Learn as you go
• Optimize
• Refactor
• Measure
WHAT’S NEXT?
New services look appealing to us
Amazon
Kinesis AWS Data Pipeline
AWS
CloudFormation
WHAT’S NEXT?
Evaluate other Cloud Providers
WHAT’S NEXT?
COMING THIS FALL!
GEARBOX IS HIRING!
Come join the team!
• Designers
• Artists
• Programmers
• Devops
http://www.gearboxsoftware.com/jobs
Jimmy Sieben @jimmys

Contenu connexe

Tendances

OpenZFS send and receive
OpenZFS send and receiveOpenZFS send and receive
OpenZFS send and receiveMatthew Ahrens
 
Concurrent Processing Performance Analysis for Apps DBAs
Concurrent Processing Performance Analysis for Apps DBAsConcurrent Processing Performance Analysis for Apps DBAs
Concurrent Processing Performance Analysis for Apps DBAsMaris Elsins
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...The Linux Foundation
 
Strategy for Simbiz simulation
Strategy for Simbiz simulationStrategy for Simbiz simulation
Strategy for Simbiz simulationcori wolf
 
micro-ROS - New client library and middleware features
micro-ROS - New client library and middleware featuresmicro-ROS - New client library and middleware features
micro-ROS - New client library and middleware featureseProsima
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101tcaesvk
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsKoan-Sin Tan
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseSeveralnines
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsThe Linux Foundation
 
Yocto Project introduction
Yocto Project introductionYocto Project introduction
Yocto Project introductionYi-Hsiu Hsu
 
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304
Demystifying Security Root of Trust Approaches for IoT/Embedded  - SFO17-304Demystifying Security Root of Trust Approaches for IoT/Embedded  - SFO17-304
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304Linaro
 

Tendances (20)

Process Synchronization
Process SynchronizationProcess Synchronization
Process Synchronization
 
OpenZFS send and receive
OpenZFS send and receiveOpenZFS send and receive
OpenZFS send and receive
 
Git
GitGit
Git
 
Concurrent Processing Performance Analysis for Apps DBAs
Concurrent Processing Performance Analysis for Apps DBAsConcurrent Processing Performance Analysis for Apps DBAs
Concurrent Processing Performance Analysis for Apps DBAs
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
 
Strategy for Simbiz simulation
Strategy for Simbiz simulationStrategy for Simbiz simulation
Strategy for Simbiz simulation
 
Micro-controllers (PIC) based Application Development
Micro-controllers (PIC) based Application DevelopmentMicro-controllers (PIC) based Application Development
Micro-controllers (PIC) based Application Development
 
micro-ROS - New client library and middleware features
micro-ROS - New client library and middleware featuresmicro-ROS - New client library and middleware features
micro-ROS - New client library and middleware features
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN hypervisor introduction
Project ACRN hypervisor introduction
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source Tools
 
Syslog Protocols
Syslog ProtocolsSyslog Protocols
Syslog Protocols
 
Git undo
Git undoGit undo
Git undo
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash Course
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
 
Yocto Project introduction
Yocto Project introductionYocto Project introduction
Yocto Project introduction
 
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304
Demystifying Security Root of Trust Approaches for IoT/Embedded  - SFO17-304Demystifying Security Root of Trust Approaches for IoT/Embedded  - SFO17-304
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304
 

Similaire à Gaming in the Cloud: How Gearbox Software Uses Amazon Web Services to Reach Millions of Gamers

Igniting the Spark: Building Online Services for Borderlands 2
Igniting the Spark: Building Online Services for Borderlands 2Igniting the Spark: Building Online Services for Borderlands 2
Igniting the Spark: Building Online Services for Borderlands 2Jimmy Sieben
 
Confrontation Pipeline and SCons
Confrontation Pipeline and SConsConfrontation Pipeline and SCons
Confrontation Pipeline and SConsslantsixgames
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hdslantsixgames
 
2004: Söldner - a Post Mortem
2004: Söldner - a Post Mortem2004: Söldner - a Post Mortem
2004: Söldner - a Post MortemTeut Weidemann
 
Supersize Your Production Pipe
Supersize Your Production PipeSupersize Your Production Pipe
Supersize Your Production Pipeslantsixgames
 
Maximize Your Production Effort (English)
Maximize Your Production Effort (English)Maximize Your Production Effort (English)
Maximize Your Production Effort (English)slantsixgames
 
Photon Session / Unite12 Conference
Photon Session / Unite12 ConferencePhoton Session / Unite12 Conference
Photon Session / Unite12 ConferenceChristof Wegmann
 
Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)DevGAMM Conference
 
Massively Social != Massively Multiplayer
Massively Social != Massively MultiplayerMassively Social != Massively Multiplayer
Massively Social != Massively MultiplayerPaul Furio
 
(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests
(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests
(GAM402) Turbine: A Microservice Approach to 3 Billion Game RequestsAmazon Web Services
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on RaspberryIgalia
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Amazon Web Services Korea
 
AWS Partner Presentation - KANO/APPS - Large Scale HTML5 Games on Desktop, M...
AWS Partner Presentation -  KANO/APPS - Large Scale HTML5 Games on Desktop, M...AWS Partner Presentation -  KANO/APPS - Large Scale HTML5 Games on Desktop, M...
AWS Partner Presentation - KANO/APPS - Large Scale HTML5 Games on Desktop, M...Amazon Web Services
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetOlinData
 
The State of Puppet
The State of PuppetThe State of Puppet
The State of PuppetPuppet
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetWalter Heck
 
Rapid Game Development with RUby and Gosu – Ruby Manor 4
Rapid Game Development with RUby and Gosu – Ruby Manor 4Rapid Game Development with RUby and Gosu – Ruby Manor 4
Rapid Game Development with RUby and Gosu – Ruby Manor 4benko
 
Amazon Lumberyard: end-to-end solutions for game developers
Amazon Lumberyard: end-to-end solutions for game developersAmazon Lumberyard: end-to-end solutions for game developers
Amazon Lumberyard: end-to-end solutions for game developersDevGAMM Conference
 
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...Docker, Inc.
 

Similaire à Gaming in the Cloud: How Gearbox Software Uses Amazon Web Services to Reach Millions of Gamers (20)

Igniting the Spark: Building Online Services for Borderlands 2
Igniting the Spark: Building Online Services for Borderlands 2Igniting the Spark: Building Online Services for Borderlands 2
Igniting the Spark: Building Online Services for Borderlands 2
 
Confrontation Pipeline and SCons
Confrontation Pipeline and SConsConfrontation Pipeline and SCons
Confrontation Pipeline and SCons
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
 
2004: Söldner - a Post Mortem
2004: Söldner - a Post Mortem2004: Söldner - a Post Mortem
2004: Söldner - a Post Mortem
 
Marek-Martin Matyska, Gamajun Games
Marek-Martin Matyska, Gamajun GamesMarek-Martin Matyska, Gamajun Games
Marek-Martin Matyska, Gamajun Games
 
Supersize Your Production Pipe
Supersize Your Production PipeSupersize Your Production Pipe
Supersize Your Production Pipe
 
Maximize Your Production Effort (English)
Maximize Your Production Effort (English)Maximize Your Production Effort (English)
Maximize Your Production Effort (English)
 
Photon Session / Unite12 Conference
Photon Session / Unite12 ConferencePhoton Session / Unite12 Conference
Photon Session / Unite12 Conference
 
Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)Inside the IT Territory game server / Mark Lokshin (IT Territory)
Inside the IT Territory game server / Mark Lokshin (IT Territory)
 
Massively Social != Massively Multiplayer
Massively Social != Massively MultiplayerMassively Social != Massively Multiplayer
Massively Social != Massively Multiplayer
 
(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests
(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests
(GAM402) Turbine: A Microservice Approach to 3 Billion Game Requests
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on Raspberry
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
 
AWS Partner Presentation - KANO/APPS - Large Scale HTML5 Games on Desktop, M...
AWS Partner Presentation -  KANO/APPS - Large Scale HTML5 Games on Desktop, M...AWS Partner Presentation -  KANO/APPS - Large Scale HTML5 Games on Desktop, M...
AWS Partner Presentation - KANO/APPS - Large Scale HTML5 Games on Desktop, M...
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of Puppet
 
The State of Puppet
The State of PuppetThe State of Puppet
The State of Puppet
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of Puppet
 
Rapid Game Development with RUby and Gosu – Ruby Manor 4
Rapid Game Development with RUby and Gosu – Ruby Manor 4Rapid Game Development with RUby and Gosu – Ruby Manor 4
Rapid Game Development with RUby and Gosu – Ruby Manor 4
 
Amazon Lumberyard: end-to-end solutions for game developers
Amazon Lumberyard: end-to-end solutions for game developersAmazon Lumberyard: end-to-end solutions for game developers
Amazon Lumberyard: end-to-end solutions for game developers
 
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...
Activision's Skypilot: Delivering Amazing Game Experiences Through Containeri...
 

Dernier

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Gaming in the Cloud: How Gearbox Software Uses Amazon Web Services to Reach Millions of Gamers

  • 1. GAMING IN THE CLOUD HOW GEARBOX USES AMAZON WEB SERVICES TO REACH MILLIONS OF PLAYERS Jimmy Sieben @jimmys
  • 2. A WORD ABOUT ME I’ve been programming for 25+ years Making games since 1995; at Gearbox for 12 years Network Programming on multiple titles • Halo: Combat Evolved (2003: PC) • Brothers in Arms: Road to Hill 30 (2005: PC/Xbox) • Brothers in Arms: Hell’s Highway (2008: PC/PS3/Xbox 360) • Borderlands (2009: PC/PS3/Xbox 360) • Borderlands 2 (2012: PC/PS3/Xbox 360) Currently directing Spark team and SHiFT
  • 7. OVER 200 ARTISTS, DESIGNERS, ENGINEERS, DEVELOPERS, ADMINS
  • 13. A WORD ABOUT BORDERLANDS 2 Franchise sales of over 16.5 Million
  • 14. SHIFT SHiFT: Our online service • In-game, Web, Mobile
  • 15. WHY BUILD THIS? Games are increasingly social, connected experiences Next generation of games: Always on, Always Connected AAA Games must go beyond the box • Embrace the web and mobile, companion experiences • Engage with players any time, anywhere • Build the brand Ultimately, all about the customer • Connection directly to the fans • Enable the community to forge connections
  • 16. Our backend platform Internal name, describes the Team and Technology Small team of 10 devops Services-Oriented Architecture SPARK Source: geek-and-poke.com
  • 17. SPARK
  • 18. Amazon EC2 Amazon EMR Amazon Kinesis Amazon Route 53 Elastic Load Balancing Amazon VPCAuto Scaling Amazon S3Amazon EBSCloudFront DynamoDBAmazon RDSElastiCache Amazon Redshift CloudWatchAWS Data Pipeline AWS CloudFormationAWS CloudTrail IAM Amazon SES Amazon SNS Amazon SQS virtual private cloud
  • 20. THE CHALLENGE OF AAA GAMES Startups & mobile teams reference a soft launch, gradual run-up to inflection point (John Mayer tweets about Words with Friends) Day 0 Day 1 Day 30 Day 120
  • 21. THE CHALLENGE OF AAA GAMES AAA game launches are the opposite: Vertical, long tail and plateau Day 0 Day 1 Day 30 Day 120 Startup AAA
  • 22. BUILDING THE SERVICE Research Build a team Start coding Ship it 2-3 years later? …. This isn’t easy. Is there a better way?
  • 24. BUILDING A BETA We used Borderlands 1 as a testbed for Borderlands 2 Built on Slicehost • At the time all Gearbox websites were hosted there • Ran our own MySQL and ActiveMQ instances Manually provisioned hardware and configured software • Took a couple of weeks to get everything working • A bit of a painful, heroic effort
  • 25. BENEFITS OF BETA Clock synchronization problem on server • Servers slowly drifted away from game clients • Some crash reports early… • …By Saturday morning, all clients crashing! • Workaround server side, instantly fixed crashes! Lessons • Some test are vectors very difficult to predict • Server tunability is incredibly valuable • Tuesdays are the Best Days! (Not Friday!)
  • 26. BETA CAPACITY PLANNING Looked at Steam data in March Predictable decline to July Launch March May July September
  • 27. BETA CAPACITY PLANNING We shipped Btest in September… Steam Summer Sale! Borderlands 2 announced! March May July September Planned Actual
  • 28. BETA CAPACITY PLANNING Scrambled to handle dramatically higher load • Resized DBs, more servers, reconfiguration • Painful! Lessons: • Pay close attention and adjust constantly • Be plugged in to PR and Business • Be agile, use tools to help agility
  • 29. DO ANOTHER BETA! Source: geek-and-poke.com
  • 30. SPARK -> CLOUD BTest1 was hard to operate on Slicehost • Capacity hard to adjust, and we didn’t get it right • We knew we needed to design for more flexibility • Tools didn’t support the agility we needed BTest2 Shipped on Amazon Web Services • EC2, RDS, ELB • Puppet to configure instances • Steep learning curve, but paid off • Didn’t get everything right…
  • 31. BTEST2: HOLIDAY STABILITY We launched and were pretty stable However, problem Christmas evening! • Our game was still selling, new people playing • Queues were backing up, not severe • A few days later, CPU is pegged! • The Cloud to the rescue! Deploy more bigger! Lessons: • Queue storage in cloud gave wiggle room • It was actually pretty easy to recover from CPU peg • Capacity planning still hard!
  • 32. BTEST2: MISSED OPPORTUNITIES New to AWS, Deployed classic EC2 instances Skipped VPC • This turned out to be a mistake • More difficult to secure some resources like we wanted • Had to build load balancing logic into app layer Lessons: • Embrace as much of the feature set as you can • Don’t be afraid to choose long term over short term • Especially for a Beta!
  • 34. LAUNCHING BORDERLANDS 2 Borderlands 2 launch: September 18, 2012 Applied some lessons from BTest2 • Doubled down on load testing • Improved our usage of Puppet and Capistrano • Pre-warmed our ELBs with Amazon and established LOC Latest capacity info from industry friends and experts projected we would survive • But still, wave of terror washed over me at T-6 hrs • Capacity planning is hard!
  • 36. DAY 2: KEEPING TELEMETRY GOING Launch week capacity was tough to manage We wanted to keep costs in check, but had not implemented AWS Auto-Scaling Groups Manually add/remove instances at set times
  • 37. A week post-launch we were stable enough to use SHiFT Codes • Randy got things started with some quick tests • Engaged directly with devops team to measure results • Got a little TOO engaged… SHIFT CODES!
  • 38. SHIFT CODES: CHAOS Lessons: Try not to intermingle monitoring for different components Be extra careful querying 100MM record datasets!
  • 39. SHIFT CODES: UNEXPECTED BEHAVIOR Telemetry traffic pattern changes when a code drops Users Save & Exit game, wait to redeem in menu Causes spike and lull in telemetry traffic
  • 40. TAKING SPARK TO 1.0 We shipped Borderlands 2 on something like a 0.8 Spent next 6 months improving every aspect of platform
  • 41. ADDED MORE SERVICES AND TITLES Borderlands 2 was a success! Quickly integrated into Aliens: Colonial Marines Developed a News service to communicate directly to fans
  • 44. GOT EXPERIENCE WITH HADOOP & EMR 3 months 3 days 3 hours 1 10 100 1000 10000 Generation 1 Generation 2 Generation 3 Processing Time in hours for 1 month of raw data
  • 45. SWEEPSTAKES! Borderlands 2 Game of the Year Edition release October 2013
  • 46. BEHIND THE LOOT HUNT Inception to ship in 2 months No changes to the Game or Core systems Goals • Put the R&D EMR effort into production • Try Elasticache with Redis • Learn something about running a live community event Amazon EMR AWS Data Pipeline
  • 54. WHY DID WE SUCCEED? Great team that believed in the vision Adopt Devops Mentality
  • 55. WHY DID WE SUCCEED? Start simple and build piece-by-piece Learn as you go • Optimize • Refactor • Measure
  • 56. WHAT’S NEXT? New services look appealing to us Amazon Kinesis AWS Data Pipeline AWS CloudFormation
  • 57. WHAT’S NEXT? Evaluate other Cloud Providers
  • 59. GEARBOX IS HIRING! Come join the team! • Designers • Artists • Programmers • Devops http://www.gearboxsoftware.com/jobs Jimmy Sieben @jimmys

Notes de l'éditeur

  1. Gearbox is an Award-Winning Independent Video Game Developer Based in Plano, TXShot of our playtest lab where our User Research department conducts studies on how people play and respond to our games
  2. Borderlands 2 received a lot of awards in2012Game of the Year from X-PlayIGN People's Choice Award for Best Overall GameMost Played New Game from RaptrBest Cooperative Multiplayer from Game InformerBest Shooter and Character of the Year (Claptrap) from Spike,Amazon.com Editor's Pickthe list goes on and on, with other awards coming in from US Military Gamers, PlayStation Blog, Wired, Yahoo, Complex, Mature Gaming, Rev3Games, The Speaky's (Kotaku Community Awards), All That's Epic and the community-voted G4TV Videogame Deathmatch just to name a few
  3. Artists at work
  4. Common area in our studio
  5. Borderlands Introduced in 2009Co-op Shooter LooterFPS Action, Action-RPG Mechanics4 player Cooperative, drop-in drop-out
  6. Borderlands 2 released in 2012 and to date over 16.5 million sales in the franchise.Refined Shooter Looter, enhanced coop playBuilt SHiFT and Spark to connect to communityA new initiative, something we’ve never done before
  7. Customer-facing, Fan and reward-focused
  8. Linux, Ruby, Rails, MongoDB, Redis, Puppet, JavaMySQL, HadoopAll running on Amazon Web Services
  9. Big believers in open source, love technology that has strong user communitiesLinux, Ruby, Rails, MongoDB, Redis, Puppet, JavaMySQL, HadoopAll running on Amazon Web Services tying it together
  10. 23 Products and Features of AWS in Use TodayWe feel like AWS has been a huge enabler for Spark, especially with a small team like ours
  11. Launch night was big, the first weekend was the peak. Sustained traffic for the first couple of monthsboosted by DLC, eventually settle into a stable player base
  12. I decided we needed a beta to prove out what we were doing
  13. Launched Friday, September 9, 2011
  14. Bad (or maybe good) luck that the servers were up just long enough that by Btest launch they had drifted enough to expose the bug – we rejected tickets from the futurePatch Tuesday is a thing for a reasonAlso release Micropatches on Tuesdays, built our hotfix workflow around this lesson
  15. We started in March, looked at Steam dataPredictable decline to planned July Launch
  16. Unfortunately, R&D caused our schedule to slip a bit, so we didn’t launch until SeptemberMeanwhile, marketing was doing what they do best: selling our game! A couple of things happenedAnd, we announced Borderlands 2!
  17. Coming out of the BTest1 experience we knew we had some unanswered questions.We wanted to try new tools and infrastructure and we wanted to get experience with it with less than a year to go until Borderlands 2 retail launch
  18. Steep learning curve as the team was familiar with traditional IT environment. Experience with virtualization, but on a much smaller scale for internal resourcesStill, we jumped right in and started digesting the APIs, had an environment up and running pretty quickly, with a LONG list of things to improve on post-btest2Launched Tuesday, December 13, 2011
  19. Nice validation of our decision to deploy on AWS
  20. 9 months from BTest2 to finish up the core system implementation and get ready for the vertical AAA launchConstant communication with Business and Marketing to understand the expected Day 1 / Week 1 sales
  21. Things generally worked out OK!There were a few issues to solve in the first week of launch but the team largely survived unscathed
  22. Valley to peak was about a factor of 4Ran custom scripts to change capacity, very painful
  23. While looking for real-time code redemption results, I issued a bad query that impacted some monitoringTook most of the afternoon and evening to recoverRedis failover scripts did not work as expectedRestart monitoring node, stabilize clusterMove some monitoring functionality to new nodeLessons:Try not to intermingle monitoring for different componentsBe extra careful querying 100MM record datasets!
  24. Right at 2pm, code drops and traffic spikes – had to redeem code in main menuDoes not recover for quite some time, users play is interrupted
  25. Implement ASG based on AMIsCompletely overhauled deploymentImplemented centralized log collection and searchFixed memory leaks in some appsImplemented VPCMade apps stateless (mostly)ActiveMQ something we are still using but want to move away from as it is hard to scale dynamically
  26. Find opportunities to get things built that improve the platform as a wholeDid some other things too
  27. Later that year, integrated News back into Borderlands 2Community team thanked usLive team thanked us as well 
  28. Using EMR, we were able to finally get a handle on our data
  29. Confidence in our systems and data processing felt realLaunched a sweepstakes, called the Borderlands 2 $100,000 Loot HuntGave away cash and prizes for playing our game30 day eventDaily challenge: kill this enemy and earn a special weapon rewardCommunity goal: take that weapon and kill some other enemies with it, work towards the total as a communityDay 7 rewards for hitting all the goals: a bonus weapon drop on the daily challenge
  30. Nearly killed meEverything on the backend – tweak game via MicropatchesEMR for data processingData Pipeline to tie it together……but eventually replaced that as it didn’t feel ready for us
  31. Tremendous participation, over 1 million entries from fansPut together infographic as a result
  32. iOS and Android apps integrated with SHiFTBuilt a new service for tracking Item CollectionBuilt OAuth service to permit loginsConnects to existing News and Account service
  33. Team of 10 launched Spark and supports it todayYou built it, you operate it! Some specialization in the team but in general a lot of collaboration. Everyone works together to keep Spark running
  34. Started with just 3 services (Auth, Configuration, Telemetry)Built up over time, now over 25 apps in the backend – saw this with Aliens bringing News, LootTheWorld bringing Oauth and Item CollectionsEvery new piece gives almost a geometric new capability to the platform
  35. Kinesis can solve ActiveMQ issuesGive DataPipeline a whirl again as it maturesImplement Cloud Formation to make deploying an entire application and full environment turnkey
  36. Keep our eyes openAzure has some compelling servicesXbox Live Cloud Compute, powering the new Titanfall game on Xbox One is very interestingI hear Google has great performance
  37. New titles in developmentDesigners find new ways to use existing services