Firefall is a free-to-play cooperative online shooter game with a "shardless" world and instance-based maps. The developers chose to build the game infrastructure in the cloud to handle unpredictable player numbers and development changes, and to take advantage of cost savings from cyclical player behavior. Their goals were quick regional expansion, on-demand scalability, minimal downtime disaster recovery, and self-healing systems. Over time they evolved their AWS architecture to expand globally and improve platform features like zero downtime updates and global player mobility. They utilize both third-party and custom tools to monitor and manage the cloud infrastructure.
4. • Free-to-play cooperative open world shooter
• “Shardless” world
• Instance-based maps
• Both persistent and transient map types
• Possible for many instances of a map to exist at the same time
5. • Free-to-play cooperative open world shooter
• “Shardless” world
• Instance-based maps
• Both persistent and transient map types
• Possible for many instances of a map to exist at the same time
6. • Free-to-play cooperative open world shooter
• “Shardless” world
• Instance-based maps
• Both persistent and transient map types
• Possible for many instances of a map to exist at the same time
8. Why build in the cloud?
• Players are unpredictable
• Developers are unpredictable
• Cyclical player behavior opens up opportunities
for significant cost savings
9. Why build in the cloud?
• Players are unpredictable
– Forecasts can be (and usually are) wrong
• Too little hardware, Too many players
– Bad for everyone
• Too much hardware, Too few players
– Good for players (sort of) but bad for the business
– What if they don’t stick around?
• Developers are unpredictable
• Cyclical player behavior opens up opportunities for
significant cost savings
10. Why build in the cloud?
• Players are unpredictable
• Developers are unpredictable
• Cyclical player behavior opens up opportunities
for significant cost savings
11. Why build in the cloud?
• Players are unpredictable
• Developers are unpredictable
– Active development has risks
• Performance can change drastically
• New services can “appear” the day of the patch
– MMOs are ALWAYS being actively developed!
• (If you want to be successful…)
• Cyclical player behavior opens up opportunities
for significant cost savings
12. Why build in the cloud?
• Players are unpredictable
• Developers are unpredictable
• Cyclical player behavior opens up opportunities
for significant cost savings
17. Infrastructure Goals
Deployment and Recovery
Platform
How do we make site
management better?
How can the platform make
the player experience better?
•
•
•
•
Expansion
Scalability
Disaster Recovery
Self-Healing
• Downtime
• Player Mobility
19. Deployment and Recovery Goals
•
•
•
•
Quick regional expansion
On-demand scalability
Disaster recovery with minimal downtime
Self-healing
20. Deployment and Recovery Goals
• Quick regional expansion
– Traditionally, expansion is a multi-month process
• Contracts, Purchase and Shipping, Installation, etc.
– Today, adding a region is about a week long task
• Additional improvements are in the works
• On-demand scalability
• Disaster recovery with minimal downtime
• Self-healing
21. Deployment and Recovery Goals
•
•
•
•
Quick regional expansion
On-demand scalability
Disaster recovery with minimal downtime
Self-healing
22. Deployment and Recovery Goals
• Quick regional expansion
• On-demand scalability
– Automated scale up and down without* limits
• Instance sizes desired may not always be available, however
• Disaster recovery with minimal downtime
• Self-healing
23. Deployment and Recovery Goals
•
•
•
•
Quick regional expansion
On-demand scalability
Disaster recovery with minimal downtime
Self-healing
24. Deployment and Recovery Goals
• Quick regional expansion
• On-demand scalability
• Disaster recovery with minimal downtime
– Traditionally, DR sites are expensive and are not always
properly maintained
– Our goal is to automate disaster recovery safely
• We do a lot manually at present
• Self-healing
25. Deployment and Recovery Goals
•
•
•
•
Quick regional expansion
On-demand scalability
Disaster recovery with minimal downtime
Self-healing
27. Platform Goals
• Zero downtime game updates
• Players can play globally without restrictions
28. Platform Goals
• Zero downtime game updates
– Blue-Green deployment
– Doesn’t preclude scheduled maintenance
• Some things are more safely done offline
• Players can play globally without restrictions
29. Platform Goals
• Zero downtime game updates
• Players can play globally without restrictions
30. Platform Goals
• Zero downtime game updates
• Players can play globally without restrictions
– Characters won’t be held hostage
• Player data is available everywhere they want to be
• We don’t charge a player so that they can play with their friends
– Prefer closest healthy region, however
35. INET
Operator
Alpha
October 2011
AWS ELB
AWS Services in Use
Mx
HP
Outside-Game
HP
PvP
HP
MM
HP
HP
Inside-LB
Chef
Ic
I
I
C
Ar
Inside-Core
Log
U
Ad
Inside-App
C
U
Ad
Inside-DB
Availability Zone: b
US-West-1
CORP
HP
Outside-LB
Inside-Game
AWS
MD
MCP
Elastic Compute Cloud (EC2)
Simple Storage Service (S3)
Elastic Load Balancing (ELB)
Simple Queue Service (SQS)
37. INET
Closed Beta
April 2012
Operator AWS ELB
AWS Services in Use
OW
PvP
HP
HP
Outside-Game
Outside-Game
MM
Outside-LB
HP
HP
Inside-LB
HP
Inside-Game
Outside-LB
MD
MM
HP
HP
Inside-LB
MCP
Inside-Game
HP
HP
HP
Chef
I
Ic
Gr
Inside-Core
Log
C
U
I
C
U
Ar
Ad
S
Ar
Ad
S
Inside-App
Availability Zone: b
AWS RDS
Inside-App
Availability Zone: c
Elastic Compute Cloud (EC2)
Simple Storage Service (S3)
Elastic Load Balancing (ELB)
Simple Queue Service (SQS)
Relational Database Service (RDS)
ElastiCache
AWS
PvP
HP
US-West-1
CORP
OW
HP
39. INET
Operator
Gamescom
August 2012
AWS ELB
AWS Services in Use
OW
NPE
PvE
EU-West-1
PvP
HP
HP
HP
Outside-Game
Task Task Task
MD
MM
Inside-Game
Log
Ic
C
U
Ar
L
In
Ad
Co
S
A
P
W
I
Chef
C
U
A
P
Gr
Log
Ic
Outside-LB
I
Inside-AppTasks
Chef
AWS ELB
Gr
Inside-Core
VPC
Inside-Core
AWS
MCP
Inside-App
W
Inside-DB
Availability Zone: b
US-West-2
CORP
US-East-1
US-West-2
Elastic Compute Cloud (EC2)
Simple Storage Service (S3)
Elastic Load Balancing (ELB)
Simple Queue Service (SQS)
Relational Database Service (RDS)
ElastiCache
CloudFront
Virtual Private Cloud (VPC)
HQ
41. INET
Operator
Open Beta
July 2013
AWS ELB
AWS Services in Use
ES
ES
ES
OW
NPE
PvE
EU-West-1
PvP
HP
MCP
MD
MM
HP
HP
Inside-LB
Log
Ic
C
U
Ar
L
In
Ad
Co
S
A
P
W
C
U
A
P
Gr
Log
Gr
Inside-Core
Inside-Core
VPC
US-East-1
Ops
US-East-1
Outside-LB
I
Chef
VPC
HP
I
Inside-AppTasks
Ic
HP
Inside-Game
Task Task Task
Chef
HP
Outside-Game
Inside-Search
AWS
ES
ES
Inside-App
W
Inside-DB
Availability Zone: b
US-West-2
CORP
ES
HQ
Elastic Compute Cloud (EC2)
Simple Storage Service (S3)
Elastic Load Balancing (ELB)
Simple Queue Service (SQS)
CloudFront
Virtual Private Cloud (VPC)
Elastic MapReduce (EMR)
43. INET
Operator
Today
November 2013
AWS ELB
AWS Services in Use
ES
ES
ES
ES
ES
ES
OW
NPE
PvE
EU-West-1
PvP
HP
Outside-Game
MD
MM
Chef
Log
Ic
Ic
Inside-LB
C
U
Ar
L
In
Ad
Co
S
A
P
W
Gr
Log
Gr
VPC
Inside-Core
VPC
US-East-1
Ops
AP-NorthEast-1
Inside-App
C
Inside-Core
US-East-1
Outside-LB
I
Inside-AppTasks
Chef
HP
Inside-Game
AWS
MCP
Task Task Task
A
P
Inside-DB
Availability Zone: b
US-West-2
SA-East-1
CORP
Inside-Search
HQ
Elastic Compute Cloud (EC2)
Simple Storage Service (S3)
Elastic Load Balancing (ELB)
Simple Queue Service (SQS)
CloudFront
Virtual Private Cloud (VPC)
Elastic MapReduce (EMR)
50. Internal Tools
• Architect
• Cartographer
– Builds new game server stacks
– Replaces failed game server components
– Scales up (or down) the servers within a pool depending on
player demand
• Dashboards (Everywhere)