Spil Games @ Percona Live 2014: Building globally available storage layers

Building globally
available storage
layers
Art van Scheppingen
Head of Database Engineering

2
1. Who is Spil Games?
2. Our challenges
3. Spil Storage Platform
4. ROAR
5. Questions?
Overview

Who are we?
Who is Spil Games?

4
• Game publishers
• Company founded in 2001
• 350+ employees world wide
• 180M+ unique visitors per month
• Over 60M registered users
• 45 portals in 19 languages
• Casual games
• Social games
• Real time multiplayer games
• Mobile games
• 40+ MySQL clusters
• 65k queries per second
• 8 Sphinx servers
• 8k queries per second
Facts

5
Geographic Reach
180 Million Monthly Active Users(*)
Source: (*) Google Analytics, August 2012

Our challenge
Serving content to users as
close as possible

7
Serving content closer to the user

8
• Static content
• CDN strategy
• Dynamic content
• Multi data center
• Store data distributed
• Sharding on user level is inevitable
• Replace old functionally sharded applications
Getting closer to the user means

9
• Portal content
• Hardly changes at all
• Can be cached heavily
• User data
• Written frequently
• Read only on demand
• Difficult to cache
• Recommendations
• Written on demand by DWH
• Written and read frequently
Dynamic content

10
• Transparent API layer
• No direct to DB
• Transparent storage system
• Use MySQL where possible
• Migrate data between systems
• Bulk migrations from/to shards
• Bulk migrations from/to data centers
• Connection pooling
Solving other problems

11
• Frequently written data
• SSP (Spilgames Storage Platform)
• Infrequently written data
• ROAR (Read Often, Alter Rarely)
Problem isolation

Spil Storage
Platform
Abstracting the storage layer

13
What are we trying to achieve?

14
• Dependent on one storage platform
• No more platform-specific query language
• Differentiate writes
• Optimistic (asynchronous)
• Pessimistic (synchronous)
• Shard data better
• Partition on user and function
• Cluster information by users, not by function
• Global expansion
• Partition on geographic location
• Solve uneven usage of data storage
• Move data from shard to shard
• Use a (small) connection pool
• Prevent bursts of writes to create connection stampede
What was our wishlist?

15
Old architecture overview
Portal
MySQL
Loadbalancer
(3rd party) game or
app
Services (NginX)
PHP Python
Memcache

16
SSP Architecture overview
Portal
MySQL
JsLib (Javascript, NginX)
(3rd party) game or
app
SPAPI (NginX, Piqi, Mochiweb)
Native Erlang (NginX, Piqi, Webmachine)
Spilgames Storage Layer (NginX, PiqiRPC, MySQL)
Loadbalancer

17
New architecture overview
Server API
Application Model
Storage platform
Client-side API
Presentation layer
Physical storage

18
• Everything written in Erlang
• SSP utilizes local caching (memcache)
• Flexible (persistent) storage layer
• MySQL
• Couchbase
• Could be any other storage product
Our building blocks

19
• Predictable
• Reliable
• Decent performance
• Easy to comprehend
• Excellent eco system
• Libraries
• Monitoring tools
• Knowledge
Why choose MySQL?

20
• Functional language
• High availability: designed for telecom solutions
• Excels at concurrency, distribution, fault tolerance
• Do more with less!
• Other companies using Erlang:
Why Erlang?

21
• What is the bucket model?
• Each record has one unique owner attribute (GID)
• GID (Global IDentifier) identifying different types
• Bucket(s) per functionality
• Bucket is structured data
• Attributes contain data of records
• Attributes do not have to correspond to schema
How do we shard?

23
$ curl -X POST -H 'Accept: application/json' -H
'Content-Type: application/json' --data-binary "{"gid":
288511851128422401}" http://127.0.0.1:8777/demobucket/get
{
"records": [
{
"gid": 288511851128422401,
"given_name": "g",
"registered_on": 1,
"email": "mail1",
"gender": "m",
"birthdate": { "year": 1963, "month": 6, "day": 21 }
}
],
"meta_info": { "total_ct": 1 }
}
Example bucket

24
CREATE TABLE demobucket (
gid bigint(20) unsigned not null,
given_name varchar(64) not null,
registered_on tinyint(3) unsigned default 0,
email varchar(255) not null,
gender enum(‘m’, ‘f’, ‘u’) not null default ‘m’,
birthdate date not null,
PRIMARY KEY(gid)
);
Example bucket MySQL 1

25
CREATE TABLE demobucket (
gid bigint(20) unsigned not null,
user_name varchar(64) not null,
user_register timestamp on update
CURRENT_TIMESTAMP(),
user_emailaddress varchar(255) not null,
user_gender char(1) not null default ‘m’,
user_dob varchar(10) not null,
PRIMARY KEY(gid)
);
Example bucket MySQL 2

26
CREATE COLUMNFAMILY demobucket (
gid int PRIMARY KEY,
given_name varchar,
registered_on timestamp,
email varchar,
gender varchar,
birth_date varchar
);
Example bucket Cassandra

27
demobucket:get( #demobucket_get_input{ gid=12345, filters= [
#filter{ attr= <<"gender">> , op= <<"=">> , parms= {string, <<"f">>}},
#filter{ attr= <<"registered_on">>, op= <<"sort">>, parms=asc },
#filter{ attr= <<"gid">>, op= <<"limit">>, parms={int, 10 }}
]} )
Example Erlang filters

28
• Nearest datacenter (DC) to the end user
• Satellite DC
• Processing and caching
• Do not own/store data
• Storage DC
• Processing, caching and persistent storage
• Store all same user data in same DC
• Partition on user globally
• Global IDentifier per user
Global distribution

29
• Contains GIDs and their master DC
• GIDs master DC predefined
• Migrated GIDs get updated
The lookup server

30
• Globally sharded on GID
• (local) GID Lookup
How does this work?
GID
lookup
Shard 1 Shard 2
Persistent
storage

31
Master/Satellite DC example

32
• Add new shard
• Add new bucket (and/or version)
• Mark shard inactive (read only)
• User migrate (bucket+version)
• Shard (bulk) migrate
• Bucket version (bulk) migrate
Tools

33
• Spread data even on shards
• Migration of buckets between shards
• GID migration between DCs
• Creating a new storage DC needs data migration
• Users will automatically be migrated after visiting
another DC many times
Why do we need data migration?

34
• Versioning on bucket definitions
• GIDs are assigned to a bucket version
• Data in old bucket versions remain (read only)
• New data only gets written to new bucket version
• Updates migrate data to new bucket version
• Migrates can be triggered
Seamless schema upgrades

35
Seamless schema upgrades
Demobucket v1
GID
1234
1235
1236
1237
1238
1239
name
Roy
Moss
Jen
Douglas
Denholm
Richmond
Demobucket v2
GID name genderGID
1241
name
Patricia
gender
f
GID
1241
1235
name
Patricia
Moss
gender
f
m
GID
1234
1236
1237
1238
1239
name
Roy
Jen
Douglas
Denholm
Richmond
GID
1234
1237
1238
1239
name
Roy
Douglas
Denholm
Richmond
GID
1241
1235
1236
name
Patricia
Moss
Jen
gender
f
m
f

36
• Every cluster (two masters) will contain two shards
• Data written interleaved
• HA for both shards
• No warmup needed
• Both masters active and “warmed up”
Initial design: Multi Master writes
SSP
Shard 1 Shard 2

SSP Master-Master setup
Server-1 Server-2 Server-n
MySQL
active
master
MySQL
active
master
Asynchronous replication
read+writeread+write
MMM
db-ssp001 (192.168.2.1) db-ssp002 (192.168.2.2)

SSP Master-Master setup
MySQL
broken
master
MySQL
active
master
read+write
MMM
db-ssp001 (192.168.2.1)
db-ssp002 (192.168.2.2)

New design: SSP Galera setup
Galera
Load balancer
MySQL MySQL MySQL
read/write to any node
Synchronous replication

40
• GID write isolation
• One single process is responsible for a GID
• Same rows can’t be written at the same time
• Synchronous replication ensures consistency
• Replication can’t fall behind
Why does Galera make sense?

41
• Not finished yet
• Diffcicult to add development resources
• Teething problems delayed old architecture
migration
• Master-Master was a (very) bad idea
• 2 Galera clusters already in production
• Tools had to be built from scratch
• Initial design lacked parallel writes (bulk import)
• Erlang using a connection pool to MySQL
• Initially set too small
• Bursts of writes larger than queue
SSP in practice

43
• Stores infrequently written data (game catalogue)
• Uses GIDs for games
• Data can be cached aggressively
• API same as the SSP
• Data access patterns are different
• Data is not sharded but replicated
• Components used:
• Memcache: Data caching
• MySQL: Denormalized PK data storage
• Sphinx: String to GID translation
ROAR

44
SSP Architecture overview
Portal
MySQL
(3rd party) game or
app
ROAR (NginX, PiqiRPC, MySQL)
Loadbalancer

45
ROAR Architecture overview
Portal
MySQL
(3rd party) game or
app
ROAR (NginX, PiqiRPC, MySQL)
Loadbalancer
Sphinx
Memcache

46
• GID to data
'Content-Type: application/json' --data-binary "{"gid":
1234567890}" http://127.0.0.1:8778/gamedata/get
{ "records": [ {
"gid": 1234567890,
”title": “Uphill Rush 6",
”url": “http://www.agame.com/some/game.swf”,
…
} ], "meta_info": { "total_ct": 1 }}
• String to GID
'Content-Type: application/json' --data-binary "{”label":
“puzzle”}" http://127.0.0.1:8778/label/get
{ "records": [
{ "gid": 1234567890,},
{ "gid": 1234567891,},
{ "gid": 1234567892,}
],
"meta_info": { "total_ct": 3 }}
}
ROAR Example

ROAR MySQL
Load balancer (port 3306 write, 3307 read)
Galera
MySQL MySQL MySQLMySQL MySQL
read only read only
asynchronous
replication
asynchronous
replication
Sphinx Sphinx

ROAR MySQL (2)
DC1
DC2
Node 1
Node 2
Node 3
Slave
Slave
Slave
Slave
Slave
Slave

ROAR MySQL (3)
DC1 DC2
Node 1 Node 2
Node 3
DC3
SlaveSlave
Slave
Slave
Slave
Slave

50
• Sphinx can handle 4k queries per second
• Small increases in response times are critical
• Sphinx indexing is CPU intensive
• Used taskset to allocate only a single CPU
• Real time indexing
• Denormalized tables designed for limited set of data
• Large increases in data are bad
• Not globally distributed yet
ROAR in practice

52
• Presentation can be found at:
http://spil.com/plsc2014storage
• If you wish to contact me:
Email: art@spilgames.com
Twitter: @banpei
• Engineering @ Spil Games
Blog: http://engineering.spilgames.com
Twitter: @spilengineering
Thank you!

53
Getting closer to the user:
https://www.flickr.com/photos/wallyonwater54/10360023775/
(creative commons license)
ROAR:
https://www.flickr.com/photos/nickharris1/5939895418/in/photostream/
(creative commons license)
Photo sources

Spil Games @ Percona Live 2014: Building globally available storage layers

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (17)

En vedette

En vedette (20)

Spil Games @ Percona Live 2014: Building globally available storage layers