Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Snowplow: open source game analytics powered by AWS

1 243 vues

Publié le

This is a presentation by Alex Dean and Yali Sassoon at Snowplow about open source game analytics powered by AWS. It was presented at the Games Developer Conference (GDC) in San Francisco, February 2017

Publié dans : Données & analyses
  • Soyez le premier à commenter

Snowplow: open source game analytics powered by AWS

  1. 1. Snowplow: open source game analytics powered by AWS
  2. 2. Hello! We’re Alex and Yali. We created Snowplow • We cofounded Snowplow • Open source event data pipeline built on AWS tech • Collect granular, rich, event-level data across digital platforms • Validate, enrich, model and deliver that data to the places it can be analysed and acted on
  3. 3. Wonder at what the data made possible drove us to create Snowplow • Digital event data is rich, behavioral information on how millions of people do things (play, work, socialize, flirt, unwind etc.) collected at scale • Endless possibilities to ask and answer different questions, build intelligence and act on that intelligence • Packaged solutions do a poor job of enabling companies to realise all the different possibilities presented by this data • Lots of companies build their own event data pipelines to realise those possibilities. If we can build a standard pipeline, companies can focus on doing stuff with the data
  4. 4. A call to arms for games analysts
  5. 5. Games companies are typically very analytically sophisticated • At a (often early) stage invest in event data warehouse / data pipeline • Analytics is often very specific to each game: packaged solutions can only get you so far • Data sophistication: competitive advantage • Larger game studios typically have very large data teams (engineering, science and analysis) and significant analytics infrastructure that they’ve built
  6. 6. But you don’t need to build your own event data pipeline from scratch • We have a tried and tested open-source stack, that you can deploy directly to your own AWS account • Built on top of AWS services incl. Kinesis, Lambda, Redshift, Elasticsearch, S3, EMR • Use your data engineers to build analyses specific to your game, not to re-build the pipe!
  7. 7. Building high quality event data pipelines is hard Data quality Schema evolution Enrichment Data modeling
  8. 8. Today Snowplow is used by games studios… …And companies in other sectors
  9. 9. Snowplow and our early gaming influences
  10. 10. Early work with games studios heavily influenced our thinking Flexible data schema that evolve! Event grammar: events vs entities Evolving data models: understanding sequences of play
  11. 11. Game analytics has grown up
  12. 12. Game analytics encompasses a lot • Product analytics: use data to improve the game • Customer acquisition analytics: sustainably drive user growth • Game health analytics: monitor the game • Data-driven applications within the game e.g. player-matching • Plenty more that is specific to your game
  13. 13. We distinguish between analytics on read vs analytics on write • Decide on how you want to process the data at the point of query • Prioritise having the flexibility to query the data in a rich / varied way • De-prioritise query latency • Example: product analytics Analytics on Read Analytics on Write • Define in advance how the data will be queried • Prioritise low latency • De-prioritise query flexibility • Example: game health monitoring Different architectures are appropriate for the above two cases
  14. 14. With Snowplow, we meet both requirements via a Lambda Architecture Analytics on write: kinesis + AWS Lambda / Spark Streaming Analytics on read: Redshift / Spark / Athena
  15. 15. Analytics on read
  16. 16. Analytics on read example: A/B testing to drive product development • Limitless possibilities for experiments • Wide set of metrics that you might be looking to influence with each experiment • Tracking the experiments should be easy • All enabled by the flexibility to compute segments and metrics after the fact (at query time)
  17. 17. Delivering the A/B testing framework with Redshift and/or Spark on EMR Process • Product manager defines A/B test in advance incl. KPI and success threshold • Rolling program of tests run each week • Test history documented Technology • Event tracked to indicate that a user is assigned to a specific group and a particular experiment is run • KPI can be measured after the fact
  18. 18. Analytics on read example 2: level optimisation analytics
  19. 19. Delivering level analytics with Redshift and/or Spark on EMR Process • Define key metrics to understand player engagement with each level • Build out data modeling process to compute level aggregation on the underlying event stream • Extend over time: build out more sophisticated metrics as understanding of play evolves Technology • Attach level metadata to all events • Aggregate event-stream in Redshift / Spark • Recompute over historical data as new metrics are developed
  20. 20. AWS provides a rich and growing toolkit for analytics on read • EMR enabling Hadoop, Spark, Flink • Athena • Redshift • Elasticsearch Service
  21. 21. Analytics on write
  22. 22. Analytics on write example 1: Surface aggregate play data in the game • https://next.codecombat.com/play/dungeon
  23. 23. Delivering aggregate play data into the game with Kinesis, Lambda and DynamoDB Example: calculating # of users live on each level now Elegantly handle computing complex metrics (count distincts) in real-time {…}, { event_name: e, level_name: l user_name: u, timestamp: t }, {…} Kinesis event stream AWS Lambda Compute player state Player state table Event stream of updates to player state DynamoDB + stream Compute level state AWS Lambda DynamoDB Level state table
  24. 24. Analytics on write example 2: Tiered support based on player LTV Triage user based on expected LTV 1. Standard user: minimise support cost 2. Silver user: personalised service 3. Platinum user: concierge service
  25. 25. Delivering tiered support using Kinesis, Lambda, DynamoDB and API Gateway Example: computing customer lifetime value and serving from customer API {…}, { event_name: e, user_name: u, transaction_value: v timestamp: t }, {…} Kinesis event stream AWS Lambda Compute Player Lifetime Value Player State table DynamoDB + stream Serve Player State API Gateway Triage player support tier
  26. 26. AWS provides a rich and growing toolkit for analytics on write • Spark Streaming on EMR • Kinesis Client Library Stream processing frameworks Serverless event processing • AWS Lambda • Kinesis Analytics
  27. 27. Design considerations for game analytics
  28. 28. 1. Keep your analytics stack independent from your game’s stack Evolve game and analytics independently Best of breed components for analytics and game Handle order of magnitude different scale requirements • Helpful for larger teams • Reduce fragility • Limited overlap between best tools for game engines and best for event analytics • Game event volumes will dwarf active game data
  29. 29. 2. Develop your analytics on read first, then migrate them to on write • Example: customer acquisition model: set bid prices for different user cohorts • Model developed, tested and trained on historical data in data warehouse • Model then put live on real-time data in-stream
  30. 30. 3. Have a formal framework for managing change • Change is inevitable through the lifetime of the game: • The game evolves • Analysts and scientists ask new questions of the game • The analytics team must agree a framework to handle: • Updates to the in-game event and entity schemas (affects the developers) • Evolution of the event data modeling (affects the wider company)
  31. 31. A call to arms for games analysts
  32. 32. Standardise on your event data pipeline • Why re-invent the wheel? • Deploy our tried and tested open-source stack, directly in your AWS account • Use your data engineers to build analyses specific to your game, not to re-build the pipe!
  33. 33. Learn more • http://snowplowanalytics.com • https://github.com/snowplow/snowplow
  34. 34. Thank you for attending #AmazonDevDay, please take a moment to complete our survey for a chance to win the grand prize. bit.ly/DevDaySurvey Q&A will be in a room on the third floor

×