SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Automated Feature
Engineering at Zynga
1
Ben Weber
Data Science @ Zynga
November 14, 2019
2
Zynga Games
2
Our Challenge
• We have tens of millions of players and dozens of
games across multiple platforms
• Our games have diverse event taxonomies
• We want to build accurate models for personalizing
our gameplay experiences
3
“One of the holy grails of machine learning is
to automate more and more of the feature
engineering process.”
Pedro Domingos
CACM 2012
4
Our Approach
• Leverage ML libraries to automate feature engineering
• Develop Portfolio-Scale data products
• Empower our game studios with ML models
5
Use Cases
6
Applications
Propensity Models: What actions are players performing?
Segmentation: Who are our players?
Anomaly Detection: which players are bad actors?
Recommendation: What actions should they take?
7
Feature Encoding
Input Dataset
• Thousands of events per player
Feature Generation
• Aggregation with FeatureTools
Output Dataset
• A single row per player
8
Raw
Event
Data
Player Summaries
Propensity Models
• We predict which users are likely to act using classification models
• Game studios use propensity scores to define experiment groups
• Feature generation reduces the need for manual feature engineering
9
Data
Extract
Feature
Engineering
Feature
Application
Model
Training
Model
Publish
Segmentation
• Generated features are used as input to k-means clustering
• Archetype labels are assigned based on qualitative analysis
10
Anomaly Detection
• Players are represented as 1D images
• We train an autoencoder to reduce dimensionality
• Players with large vector differences are flagged as suspect
11
Features
Latent
Space
InputLayer
OutputLayer
Players
Features
Players
AutoencoderInput Vectors Output Vectors
Recommendation Systems
• Feature engineering is used for item & guild recommendations
• Cosine similarity is applied to normalized generated features
Item Recommendations
sim(u, v) = u * v
|| u || * || v ||
weighti
= ∑ sim(u, w) * rating(w, i)
w = user neighborhood
12
Feature
Engineering
13
FeatureTools
• A python library for deep feature synthesis
• Represents data as entity sets
• Identifies feature descriptors for transforming your
data into a shallow and wide format
• Open-source version maintained by FeatureLabs
14
Kaggle NHL Dataset
15
16
Data Frames
game_df
plays_df
17
Entity Sets
• Define the tables and
relationships for DFS
• Operate on Pandas
data frames
18
1-Hot Encoding
19
Deep Feature Synthesis
Applying FeatureTools
• We translate our raw tracking events into player summaries
• Supports dozens of games with diverse taxonomies
• Minimizes manual steps in our data science workflows
• Scales to millions of players and billions of records
20
Deployment
21
Tech Stack
• Databricks for PySpark
• FeatureTools for generation
• Pandas UDFs for distribution
• MLlib for predictive modeling
22
• Introduced in Spark 2.3
• Provide Scalar and Grouped map operations
• Partitioned using a groupby clause
• Enable distributing code that uses Pandas
23
Pandas UDFs
24
UDF
Pandas
Output
Pandas
Input
Spark Output
Spark Input
UDF
Pandas
Output
Pandas
Input
UDF
Pandas
Output
Pandas
Input
UDF
Pandas
Output
Pandas
Input
UDF
Pandas
Output
Pandas
Input
Grouped MAP UDFs
25
Feature Generation at Scale
AutoModel System
•Generates hundreds of propensity models
•Powers features in our games & live services
26
Data
Extract
Feature
Engineering
Feature
Application
Model
Training
Model
Publish
Wrapping Up
27
Machine Learning at Zynga
Old Approach
• Custom data science and
engineering work per model
• Months-long development cycles
• Ad-hoc process for deploying
models to production
28
New Approach
• Minimal effort spent on the
feature engineering stage
• No custom work for new games
• Model outputs are published to
application databases
Takeaways
• Zynga is leveraging automated feature engineering to build
Portfolio-Scale data products
• We are using PySpark to scale to tens of millions of players
• Feature generation has unlocked novel data products
29
30
Automated Feature Engineering at Zynga
Ben Weber
Distinguished Data Scientist
bweber@zynga.com
https://www.zynga.com/jobs/

Contenu connexe

Similaire à Ai expo 2019

PRESENTATION ON Game Engine
PRESENTATION ON Game EnginePRESENTATION ON Game Engine
PRESENTATION ON Game Engine
Diksha Bhargava
 

Similaire à Ai expo 2019 (20)

Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
Developing and optimizing a procedural game: The Elder Scrolls Blades- Unite ...
 
De Re PlayStation Vita
De Re PlayStation VitaDe Re PlayStation Vita
De Re PlayStation Vita
 
Designing a pragmatic back-end service for mobile games
Designing a pragmatic back-end service for mobile gamesDesigning a pragmatic back-end service for mobile games
Designing a pragmatic back-end service for mobile games
 
PRESENTATION ON Game Engine
PRESENTATION ON Game EnginePRESENTATION ON Game Engine
PRESENTATION ON Game Engine
 
The next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game EnginesThe next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game Engines
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
GDC Europe 2014: Unreal Engine 4 for Programmers - Lessons Learned & Things t...
GDC Europe 2014: Unreal Engine 4 for Programmers - Lessons Learned & Things t...GDC Europe 2014: Unreal Engine 4 for Programmers - Lessons Learned & Things t...
GDC Europe 2014: Unreal Engine 4 for Programmers - Lessons Learned & Things t...
 
Snowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWSSnowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWS
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
Schemas Beyond The Edge
Schemas Beyond The EdgeSchemas Beyond The Edge
Schemas Beyond The Edge
 
Ansiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxAnsiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at roblox
 
ARISE
ARISEARISE
ARISE
 
PlayFab analytics gdc
PlayFab analytics gdcPlayFab analytics gdc
PlayFab analytics gdc
 
Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*Create a Scalable and Destructible World in HITMAN 2*
Create a Scalable and Destructible World in HITMAN 2*
 
Deep Dive: Amazon Lumberyard & Amazon GameLift
Deep Dive: Amazon Lumberyard & Amazon GameLiftDeep Dive: Amazon Lumberyard & Amazon GameLift
Deep Dive: Amazon Lumberyard & Amazon GameLift
 
Understanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha LatyshevaUnderstanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha Latysheva
 
Spark at Zillow
Spark at ZillowSpark at Zillow
Spark at Zillow
 
Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2
 
Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2
 
Massively Social != Massively Multiplayer
Massively Social != Massively MultiplayerMassively Social != Massively Multiplayer
Massively Social != Massively Multiplayer
 

Plus de Ben Weber (8)

Building an Applied Science Portfolio
Building an Applied Science PortfolioBuilding an Applied Science Portfolio
Building an Applied Science Portfolio
 
Dissertation defense
Dissertation defenseDissertation defense
Dissertation defense
 
Mining the Madden Experience
Mining the Madden ExperienceMining the Madden Experience
Mining the Madden Experience
 
Building a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s MarketplaceBuilding a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s Marketplace
 
Holding Effective Data Meetings
Holding Effective Data MeetingsHolding Effective Data Meetings
Holding Effective Data Meetings
 
Applying Reactive Planning Idioms to Behavior Trees
Applying Reactive Planning Idioms to Behavior TreesApplying Reactive Planning Idioms to Behavior Trees
Applying Reactive Planning Idioms to Behavior Trees
 
Game Analytics & Machine Learning
Game Analytics & Machine LearningGame Analytics & Machine Learning
Game Analytics & Machine Learning
 
Building a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's MarketplaceBuilding a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's Marketplace
 

Dernier

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 

Dernier (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 

Ai expo 2019