Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017

•Télécharger en tant que PPTX, PDF•

4 j'aime•1,528 vues

Byron Galbraith is the Chief Data Scientist and co-founder of Talla, where he works to translate the latest advancements in machine learning and natural language processing to build AI-powered conversational agents. Byron has a PhD in Cognitive and Neural Systems from Boston University and an MS in Bioinformatics from Marquette University. His research expertise includes brain-computer interfaces, neuromorphic robotics, spiking neural networks, high-performance computing, and natural language processing. Byron has also held several software engineering roles including back-end system engineer, full stack web developer, office automation consultant, and game engine developer at companies ranging in size from a two-person startup to a multi-national enterprise. Abstract Summary: Bayesian Bandits: What color should that button be to convert more sales? What ad will most likely get clicked on? What movie recommendations should be displayed to keep subscribers engaged? What should we have for lunch? These are all examples of iterated decision problems — the same choice has to be made repeatedly with the goal being to arrive at an optimal decision strategy by incorporating the results of the previous decisions. In this talk I will describe the Bayesian Bandit solution to these types of problems, how it adaptively learns to minimize regret, how additional contextual information can be incorporated, and how it compares to the more traditional A/B testing solution.

Technologie

Bayesian Bandits
Byron Galbraith, PhD
Cofounder / Chief Data Scientist, Talla
2017.03.24

Bayesian Bandits for the Impatient
Online adaptive learning: “Earn while you Learn”1
2
3
Powerful alternative to A/B testing optimization
Can be efficient and easy to implement

Iterated Decision Problems
What product recommendations
should we present to subscribers
to keep them engaged?

Exploit vs Explore - What should we do?
Choose what seems best so far
🙂 Feel good about our decision
🙂 There still may be something better
Try something new
😄 Discover a superior approach
😧 Regret our choice

Regret - What did that experiment cost us?

The Multi-Armed Bandit Problem
http://blog.yhat.com/posts/the-beer-bandit.html

Bandit Solutions
𝑅 𝑇 =
𝑡=1
𝑇
𝑟(𝑌𝑡 𝑎∗ ) − 𝑟 𝑌𝑡 𝑎 𝑡
k-MAB = 𝐴, 𝑌, 𝑃, 𝑟
𝑟𝑎 𝑛+1
= 𝑟𝑎 𝑛
+
1
𝑛 𝑎
𝑟𝑎 𝑡
− 𝑟𝑎 𝑛
𝑎 𝑡 = argmax
𝑖
𝑟𝑖 𝑡
+
𝑐 log 𝑡
𝑛𝑖
𝑃 𝐴 𝑡 = 𝑎 =
𝑒ℎ 𝑎 𝑛
𝑏=1
𝑘
𝑒ℎ 𝑏 𝑛
= 𝜋 𝑡(𝑎)
ℎ 𝑎 𝑛+1
= ℎ 𝑎 𝑛
+ 𝛼 𝑟𝑎 𝑡
− 𝑟𝑎 𝑛
(1 − 𝜋 𝑡 𝑎 )
ℎ 𝑏 𝑛+1
= ℎ 𝑏 𝑛
− 𝛼 𝑟𝑎 𝑡
− 𝑟𝑎 𝑛
𝜋 𝑡 𝑏 , 𝑏 ≠ 𝑎
𝑃 𝑋 = 𝑥 =
𝑥 𝛼−1
1 − 𝑥 𝛽−1
𝐵 𝛼, 𝛽
𝑃 𝑋 = 𝑥 =
𝑛
𝑥
𝑝 𝑥 1 − 𝑝 𝑛−𝑥
𝐵𝑒𝑡𝑎 𝑎(𝛼 + 𝑟𝑎, 𝛽 + 𝑁 − 𝑟𝑎)
𝑃 𝑋 𝑌, 𝑍 =
𝑃 𝑌 𝑋, 𝑍 𝑃 𝑋 𝑍
𝑃 𝑌 𝑍

Thompson Sampling
𝑷 𝜽 𝒓, 𝒂 ∝ 𝑷 𝒓 𝜽, 𝒂 𝑷 𝜽|𝒂
Prior
Likelihood
Posterior

Bayesian Bandits – The Model
Model if a recommendation will result in user engagement
• Bernoulli distribution: 𝑝 - likelihood of event occurring
How do we find 𝑝?
• Conjugate prior
• Beta distribution: 𝛼 - number of hits, 𝛽 - number of misses
Only need to keep track of two numbers per option
• # of hits, # of misses

Bayesian Bandits – The Algorithm
1. Initialize 𝛼𝑖 = 𝛽𝑖 = 1 (uniform prior)
2. For each user request for recommendations t
1. Sample 𝑝𝑖 ~ 𝐵𝑒𝑡𝑎 𝛼𝑖, 𝛽𝑖
2. Choose action corresponding to largest 𝑝𝑖
3. Observe reward 𝑟𝑡
4. Update 𝛼𝑡 += 𝑟𝑡, 𝛽𝑡 += 1 − 𝑟𝑡

But behavior is dependent on context
• Categorical contexts
• One bandit model per category
• One-hot context vector
• Real-valued contexts
• Can capture interrelatedness of context dimensions
• More difficult to incorporate effectively

So why would I ever A/B test again?
Test intent
Optimization vs understanding
Difficulty with non-stationarity
Monday vs Friday behavior
Deployment
Few turnkey options
Specialized skill set https://vwo.com/blog/multi-armed-bandit-algorithm/

Bayesian Bandits for the Patient
Thompson Sampling balances exploitation &
exploration while minimizing decision regret1
2
3
No need to pre-specify decision splits, time
horizon for experiments
Can model a variety of problems and complex
interactions

Resources
https://github.com/bgalbraith/bandits

Recommandé

John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 MLconf

Bag the model with baggingChode Amarnath

Ensemble learningHaris Jamil

Ensemble methods zekeLabs Technologies

Boosting Approach to Solving Machine Learning ProblemsDr Sulaimon Afolabi

Ensemble methodsChristopher Marker

Ensemble modeling and Machine LearningStepUp Analytics

Machine Learning - Ensemble MethodsAndrew Ferlitsch

Recommandé

John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 MLconf

Bag the model with baggingChode Amarnath

Ensemble learningHaris Jamil

Ensemble methods zekeLabs Technologies

Boosting Approach to Solving Machine Learning ProblemsDr Sulaimon Afolabi

Ensemble methodsChristopher Marker

Ensemble modeling and Machine LearningStepUp Analytics

Machine Learning - Ensemble MethodsAndrew Ferlitsch

Dm part03 neural-networks-homeworkokeee

Ensemble learning TechniquesBabu Priyavrat

Overfitting and-tblDigvijay Singh

Boosting Algorithms Omar Odibat omarodibat

Ensemble Learning Featuring the Netflix Prize Competition and ...butest

Lecture 6: Ensemble Methods Marina Santini

Intro to modelling-supervised learningJustin Sebok

ensemble learningbutest

(Machine Learning) Ensemble learning Omkar Rane

Understanding Bagging and BoostingMohit Rajput

Ensemble hybrid learning techniqueDishaSinha9

Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017MLconf

Erik Bernhardsson, CTO, Better MortgageMLconf

Claudia Perlich, Chief Scientist, Dstillery MLconf

Yuri M. Brovman, Data Scientist, eBayMLconf

Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...MLconf

Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017MLconf

Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference MLconf

Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017MLconf

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf

Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf

Contenu connexe

Tendances

Dm part03 neural-networks-homeworkokeee

Ensemble learning TechniquesBabu Priyavrat

Overfitting and-tblDigvijay Singh

Boosting Algorithms Omar Odibat omarodibat

Ensemble Learning Featuring the Netflix Prize Competition and ...butest

Lecture 6: Ensemble Methods Marina Santini

Intro to modelling-supervised learningJustin Sebok

ensemble learningbutest

(Machine Learning) Ensemble learning Omkar Rane

Understanding Bagging and BoostingMohit Rajput

Ensemble hybrid learning techniqueDishaSinha9

Tendances (11)

Dm part03 neural-networks-homework

Ensemble learning Techniques

Overfitting and-tbl

Boosting Algorithms Omar Odibat

Ensemble Learning Featuring the Netflix Prize Competition and ...

Lecture 6: Ensemble Methods

Intro to modelling-supervised learning

ensemble learning

(Machine Learning) Ensemble learning

Understanding Bagging and Boosting

Ensemble hybrid learning technique

En vedette

Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017MLconf

Erik Bernhardsson, CTO, Better MortgageMLconf

Claudia Perlich, Chief Scientist, Dstillery MLconf

Yuri M. Brovman, Data Scientist, eBayMLconf

Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...MLconf

Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017MLconf

Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference MLconf

Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017MLconf

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf

Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf

Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...MLconf

Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...MLconf

Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf

Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017MLconf

Artemy Malkov, CEO, Data Monsters at The AI Conference 2017 MLconf

Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017MLconf

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf

Talha Obaid, Email Security, Symantec at MLconf ATL 2017MLconf

En vedette (20)

Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017

Erik Bernhardsson, CTO, Better Mortgage

Claudia Perlich, Chief Scientist, Dstillery

Yuri M. Brovman, Data Scientist, eBay

Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...

Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017

Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference

Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017

Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017

Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...

Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...

Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017

Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017

Artemy Malkov, CEO, Data Monsters at The AI Conference 2017

Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

Talha Obaid, Email Security, Symantec at MLconf ATL 2017

Similaire à Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017

Artwork Personalization at NetflixJustin Basilico

Multi-Armed Bandit: an algorithmic perspectiveGabriele Sottocornola

Recommendation engine Using Genetic AlgorithmVaibhav Varshney

The Art of A/B Testing in Product by YogaGlo VP of ProductProduct School

Product Madness - A/B TestingGIAF

Marketing analyticsData Science Thailand

6 Guidelines for A/B TestingEmily Robinson

Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...Domino Data Lab

Lessons learned from Large Scale Real World Recommender Systemschrisalvino

Andrii Belas: A/B testing overview: use-cases, theory and toolsLviv Startup Club

Reinforcement Learning in Practice: Contextual BanditsMax Pagels

Causal reasoning and Learning SystemsTrieu Nguyen

Computational advertising bipartite graph matchingAbhilash Kumar Aryavanshi

Aprendizado de Máquina e Visualização de Informação para otimização de Sistem...Robson Motta

Aprendizado de maquina e visualizacao de informacao para otimizacao de sistem...tdc-globalcode

How to Correctly Use Experimentation in PM by Google PMProduct School

Overview of recommender systemStanley Wang

A/B Testing: Common Pitfalls and How to Avoid ThemIgor Karpov

FashiondatascSuman Bhattacharya, PhD

Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi

Similaire à Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 (20)

Artwork Personalization at Netflix

Multi-Armed Bandit: an algorithmic perspective

Recommendation engine Using Genetic Algorithm

The Art of A/B Testing in Product by YogaGlo VP of Product

Product Madness - A/B Testing

Marketing analytics

6 Guidelines for A/B Testing

Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...

Lessons learned from Large Scale Real World Recommender Systems

Andrii Belas: A/B testing overview: use-cases, theory and tools

Reinforcement Learning in Practice: Contextual Bandits

Causal reasoning and Learning Systems

Computational advertising bipartite graph matching

Aprendizado de Máquina e Visualização de Informação para otimização de Sistem...

Aprendizado de maquina e visualizacao de informacao para otimizacao de sistem...

How to Correctly Use Experimentation in PM by Google PM

Overview of recommender system

A/B Testing: Common Pitfalls and How to Avoid Them

Fashiondatasc

Facebook Talk at Netflix ML Platform meetup Sep 2019

Plus de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf

Josh Wills - Data Labeling as Religious ExperienceMLconf

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf

Meghana Ravikumar - Optimized Image Classification on the CheapMLconf

Noam Finkelstein - The Importance of Modeling Data CollectionMLconf

June Andrews - The Uncanny Valley of MLMLconf

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf

Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf

Neel Sundaresan - Teaching a machine to codeMLconf

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf

Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf

Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush

Josh Wills - Data Labeling as Religious Experience

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...

Meghana Ravikumar - Optimized Image Classification on the Cheap

Noam Finkelstein - The Importance of Modeling Data Collection

June Andrews - The Uncanny Valley of ML

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...

Vito Ostuni - The Voice: New Challenges in a Zero UI World

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...

Neel Sundaresan - Teaching a machine to code

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...

Soumith Chintala - Increasing the Impact of AI Through Better Software

Roy Lowrance - Predicting Bond Prices: Regime Changes

Dernier

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

A Call to Action for Generative AI in 2024Results

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

A Year of the Servo Reboot: Where Are We Now?Igalia

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Real Time Object Detection Using Open CVKhem

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Dernier (20)

A Domino Admins Adventures (Engage 2024)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

The Codex of Business Writing Software for Real-World Solutions 2.pptx

A Call to Action for Generative AI in 2024

Exploring the Future Potential of AI-Enabled Smartphone Processors

A Year of the Servo Reboot: Where Are We Now?

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Breaking the Kubernetes Kill Chain: Host Path Mount

Data Cloud, More than a CDP by Matt Robison

08448380779 Call Girls In Civil Lines Women Seeking Men

Driving Behavioral Change for Information Management through Data-Driven Gree...

Boost PC performance: How more available memory can improve productivity

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Real Time Object Detection Using Open CV

Advantages of Hiring UIUX Design Service Providers for Your Business

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017

1. Bayesian Bandits Byron Galbraith, PhD Cofounder / Chief Data Scientist, Talla 2017.03.24

2. Bayesian Bandits for the Impatient Online adaptive learning: “Earn while you Learn”1 2 3 Powerful alternative to A/B testing optimization Can be efficient and easy to implement

3. Dining Ware VR Experiences on Demand

4. Dining Ware VR Experiences on Demand

5. Iterated Decision Problems What product recommendations should we present to subscribers to keep them engaged?

6. A/B Testing

7. Exploit vs Explore - What should we do? Choose what seems best so far 🙂 Feel good about our decision 🙂 There still may be something better Try something new 😄 Discover a superior approach 😧 Regret our choice

8. A/B/n Testing

9. Regret - What did that experiment cost us?

10. The Multi-Armed Bandit Problem http://blog.yhat.com/posts/the-beer-bandit.html

11. Bandit Solutions 𝑅 𝑇 = 𝑡=1 𝑇 𝑟(𝑌𝑡 𝑎∗ ) − 𝑟 𝑌𝑡 𝑎 𝑡 k-MAB = 𝐴, 𝑌, 𝑃, 𝑟 𝑟𝑎 𝑛+1 = 𝑟𝑎 𝑛 + 1 𝑛 𝑎 𝑟𝑎 𝑡 − 𝑟𝑎 𝑛 𝑎 𝑡 = argmax 𝑖 𝑟𝑖 𝑡 + 𝑐 log 𝑡 𝑛𝑖 𝑃 𝐴 𝑡 = 𝑎 = 𝑒ℎ 𝑎 𝑛 𝑏=1 𝑘 𝑒ℎ 𝑏 𝑛 = 𝜋 𝑡(𝑎) ℎ 𝑎 𝑛+1 = ℎ 𝑎 𝑛 + 𝛼 𝑟𝑎 𝑡 − 𝑟𝑎 𝑛 (1 − 𝜋 𝑡 𝑎 ) ℎ 𝑏 𝑛+1 = ℎ 𝑏 𝑛 − 𝛼 𝑟𝑎 𝑡 − 𝑟𝑎 𝑛 𝜋 𝑡 𝑏 , 𝑏 ≠ 𝑎 𝑃 𝑋 = 𝑥 = 𝑥 𝛼−1 1 − 𝑥 𝛽−1 𝐵 𝛼, 𝛽 𝑃 𝑋 = 𝑥 = 𝑛 𝑥 𝑝 𝑥 1 − 𝑝 𝑛−𝑥 𝐵𝑒𝑡𝑎 𝑎(𝛼 + 𝑟𝑎, 𝛽 + 𝑁 − 𝑟𝑎) 𝑃 𝑋 𝑌, 𝑍 = 𝑃 𝑌 𝑋, 𝑍 𝑃 𝑋 𝑍 𝑃 𝑌 𝑍

12. Thompson Sampling 𝑷 𝜽 𝒓, 𝒂 ∝ 𝑷 𝒓 𝜽, 𝒂 𝑷 𝜽|𝒂 Prior Likelihood Posterior

13. Bayesian Bandits – The Model Model if a recommendation will result in user engagement • Bernoulli distribution: 𝑝 - likelihood of event occurring How do we find 𝑝? • Conjugate prior • Beta distribution: 𝛼 - number of hits, 𝛽 - number of misses Only need to keep track of two numbers per option • # of hits, # of misses

14. Bayesian Bandits – The Algorithm 1. Initialize 𝛼𝑖 = 𝛽𝑖 = 1 (uniform prior) 2. For each user request for recommendations t 1. Sample 𝑝𝑖 ~ 𝐵𝑒𝑡𝑎 𝛼𝑖, 𝛽𝑖 2. Choose action corresponding to largest 𝑝𝑖 3. Observe reward 𝑟𝑡 4. Update 𝛼𝑡 += 𝑟𝑡, 𝛽𝑡 += 1 − 𝑟𝑡

15. Belief Adaptation

16. Belief Adaptation

17. Belief Adaptation

18. Belief Adaptation

19. Belief Adaptation

20. Bandit Regret

21. But behavior is dependent on context • Categorical contexts • One bandit model per category • One-hot context vector • Real-valued contexts • Can capture interrelatedness of context dimensions • More difficult to incorporate effectively

22. So why would I ever A/B test again? Test intent Optimization vs understanding Difficulty with non-stationarity Monday vs Friday behavior Deployment Few turnkey options Specialized skill set https://vwo.com/blog/multi-armed-bandit-algorithm/

23. Bayesian Bandits for the Patient Thompson Sampling balances exploitation & exploration while minimizing decision regret1 2 3 No need to pre-specify decision splits, time horizon for experiments Can model a variety of problems and complex interactions

24. Resources https://github.com/bgalbraith/bandits

Notes de l'éditeur

Competitors: Amazon Dining Ware, Spoonoo