Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Smokey and the Multi-Armed Bandit featuring BERT Reynolds and Reinforcement Learning

377 vues

Publié le

https://datascienceonaws.com

https://github.com/data-science-on-aws/workshop

First, I will train and deploy multiple natural language understanding (NLU) models and compare them in live production using reinforcement learning to dynamically shift traffic to the winning model.

Next, I will describe the differences between A/B and multi-armed bandit tests including exploration-exploitation, reward-maximization, and regret-minimization.

Last, I will dive deep into the details of building and scaling a multi-armed bandit deployment on AWS using a real-time, stream-based text classifier with TensorFlow, PyTorch, and BERT on 150+ million reviews from the Amazon Customer Reviews Dataset.

Publié dans : Technologie
  • Soyez le premier à commenter

Smokey and the Multi-Armed Bandit featuring BERT Reynolds and Reinforcement Learning

  1. 1. Chris Fregly Developer Advocate AI and Machine Learning @AWS Smokey and the Multi-Armed Bandit Featuring BERT Reynolds
  2. 2. Abstract First, I will train and deploy multiple natural language understanding (NLU) models and compare them in live production using reinforcement learning to dynamically shift traffic to the winning model. Next, I will describe the differences between A/B and multi-armed bandit tests including exploration-exploitation, reward-maximization, and regret-minimization. Last, I will dive deep into the details of building and scaling a multi- armed bandit deployment on AWS using a real-time, stream-based text classifier with TensorFlow, PyTorch, and BERT on 150+ million reviews from the Amazon Customer Reviews Dataset.
  3. 3. Me Developer Advocate AI and Machine Learning @ AWS (Based in San Francisco) Co-Author of the O'Reilly Book, "Data Science on AWS." Founder of the Advanced Kubeflow Meetup (Global) https://www.datascienceonaws.com data-science-on-aws @cfregly linkedin.com/in/cfregly https://meetup.com/Advanced-Kubeflow
  4. 4. Data Science on AWS – Book and Workshop Outline https://www.datascienceonaws.com/
  5. 5. Agenda • Compare A/B Tests vs. Multi-Armed Bandit Tests • Optimize Bandits with Reinforcement Learning • Train 2 BERT Languge Models with TensorFlow • Train a Multi-Armed Bandit Model with Vowpal Wabbit • Test 2 BERT Models with a Bandit • DEMO: Scale Multi-Armed Bandits on AWS
  6. 6. Traditional A/B Tests • Static • Cannot Add New Models After Test Begins • Static Traffic Split Between Models A and B • May Negatively Impact Business Metrics • Must Run Experiment to Completion • No Concept of Reward for Winning Model
  7. 7. Multi-Armed Bandit Tests • Add New Models • Dynamically Shift Traffic • Explore-Exploit Strategy • Finish Experiment Early - or Run Longer! • Minimize Regret (Business Impact) • Maximize Reward
  8. 8. Train 2 BERT Models with TensorFlow (Models A & B) • BERT Mania! • Fine-Tuning BERT
  9. 9. Train a Bandit Model with Reinforcement Learning (RL) • Popular Reinforcement Learning Strategies • Epsilon Greedy • Thompson’s Sampling • Online Cover • Bagging • Implemented in Vowpal Wabbit (VW)! • Try Our Open Source RL Containers • https://github.com/aws/sagemaker-rl-container
  10. 10. Test 2 BERT Models with a Multi-Armed Bandit Model
  11. 11. DEMO: Scale Multi-Armed Bandits on AWS
  12. 12. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. DEMO!
  13. 13. More Resources • O’Reilly Book - Data Science on AWS – Early Release Available! • https://datascienceonaws.com • GitHub Repo • https://github.com/data-science-on-aws/workshop • AWS Blog Post on Multi-Armed Bandits • https://aws.amazon.com/blogs/machine-learning/power-contextual-bandits-using-continual-learning- with-amazon-sagemaker-rl/ • Bandit Algorithms • https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Contextual-Bandit-algorithms • Open Source SageMaker Reinforcement Learning Containers • https://github.com/aws/sagemaker-rl-container
  14. 14. Thank you! © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. Chris Fregly data-science-on-aws @cfregly linkedin.com/in/cfregly

×