SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Practical Considerations for Continual Learning
Tom Diethe
tdiethe@amazon.com
Continual AI Meetup:
“Real-world Applications of Continual Learning”
April 28 2020
Continual Learning at Amazon
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 1 / 11
Alexa AI
What is Alexa?
A cloud-based voice service that can help
you with tasks, entertainment, general
information, shopping, and more
The more you talk to Alexa, the more
Alexa adapts to your speech patterns,
vocabulary, and personal preferences
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 2 / 11
Alexa AI
What is Alexa?
A cloud-based voice service that can help
you with tasks, entertainment, general
information, shopping, and more
The more you talk to Alexa, the more
Alexa adapts to your speech patterns,
vocabulary, and personal preferences
How do we ensure that ...
we create robust and efficient AI systems?
we ensure that the privacy of customer
data is safeguarded?
customers are treated fairly by ML
algorithms?
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 2 / 11
Failure Modes
Unintentional failures: ML system produces a formally correct but completely unsafe
outcome
Outliers/anomalies
Dataset shift
Limited memory
Intentional failures: failure is caused by an active adversary attempting to subvert the
system to attain her goals, such as to:
misclassify the result
infer private training data
steal the underlying algorithm
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 3 / 11
FX (xt1 , . . . , xtn ) = FX (xt1+τ , . . . , xtn+τ )
for all τ, t1, . . . , tn
for all n ∈ N
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 4 / 11
Sagemaker
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 5 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Robustness & Transparency via Continual Learning
Data arrive continually
(Possibly) non-IID
Tasks may change over time (e.g. trends/fashions in
shopping)
New tasks may emerge (e.g. new product
categories, new marketplaces)
Robustness How can we adapt to new data whilst
retaining existing knowledge?
Transparency: How can we have systems can
signal they’re going wrong?
Standard approaches:
Train individual models on each task. Train
combination
Maintain single model and use regularization to fix
influential parameters
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
Bayesian Continual Learning [Nguyen 2018]
Given e.g. data in task t as Dt = x
(nt )
t , y
(nt )
t
Nt
n=1
, parameters θ (e.g. BLR, BNN, GP ...)
p(θ|D1:T ) ∝ p(θ)p(D1:T |θ)
= p(θ)
T
t−1
NT
n=1
p y
(nt )
t |θ, x
(nt )
t
= p(θ|D1:T−1)p(DT |θ).
Natural recursive algorithm!
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 7 / 11
Bayesian Continual Learning [Nguyen 2018]
Given e.g. data in task t as Dt = x
(nt )
t , y
(nt )
t
Nt
n=1
, parameters θ (e.g. BLR, BNN, GP ...)
p(θ|D1:T ) ∝ p(θ)p(D1:T |θ)
= p(θ)
T
t−1
NT
n=1
p y
(nt )
t |θ, x
(nt )
t
= p(θ|D1:T−1)p(DT |θ).
Natural recursive algorithm!
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 7 / 11
Engineering a Continual Learning System
Automating Data Retention Policies:
Sketcher/Compressor: when the data rate is too high
Joiner: when labels arrive late
Shared infrastructure: optimal use of space, like an OS cache
Automating Monitoring and Quality Control:
Data monitoring: dataset shift detection, anomaly detection
Prediction monitoring: monitor performance of models
Automating the ML Life-Cycle:
Trainer and HPO: store provenance, warm start training
Model policy engine: ensure re-training performed at right cadence
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 8 / 11
“Zero-Touch” Machine Learning
Model Policy
Engine
Streams
Model
Stream
Trainer
HPO
Data
Statistics
Data Monitoring
Anomaly Detection,
Distribution Shift
Measurement
Retrain
Rollback
Prediction
statistics
Prediction
Statistics
Prediction
Monitoring
Accuracy, Shift
Predictor
Business Metrics
Business Logic
Business metrics
Costs
Desired accuracy
Joiner
System State
DB
Diagnostic
Logs
Sketcher/
Sampler
Predictions
Predictions
Shared Infrastructure
Model DB
Training Data
Reservoir
Validation Data
Reservoir
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 9 / 11
Summary: Continual Learning
Continual Learning
Bayesian methods are a natural fit for continual learning
However it’s tricky to make them work well with deep learning methods
Many interesting methodological improvements happening, but most are still not
production ready
Engineering viewpoint is also required
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 10 / 11
Questions?
tdiethe@amazon.com
Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 11 / 11

Contenu connexe

Similaire à Practical Considerations for Continual Learning

Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.ppt
butest
 

Similaire à Practical Considerations for Continual Learning (20)

PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl Weir
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
 
Reasoning over big data
Reasoning over big dataReasoning over big data
Reasoning over big data
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.ppt
 
Classifying Unstructured Text - A Hybrid Deterministic/ML Approach
Classifying Unstructured Text - A Hybrid Deterministic/ML ApproachClassifying Unstructured Text - A Hybrid Deterministic/ML Approach
Classifying Unstructured Text - A Hybrid Deterministic/ML Approach
 
Lecture_1_-_Course_Overview_(Inked).pdf
Lecture_1_-_Course_Overview_(Inked).pdfLecture_1_-_Course_Overview_(Inked).pdf
Lecture_1_-_Course_Overview_(Inked).pdf
 
Data lakes a tool for minimizing expenditure on storage
Data lakes a tool for minimizing expenditure on storageData lakes a tool for minimizing expenditure on storage
Data lakes a tool for minimizing expenditure on storage
 
From Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataFrom Lab to Factory: Creating value with data
From Lab to Factory: Creating value with data
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Robo advisory-online-fow dw-asia-gs_khooalgo2008
Robo advisory-online-fow dw-asia-gs_khooalgo2008Robo advisory-online-fow dw-asia-gs_khooalgo2008
Robo advisory-online-fow dw-asia-gs_khooalgo2008
 
Course Title: Introduction to Machine Learning Chapter, One: Introduction
Course Title: Introduction to Machine Learning   Chapter,   One: IntroductionCourse Title: Introduction to Machine Learning   Chapter,   One: Introduction
Course Title: Introduction to Machine Learning Chapter, One: Introduction
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Practical Considerations for Continual Learning

  • 1. Practical Considerations for Continual Learning Tom Diethe tdiethe@amazon.com Continual AI Meetup: “Real-world Applications of Continual Learning” April 28 2020
  • 2. Continual Learning at Amazon Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 1 / 11
  • 3. Alexa AI What is Alexa? A cloud-based voice service that can help you with tasks, entertainment, general information, shopping, and more The more you talk to Alexa, the more Alexa adapts to your speech patterns, vocabulary, and personal preferences Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 2 / 11
  • 4. Alexa AI What is Alexa? A cloud-based voice service that can help you with tasks, entertainment, general information, shopping, and more The more you talk to Alexa, the more Alexa adapts to your speech patterns, vocabulary, and personal preferences How do we ensure that ... we create robust and efficient AI systems? we ensure that the privacy of customer data is safeguarded? customers are treated fairly by ML algorithms? Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 2 / 11
  • 5. Failure Modes Unintentional failures: ML system produces a formally correct but completely unsafe outcome Outliers/anomalies Dataset shift Limited memory Intentional failures: failure is caused by an active adversary attempting to subvert the system to attain her goals, such as to: misclassify the result infer private training data steal the underlying algorithm Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 3 / 11
  • 6. FX (xt1 , . . . , xtn ) = FX (xt1+τ , . . . , xtn+τ ) for all τ, t1, . . . , tn for all n ∈ N Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 4 / 11
  • 7. Sagemaker Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 5 / 11
  • 8. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 9. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 10. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 11. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 12. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 13. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 14. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 15. Robustness & Transparency via Continual Learning Data arrive continually (Possibly) non-IID Tasks may change over time (e.g. trends/fashions in shopping) New tasks may emerge (e.g. new product categories, new marketplaces) Robustness How can we adapt to new data whilst retaining existing knowledge? Transparency: How can we have systems can signal they’re going wrong? Standard approaches: Train individual models on each task. Train combination Maintain single model and use regularization to fix influential parameters Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 6 / 11
  • 16. Bayesian Continual Learning [Nguyen 2018] Given e.g. data in task t as Dt = x (nt ) t , y (nt ) t Nt n=1 , parameters θ (e.g. BLR, BNN, GP ...) p(θ|D1:T ) ∝ p(θ)p(D1:T |θ) = p(θ) T t−1 NT n=1 p y (nt ) t |θ, x (nt ) t = p(θ|D1:T−1)p(DT |θ). Natural recursive algorithm! Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 7 / 11
  • 17. Bayesian Continual Learning [Nguyen 2018] Given e.g. data in task t as Dt = x (nt ) t , y (nt ) t Nt n=1 , parameters θ (e.g. BLR, BNN, GP ...) p(θ|D1:T ) ∝ p(θ)p(D1:T |θ) = p(θ) T t−1 NT n=1 p y (nt ) t |θ, x (nt ) t = p(θ|D1:T−1)p(DT |θ). Natural recursive algorithm! Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 7 / 11
  • 18. Engineering a Continual Learning System Automating Data Retention Policies: Sketcher/Compressor: when the data rate is too high Joiner: when labels arrive late Shared infrastructure: optimal use of space, like an OS cache Automating Monitoring and Quality Control: Data monitoring: dataset shift detection, anomaly detection Prediction monitoring: monitor performance of models Automating the ML Life-Cycle: Trainer and HPO: store provenance, warm start training Model policy engine: ensure re-training performed at right cadence Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 8 / 11
  • 19. “Zero-Touch” Machine Learning Model Policy Engine Streams Model Stream Trainer HPO Data Statistics Data Monitoring Anomaly Detection, Distribution Shift Measurement Retrain Rollback Prediction statistics Prediction Statistics Prediction Monitoring Accuracy, Shift Predictor Business Metrics Business Logic Business metrics Costs Desired accuracy Joiner System State DB Diagnostic Logs Sketcher/ Sampler Predictions Predictions Shared Infrastructure Model DB Training Data Reservoir Validation Data Reservoir Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 9 / 11
  • 20. Summary: Continual Learning Continual Learning Bayesian methods are a natural fit for continual learning However it’s tricky to make them work well with deep learning methods Many interesting methodological improvements happening, but most are still not production ready Engineering viewpoint is also required Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 10 / 11
  • 21. Questions? tdiethe@amazon.com Tom Diethe (Amazon) Practical Considerations for Continual Learning April 28 2020 11 / 11