Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

•

7 likes•5,335 views

Alexandra Johnson, Software Engineer, SigOpt Alexandra works on everything from infrastructure to product features to blog posts. Previously, she worked on growth, APIs, and recommender systems at Polyvore (acquired by Yahoo). She majored in computer science at Carnegie Mellon University with a minor in discrete mathematics and logic, and during the summers she A/B tested recommendations at internships with Facebook and Rent the Runway. Abstract Summary: Common Problems In Hyperparameter Optimization: All large machine learning pipelines have tunable parameters, commonly referred to as hyperparameters. Hyperparameter optimization is the process by which we find the values for these parameters that cause our system to perform the best. SigOpt provides a Bayesian optimization platform that is commonly used for hyperparameter optimization, and I’m going to share some of the common problems we’ve seen when integrating into machine learning pipelines.

Technology

Common Problems in Hyperparameter
Optimization
Alexandra Johnson
@alexandraj777

Hyperparameter Optimization
● Hyperparameter
tuning, model tuning,
model selection
● Finding "the best"
values for the
hyperparameters of
your model

Better Performance
● +315% accuracy boost for TensorFlow
● +49% accuracy boost for xgboost
● -41% error reduction for recommender system

● Default values are an implicit choice
● Defaults not always appropriate for your model
● You may build a classifier that looks like this:
Default Values

Choosing a Metric
● Balance long-term
and short-term goals
● Question underlying
assumptions
● Example from
Microsoft

Choose Multiple Metrics
●
● Composite Metric
● Multi-metric

Metric Generalization
● Cross validation
● Backtesting
● Regularization terms

Example: xgboost
● Optimized model
always performed
better with tuned
feature parameters
● No matter which
optimization method

You are not an Optimization Method
● Hand tuning is time
consuming and
expensive
● Algorithms can
quickly and cheaply
beat expert tuning

Grid Search Random Search Bayesian Optimization
Use an Algorithm

No Grid Search
Hyper-
parameters
Model
Evaluations
2 100
3 1,000
4 10,000
5 100,000

Random Search
● Theoretically more
effective than grid
search
● Large variance in
results
● No intelligence

Use an Intelligent Method
Genetic algorithms
Bayesian optimization
Particle-based methods
Convex optimizers
Simulated annealing
To name a few...

SigOpt: Bayesian Optimization Service
Three API calls:
1. Define
hyperparameters
2. Receive suggested
hyperparameters
3. Report observed
performance

Intro
Ian Dewancker. SigOpt for ML: TensorFlow ConvNets on a Budget with Bayesian Optimization.
Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization.
Ian Dewancker. SigOpt for ML : Bayesian Optimization for Collaborative Filtering with MLlib.
#1 Trusting the Defaults
Keras recurrent layers documentation
#2 Using the Wrong Metric
Ron Kohavi et al. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained.
Xavier Amatriain. 10 Lessons Learning from building ML systems [Video at 19:03].
Image from PhD Comics.
See also: SigOpt in Depth: Intro to Multicriteria Optimization.
#4 Too Few Hyperparameters
Image from TensorFlow Playground.
Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization.
#5 Hand Tuning
On algorithms beating experts: Scott Clark, Ian Dewancker, and Sathish Nagappan. Deep Neural Network Optimization with SigOpt and Nervana
Cloud.
#6 Grid Search
NoGridSearch.com
References - by Section

References - by Section
#7 Random Search
James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization.
Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, George Ke. A Stratified Analysis of Bayesian Optimization
Methods.
Learn More
blog.sigopt.com
sigopt.com/research

What's hot

Machine Learning FundamentalsSigOpt

Scott Clark, CEO, SigOpt, at The AI Conference 2017MLconf

Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt

Winning Kaggle 101: Introduction to StackingTed Xiao

The Evolution of AutoMLNing Jiang

SigOpt for Machine Learning and AISigOpt

General Tips for participating Kaggle CompetitionsMark Peng

Machine Learning for .NET Developers - ADC21Gülden Bilgütay

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf

Day 2 (Lecture 5): A Practitioner's Perspective on Building Machine Product i...Aseda Owusua Addai-Deseh

Making Netflix Machine Learning Algorithms ReliableJustin Basilico

Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico

Alexandra johnson reducing operational barriers to model trainingMLconf

Automatic machine learning (AutoML) 101QuantUniversity

Neel Sundaresan - Teaching a machine to codeMLconf

Feature EngineeringHJ van Veen

Explainable AI - making ML and DL models more interpretableAditya Bhattacharya

Adopting software design practices for better machine learningMLconf

The Power of Auto ML and How Does it WorkIvo Andreev

Winning data science competitionsOwen Zhang

What's hot (20)

Machine Learning Fundamentals

Scott Clark, CEO, SigOpt, at The AI Conference 2017

Using SigOpt to Tune Deep Learning Models with Nervana Cloud

Winning Kaggle 101: Introduction to Stacking

The Evolution of AutoML

SigOpt for Machine Learning and AI

General Tips for participating Kaggle Competitions

Machine Learning for .NET Developers - ADC21

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017

Day 2 (Lecture 5): A Practitioner's Perspective on Building Machine Product i...

Making Netflix Machine Learning Algorithms Reliable

Lessons Learned from Building Machine Learning Software at Netflix

Alexandra johnson reducing operational barriers to model training

Automatic machine learning (AutoML) 101

Neel Sundaresan - Teaching a machine to code

Feature Engineering

Explainable AI - making ML and DL models more interpretable

Adopting software design practices for better machine learning

The Power of Auto ML and How Does it Work

Winning data science competitions

Viewers also liked

Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016MLconf

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017MLconf

Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...MLconf

Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017MLconf

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...MLconf

Jeff Bradshaw, Founder, AdaptrisMLconf

Layla El Asri, Research Scientist, Maluuba MLconf

Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017MLconf

Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...MLconf

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf

Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016MLconf

Daniel Shank, Data Scientist, Talla at MLconf SF 2016MLconf

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf

Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016MLconf

Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf

Anjuli Kannan, Software Engineer, Google at MLconf SF 2016MLconf

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf

Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...MLconf

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017MLconf

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf

Viewers also liked (20)

Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017

Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...

Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...

Jeff Bradshaw, Founder, Adaptris

Layla El Asri, Research Scientist, Maluuba

Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017

Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...

Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016

Daniel Shank, Data Scientist, Talla at MLconf SF 2016

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...

Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016

Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016

Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016

Similar to Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh

Guiding through a typical Machine Learning PipelineMichael Gerke

Automated Hyperparameter Tuning, Scaling and TrackingDatabricks

Entity matching of web offers, from html to similarity score. Paul Puget

Foutse_Khomh.pptxFoutse Khomh

Tuning 2.0: Advanced Optimization Techniques WebinarSigOpt

CODE TUNINGtertertertrtryryryryrtytrytrtrykapib57390

Ijcai 2020Viral Gupta

ICLR 2020 RecapSri Ambati

Performance Tuning with XHProfSalesforce Engineering

Best Practices for Hyperparameter Tuning with MLflowDatabricks

MLPerf an industry standard benchmark suite for machine learning performancejemin lee

Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...gdgsurrey

housing price prediction.pptxJINALVASOYA2

Pre-Report.pptxTANVIBENPATEL

Modeling at Scale: SigOpt at TWIMLcon 2019SigOpt

Rails Conf 2014 Concerns, Decorators, Presenters, Service-objects, Helpers, H...Justin Gordon

Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks

Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt

Test AutomationRodrigo Paiva

Similar to Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017 (20)

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...

Guiding through a typical Machine Learning Pipeline

Automated Hyperparameter Tuning, Scaling and Tracking

Entity matching of web offers, from html to similarity score.

Foutse_Khomh.pptx

Tuning 2.0: Advanced Optimization Techniques Webinar

CODE TUNINGtertertertrtryryryryrtytrytrtry

Ijcai 2020

ICLR 2020 Recap

Performance Tuning with XHProf

Best Practices for Hyperparameter Tuning with MLflow

MLPerf an industry standard benchmark suite for machine learning performance

Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...

housing price prediction.pptx

Pre-Report.pptx

Modeling at Scale: SigOpt at TWIMLcon 2019

Rails Conf 2014 Concerns, Decorators, Presenters, Service-objects, Helpers, H...

Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...

Using Bayesian Optimization to Tune Machine Learning Models

Test Automation

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Real Time Object Detection Using Open CVKhem

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Histor y of HAM Radio presentation slidevu2urc

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Recently uploaded (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Presentation on how to chat with PDF using ChatGPT code interpreter

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Real Time Object Detection Using Open CV

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Boost Fertility New Invention Ups Success Rates.pdf

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Finology Group – Insurtech Innovation Award 2024

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

08448380779 Call Girls In Civil Lines Women Seeking Men

Histor y of HAM Radio presentation slide

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Exploring the Future Potential of AI-Enabled Smartphone Processors

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

How to Troubleshoot Apps for the Modern Connected Worker

Powerful Google developer tools for immediate impact! (2023-24 C)

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

1. Common Problems in Hyperparameter Optimization Alexandra Johnson @alexandraj777

2. What are Hyperparameters?

3. Hyperparameter Optimization ● Hyperparameter tuning, model tuning, model selection ● Finding "the best" values for the hyperparameters of your model

4. Better Performance ● +315% accuracy boost for TensorFlow ● +49% accuracy boost for xgboost ● -41% error reduction for recommender system

5. #1 Trusting the Defaults

6. ● Default values are an implicit choice ● Defaults not always appropriate for your model ● You may build a classifier that looks like this: Default Values

7. #2 Using the Wrong Metric

8. Choosing a Metric ● Balance long-term and short-term goals ● Question underlying assumptions ● Example from Microsoft

9. Choose Multiple Metrics ● ● Composite Metric ● Multi-metric

10. #3 Overfitting

11. Metric Generalization ● Cross validation ● Backtesting ● Regularization terms

12. Metric Generalization ● Cross validation ● Backtesting ● Regularization terms

13. Metric Generalization ● Cross validation ● Backtesting ● Regularization terms

14. #4 Too Few Hyperparameters

15. Optimize all Parameters at Once

16. Include Feature Parameters

17. Include Feature Parameters

18. Example: xgboost ● Optimized model always performed better with tuned feature parameters ● No matter which optimization method

19. #5 Hand Tuning

20. What is an Optimization Method?

21. You are not an Optimization Method ● Hand tuning is time consuming and expensive ● Algorithms can quickly and cheaply beat expert tuning

22. Grid Search Random Search Bayesian Optimization Use an Algorithm

23. #6 Grid Search

24. No Grid Search Hyper- parameters Model Evaluations 2 100 3 1,000 4 10,000 5 100,000

25. #7 Random Search

26. Random Search ● Theoretically more effective than grid search ● Large variance in results ● No intelligence

27. Use an Intelligent Method Genetic algorithms Bayesian optimization Particle-based methods Convex optimizers Simulated annealing To name a few...

28. SigOpt: Bayesian Optimization Service Three API calls: 1. Define hyperparameters 2. Receive suggested hyperparameters 3. Report observed performance

29. Thank You!

30. Intro Ian Dewancker. SigOpt for ML: TensorFlow ConvNets on a Budget with Bayesian Optimization. Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization. Ian Dewancker. SigOpt for ML : Bayesian Optimization for Collaborative Filtering with MLlib. #1 Trusting the Defaults Keras recurrent layers documentation #2 Using the Wrong Metric Ron Kohavi et al. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained. Xavier Amatriain. 10 Lessons Learning from building ML systems [Video at 19:03]. Image from PhD Comics. See also: SigOpt in Depth: Intro to Multicriteria Optimization. #4 Too Few Hyperparameters Image from TensorFlow Playground. Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization. #5 Hand Tuning On algorithms beating experts: Scott Clark, Ian Dewancker, and Sathish Nagappan. Deep Neural Network Optimization with SigOpt and Nervana Cloud. #6 Grid Search NoGridSearch.com References - by Section

31. References - by Section #7 Random Search James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, George Ke. A Stratified Analysis of Bayesian Optimization Methods. Learn More blog.sigopt.com sigopt.com/research

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

Similar to Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017