SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Dietterich
2nd edition
1
Dietterich #MLSEV 2
State of the Art in
Machine Learning
Tom Dietterich
Chief Scientist, BigML, Inc
Dietterich #MLSEV 3
• Carnegie-Mellon University
• Organizers: Jaime Carbonell, Tom Mitchell,
Ryszard Michalski
• Attendees: ~30
• Topics:
• Exact learning
• Compression
• Supervised learning with noise-free labels
1980: First Machine
Learning Workshop
Dietterich #MLSEV 4
• Generalization
• Feature Engineering
• Explanation and Uncertainty
• Uncertainty Quantification
• Run-Time Monitoring
• Application-Specific Metrics
Outline:
Six Challenges for ML
Dietterich #MLSEV
Challenge #1: Generalization
5
Dietterich #MLSEV 6
• Ross Quinlan introduced ID3
• Decision tree learning algorithm
• Goal: Compress chess endgame tables into
simple decision rules
• Ken Thompson had reverse-enumerated the
winning positions for certain chess endgames 
Large table of (board position, outcome) pairs
• ID3 was applied to compress these into a more
understandable representation
• Notes:
• No generalization, Noise Free
• Interpretability was important
Decision Tree Method: ID3
Win in 10
Breda, 2006
ID3
Dietterich #MLSEV 7
• Generalization for iid data
• Assume training and runtime data are drawn
from the same distribution
• Strong theoretical guarantees
• Generalization across domains
• Causal Transportability
• Domain-Adversarial Training
Today:
Generalization is the Key
Dietterich #MLSEV 8
• Predicting Lung Cancer
• T: Lung Cancer
• C: Chest Pain
• A: Patient is taking aspirin
• K: Patient is a smoker (not observed)
• S: The distribution of A may change between training and
deployment (change of hospital)
• Goal: Create a predictive model that does not depend on S
• Guaranteed to generalize to new hospital (assuming this
causal model is correct)
Causal Transportability
(Pearl & Bareinboim, 2011)
Dietterich #MLSEV 9
• Generate all models that can make 𝑇𝑇
independent of 𝑆𝑆
• Evaluate each model on validation
data
• Keep the best model
• Guaranteed to transport across
hospitals provided that the causal
diagram is correct
Graph Surgery Technique
Encourages thinking ahead about possible changes at
deployment time
(Subbaswamy et al., 2018)
Dietterich #MLSEV 10
• Given:
• Training data points from two or more domains: 𝐷𝐷1, 𝐷𝐷2
• 𝐷𝐷1 points are labeled pairs 𝑥𝑥𝑖𝑖, 𝑦𝑦𝑖𝑖
• 𝐷𝐷2 points are unlabeled 𝑥𝑥𝑖𝑖
• Training:
• For 𝐷𝐷1 points: Predict the correct label
• For all points: Predict the domain 1 vs. 2
• Find weights that give accurate predictions for 𝐷𝐷1 and
chance predictions for the domain
Domain Adversarial Training
Dietterich #MLSEV 11
Domain Adversarial Training
Ganin, et al., JMLR 2016
Dietterich #MLSEV 12
Experiments
Dietterich #MLSEV 13
• Method assumes that the class label distributions
are not changing
• The method can be unstable. Works best if you
have at least some labeled data for the target
domain to help choose hyperparameters
Domain-Adversarial Training
Weaknesses
Dietterich #MLSEV
Challenge #2: Feature Engineering
14
Dietterich #MLSEV 15
• In 1980, Quinlan carefully designed
interpretable features with
predictive power. This is still
important today in most
applications
• Claim: Features should include
meta-data definitions
• “Numbers should never travel
alone across the internet” –Mark
Fox
• BigML flatline language
• SQL statements/procedures
• Trifacta rules
Feature Engineering
Example:
Student_Teacher_Ratio(school, time)
|{s | registered(s, school, time)}| /
∑ 𝐹𝐹𝐹𝐹𝐹𝐹(𝑡𝑡𝑡𝑡 , school, time)
Dietterich #MLSEV 16
• Allows data consumers to detect when the
meaning of the feature has changed even when
the feature name has not changed
• important for detecting data errors and
debugging classifier failures
Importance of
Feature Meta-Data
Dietterich #MLSEV 17
• No: Deep learning applications still
require careful data preparation
• image normalization, contrast
enhancement, etc.
• Yes: Deep learning can learn
powerful intermediate
representations
• <2012: Manually-designed SIFT
and HoG features for images
combined with support vector
machines or random forests
• >2012: Deep learning produces
much better results
Does Deep Learning Automate
Feature Engineering? Yes and No
0
5
10
15
20
25
30
2010 2011 2012 2013 2014
Top5ClassificationError
(%) Before After
ImageNet 1000 Classes
Dietterich #MLSEV
Challenge #3:
Explanation and Interpretability
18
Dietterich #MLSEV 19
• 1980: Quinlan wanted interpretability because he
expected people to memorize the learned
decision tree
• In practice, we needed to check whether the
learning algorithm got the right answer
• Today: Our highest-performing models (random
forests, boosted trees, deep neural networks) are
not interpretable
• Interpretability and explanation are “hot topics”
in ML research
Interpretability and
Explanation
Dietterich #MLSEV 20
• Claim: Explanations should help the user perform some
task
• BigML has worked hard on visualization tools to provide
interpretability
• At Oregon State, we are developing explanation tools
for reinforcement learning
Explanation and
Interpretability
ML System User Task
Predictive Model ML Engineer Find errors and holes in data
Recommendation
System
End User
Decide whether to follow the
recommendation
Predictive Model
RL Model
ML Engineer
Acceptance Testing:
Decide whether delivered
system is sufficiently
accurate
Dietterich #MLSEV
Challenge #4:
Uncertainty Quantification
21
Dietterich #MLSEV 22
• 1980: This issue was totally ignored
• Today: Giving calibrated uncertainty estimates is
important
• Calibrated Probabilities:
• When the classifier says “X belongs to class C
with probability 0.94”, then it is correct 94% of the
time
• This is measured using a separate labeled
“calibration set”
• Can use “out of bag” training data in random
forests
Uncertainty Quantification
Dietterich #MLSEV 23
• Some classifiers are always well-calibrated
• Decision Trees
• Random Forests
• Others must be post-processed to achieve good
calibration
• Boosted Trees
• Support Vector Machines
• Deep Neural Networks
Calibration
Dietterich #MLSEV 24
• Sort the predicted probabilities
into bins 0.0-0.1, 0.1-0.2, etc.
• For each bin, measure the
average accuracy on the
calibration data
• Plot the accuracy for each bin
• should lie on the diagonal if
well-calibrated
• Example shows that Naïve
Bayes is generally very
optimistic
Measuring Calibration via a
Reliability Diagram
Reliability Diagram (Naïve
Bayes; ADULT)
Zadrozny & Elkan,
2002
Dietterich #MLSEV 25
• Fit a function to the reliability
diagram
• Often a sigmoid (logistic
regression) function works
well
• Use this to convert the
predicted values (on X axis) to
calibrated values (Y axis)
• Similar techniques can
calibrate Naïve Bayes,
Deep Nets, Boosted Trees,
etc.
Fitting a recalibration
function
Dietterich #MLSEV 26
• Calibration compares predicted probability and expected accuracy
globally – across the entire calibration data set
• This may be misleading
• A classifier could achieve 95% accuracy and perfect calibration by classifying
95% of the data set perfectly and the remaining 5% completely incorrectly
• This 5% could be a specific customer segment
• Within that segment, the classifier is actually very poorly calibrated
because it outputs a confidence of 0.95 but is correct 0% of the time
• Lesson: Calibration should be done separately for each customer
segment or local group
• Decision trees calibrate separately for each leaf of the tree, so they
usually don’t exhibit this problem
• It is always important to look at model accuracy by customer
segments and other customer features (gender, race, region, age,
etc.)
• Example: Face recognition is less accurate on dark skin and on women, etc.
Local vs. Global Calibration
Dietterich #MLSEV
Challenge #5:
Run-Time Monitoring
27
Dietterich #MLSEV 28
• Predictive models are only guaranteed to be
accurate if run-time queries are drawn from the same
distribution as the training data
• Open Category Problem: Run-time data may involve
new classes
• New types of objects in computer vision
• New classes of items (books, restaurants) in
recommender systems
• New diseases in medical systems
• New types of fraud in supervised fraud detection
Why Monitor?
Dietterich #MLSEV 29
• Outlier Detection
• Detect whether a new query 𝑥𝑥𝑞𝑞 is an outlier
compared to the training data 𝑥𝑥1, … , 𝑥𝑥𝑁𝑁
• Change Detection
• Detect whether the data distribution has changed
• Compare the 𝐿𝐿 most recent points 𝑥𝑥𝑡𝑡−𝐿𝐿+1, … , 𝑥𝑥𝑡𝑡
to the 𝐿𝐿 points before them, 𝑥𝑥𝑡𝑡−2𝐿𝐿+1, … , 𝑥𝑥𝑡𝑡−𝐿𝐿. Do
they come from different distributions?
How to Monitor?
Dietterich #MLSEV 30
• Most AD papers only evaluate on a few datasets
• Often proprietary or very easy (e.g., KDD 1999)
• ML community needs a large and growing
collection of public anomaly benchmarks
Anomaly Detection
Benchmarking Study
[Emmott, Das, Dietterich, Fern, Wong, 2013; KDD ODD-2013]
[Emmott, Das, Dietterich, Fern, Wong. 2016; arXiv 1503.01158v2]
Dietterich #MLSEV 31
• Density-Based Approaches
• RKDE: Robust Kernel Density
Estimation (Kim & Scott, 2008)
• EGMM: Ensemble Gaussian
Mixture Model (our group)
• Quantile-Based Methods
• OCSVM: One-class SVM
(Schoelkopf, et al., 1999)
• SVDD: Support Vector Data
Description (Tax & Duin, 2004)
Algorithms
• Neighbor-Based Methods
• LOF: Local Outlier Factor (Breunig,
et al., 2000)
• ABOD: kNN Angle-Based Outlier
Detector (Kriegel, et al., 2008)
• Projection-Based Methods
• IFOR: Isolation Forest (Liu, et al.,
2008)
• LODA: Lightweight Online Detector
of Anomalies (Pevny, 2016)
Dietterich #MLSEV 32
Algorithm Comparison
0
0.2
0.4
0.6
0.8
1
1.2
ChangeinMetricwrt
ControlDataset
Algorithm
logit(AUC)
log(LIFT)
Based on this
study, BigML
implemented
Isolation Forest
Dietterich #MLSEV 33
• Only make a prediction
if the query 𝑥𝑥𝑞𝑞 has a
low anomaly score
• Liu, et al. 2018 showed
how to set 𝜏𝜏 to
guarantee detecting
new category queries
with high probability
Open Category Detection
𝑥𝑥𝑞𝑞
Anomaly
Detector
𝐴𝐴 𝑥𝑥𝑞𝑞 > 𝜏𝜏?
Classifier 𝑓𝑓
Training
Examples
(𝑥𝑥𝑖𝑖, 𝑦𝑦𝑖𝑖) no
𝑦𝑦 = 𝑓𝑓(𝑥𝑥𝑞𝑞)
yes reject
[Liu, Garrepalli, Fern, Dietterich, ICML 2018]
Dietterich #MLSEV 34
• “Two Sample” test. 𝑆𝑆𝑎𝑎 ∼ 𝑃𝑃𝑎𝑎, 𝑆𝑆𝑏𝑏 ∼ 𝑃𝑃𝑏𝑏, is 𝑃𝑃𝑎𝑎 ≠ 𝑃𝑃𝑏𝑏?
• Method 1: Kernel two-sample test
• Method 2: Old-vs-New Classifier
• Train a classifier to distinguish between 𝑆𝑆𝑎𝑎 and 𝑆𝑆𝑏𝑏. Can it
do better than random guessing?
• At each time 𝑡𝑡, slide 𝑆𝑆𝑎𝑎 and 𝑆𝑆𝑏𝑏 one step forward in time
(requires online methods)
• An area of active research
Change Detection
𝑥𝑥1 𝑥𝑥2 𝑥𝑥100 𝑥𝑥101 𝑥𝑥200
𝑆𝑆𝑎𝑎 𝑆𝑆𝑏𝑏
Dietterich #MLSEV
Challenge #6:
Evaluation
35
Dietterich #MLSEV 36
• Standard metrics for evaluating classifiers, such as F1
and AUC, were developed for machine learning research
• Most applications require separate metrics
• Example:
• Financial fraud
• Suppose we have 5 analysts and each analyst can
examine 10 cases per day
• Metric: Expected value of the top 50 alarms
(value@50).
• Incorporates the estimated value of each
candidate fraud alarm
Application-Specific Metrics
are Essential
Dietterich #MLSEV 37
• Open Category Detection:
• Detect 99% of all open category queries
• Metric: Precision at 99% recall
• Obstacle Detection for Self-Driving cars
• Detect 99.999% of all dangerous obstacles
• Metric: Precision at 99.999% recall
• Cancer Screening:
• Must trade off false alarms versus missed alarms
• Metric: Cost to patient (may vary from one patient to
another)
• AUC is a fairly good metric for this case
More Examples
Dietterich #MLSEV
Summary
38
Dietterich #MLSEV 39
• Generalization
• Beyond iid: Causal transportability; Domain adaptation
• Feature Engineering
• Very important; Deep learning can discover useful
intermediate features in some cases
• Uncertainty Quantification
• Probability Calibration
• Run-time Monitoring
• Anomaly Detection; Change Point Detection
• Application-Specific Metrics
Frontiers of Machine
Learning and Applications
Dietterich 40

Contenu connexe

Tendances

Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
QuestionPro
 

Tendances (20)

DIY market segmentation 20170125
DIY market segmentation 20170125DIY market segmentation 20170125
DIY market segmentation 20170125
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
DIY Max-Diff webinar slides
DIY Max-Diff webinar slidesDIY Max-Diff webinar slides
DIY Max-Diff webinar slides
 
Module 1.2 data preparation
Module 1.2  data preparationModule 1.2  data preparation
Module 1.2 data preparation
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
DIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slidesDIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slides
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Kevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data MiningKevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data Mining
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
 
MLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve OperationsMLSEV Virtual. Applying Topic Modelling to improve Operations
MLSEV Virtual. Applying Topic Modelling to improve Operations
 
LKNA 2014 Risk and Impediment Analysis and Analytics - Troy Magennis
LKNA 2014 Risk and Impediment Analysis and Analytics - Troy MagennisLKNA 2014 Risk and Impediment Analysis and Analytics - Troy Magennis
LKNA 2014 Risk and Impediment Analysis and Analytics - Troy Magennis
 
MLSEV Virtual. Anomaly Detection Examples
MLSEV Virtual. Anomaly Detection ExamplesMLSEV Virtual. Anomaly Detection Examples
MLSEV Virtual. Anomaly Detection Examples
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
Data Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution ImplementationData Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution Implementation
 
Slides for automate or die (presentation)
Slides for automate or die (presentation)Slides for automate or die (presentation)
Slides for automate or die (presentation)
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversions
 
Nss power point_machine_learning
Nss power point_machine_learningNss power point_machine_learning
Nss power point_machine_learning
 

Similaire à MLSEV Virtual. State of the Art in ML

RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
GeekPwn Keen
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
台灣資料科學年會
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Introduction to machine learning and pattern recognition
Introduction to machine learning and pattern recognitionIntroduction to machine learning and pattern recognition
Introduction to machine learning and pattern recognition
aqib296675
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
Egyptian Engineers Association
 

Similaire à MLSEV Virtual. State of the Art in ML (20)

DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical Perspective
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Machine Learning (Decisoion Trees)
Machine Learning (Decisoion Trees)Machine Learning (Decisoion Trees)
Machine Learning (Decisoion Trees)
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
RECENT PROGRESS IN ADVERSARIAL DEEP LEARNING ATTACK AND DEFENSE - Wenbo Guo a...
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Psychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet PrinterPsychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet Printer
 
07 learning
07 learning07 learning
07 learning
 
Introduction to machine learning and pattern recognition
Introduction to machine learning and pattern recognitionIntroduction to machine learning and pattern recognition
Introduction to machine learning and pattern recognition
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
 
Top 10 Data Science Practioner Pitfalls - Mark Landry
Top 10 Data Science Practioner Pitfalls - Mark LandryTop 10 Data Science Practioner Pitfalls - Mark Landry
Top 10 Data Science Practioner Pitfalls - Mark Landry
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark Landry
 
Mini datathon
Mini datathonMini datathon
Mini datathon
 

Plus de BigML, Inc

Plus de BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
 

Dernier

Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 

Dernier (20)

Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

MLSEV Virtual. State of the Art in ML

  • 2. Dietterich #MLSEV 2 State of the Art in Machine Learning Tom Dietterich Chief Scientist, BigML, Inc
  • 3. Dietterich #MLSEV 3 • Carnegie-Mellon University • Organizers: Jaime Carbonell, Tom Mitchell, Ryszard Michalski • Attendees: ~30 • Topics: • Exact learning • Compression • Supervised learning with noise-free labels 1980: First Machine Learning Workshop
  • 4. Dietterich #MLSEV 4 • Generalization • Feature Engineering • Explanation and Uncertainty • Uncertainty Quantification • Run-Time Monitoring • Application-Specific Metrics Outline: Six Challenges for ML
  • 6. Dietterich #MLSEV 6 • Ross Quinlan introduced ID3 • Decision tree learning algorithm • Goal: Compress chess endgame tables into simple decision rules • Ken Thompson had reverse-enumerated the winning positions for certain chess endgames  Large table of (board position, outcome) pairs • ID3 was applied to compress these into a more understandable representation • Notes: • No generalization, Noise Free • Interpretability was important Decision Tree Method: ID3 Win in 10 Breda, 2006 ID3
  • 7. Dietterich #MLSEV 7 • Generalization for iid data • Assume training and runtime data are drawn from the same distribution • Strong theoretical guarantees • Generalization across domains • Causal Transportability • Domain-Adversarial Training Today: Generalization is the Key
  • 8. Dietterich #MLSEV 8 • Predicting Lung Cancer • T: Lung Cancer • C: Chest Pain • A: Patient is taking aspirin • K: Patient is a smoker (not observed) • S: The distribution of A may change between training and deployment (change of hospital) • Goal: Create a predictive model that does not depend on S • Guaranteed to generalize to new hospital (assuming this causal model is correct) Causal Transportability (Pearl & Bareinboim, 2011)
  • 9. Dietterich #MLSEV 9 • Generate all models that can make 𝑇𝑇 independent of 𝑆𝑆 • Evaluate each model on validation data • Keep the best model • Guaranteed to transport across hospitals provided that the causal diagram is correct Graph Surgery Technique Encourages thinking ahead about possible changes at deployment time (Subbaswamy et al., 2018)
  • 10. Dietterich #MLSEV 10 • Given: • Training data points from two or more domains: 𝐷𝐷1, 𝐷𝐷2 • 𝐷𝐷1 points are labeled pairs 𝑥𝑥𝑖𝑖, 𝑦𝑦𝑖𝑖 • 𝐷𝐷2 points are unlabeled 𝑥𝑥𝑖𝑖 • Training: • For 𝐷𝐷1 points: Predict the correct label • For all points: Predict the domain 1 vs. 2 • Find weights that give accurate predictions for 𝐷𝐷1 and chance predictions for the domain Domain Adversarial Training
  • 11. Dietterich #MLSEV 11 Domain Adversarial Training Ganin, et al., JMLR 2016
  • 13. Dietterich #MLSEV 13 • Method assumes that the class label distributions are not changing • The method can be unstable. Works best if you have at least some labeled data for the target domain to help choose hyperparameters Domain-Adversarial Training Weaknesses
  • 14. Dietterich #MLSEV Challenge #2: Feature Engineering 14
  • 15. Dietterich #MLSEV 15 • In 1980, Quinlan carefully designed interpretable features with predictive power. This is still important today in most applications • Claim: Features should include meta-data definitions • “Numbers should never travel alone across the internet” –Mark Fox • BigML flatline language • SQL statements/procedures • Trifacta rules Feature Engineering Example: Student_Teacher_Ratio(school, time) |{s | registered(s, school, time)}| / ∑ 𝐹𝐹𝐹𝐹𝐹𝐹(𝑡𝑡𝑡𝑡 , school, time)
  • 16. Dietterich #MLSEV 16 • Allows data consumers to detect when the meaning of the feature has changed even when the feature name has not changed • important for detecting data errors and debugging classifier failures Importance of Feature Meta-Data
  • 17. Dietterich #MLSEV 17 • No: Deep learning applications still require careful data preparation • image normalization, contrast enhancement, etc. • Yes: Deep learning can learn powerful intermediate representations • <2012: Manually-designed SIFT and HoG features for images combined with support vector machines or random forests • >2012: Deep learning produces much better results Does Deep Learning Automate Feature Engineering? Yes and No 0 5 10 15 20 25 30 2010 2011 2012 2013 2014 Top5ClassificationError (%) Before After ImageNet 1000 Classes
  • 19. Dietterich #MLSEV 19 • 1980: Quinlan wanted interpretability because he expected people to memorize the learned decision tree • In practice, we needed to check whether the learning algorithm got the right answer • Today: Our highest-performing models (random forests, boosted trees, deep neural networks) are not interpretable • Interpretability and explanation are “hot topics” in ML research Interpretability and Explanation
  • 20. Dietterich #MLSEV 20 • Claim: Explanations should help the user perform some task • BigML has worked hard on visualization tools to provide interpretability • At Oregon State, we are developing explanation tools for reinforcement learning Explanation and Interpretability ML System User Task Predictive Model ML Engineer Find errors and holes in data Recommendation System End User Decide whether to follow the recommendation Predictive Model RL Model ML Engineer Acceptance Testing: Decide whether delivered system is sufficiently accurate
  • 22. Dietterich #MLSEV 22 • 1980: This issue was totally ignored • Today: Giving calibrated uncertainty estimates is important • Calibrated Probabilities: • When the classifier says “X belongs to class C with probability 0.94”, then it is correct 94% of the time • This is measured using a separate labeled “calibration set” • Can use “out of bag” training data in random forests Uncertainty Quantification
  • 23. Dietterich #MLSEV 23 • Some classifiers are always well-calibrated • Decision Trees • Random Forests • Others must be post-processed to achieve good calibration • Boosted Trees • Support Vector Machines • Deep Neural Networks Calibration
  • 24. Dietterich #MLSEV 24 • Sort the predicted probabilities into bins 0.0-0.1, 0.1-0.2, etc. • For each bin, measure the average accuracy on the calibration data • Plot the accuracy for each bin • should lie on the diagonal if well-calibrated • Example shows that Naïve Bayes is generally very optimistic Measuring Calibration via a Reliability Diagram Reliability Diagram (Naïve Bayes; ADULT) Zadrozny & Elkan, 2002
  • 25. Dietterich #MLSEV 25 • Fit a function to the reliability diagram • Often a sigmoid (logistic regression) function works well • Use this to convert the predicted values (on X axis) to calibrated values (Y axis) • Similar techniques can calibrate Naïve Bayes, Deep Nets, Boosted Trees, etc. Fitting a recalibration function
  • 26. Dietterich #MLSEV 26 • Calibration compares predicted probability and expected accuracy globally – across the entire calibration data set • This may be misleading • A classifier could achieve 95% accuracy and perfect calibration by classifying 95% of the data set perfectly and the remaining 5% completely incorrectly • This 5% could be a specific customer segment • Within that segment, the classifier is actually very poorly calibrated because it outputs a confidence of 0.95 but is correct 0% of the time • Lesson: Calibration should be done separately for each customer segment or local group • Decision trees calibrate separately for each leaf of the tree, so they usually don’t exhibit this problem • It is always important to look at model accuracy by customer segments and other customer features (gender, race, region, age, etc.) • Example: Face recognition is less accurate on dark skin and on women, etc. Local vs. Global Calibration
  • 28. Dietterich #MLSEV 28 • Predictive models are only guaranteed to be accurate if run-time queries are drawn from the same distribution as the training data • Open Category Problem: Run-time data may involve new classes • New types of objects in computer vision • New classes of items (books, restaurants) in recommender systems • New diseases in medical systems • New types of fraud in supervised fraud detection Why Monitor?
  • 29. Dietterich #MLSEV 29 • Outlier Detection • Detect whether a new query 𝑥𝑥𝑞𝑞 is an outlier compared to the training data 𝑥𝑥1, … , 𝑥𝑥𝑁𝑁 • Change Detection • Detect whether the data distribution has changed • Compare the 𝐿𝐿 most recent points 𝑥𝑥𝑡𝑡−𝐿𝐿+1, … , 𝑥𝑥𝑡𝑡 to the 𝐿𝐿 points before them, 𝑥𝑥𝑡𝑡−2𝐿𝐿+1, … , 𝑥𝑥𝑡𝑡−𝐿𝐿. Do they come from different distributions? How to Monitor?
  • 30. Dietterich #MLSEV 30 • Most AD papers only evaluate on a few datasets • Often proprietary or very easy (e.g., KDD 1999) • ML community needs a large and growing collection of public anomaly benchmarks Anomaly Detection Benchmarking Study [Emmott, Das, Dietterich, Fern, Wong, 2013; KDD ODD-2013] [Emmott, Das, Dietterich, Fern, Wong. 2016; arXiv 1503.01158v2]
  • 31. Dietterich #MLSEV 31 • Density-Based Approaches • RKDE: Robust Kernel Density Estimation (Kim & Scott, 2008) • EGMM: Ensemble Gaussian Mixture Model (our group) • Quantile-Based Methods • OCSVM: One-class SVM (Schoelkopf, et al., 1999) • SVDD: Support Vector Data Description (Tax & Duin, 2004) Algorithms • Neighbor-Based Methods • LOF: Local Outlier Factor (Breunig, et al., 2000) • ABOD: kNN Angle-Based Outlier Detector (Kriegel, et al., 2008) • Projection-Based Methods • IFOR: Isolation Forest (Liu, et al., 2008) • LODA: Lightweight Online Detector of Anomalies (Pevny, 2016)
  • 32. Dietterich #MLSEV 32 Algorithm Comparison 0 0.2 0.4 0.6 0.8 1 1.2 ChangeinMetricwrt ControlDataset Algorithm logit(AUC) log(LIFT) Based on this study, BigML implemented Isolation Forest
  • 33. Dietterich #MLSEV 33 • Only make a prediction if the query 𝑥𝑥𝑞𝑞 has a low anomaly score • Liu, et al. 2018 showed how to set 𝜏𝜏 to guarantee detecting new category queries with high probability Open Category Detection 𝑥𝑥𝑞𝑞 Anomaly Detector 𝐴𝐴 𝑥𝑥𝑞𝑞 > 𝜏𝜏? Classifier 𝑓𝑓 Training Examples (𝑥𝑥𝑖𝑖, 𝑦𝑦𝑖𝑖) no 𝑦𝑦 = 𝑓𝑓(𝑥𝑥𝑞𝑞) yes reject [Liu, Garrepalli, Fern, Dietterich, ICML 2018]
  • 34. Dietterich #MLSEV 34 • “Two Sample” test. 𝑆𝑆𝑎𝑎 ∼ 𝑃𝑃𝑎𝑎, 𝑆𝑆𝑏𝑏 ∼ 𝑃𝑃𝑏𝑏, is 𝑃𝑃𝑎𝑎 ≠ 𝑃𝑃𝑏𝑏? • Method 1: Kernel two-sample test • Method 2: Old-vs-New Classifier • Train a classifier to distinguish between 𝑆𝑆𝑎𝑎 and 𝑆𝑆𝑏𝑏. Can it do better than random guessing? • At each time 𝑡𝑡, slide 𝑆𝑆𝑎𝑎 and 𝑆𝑆𝑏𝑏 one step forward in time (requires online methods) • An area of active research Change Detection 𝑥𝑥1 𝑥𝑥2 𝑥𝑥100 𝑥𝑥101 𝑥𝑥200 𝑆𝑆𝑎𝑎 𝑆𝑆𝑏𝑏
  • 36. Dietterich #MLSEV 36 • Standard metrics for evaluating classifiers, such as F1 and AUC, were developed for machine learning research • Most applications require separate metrics • Example: • Financial fraud • Suppose we have 5 analysts and each analyst can examine 10 cases per day • Metric: Expected value of the top 50 alarms (value@50). • Incorporates the estimated value of each candidate fraud alarm Application-Specific Metrics are Essential
  • 37. Dietterich #MLSEV 37 • Open Category Detection: • Detect 99% of all open category queries • Metric: Precision at 99% recall • Obstacle Detection for Self-Driving cars • Detect 99.999% of all dangerous obstacles • Metric: Precision at 99.999% recall • Cancer Screening: • Must trade off false alarms versus missed alarms • Metric: Cost to patient (may vary from one patient to another) • AUC is a fairly good metric for this case More Examples
  • 39. Dietterich #MLSEV 39 • Generalization • Beyond iid: Causal transportability; Domain adaptation • Feature Engineering • Very important; Deep learning can discover useful intermediate features in some cases • Uncertainty Quantification • Probability Calibration • Run-time Monitoring • Anomaly Detection; Change Point Detection • Application-Specific Metrics Frontiers of Machine Learning and Applications