SlideShare une entreprise Scribd logo
1  sur  17
DEEP VS. DIVERSE
ARCHITECTURES
By Colleen M. Farrelly
SCOPE OF PROBLEM
•The No Free Lunch Theorem suggests that no individual machine learning model
will perform best across all types of data and datasets.
• Social science/behavioral datasets present a particular challenge, as data often contains main
effects and interaction effects, which can be linear or nonlinear with respect to an outcome of
interest.
• In addition, social science datasets often contain outliers and group overlap among
classification outcomes, where someone may have all the risk factors for dropping out or drug
use but does not exhibit the predicted behavior.
•Several machine learning frameworks have nice theoretical properties, including
convergence theorems and universal approximation guarantees, that may be
particularly adept at modeling social science outcomes.
• Superlearners and subsembles have been proven to improve ensemble performance to a level
at least as good as the best model in the ensemble.
• Neural networks with one hidden layer have universal approximation properties, which
guarantee that random mappings to a wide enough layer will come arbitrarily close to a desired
error level for any given function.
• One caveat to this universal approximation is the size needed to obtain these guarantees may be larger than is practical or
possible in a model.
• Deep learning attempts to rectify this limitation by adding additional layers to the neural network, where each layer
reduces model error beyond the previous layers’ capabilities.
NEURAL NETWORK GENERAL
OVERVIEW
colah.github.iowww.alz.org
•A neural network is a model
based on processing
complex, nonlinear
information the way the
human brain does via a series
of feature mappings.
Arrows denote mapping
functions, which take
one topological space
to another
•These are a type of shallow, wide neural network.
•This formulation of neural networks reduces framework to a penalized linear
algebra problem, rather than iterative training (much faster to solve).
•It is based on random mappings, it is shown to converge to correct
classification/regression via the Universal Approximation Theorem (likely a
result of adequate coverage of the underlying data manifold).
•However, this the width of the network required may be computational
infeasible at the point of convergence with an arbitrary error level.
EXTREME LEARNING MACHINES
AND UNIVERSAL APPROXIMATION
DEEP LEARNING
•Deep learning attempts to solve the wide
layer problem by adding depth layers in
neural networks, which can be more
effective and computationally feasible than
extreme learning machines for some
problems.
• This framework is like sifting data with multiple
sifters to distill finer and finer pieces of the data.
•These are computationally intensive and
require architecture design and tuning for
each problem.
• Feed-forward networks are particularly popular, as
they can be easily built, tuned, and trained.
• Feed-forward networks also have relations to the
Universal Approximation Theorem, providing a
means to exploit these results without requiring
•This model is a weighted aggregation
of multiple types of models.
• This is analogous to a small town election.
• Different people have different views of the
politics and care about different issues.
•Different modeling methods capture
different pieces of the data variance
and vote accordingly.
• This leverages algorithm strengths while
minimizing weaknesses for each model (kind
of like an extension of bagging to multi-
algorithm ensembles).
• Diversity allows the full ensemble to better
explore the geometry underlying the data.
•This combines multiple models while
avoiding multiple testing issues.
SUPERLEARNERS
THEORY AND PRACTICE
•Superlearners are a type of ensemble of machine learning models,
typically using a set of classifiers or regression models, including linear
models, tree models, and ensemble models like boosting or bagging.
• Superlearners also have some theoretical guarantees about convergence and least
upper bounds on model error relative to algorithms within superlearner framework.
• They also have the ability to rank variables by importance and provide model fits for
each component.
•Deep architectures can be designed as feed-forward data processing
networks, in which functional nodes through which data passes add
information to the dataset regarding optimal partitioning and variable
pairing.
• Recent attempts to create feed-forward deep networks employing random forest or
SVM functions at each mapping show promise as an alternative to the typical neural
network formulation of deep learning.
• It stands to reason that feed-forward deep networks based on other machine learning
algorithms or combinations of algorithms may enjoy some of these benefits of deep
EXPERIMENTAL SET-UP
•Algorithm frameworks tested:
1. Superlearner with random forest, random
ferns, KNN regression, MARS regression,
conditional inference trees, and boosted
regression.
2. Deep feed-forward machine learning model
(mixed deep model) with first hidden layer
of 2 random forest models, a conditional
inference tree model, and a random ferns
model; with second hidden layer of MARS
regression and conditional inference trees;
and a third hidden layer of boosted
regression.
3. Optimally tuned deep feed-forward neural
network model (13-5-3-1 configuration).
4. Deep feed-forward neural network model
with the same hidden layer structure as the
mixed deep model (Model 2).
5. KNN models, including k=5 regression
model, a deep k=5 model with 10-10-5
hidden layer configuration, and a
•Simulation design:
1. Outcome as yes/no for simplicity of
design (logistic regression problem)
2. 4 true predictors, 9 noise predictors
3. Predictor relationships
1. Purely linear terms (ideal neural network set-
up)
2. Purely nonlinear terms (ideal machine
learning set-up)
3. Mix of linear and nonlinear terms (more likely
in real-world data)
4. Gaussian noise level
1. Low
2. High (more likely in real-world data)
5. Addition of outliers (fraction ~5-10%)
to high noise conditions (mimic
group overlap)
6. Sample sizes of 500, 1000, 2500,
5000, 10000 to test convergence
properties for each condition and
algorithm
LINEAR RESULTS
•Deep neural networks show strong performance
(linear relationship models show universal
approximation convergence at low sample sizes with
low noise).
•Superlearners seem to perform better than deep
models for machine learning ensembles.
•Deep architectures enhance the performance of KNN
models, particularly at low sample sizes, but
superlearners win out.
NONLINEAR RESULTS
•Superlearners dominate performance accuracy at
smaller sample sizes, and machine learning deep
models are competitive at these sample sizes.
•Tuned deep neural networks catch up to this
performance at large sample sizes, particularly with
noise and no outliers.
•Superlearner architectures show performance gains in
KNN regression models across all conditions.
MIXED RESULTS
•Superlearners retain their competitive advantage up until
very large sample sizes, suggesting that deep neural
networks struggle with a mix of linear and nonlinear terms
in a classification/regression model.
•Machine-learning-based deep architectures are
competitive at small sample sizes compared to deep
neural networks when no outliers are present.
•KNN superlearners retain a large advantage, particularly at
low noise with few outliers.
PREDICTING BAR PASSAGE
•Data includes 188 Concord Law
students for whom BAR data exists.
•22 predictors, including admissions
factors and law school grades,
used.
•Mixed deep model, superlearner
model, and tuned deep neural
network model were compared to
assess performance on real-world
data exhibiting linear and nonlinear
relationships with noise and group
overlap.
•70% of data was used to train, with
30% held out as a test set to assess
Algorithm Accuracy
Deep Machine Learning
Network
84.2%
Superlearner Model 100.0%
Tuned Deep Neural
Network
68.4%
•Deep neural networks struggle with
the small sample size; using
machine learning map functions
dramatically improves accuracy.
• Sample size requirements for
convergence are a noted limitation of
neural networks in general.
• Previous results suggest performance
depends on choice of hidden layer
activation functions (maps).
•Superlearner yields perfect
prediction, with individual
PREDICTING RETENTION BY
ADVISING
•Data includes 27666 students in 2016
and retention/graduation status at the
end of each term.
•10 predictors—academic,
demographic, and advising factors—
were used.
•Mixed deep model, superlearner
model, and tuned deep neural network
model were compared to assess
performance on real-world data
exhibiting linear and nonlinear
relationships with noise and group
overlap.
•70% of data was used to train, with 30%
held out as a test set to assess
accuracy.
Algorithm Accuracy
Deep Machine Learning
Network
73.2%
Superlearner Model 74.1%
Tuned Deep Neural
Network
74.4%
•Deep neural networks and deep
machine learning models seem to
provide a good processing sequence
to improve model fits iteratively.
• Examining the deep machine learning
model, we see that later layers do weight
prior models as fairly important
predictors, and we see evidence that
these previous layer predictions combine
with other factors in the dataset in these
later layers.
• This suggests that a deep approach can
PREDICTING ADMISSIONS
•Data involved 905,612 leads from
2016 and various admission
factors.
• Because of low enrollment counts
(~24000), stratified sampling was
used to enrich the training set for all
models.
• Training set contained ~20% of
observations, with ~10% of those
being enrolled students.
•Superlearner/deep models give
very similar model fit specs
(accuracy, AUC, FNR, FPR), and
some individual models (MARS,
random forest, boosted
regression, conditional trees) gave
very good model fit, as well.
•This suggests convergence, of
most models tested, including
•Runtime analysis shows the advantage of
some models over others, with conditional
trees/MARS models showing low runtimes.
•Deep NN have an advantage over deep ML
models and superlearners, mostly as a result
of the random forest runtimes.
•A tree/MARS superlearner gave similar
performance in a shorter amount of time than
the deep NN (~2 minutes).
Algorithm Accurac
y
AUC FNR FPR Time
(Minutes
)
Deep
Machine
Learning
Network
98.0% 0.9
5
0.08 0.0
2
22
Superlearner
Model
98.2% 0.9
6
0.08 0.0
1
15
Fast
Superlearner
Model
98.0% 0.9
5
0.08 0.0
2
2
Tuned Deep
Neural
Network
98.0% 0.9
5
0.08 0.0
2
8
CONCLUSIONS
•Deep architectures can provide gain above individual models, particularly at
lower sample sizes, suggesting deep feed-forward approaches are
efficacious at improving predictive capabilities.
• This suggests that deep architectures can improve individual models that work well on a
particular problem.
• However, there is evidence that the topology of mappings between layers using these more
complex machine learning functions detracts from the predictive capabilities and universal
approximation property.
•Deep architectures with a variety of algorithms in each layer provide gains
above individual models and achieve good performance at low sample sizes
under real-world conditions.
•However, superlearners provide more robust models with no architecture
design or tuning needed; with group overlap and/or a combination of linear
and nonlinear relationships, they are the best models to use, even at sample
sizes where deep architecture begins to converge.
• Superlearners yield interpretable models and, hence, insight into important relationships
between predictors and an outcome.
SELECTED REFERENCES Theory and practice
• Aliper, A., Plis, S., Artemov, A., Ulloa, A., Mamoshina, P., & Zhavoronkov, A. (2016). Deep learning applications for predicting
pharmacological properties of drugs and drug repurposing using transcriptomic data. Molecular pharmaceutics, 13(7), 2524-2530.
• Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3),
175-185.
• Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
• Dekker, G., Pechenizkiy, M., & Vleeshouwers, J. (2009, July). Predicting students drop out: A case study. In Educational Data Mining
2009.
• Devroye, L. (1978). The uniform convergence of nearest neighbor regression function estimators and their application in
optimization. IEEE Transactions on Information Theory, 24(2), 142-151.
• Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 1-67.
• Friedman, J. H., & Meulman, J. J. (2003). Multiple additive regression trees with application in epidemiology. Statistics in medicine,
22(9), 1365-1381. outliers
• Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks,
2(5), 359-366.
• Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of
Computational and Graphical statistics, 15(3), 651-674.
• Huang, G. B., Chen, L., & Siew, C. K. (2006). Universal approximation using incremental constructive feedforward networks with
random hidden nodes. IEEE Trans. Neural Networks, 17(4), 879-892.
• Huang, G. B., Wang, D. H., & Lan, Y. (2011). Extreme learning machines: a survey. International Journal of Machine Learning and
Cybernetics, 2(2), 107-122.
• Huberty, C. J., & Lowman, L. L. (2000). Group overlap as a basis for effect size. Educational and Psychological Measurement, 60(4),
543-563.
• Kang, B., & Choo, H. (2016). A deep-learning-based emergency alert system. ICT Express, 2(2), 67-70.
• Lian, H. (2011). Convergence of functional k-nearest neighbor regression estimate with functional responses. Electronic Journal of
Statistics, 5, 31-40.
• Osborne, J. W., & Overbay, A. (2004). The power of outliers (and why researchers should always check for them). Practical
assessment, research & evaluation, 9(6), 1-12.
• Ozuysal, M., Calonder, M., Lepetit, V., & Fua, P. (2010). Fast keypoint recognition using random ferns. IEEE transactions on pattern
analysis and machine intelligence, 32(3), 448-461.
• Pirracchio, R., Petersen, M. L., Carone, M., Rigon, M. R., Chevret, S., & van der Laan, M. J. (2015). Mortality prediction in intensive
care units with the Super ICU Learner Algorithm (SICULA): a population-based study. The Lancet Respiratory Medicine, 3(1), 42-52.
• Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117. –industry
and competition/robots

Contenu connexe

Tendances

Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationColleen Farrelly
 
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...Sri Ambati
 
House price prediction
House price predictionHouse price prediction
House price predictionSabahBegum
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDeep Borkar
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesDATAVERSITY
 
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networks
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networksPR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networks
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networksTaesu Kim
 
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ..."Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...Quantopian
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfittingSivapriyaS12
 
머피의 머신러닝: 17장 Markov Chain and HMM
머피의 머신러닝: 17장  Markov Chain and HMM머피의 머신러닝: 17장  Markov Chain and HMM
머피의 머신러닝: 17장 Markov Chain and HMMJungkyu Lee
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommenderAnnie Thomas
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine LearningSri Ambati
 
Sales Analytics Using Power BI
Sales Analytics Using Power BISales Analytics Using Power BI
Sales Analytics Using Power BINetwoven Inc.
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationDATAVERSITY
 
Machine Learning by Analogy II
Machine Learning by Analogy IIMachine Learning by Analogy II
Machine Learning by Analogy IIColleen Farrelly
 
Introduction to XGboost
Introduction to XGboostIntroduction to XGboost
Introduction to XGboostShuai Zhang
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine LearningKnoldus Inc.
 

Tendances (20)

Customer Segmentation
Customer SegmentationCustomer Segmentation
Customer Segmentation
 
Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validation
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case Studies
 
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networks
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networksPR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networks
PR12-094: Model-Agnostic Meta-Learning for fast adaptation of deep networks
 
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ..."Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...
"Quantum Hierarchical Risk Parity - A Quantum-Inspired Approach to Portfolio ...
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfitting
 
Python Project Poster
Python Project PosterPython Project Poster
Python Project Poster
 
머피의 머신러닝: 17장 Markov Chain and HMM
머피의 머신러닝: 17장  Markov Chain and HMM머피의 머신러닝: 17장  Markov Chain and HMM
머피의 머신러닝: 17장 Markov Chain and HMM
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommender
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
Sales Analytics Using Power BI
Sales Analytics Using Power BISales Analytics Using Power BI
Sales Analytics Using Power BI
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
 
Machine Learning by Analogy II
Machine Learning by Analogy IIMachine Learning by Analogy II
Machine Learning by Analogy II
 
Introduction to XGboost
Introduction to XGboostIntroduction to XGboost
Introduction to XGboost
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 

En vedette

Trauma and Alcoholism: Risk and Resilience
Trauma and Alcoholism: Risk and ResilienceTrauma and Alcoholism: Risk and Resilience
Trauma and Alcoholism: Risk and ResilienceColleen Farrelly
 
The Neurobiology of Addiction
The Neurobiology of AddictionThe Neurobiology of Addiction
The Neurobiology of AddictionColleen Farrelly
 
Gender, Education, Skills, and Compensation in US Data Scientists
Gender, Education, Skills, and Compensation in US Data ScientistsGender, Education, Skills, and Compensation in US Data Scientists
Gender, Education, Skills, and Compensation in US Data ScientistsColleen Farrelly
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overviewColleen Farrelly
 
Understanding the Profoundly Gifted
Understanding the Profoundly GiftedUnderstanding the Profoundly Gifted
Understanding the Profoundly GiftedColleen Farrelly
 

En vedette (8)

Trauma and Alcoholism: Risk and Resilience
Trauma and Alcoholism: Risk and ResilienceTrauma and Alcoholism: Risk and Resilience
Trauma and Alcoholism: Risk and Resilience
 
The Neurobiology of Addiction
The Neurobiology of AddictionThe Neurobiology of Addiction
The Neurobiology of Addiction
 
Gender, Education, Skills, and Compensation in US Data Scientists
Gender, Education, Skills, and Compensation in US Data ScientistsGender, Education, Skills, and Compensation in US Data Scientists
Gender, Education, Skills, and Compensation in US Data Scientists
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
Understanding the Profoundly Gifted
Understanding the Profoundly GiftedUnderstanding the Profoundly Gifted
Understanding the Profoundly Gifted
 
Guide to MD/PhD programs
Guide to MD/PhD programsGuide to MD/PhD programs
Guide to MD/PhD programs
 
Profiles of the Gifted
Profiles of the GiftedProfiles of the Gifted
Profiles of the Gifted
 
Neuropsychopharmacology
NeuropsychopharmacologyNeuropsychopharmacology
Neuropsychopharmacology
 

Similaire à Deep vs diverse architectures for classification problems

Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnnkartikaursang53
 
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesValue Amplify Consulting
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfJayanti Pande
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...gabrielesisinna
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine LearningAyodele Odubela
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
ADFUNN
ADFUNNADFUNN
ADFUNNadfunn
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceTakrim Ul Islam Laskar
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxDebabrataPain1
 
deep neural networkmodel implementation over homomorphically encrypted data
deep neural networkmodel implementation over homomorphically encrypted datadeep neural networkmodel implementation over homomorphically encrypted data
deep neural networkmodel implementation over homomorphically encrypted dataKVENKATASRAVANI
 
Basics of Deep learning
Basics of Deep learningBasics of Deep learning
Basics of Deep learningRamesh Kumar
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.ASHOK KUMAR
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptxnarmeen11
 

Similaire à Deep vs diverse architectures for classification problems (20)

Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnn
 
PNN and inversion-B
PNN and inversion-BPNN and inversion-B
PNN and inversion-B
 
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdf
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine Learning
 
Neural network
Neural networkNeural network
Neural network
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
ADFUNN
ADFUNNADFUNN
ADFUNN
 
SC1.pptx
SC1.pptxSC1.pptx
SC1.pptx
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
 
ML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
 
deep neural networkmodel implementation over homomorphically encrypted data
deep neural networkmodel implementation over homomorphically encrypted datadeep neural networkmodel implementation over homomorphically encrypted data
deep neural networkmodel implementation over homomorphically encrypted data
 
Basics of Deep learning
Basics of Deep learningBasics of Deep learning
Basics of Deep learning
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Jack
JackJack
Jack
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 

Plus de Colleen Farrelly

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Colleen Farrelly
 
Modeling Climate Change.pptx
Modeling Climate Change.pptxModeling Climate Change.pptx
Modeling Climate Change.pptxColleen Farrelly
 
Natural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxNatural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxColleen Farrelly
 
The Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxThe Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxColleen Farrelly
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxColleen Farrelly
 
Emerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxEmerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxColleen Farrelly
 
Applications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxApplications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxColleen Farrelly
 
Geometry for Social Good.pptx
Geometry for Social Good.pptxGeometry for Social Good.pptx
Geometry for Social Good.pptxColleen Farrelly
 
Topology for Time Series.pptx
Topology for Time Series.pptxTopology for Time Series.pptx
Topology for Time Series.pptxColleen Farrelly
 
Time Series Applications AMLD.pptx
Time Series Applications AMLD.pptxTime Series Applications AMLD.pptx
Time Series Applications AMLD.pptxColleen Farrelly
 
An introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxAn introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxColleen Farrelly
 
An introduction to time series data with R.pptx
An introduction to time series data with R.pptxAn introduction to time series data with R.pptx
An introduction to time series data with R.pptxColleen Farrelly
 
NLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasNLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasColleen Farrelly
 
Geometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxGeometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxColleen Farrelly
 
Topological Data Analysis.pptx
Topological Data Analysis.pptxTopological Data Analysis.pptx
Topological Data Analysis.pptxColleen Farrelly
 
Transforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxTransforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxColleen Farrelly
 
Natural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxNatural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxColleen Farrelly
 
SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing Colleen Farrelly
 
2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science TalkColleen Farrelly
 

Plus de Colleen Farrelly (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023
 
Modeling Climate Change.pptx
Modeling Climate Change.pptxModeling Climate Change.pptx
Modeling Climate Change.pptx
 
Natural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxNatural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptx
 
The Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxThe Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptx
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
Emerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxEmerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptx
 
Applications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxApplications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptx
 
Geometry for Social Good.pptx
Geometry for Social Good.pptxGeometry for Social Good.pptx
Geometry for Social Good.pptx
 
Topology for Time Series.pptx
Topology for Time Series.pptxTopology for Time Series.pptx
Topology for Time Series.pptx
 
Time Series Applications AMLD.pptx
Time Series Applications AMLD.pptxTime Series Applications AMLD.pptx
Time Series Applications AMLD.pptx
 
An introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxAn introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptx
 
An introduction to time series data with R.pptx
An introduction to time series data with R.pptxAn introduction to time series data with R.pptx
An introduction to time series data with R.pptx
 
NLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasNLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved Areas
 
Geometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxGeometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptx
 
Topological Data Analysis.pptx
Topological Data Analysis.pptxTopological Data Analysis.pptx
Topological Data Analysis.pptx
 
Transforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxTransforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptx
 
Natural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxNatural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptx
 
SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing
 
2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk
 

Dernier

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Dernier (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 

Deep vs diverse architectures for classification problems

  • 1. DEEP VS. DIVERSE ARCHITECTURES By Colleen M. Farrelly
  • 2. SCOPE OF PROBLEM •The No Free Lunch Theorem suggests that no individual machine learning model will perform best across all types of data and datasets. • Social science/behavioral datasets present a particular challenge, as data often contains main effects and interaction effects, which can be linear or nonlinear with respect to an outcome of interest. • In addition, social science datasets often contain outliers and group overlap among classification outcomes, where someone may have all the risk factors for dropping out or drug use but does not exhibit the predicted behavior. •Several machine learning frameworks have nice theoretical properties, including convergence theorems and universal approximation guarantees, that may be particularly adept at modeling social science outcomes. • Superlearners and subsembles have been proven to improve ensemble performance to a level at least as good as the best model in the ensemble. • Neural networks with one hidden layer have universal approximation properties, which guarantee that random mappings to a wide enough layer will come arbitrarily close to a desired error level for any given function. • One caveat to this universal approximation is the size needed to obtain these guarantees may be larger than is practical or possible in a model. • Deep learning attempts to rectify this limitation by adding additional layers to the neural network, where each layer reduces model error beyond the previous layers’ capabilities.
  • 3. NEURAL NETWORK GENERAL OVERVIEW colah.github.iowww.alz.org •A neural network is a model based on processing complex, nonlinear information the way the human brain does via a series of feature mappings. Arrows denote mapping functions, which take one topological space to another
  • 4. •These are a type of shallow, wide neural network. •This formulation of neural networks reduces framework to a penalized linear algebra problem, rather than iterative training (much faster to solve). •It is based on random mappings, it is shown to converge to correct classification/regression via the Universal Approximation Theorem (likely a result of adequate coverage of the underlying data manifold). •However, this the width of the network required may be computational infeasible at the point of convergence with an arbitrary error level. EXTREME LEARNING MACHINES AND UNIVERSAL APPROXIMATION
  • 5. DEEP LEARNING •Deep learning attempts to solve the wide layer problem by adding depth layers in neural networks, which can be more effective and computationally feasible than extreme learning machines for some problems. • This framework is like sifting data with multiple sifters to distill finer and finer pieces of the data. •These are computationally intensive and require architecture design and tuning for each problem. • Feed-forward networks are particularly popular, as they can be easily built, tuned, and trained. • Feed-forward networks also have relations to the Universal Approximation Theorem, providing a means to exploit these results without requiring
  • 6. •This model is a weighted aggregation of multiple types of models. • This is analogous to a small town election. • Different people have different views of the politics and care about different issues. •Different modeling methods capture different pieces of the data variance and vote accordingly. • This leverages algorithm strengths while minimizing weaknesses for each model (kind of like an extension of bagging to multi- algorithm ensembles). • Diversity allows the full ensemble to better explore the geometry underlying the data. •This combines multiple models while avoiding multiple testing issues. SUPERLEARNERS
  • 7. THEORY AND PRACTICE •Superlearners are a type of ensemble of machine learning models, typically using a set of classifiers or regression models, including linear models, tree models, and ensemble models like boosting or bagging. • Superlearners also have some theoretical guarantees about convergence and least upper bounds on model error relative to algorithms within superlearner framework. • They also have the ability to rank variables by importance and provide model fits for each component. •Deep architectures can be designed as feed-forward data processing networks, in which functional nodes through which data passes add information to the dataset regarding optimal partitioning and variable pairing. • Recent attempts to create feed-forward deep networks employing random forest or SVM functions at each mapping show promise as an alternative to the typical neural network formulation of deep learning. • It stands to reason that feed-forward deep networks based on other machine learning algorithms or combinations of algorithms may enjoy some of these benefits of deep
  • 8. EXPERIMENTAL SET-UP •Algorithm frameworks tested: 1. Superlearner with random forest, random ferns, KNN regression, MARS regression, conditional inference trees, and boosted regression. 2. Deep feed-forward machine learning model (mixed deep model) with first hidden layer of 2 random forest models, a conditional inference tree model, and a random ferns model; with second hidden layer of MARS regression and conditional inference trees; and a third hidden layer of boosted regression. 3. Optimally tuned deep feed-forward neural network model (13-5-3-1 configuration). 4. Deep feed-forward neural network model with the same hidden layer structure as the mixed deep model (Model 2). 5. KNN models, including k=5 regression model, a deep k=5 model with 10-10-5 hidden layer configuration, and a •Simulation design: 1. Outcome as yes/no for simplicity of design (logistic regression problem) 2. 4 true predictors, 9 noise predictors 3. Predictor relationships 1. Purely linear terms (ideal neural network set- up) 2. Purely nonlinear terms (ideal machine learning set-up) 3. Mix of linear and nonlinear terms (more likely in real-world data) 4. Gaussian noise level 1. Low 2. High (more likely in real-world data) 5. Addition of outliers (fraction ~5-10%) to high noise conditions (mimic group overlap) 6. Sample sizes of 500, 1000, 2500, 5000, 10000 to test convergence properties for each condition and algorithm
  • 9. LINEAR RESULTS •Deep neural networks show strong performance (linear relationship models show universal approximation convergence at low sample sizes with low noise). •Superlearners seem to perform better than deep models for machine learning ensembles. •Deep architectures enhance the performance of KNN models, particularly at low sample sizes, but superlearners win out.
  • 10. NONLINEAR RESULTS •Superlearners dominate performance accuracy at smaller sample sizes, and machine learning deep models are competitive at these sample sizes. •Tuned deep neural networks catch up to this performance at large sample sizes, particularly with noise and no outliers. •Superlearner architectures show performance gains in KNN regression models across all conditions.
  • 11. MIXED RESULTS •Superlearners retain their competitive advantage up until very large sample sizes, suggesting that deep neural networks struggle with a mix of linear and nonlinear terms in a classification/regression model. •Machine-learning-based deep architectures are competitive at small sample sizes compared to deep neural networks when no outliers are present. •KNN superlearners retain a large advantage, particularly at low noise with few outliers.
  • 12. PREDICTING BAR PASSAGE •Data includes 188 Concord Law students for whom BAR data exists. •22 predictors, including admissions factors and law school grades, used. •Mixed deep model, superlearner model, and tuned deep neural network model were compared to assess performance on real-world data exhibiting linear and nonlinear relationships with noise and group overlap. •70% of data was used to train, with 30% held out as a test set to assess Algorithm Accuracy Deep Machine Learning Network 84.2% Superlearner Model 100.0% Tuned Deep Neural Network 68.4% •Deep neural networks struggle with the small sample size; using machine learning map functions dramatically improves accuracy. • Sample size requirements for convergence are a noted limitation of neural networks in general. • Previous results suggest performance depends on choice of hidden layer activation functions (maps). •Superlearner yields perfect prediction, with individual
  • 13. PREDICTING RETENTION BY ADVISING •Data includes 27666 students in 2016 and retention/graduation status at the end of each term. •10 predictors—academic, demographic, and advising factors— were used. •Mixed deep model, superlearner model, and tuned deep neural network model were compared to assess performance on real-world data exhibiting linear and nonlinear relationships with noise and group overlap. •70% of data was used to train, with 30% held out as a test set to assess accuracy. Algorithm Accuracy Deep Machine Learning Network 73.2% Superlearner Model 74.1% Tuned Deep Neural Network 74.4% •Deep neural networks and deep machine learning models seem to provide a good processing sequence to improve model fits iteratively. • Examining the deep machine learning model, we see that later layers do weight prior models as fairly important predictors, and we see evidence that these previous layer predictions combine with other factors in the dataset in these later layers. • This suggests that a deep approach can
  • 14. PREDICTING ADMISSIONS •Data involved 905,612 leads from 2016 and various admission factors. • Because of low enrollment counts (~24000), stratified sampling was used to enrich the training set for all models. • Training set contained ~20% of observations, with ~10% of those being enrolled students. •Superlearner/deep models give very similar model fit specs (accuracy, AUC, FNR, FPR), and some individual models (MARS, random forest, boosted regression, conditional trees) gave very good model fit, as well. •This suggests convergence, of most models tested, including •Runtime analysis shows the advantage of some models over others, with conditional trees/MARS models showing low runtimes. •Deep NN have an advantage over deep ML models and superlearners, mostly as a result of the random forest runtimes. •A tree/MARS superlearner gave similar performance in a shorter amount of time than the deep NN (~2 minutes). Algorithm Accurac y AUC FNR FPR Time (Minutes ) Deep Machine Learning Network 98.0% 0.9 5 0.08 0.0 2 22 Superlearner Model 98.2% 0.9 6 0.08 0.0 1 15 Fast Superlearner Model 98.0% 0.9 5 0.08 0.0 2 2 Tuned Deep Neural Network 98.0% 0.9 5 0.08 0.0 2 8
  • 15. CONCLUSIONS •Deep architectures can provide gain above individual models, particularly at lower sample sizes, suggesting deep feed-forward approaches are efficacious at improving predictive capabilities. • This suggests that deep architectures can improve individual models that work well on a particular problem. • However, there is evidence that the topology of mappings between layers using these more complex machine learning functions detracts from the predictive capabilities and universal approximation property. •Deep architectures with a variety of algorithms in each layer provide gains above individual models and achieve good performance at low sample sizes under real-world conditions. •However, superlearners provide more robust models with no architecture design or tuning needed; with group overlap and/or a combination of linear and nonlinear relationships, they are the best models to use, even at sample sizes where deep architecture begins to converge. • Superlearners yield interpretable models and, hence, insight into important relationships between predictors and an outcome.
  • 17. • Aliper, A., Plis, S., Artemov, A., Ulloa, A., Mamoshina, P., & Zhavoronkov, A. (2016). Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Molecular pharmaceutics, 13(7), 2524-2530. • Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175-185. • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. • Dekker, G., Pechenizkiy, M., & Vleeshouwers, J. (2009, July). Predicting students drop out: A case study. In Educational Data Mining 2009. • Devroye, L. (1978). The uniform convergence of nearest neighbor regression function estimators and their application in optimization. IEEE Transactions on Information Theory, 24(2), 142-151. • Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 1-67. • Friedman, J. H., & Meulman, J. J. (2003). Multiple additive regression trees with application in epidemiology. Statistics in medicine, 22(9), 1365-1381. outliers • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359-366. • Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical statistics, 15(3), 651-674. • Huang, G. B., Chen, L., & Siew, C. K. (2006). Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Networks, 17(4), 879-892. • Huang, G. B., Wang, D. H., & Lan, Y. (2011). Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics, 2(2), 107-122. • Huberty, C. J., & Lowman, L. L. (2000). Group overlap as a basis for effect size. Educational and Psychological Measurement, 60(4), 543-563. • Kang, B., & Choo, H. (2016). A deep-learning-based emergency alert system. ICT Express, 2(2), 67-70. • Lian, H. (2011). Convergence of functional k-nearest neighbor regression estimate with functional responses. Electronic Journal of Statistics, 5, 31-40. • Osborne, J. W., & Overbay, A. (2004). The power of outliers (and why researchers should always check for them). Practical assessment, research & evaluation, 9(6), 1-12. • Ozuysal, M., Calonder, M., Lepetit, V., & Fua, P. (2010). Fast keypoint recognition using random ferns. IEEE transactions on pattern analysis and machine intelligence, 32(3), 448-461. • Pirracchio, R., Petersen, M. L., Carone, M., Rigon, M. R., Chevret, S., & van der Laan, M. J. (2015). Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. The Lancet Respiratory Medicine, 3(1), 42-52. • Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117. –industry and competition/robots

Notes de l'éditeur

  1. Computationally expensive in traditional algorithms and rooted in topological maps. Cannot handle lots of variables compared to number of observations. Cannot handle non-independent data. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359-366.
  2. Random mappings to reduce MLP to linear system of equations. Huang, G. B., Wang, D. H., & Lan, Y. (2011). Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics, 2(2), 107-122.
  3. Computationally expensive neural network extension. Still suffers from singularities which hinder performance. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
  4. Bagging of different base models (same bootstrap or different bootstrap). van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology, 6(1).