SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
L1-based compression of random forest model 
Arnaud Joly, François Schnitlzer, Pierre Geurts, Louis Wehenkel 
a.joly@ulg.ac.be - www.ajoly.org 
Departement of EE and CS & GIGA-Research
High dimensional supervised learning applications 
3D Image 
segmentation 
Genomics Electrical grid 
x=original image 
y=segmented image 
x=DNA sequence 
y=phenotype 
x=system state 
y=stability 
From 105 to 109 dimensions.
Tree based ensemble methods 
From a dataset of input-output pairs f(xi ; yi )gni 
=1  X  Y, we 
approximate f : X ! Y by learning an ensemble of M decision trees. 
Label 3 
Yes 
Yes 
No 
No 
... 
Label 1 
Label 2 
The estimator ^f of f is obtained by averaging the predictions of the 
ensemble of trees.
High model complexity ! Large memory requirement 
The complexity of tree based methods is measured by the number of 
internal nodes and increases with 
I the ensemble size M; 
I the number of samples n in the dataset. 
The variance of individual trees increases with the dimension p of the 
original feature space ! M(p) should increase with p to yield near optimal 
accuracy. 
Complexity grows as nM(p) ! may require huge amount of storage. 
Memory limitation will be an issue in high dimensional problems.
L1-based compression of random forest model (I) 
We first learn an ensemble of M extremely randomized trees (Geurts, et al., 
2006) . . . 
Example 
1,1 
1,2 1,3 
1,4 1,5 
1,6 1,7 
2,1 
2,2 2,3 
2,4 2,5 2,6 2,7 
3,1 
3,2 3,3 
3,4 3,5 
. . . and associate to each node an indicator function 1m;l (x) which is equal 
to 1 if the sample (x; y) reaches the l -th node of the m-th tree, 0 otherwise.
L1-based compression of random forest model (II) 
The node indicator functions 1m;l (x) may be used to lift the input space X 
towards its induced feature space Z 
z(x) = (11;1(x); : : : ; 11;N1(x); : : : ; 1M;1(x); : : : ; 1M;NM(x)) : 
Example for one sample xs 
1,1 
1,2 1,3 
1,4 1,5 
1,6 1,7 
2,1 
2,2 2,3 
2,4 2,5 2,6 2,7 
3,1 
3,2 3,3 
3,4 3,5
L1-based compression of random forest model (III) 
A variable selection method (regularization with L1-norm) is applied on the 
induced space Z to compress the tree ensemble using the solution of
j (t) 
q 
j=0 = arg min
Xn 
i=1 
0 
@yi
0  
Xq 
j=1 
1 
A
j zj (xi ) 
2 
s.t. 
Xq 
j=1 
j
j j  t: 
Pruning: a test node is deleted if all its descendants (including the test 
node itself) correspond to
j (t) = 0.
Overall assessment on 3 datasets 
Datasets Error Complexity 
ET rET ET rET ET/rET 
Friedman1 0:19587 0:18593 29900 885 34 
Two-norm 0:04177 0:06707 4878 540 9 
SEFTi 0:86159 0:84131 39436 2055 19 
ET Extra trees; 
rET L1-based compression of ET. 
Table: Parameters of the Extra-Tree method: M = 100; K = p; nmin = 1 on 
Friedman1 and Two-norm, nmin = 10 on SEFTi.
An increase of t decreases the error of rET until t = 3 with 
drastic pruning 
(a) Estimated risk (b) Complexity 
Friedman1 : M = 100, K = p = 10 and nmin = 1
Managing complexity in the extra tree method 
Bound M 
Restrict the size M of the tree based ensemble. 
Pre-pruning 
Pre-pruning reduces the complexity of tree based methods by imposing a 
condition to split a node e.g. 
I minimum number of samples nmin in order to split, 
I minimum decrease of an impurity measure, 
I . . .

Contenu connexe

Tendances

Speaker Recognition using Gaussian Mixture Model
Speaker Recognition using Gaussian Mixture Model Speaker Recognition using Gaussian Mixture Model
Speaker Recognition using Gaussian Mixture Model Saurab Dulal
 
FUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolFUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolSelman Bozkır
 
Newton Raphson iterative Method
Newton Raphson iterative MethodNewton Raphson iterative Method
Newton Raphson iterative MethodIsaac Yowetu
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modeljins0618
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing홍배 김
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
12. Random Forest
12. Random Forest12. Random Forest
12. Random ForestFAO
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 

Tendances (20)

Speaker Recognition using Gaussian Mixture Model
Speaker Recognition using Gaussian Mixture Model Speaker Recognition using Gaussian Mixture Model
Speaker Recognition using Gaussian Mixture Model
 
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
 
FUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolFUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis Tool
 
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Newton Raphson iterative Method
Newton Raphson iterative MethodNewton Raphson iterative Method
Newton Raphson iterative Method
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
Pattern classification
Pattern classificationPattern classification
Pattern classification
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
12. Random Forest
12. Random Forest12. Random Forest
12. Random Forest
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 

Similaire à L1-based compression of random forest modelSlide

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Umberto Picchini
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Handling missing data with expectation maximization algorithm
Handling missing data with expectation maximization algorithmHandling missing data with expectation maximization algorithm
Handling missing data with expectation maximization algorithmLoc Nguyen
 
Dynamics of structures with uncertainties
Dynamics of structures with uncertaintiesDynamics of structures with uncertainties
Dynamics of structures with uncertaintiesUniversity of Glasgow
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzerbutest
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherchesPierre Pudlo
 
Two algorithms to accelerate training of back-propagation neural networks
Two algorithms to accelerate training of back-propagation neural networksTwo algorithms to accelerate training of back-propagation neural networks
Two algorithms to accelerate training of back-propagation neural networksESCOM
 
Large variance and fat tail of damage by natural disaster
Large variance and fat tail of damage by natural disasterLarge variance and fat tail of damage by natural disaster
Large variance and fat tail of damage by natural disasterHang-Hyun Jo
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Learning to Reconstruct
Learning to ReconstructLearning to Reconstruct
Learning to ReconstructJonas Adler
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filternozomuhamada
 

Similaire à L1-based compression of random forest modelSlide (20)

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Deep Learning Opening Workshop - Statistical and Computational Guarantees of ...
Deep Learning Opening Workshop - Statistical and Computational Guarantees of ...Deep Learning Opening Workshop - Statistical and Computational Guarantees of ...
Deep Learning Opening Workshop - Statistical and Computational Guarantees of ...
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Handling missing data with expectation maximization algorithm
Handling missing data with expectation maximization algorithmHandling missing data with expectation maximization algorithm
Handling missing data with expectation maximization algorithm
 
Benelearn2016
Benelearn2016Benelearn2016
Benelearn2016
 
Dynamics of structures with uncertainties
Dynamics of structures with uncertaintiesDynamics of structures with uncertainties
Dynamics of structures with uncertainties
 
CoopLoc Technical Presentation
CoopLoc Technical PresentationCoopLoc Technical Presentation
CoopLoc Technical Presentation
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzer
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherches
 
Isolation Forest
Isolation ForestIsolation Forest
Isolation Forest
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
 
Two algorithms to accelerate training of back-propagation neural networks
Two algorithms to accelerate training of back-propagation neural networksTwo algorithms to accelerate training of back-propagation neural networks
Two algorithms to accelerate training of back-propagation neural networks
 
Large variance and fat tail of damage by natural disaster
Large variance and fat tail of damage by natural disasterLarge variance and fat tail of damage by natural disaster
Large variance and fat tail of damage by natural disaster
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Learning to Reconstruct
Learning to ReconstructLearning to Reconstruct
Learning to Reconstruct
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter
 

Dernier

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 

Dernier (20)

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

L1-based compression of random forest modelSlide

  • 1. L1-based compression of random forest model Arnaud Joly, François Schnitlzer, Pierre Geurts, Louis Wehenkel a.joly@ulg.ac.be - www.ajoly.org Departement of EE and CS & GIGA-Research
  • 2. High dimensional supervised learning applications 3D Image segmentation Genomics Electrical grid x=original image y=segmented image x=DNA sequence y=phenotype x=system state y=stability From 105 to 109 dimensions.
  • 3. Tree based ensemble methods From a dataset of input-output pairs f(xi ; yi )gni =1 X Y, we approximate f : X ! Y by learning an ensemble of M decision trees. Label 3 Yes Yes No No ... Label 1 Label 2 The estimator ^f of f is obtained by averaging the predictions of the ensemble of trees.
  • 4. High model complexity ! Large memory requirement The complexity of tree based methods is measured by the number of internal nodes and increases with I the ensemble size M; I the number of samples n in the dataset. The variance of individual trees increases with the dimension p of the original feature space ! M(p) should increase with p to yield near optimal accuracy. Complexity grows as nM(p) ! may require huge amount of storage. Memory limitation will be an issue in high dimensional problems.
  • 5. L1-based compression of random forest model (I) We first learn an ensemble of M extremely randomized trees (Geurts, et al., 2006) . . . Example 1,1 1,2 1,3 1,4 1,5 1,6 1,7 2,1 2,2 2,3 2,4 2,5 2,6 2,7 3,1 3,2 3,3 3,4 3,5 . . . and associate to each node an indicator function 1m;l (x) which is equal to 1 if the sample (x; y) reaches the l -th node of the m-th tree, 0 otherwise.
  • 6. L1-based compression of random forest model (II) The node indicator functions 1m;l (x) may be used to lift the input space X towards its induced feature space Z z(x) = (11;1(x); : : : ; 11;N1(x); : : : ; 1M;1(x); : : : ; 1M;NM(x)) : Example for one sample xs 1,1 1,2 1,3 1,4 1,5 1,6 1,7 2,1 2,2 2,3 2,4 2,5 2,6 2,7 3,1 3,2 3,3 3,4 3,5
  • 7. L1-based compression of random forest model (III) A variable selection method (regularization with L1-norm) is applied on the induced space Z to compress the tree ensemble using the solution of
  • 8. j (t) q j=0 = arg min
  • 9. Xn i=1 0 @yi
  • 10. 0 Xq j=1 1 A
  • 11. j zj (xi ) 2 s.t. Xq j=1 j
  • 12. j j t: Pruning: a test node is deleted if all its descendants (including the test node itself) correspond to
  • 13. j (t) = 0.
  • 14. Overall assessment on 3 datasets Datasets Error Complexity ET rET ET rET ET/rET Friedman1 0:19587 0:18593 29900 885 34 Two-norm 0:04177 0:06707 4878 540 9 SEFTi 0:86159 0:84131 39436 2055 19 ET Extra trees; rET L1-based compression of ET. Table: Parameters of the Extra-Tree method: M = 100; K = p; nmin = 1 on Friedman1 and Two-norm, nmin = 10 on SEFTi.
  • 15. An increase of t decreases the error of rET until t = 3 with drastic pruning (a) Estimated risk (b) Complexity Friedman1 : M = 100, K = p = 10 and nmin = 1
  • 16. Managing complexity in the extra tree method Bound M Restrict the size M of the tree based ensemble. Pre-pruning Pre-pruning reduces the complexity of tree based methods by imposing a condition to split a node e.g. I minimum number of samples nmin in order to split, I minimum decrease of an impurity measure, I . . .
  • 17. The accuracy and complexity of an rET model does not depend on nmin, for nmin small enough (nmin 10 ) (c) Estimated risk (d) Complexity Friedman1 : M = 100, K = p = 10 and t = t cv
  • 18. After variance reduction has stabilized (M ' 100), further increasing M keeps enhancing the accuracy of the rET model without increasing complexity (e) Estimated risk (f) Complexity Friedman1 : nmin = 10, K = p = 10 and t = t cv
  • 19. Conclusion perspectives 1. Drastic pruning while preserving accuracy. 2. Strong compressibility of the tree ensemble suggests that it could be possible to design novel algorithms suited to very high dimensional input space. 3. Future research will target similar compression ratio without using the complete set of node indicator functions of the forest model.
  • 20. L1-based compression of random forest model Arnaud Joly, François Schnitlzer, Pierre Geurts, Louis Wehenkel a.joly@ulg.ac.be - www.ajoly.org Departement of EE and CS GIGA-Research
  • 22. Overall assessment on 3 datasets Datasets Error Complexity ET rET Lasso ET rET Lasso Friedman1 0:19587 0:18593 0:282441 29900 885 4 Two-norm 0:04177 0:06707 0:033500 4878 540 20 SEFTi 0:86159 0:84131 0:988031 39436 2055 14 ET Extra trees; rET L1-based compression of ET. Table: Overall assessment (parameters of the Extra-Tree method: M = 100; K = p; nmin = 1 on Friedman1 and Two-norm, nmin = 10 on SEFTi).