Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Automated Machine Learning via Sequential Uniform Designs

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 32 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (19)

Similaire à Automated Machine Learning via Sequential Uniform Designs (20)

Publicité

Plus récents (20)

Automated Machine Learning via Sequential Uniform Designs

  1. 1. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Automated Machine Learning via Sequential Uniform Designs Dr. Aijun Zhang The University of Hong Kong (Joint work with Zebin Yang (HKU) and Ji Zhu (Michigan)) October 2018 StatSoft.org 1
  2. 2. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Outline of the presentation 1 Introduction to AutoML Hyperparameter Optimization Review of Existing Methods Proposed Approach to AutoML 2 SeqUD-based Hyperparameter Optimization Sequential Uniform Design SeqUDHO Meta-algorithm 3 Numerical Experiments Simulation Study AutoML Experiments StatSoft.org 2
  3. 3. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments What is AutoML (Automated Machine Learning)? AutoML is to perform automated ML model selection and hyperparameter configuration for the purpose of maximizing ML prediction accuracy. It also targets progressive automation of data preprocessing, feature extraction/transformation, postprocessing and interpretation. StatSoft.org 3
  4. 4. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Growing Interest in AutoML With the ultimate goal of making ML algorithms to be easily used without expert knowledge, there appear off-the-shelf AutoML packages: Auto-WEKA 2.0: simultaneous selection of ML algorithm and its hyperparameters on WEKA (Kotthof et al., JMLR 2017) auto-sklearn: AutoML for Python scikit-learn (Feurer et al., NIPS 2015) H2O AutoML: automated model selection and ensembling for H2O AutoKeras: automated neural architecture search (Jin, et al. 2018) Google Cloud: AutoMLBETA for Translation, NLP, and Vision (2018) A recent Forbes article claims that AutoML is set to become the future of artificial intelligence. StatSoft.org 4
  5. 5. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Hyperparameter Optimization Hyperparameter optimization, a.k.a. (hyper)paramater tuning, plays a central role in AutoML pipelines. Hyperparameters can be continuous, integer-valued or categorical, e.g. regularization parameters, kernel bandwidths, tree depth, learning rate, batch size, number of layers, type of activation. Hyperparameter Optimization is of combinatorial nature, therefore a challenging problem with curse of dimensionality. There is limited understanding about tunability of ML hyperparameters (Probst et al., 2018). There are mostly empirical evidences. Robustness and reproducibility of hyperparameter configuration depend not only on the specific algorithm, but also on the specific dataset. StatSoft.org 5
  6. 6. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments StatSoft.org 6
  7. 7. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Hyperparameter Optimization: Existing Methods Grid search: exhaustive search over grid combinations (most popular) Random search: random sampling (Bergstra and Bengio, 2012) Bayesian optimization: sequentially sampling one-point-at-a-time through maximizing the expected improvement (Jones et al., 1998) GP-EI: surface modeled by Gaussian process (Snoek et al., 2012) SMAC: surface modeled by random forest (Hutter et al., 2011) TPE: Tree-structured Parzen Estimator (Bergstra et al., 2011) Genetic algorithm: Goldberg & Holland (Machine Learning 1988) Reinforcement learning: DNN architecture search (Zoph and Le, 2016) StatSoft.org 7
  8. 8. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Grid Search vs. Random Search StatSoft.org 8
  9. 9. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Bayesian Optimization E.g. the acquisition function used by GP-EI (Snoek, et al., 2012): αEI(x) = ∫ ∞ y∗ (y − y∗ )pGP(y|x)dy = σ(x) [ z∗ (x)Φ(z∗ (x)) + ϕ(z∗ (x)) ] where y∗ is the observed maximum, (µ(x), σ2 (x)) are the GP-predicted (posterior) mean and variance, and z∗(x) = (µ(x) − y∗)/σ(x). StatSoft.org 9
  10. 10. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Proposed Approach to AutoML We reformulate AutoML as a kind of Computer Experiment (CompExp): Connections between AutoML and CompExp: a) the blackbox response surface can be complex; b) the experiment is expensive to run. StatSoft.org 10
  11. 11. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Proposed Approach to AutoML Within CompExp framework, we propose a SeqUDHO meta-algorithm to perform hyperparameter optimization for each candidate ML algorithm. Key innovation: Sequential Uniform Design with augmented runs By simulation study, the proposed SeqUDHO meta-algorithm is shown to outperform existing methods. Numerical experiments with real-world datasets demonstrate SeqUDHO has superior performance for SVM, Xgboost and CNN algorithms. StatSoft.org 11
  12. 12. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Outline of the presentation 1 Introduction to AutoML Hyperparameter Optimization Review of Existing Methods Proposed Approach to AutoML 2 SeqUD-based Hyperparameter Optimization Sequential Uniform Design SeqUDHO Meta-algorithm 3 Numerical Experiments Simulation Study AutoML Experiments StatSoft.org 12
  13. 13. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments SNTO method for Global Optimization Fang and Wang (1990) proposed an SNTO method using NT-nets for global/blackbox optimization; see Fang and Wang (1994; Chapter 3) However, SNTO does not utilize existing runs in the subdomain. This motivates us to develop an augmented uniform design method ... StatSoft.org 13
  14. 14. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Augmented Uniform Design Given an initial design D1 with n1 runs, find an augmented D∗ 2 with n2 runs such that the combined design is as uniform as possible, i.e. D∗ 2 ← min D2 ϕ ([ D1 D2 ]) , where ϕ(D) is a uniformity criterion, e.g. centered L2-discrepancy (CD2). StatSoft.org 14
  15. 15. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Real-time SeqUD Construction R:UniDOE package by Zhang, et al. (2018) for stochastic search of uniform designs Left: Stochastic/Adaptive TA Algorithm https://CRAN.R-project.org/package=UniDOE Supports real-time construction of sequential uniform design (SeqUD) with augmented runs R:UniDOE used for AutoML implementation StatSoft.org 15
  16. 16. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments SeqUDHO Meta-algorithm 1 Define the search space by converting parameters to unit hypercube. Set Tmax (total runs), J (multi-shooting number) and k = 1 (current stage). 2 Generate D with T = n1 UD runs. Evaluate CV(θ) and fit GP(θ). 3 while T ≤ Tmax do Set k = k + 1. Find from D and GP-predicted QMC samples the top-J centers {θ∗ k j }j∈[J] with little overlapping sub-spaces. for j = 1, . . ., J do Subspace zooming into center θ∗ k j with level doubling; Generate nk j augmented runs in the subspace; If T + nk j > Tmax, break; Evaluate CV(θ) of nk j runs, set T = T + nk j. Update SeqUD D with T runs, and refit GP(θ). 4 Output the optimal θ∗ from all evaluated T runs. StatSoft.org 16
  17. 17. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Outline of the presentation 1 Introduction to AutoML Hyperparameter Optimization Review of Existing Methods Proposed Approach to AutoML 2 SeqUD-based Hyperparameter Optimization Sequential Uniform Design SeqUDHO Meta-algorithm 3 Numerical Experiments Simulation Study AutoML Experiments StatSoft.org 17
  18. 18. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Simulation Study To check the effectiveness of hyperparameter optimization, we consider two kinds of complex surfaces as ground truth: StatSoft.org 18
  19. 19. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Competitor Methods Five existing methods are compared: Grid search: still most popular today due to its simplicity Random search: Bergstra and Bengio (JMLR 2012) GP-EI (Snoek et al., NIPS 2012) based on Github:spearmint SMAC (Hutter et al., 2011) based on Github:SMAC3 TPE (Bergstra et al., NIPS 2011) based on Github:hyperopt StatSoft.org 19
  20. 20. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Comparative Results (a) Cliff-shaped function (b) Octopus-shaped function StatSoft.org 20
  21. 21. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Sampling Points for Cliff-shaped Function (c) SeqUDHO (d) GP-EI (e) SMAC (f) TPE (g) Rand (h) Grid Figure: An example of evaluation trajectories on Cliff-shaped function.StatSoft.org 21
  22. 22. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Sampling Points for Octopus-shaped Function (a) SeqUDHO (b) GP-EI (c) SMAC (d) TPE (e) Rand (f) Grid Figure: An example of evaluation trajectories on Octopus-shaped function.StatSoft.org 22
  23. 23. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments AutoML Experiments Six real classification datasets from UCI machine learning repository: Table: Description of Datasets Abb. Dataset nfeatures ndata prop MBP molec-biol-promoter 58 106 0.49 Breast breast-cancer 10 286 0.69 IonS ionosphere 34 350 0.3 ConVot congressional-voting 17 434 0.59 Credit credit-approval 16 690 0.43 MamG mammographic 6 960 0.56 StatSoft.org 23
  24. 24. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Testing Algorithm: SVM SVM (support vector machine) algorithm with 2 hyperparameters: kernel width in [10−16 , 106 ] and regularization strength in [10−6 , 1016 ] Parameter tuning results for SVM under 5-fold CV accuracy (%): Dataset Rand TPE GP-EI SMAC SeqUDHO Breast 73.85 74.06 73.78 74.16 74.72 ConVot 62.97 62.99 62.99 62.83 62.99 Credit 86.13 86.29 86.38 86.03 86.52 IonS 95.13 95.41 95.73 95.73 95.73 MamG 83.83 83.92 83.56 84.00 84.00 MBP 83.49 83.96 83.96 83.96 83.96 StatSoft.org 24
  25. 25. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Testing Algorithm: XGBoost XGBoost (extreme gradient boosting) algorithm with 10 parameters: 1 binary (choice of base model), 2 integer (Maximum Tree Depth, Number of Estimators) and 7 continuous (Learning Rate, Min Sample Weights, Min Loss Reduction, Ratio of Samples in Trees, Ratio of Variables in Trees, L2 Regularization and L1 Regularization) Parameter tuning results for XGBoost under 5-fold CV accuracy (%): Dataset Rand TPE GP-EI SMAC SeqUDHO Breast 75.77 76.18 76.22 76.22 76.18 ConVot 63.17 63.38 63.22 63.01 63.54 Credit 88.06 88.28 88.55 88.5 88.65 IonS 93.53 93.96 94.02 94.08 94.22 MamG 82.97 83.02 83.14 82.9 82.90 MBP 89.43 90.28 89.62 89.62 90.48 StatSoft.org 25
  26. 26. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Testing Algorithm: CNN CNN (convolutional neural network) with three layers. Each layer is tuned by its number of filters and kernel size. Global parameters include the choice of optimizer, batch size, learning rate and L2 penalty. MNIST data split: 8000 samples for training, 2000 samples for validation and 50000 samples for testing. Here, our AutoML target is to maximize the validation accuracy. StatSoft.org 26
  27. 27. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Testing Algorithm: CNN Hyperparameter settings and optimization results: The best CNN model selected by SeqUDHO is tested on the 50K sample with testing accuracy of 98.05%. StatSoft.org 27
  28. 28. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments AutoML Demonstration Finally, we demonstrate how to use SeqUDHO for AutoML in practice. Consider the mixture.example (R:ElemStatLearn) and seven benchmark datasets from UCI ML repository, all with binary responses. Consider three candidate ML algorithms (SVM, Random Forest, XGBoost), each having different hyperparameter settings. Example of AutoML output by SeqUDHO: StatSoft.org 28
  29. 29. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Future Work 1 To run simulation study for high-dimensional blackbox optimization; analyze strength/weakness of SeqUDHO and other Bayesian methods; 2 To improve the Gaussian process meta-modeling (with nugget effect) through sequential approximation for non-stationary surfaces; 3 To investigate DNN architecture search with SeqUD, and compare with genetic programming and reinforcement learning; 4 To investigate automated procedures for feature engineering, including variable selection and transformation; 5 To develop AutoML R/Python package with SeqUDHO meta-algorithm. StatSoft.org 29
  30. 30. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments References 1. Bergstra, J., Bardenet, R., Bengio, Y. and Kegl, B. (2011). Algorithms for hyper-parameter optimization. In NIPS, 2546–2554. 2. Bergstra, J. and Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305. 3. Fang, K.T. and Wang, Y. (1990). A sequential number-theoretic method for optimization and its applications in statistics. In Lecture Notes in Contemporary Mathematics, Science Press. 4. Fang, K.T. and Wang, Y. (1994). Number-theoretic Methods in Statistics. CRC Press. 5. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M. and Hutter, F. (2015). Efficient and robust automated machine learning. In NIPS, 2962–2970. 6. Goldberg, D.E. and Holland, J.H. (1988). Genetic algorithms and machine learning. Machine learning, 3(2), 95–99. 7. Huang, C.M., Lee, Y.J., Lin, D.K. and Huang, S.Y. (2007). Model selection for support vector machines via uniform design. CSDA, 52(1), 335–346. 8. Hutter, F., Hoos, H.H. and Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization, 507–523. Springer, Berlin, Heidelberg. StatSoft.org 30
  31. 31. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments References 9. Jin, H., Song, Q., and Hu, X. (2018). Efficient neural architecture search with network morphism. arXiv preprint arXiv:1806.10282. 10. Jones, D.R., Schonlau, M. and Welch, W.J. (1998). Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13(4), 455–492. 11. Kotthoff, L., Thornton, C., Hoos, H.H., Hutter, F. and Leyton-Brown, K. (2017). Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research, 18(1), 826–830. 12. Probst, P., Bischl, B. and Boulesteix, A.L. (2018). Tunability: Importance of hyperparameters of machine learning algorithms. arXiv:1802.09596. 13. Snoek, J., Larochelle, H. and Adams, R.P. (2012). Practical bayesian optimization of machine learning algorithms. In NIPS, 2951–2959. 14. Zhang, A., Li, H., Quan, S. and Yang, Z. (2018). UniDOE: uniform design of experiments. R package version 1.0.2. https://CRAN.R-project.org/package=UniDOE 15. Zoph, B. and Le, Q.V. (2016). Neural architecture search with reinforcement learning. arXiv:1611.01578. StatSoft.org 31
  32. 32. Introduction to AutoML SeqUD-based Hyperparameter Optimization Numerical Experiments Thank You! Q&A or Email ajzhang@hku.hk。 StatSoft.org 32

×