COSMOS_IDETC_2014_Souma

Concurrent Surrogate Model Selection (COSMOS)
based on
Predictive Estimation of Model Fidelity
Souma Chowdhury#, Ali Mehmani*, and Achille Messac#
* Syracuse University, Department of Mechanical and Aerospace Engineering
# Mississippi State University, Bagley College of Engineering
The ASME International Design Engineering Technical Conferences (IDETC)
August 17 – 20, 2014, Buffalo, NY

Surrogate Modeling
Surrogate models are commonly used for providing a tractable and inexpensive
approximation of the actual system behavior, as an alternative
 To expensive computational simulations (e.g., CFD), or
 To the lack of a physical model in the case of experiment-derived data (e.g., testing
of new metallic alloys).
풘풊 흍( 풙 − 풙풊 )
2
Model Type Kriging RBF SVR . . .
Linear Exponential Gaussian Cubic Multiquadric . . .
Kernel/Basis function
Correlation parameter Shape parameter . . .
Hyper-Parameter value
풇 풙 =
풏
풊=ퟏ
흍 풓 = (풓ퟐ + 풄ퟐ) ퟏ/ퟐ
풓= 풙 − 풙풊
풄풍풐풘풆풓 < 풄 < 풄풖풑풑풆풓

Outline
• Background and Literature
• Research Objectives
• COSMOS Framework
• Predictive Estimation of Model Fidelity (PEMF)
• Numerical Experiments: Results
• Concluding Remarks
3

Surrogate Model Selection
4
 Intuitive model selection (experience-based selection)
Model selection based on an understanding of the data characteristics
and/or the application constraints.
• Development of general guidelines likely not practical due to problem diversity.
• A few candidate surrogates are generally considered.
• In MDO problems, characteristics of disciplinary phenomena may not be evident.
 Automated model selection
Model selection based on the quantitative decision-making
techniques. Automated selection can be performed at these levels:

Automated Model or Kernel Selection
5
 Error measures are used to select the model type and basis functions*
퐹∗ = argmin
퐹 ∈푭
휺( 푭)
best surrogate model
surrogate model error
set of candidate surrogates
 Popular error measures used for model selection include: (i) split sample,
(ii) cross-validation, (iii) bootstrapping, (iv) Schwarz’s Bayesian information
criterion (BIC), and (v) Akaike’s information criterion (AIC)
Method Model Type Selection Kernel Type Selection
Holena et al., 2011 
Jin et al., 2001 
Gano et al., 2006 
Chen et al., 2004 
Viana et al., 2009  

Hyper-parameter Optimization
To mitigate the possibility of constructing a suboptimal surrogate model for a
given Kernel function, one must perform hyper-parameter optimization.
• Martin et al. (AIAAJ, 2005) used MLE and cross-validation methods to find the optimum
6
hyper-parameter value for the Gaussian correlation function in Kriging.
• Mongillo et al. (SIAM, 2011) used MLE and leave-one-out cross-validation methods to select
an optimal shape parameter in a Gaussian RBF.
• Gorissen et al. (JMLR, 2009) used the leave-one-out cross-validation and AIC error measures
in the SUMO Toolbox to select the hyper parameter value(s) through a genetic algorithm.
Shape parameter, σ
RMSE
X
F
Branin-Hoo function:
RBF Multiquadric
model with different
HP values

Research Objectives
 The original PEMF-based surrogate model selection method performed
selection at all three levels based on the median and maximum error.
 Models with similar number of kernel choices and kernels with a single
hyper-parameter was considered.
 The objectives of this research is to advance the PEMF-based COSMOS:
1. By introducing additional selection criteria: (i) the variance of the surrogate error and
(ii) the predicted error at a greater number of sample points.
2. By modifying the optimization formulation to allow competition among surrogates with
differing numbers of candidate kernels, and kernels with differing numbers of HPs.
3. By testing the COSMOS framework with a comprehensive set of model types and
constitutive kernel types − 16 surrogate-kernel combinations with 0 to 2 HPs.
PEMF: Predictive Estimation of Model Fidelity (Mehmani et al., AIAA Scitech 2014) 7

COSMOS Framework
8
Pareto Filter
Generally, any two
selection criteria, based on
user-preference, could be
considered simultaneously

COSMOS: MATLAB-based GUI
COSMOS MATLAB-based GUI: Courtesy of Ali Mehmani 9

COSMOS: Optimization Formulation
 Separate MINLPs are run in parallel for
each HP class (defined by #HPs involved)
 All hyper-parameters (CHP) are scaled to the
range 0 to 1.
 The candidate model-kernel combinations
are integer-coded.
 A single integer variable (TSK) now identifies
the model-kernel type.
 NSGA-II is used to solve the MINLP
problems. 10
Hyper-Parameter
Values
Candidate
Model-Kernel
Combinations
Branin Hoo
Function

Surrogate Model Candidates
11

Predictive Estimation of Model Fidelity (PEMF)
The PEMF method is derived from the hypothesis that the accuracy of
approximation models is related to the amount of data resources
leveraged to train the model.
 PEMF can be perceived as a novel sequential implementation of k-fold
12
cross-validation, with carefully constructed error measures.
 The PEMF method analyzes the variation of the model error distribution
with increasing number of training points.
 The PEMF method is a model independent approach for surrogate error
quantification, and does not require any additional test points.
 The PEMF method has been shown to be 1-2 orders of magnitude more
accurate in error quantification compared to leave-one-out cross validation.
Mehmani et al., AIAA SDM 2013, AIAA Scitech 2014, and Aviation, SMO 2014

PEMF: Approach
13
Median Error Maximum Error
Using Lognormal distribution at every iteration:
Variation of the modal value of the
median/maximum error is represented by
퐸 = 푎푛푏 or 퐸 = 푎푒푏푛
Final
Surrogate
Final
Surrogate

PEMF Input-Output
14
Error Metrics Model type
I/O Training data
Kernel type HP values
All error values expressed in terms
of relative absolute errors.

Numerical Experiments
Problem
Problem Settings Optimization Settings
Dimension Sample Size Population Size Max. Generations
Branin Hoo 2 30 20 50
Hartman-6 6 60 40 50
Dixon & Price 30 60 30 30
Neumaier & Perm 20 50 30 30
Airfoil Design 4 30 20 50
15
 Case 1: Minimize modal value of the median error and modal value of
the maximum error.
 Case 2: Minimize modal value of the median error and variance of the
median error.
 Case 3: Minimize modal value of the median error and the expected
modal value of the median error at 20% more sample points.

COSMOS Results: Benchmark Problems
16
Med vs. Max Med vs. Std-dev Med vs. Med-extra
0 0.05 0.1 0.15 0.2
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
Mode of Median Error, E
mo
med
max
mo
Mode of Maximum Error, E
HP-0
HP-1
HP-2
Pareto
0 0.05 0.1 0.15 0.2
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
mo
med
med,
mo
Mode of Median Error at 20% more samples, E
HP-0
HP-1
HP-2
Pareto
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.02 0.04 0.06 0.08 0.1 0.12
mo
med
med

Standard Deviation of Median Error, E
HP-0
HP-1
HP-2
Pareto
Branin Hoo (2D)
2.4
2
1.6
1.2
0.8
0.4
0.2 0.4 0.6 0.8 1 1.2
mo
med
max
mo
HP-0
HP-1
HP-2
Pareto
40
35
30
25
20
15
10
5
0
Hartman-6 (6 D)
0.2 0.4 0.6 0.8 1 1.2
mo
med
med

HP-0
HP-1
HP-2
Pareto
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.2 0.4 0.6 0.8 1 1.2
mo
med
med,
mo
HP-0
HP-1
HP-2
Pareto

COSMOS Results: Benchmark Problems
mo
0.35 0.4 0.45 0.5
0.11
0.105
0.1
0.095
0.09
0.085
0.08
0.075
0.07
0.065
0.06
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
mo
med
med,
med,
mo
mo
HP-0
HP-1
HP-2
Pareto
mo
0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
0.35
0.34
0.33
0.32
0.31
0.3
0.29
0.28
0.27
0.26
0.25
3.5
3
2.5
2
1.5
1
0.5
0
mo
med
mo
max
max
mo
HP-0
HP-1
HP-2
Pareto
0.09 0.1 0.11 0.12 0.13 0.14 0.15
med
HP-0
HP-1
HP-2
Pareto
Dixon & Price (30D)
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0.09 0.1 0.11 0.12 0.13 0.14 0.15
mo
med
med

HP-0
HP-1
HP-2
Pareto
0.09 0.1 0.11 0.12 0.13 0.14 0.15
med
HP-0
HP-1
HP-2
Pareto
17
Neumaier Perm (20D)

COSMOS Results: Airfoil Problem
18
0 0.005 0.01 0.015 0.02
0.035
0.03
0.025
0.02
0.015
0.01
0.005
0
mo
med
max
mo
HP-0
HP-1
HP-2
Pareto
0.02
0.018
0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
med,
mo
퐶퐿
퐶퐷
= 푓 푥1, 푥2, 푥3, 훼
mo
High-fidelity samples generated
by a Fluent CFD simulation
0 0.005 0.01 0.015 0.02
0.02
0.018
0.016
0.014
0.012
0.01
0.008
0.006
0.004
mo
med
med

HP-0
HP-1
HP-2
Pareto
0 0.005 0.01 0.015 0.02 0.025
0
med
HP-0
HP-1
HP-2
Pareto

COSMOS Results: Summary
 Widely different sets of surrogates models were selected as the
optimum set in the five different problems.
 A diverse set of surrogate-kernel combinations are often observed
to provide important trade-offs.
19
The Best Trade-off Models

Concluding Remarks
 A new framework, COSMOS, was developed to select surrogate models
based on criteria driven by user preference (e.g., median or max error).
 COSMOS can identify optimal model-kernel combinations from a large
pool of candidates, by using
1. The model-independent error measures given by PEMF, and
2. A novel MINLP formulation.
 On applying COSMOS to a suite of benchmark test problems, we found:
1. Same surrogate-kernel combinations can yield a noticeable spread of best trade-offs
(at different HP values);
2. Diverse surrogate models often constitute the set of best trade-off models.
 These initial tests readily exhibit the need for such frameworks for
automated selection of globally competitive surrogates.
 Future research directions: Application to more complex practical
problems, and a smarter apriori sorting of the model-kernel candidates.
20

Questions
and
Comments
21
Thank you

COSMOS: MATLAB-based GUI
22
For those interested to contribute models or test problems,
or interested to try out COSMOS,
please contact chowdhury@bagley.msstate.edu

Surrogate-Kernel Combinations
23

24
Median of RAEs
Predictive Estimation of Model Fidelity
(PEMF)
Intermediate Actual model
surrogate model
ε = 풎풆풅 |
풇풊 − 풇풊
풇풊
| ,
푋 = 푋푖푛 + 푋표푢푡
푋푡
퐢 = ퟏ, … , #{푿푻푬}
푇푅 = 푋표푢푡 + {훽푘 }
A chi-square, 흌 ퟐ,goodness-of-fit criterion
푋푡
푇퐸 = X − {푋푡
푇푅 }
kth subset of
inside-region
sample points
Inside and outside sets
Momed
It. 2
It. 1
t1 t2 t3 t4
Number of Training Points
휒 2 =
푚
푖=1
(표푖 − 푡푖 )^2
푡푖
It. 4
Predicted
Median Error
Mean Error
No. of Training points
Mode of Median Error
No. of Training points
Branin-Hoo Function (RBF)
It. 3
퐹 푛푡 = 푎0(푛푡 )−푎1
OR
퐹 푛푡 = 푎0 푒−푎1(푛푡) 
Model Based Systems
Design
Integrative Modeling and
Design of Wind Farms
Energy-Sustainable Smart
Buildings
Reconfigurable Unmanned
Aerial Vehicles (UAV)
We randomly divide the set of sample points into intermediate sets of
1.Training points and
2.Test points

Comparing Computational Time
26
One step method requires around 1/7th the time in
searching for optimal models.

COSMOS_IDETC_2014_Souma

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (18)

Similaire à COSMOS_IDETC_2014_Souma

Similaire à COSMOS_IDETC_2014_Souma (20)

Plus de MDO_Lab

Plus de MDO_Lab (20)

Dernier

Dernier (20)

COSMOS_IDETC_2014_Souma