Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
1. Development of Data Driven
Techniques for Process
Optimization Using Real-
Coded Genetic Algorithms
陳奇中
Chyi-Tsong Chen
ctchen@fcu.edu.tw
逢甲大學化工系
Dept. of Chem. Eng., Feng Chia Univ.
2. Outline
Introduction - evolution in biology
What is genetic algorithm (GA)?
Optimization using RCGA (Real-coded GA)
A. Single Objective (Global optimal)
B. Multi-objective (Pareto front)
Data Driven Techniques Using RCGA
A. Single objective
B. Multi-objective
Application to the optimal design of MOCVD
processes
Conclusions
3. Introduction:
Evolution in biology
IMG form http://www.geo.au.dk/besoegsservice/foredrag/evolution/
4. Evolution in biology - I
Organisms produce a number of offspring similar
to themselves but can have variations due to:
(a) Sexual reproduction
Parents offspring
IMG from http://www.tulane.edu/~wiser/protozoology/notes/images/ciliate.gif
Ref. :http://www.cas.mcmaster.ca/~cs777/presentations/3_GO_Olesya_Genetic_Algorithms.pdf
5. Evolution in biology - I
Organisms produce a number of offspring similar
to themselves but can have variations due to:
(b) Mutations (Random changes in the DNA sequence)
Before After
IMG from http://offers.genetree.com/landing/images/mutation.png
IMG from http://www.tulane.edu/~wiser/protozoology/notes/images/ciliate.gif
Ref. :http://www.cas.mcmaster.ca/~cs777/presentations/3_GO_Olesya_Genetic_Algorithms.pdf
6. Evolution in biology - II
Some offspring survive, and produce next
generations, and some don’t:
Ugobe Inc. Pelo
http://www.ugobe.com/Home.aspx
Ref. :http://www.cas.mcmaster.ca/~cs777/presentations/3_GO_Olesya_Genetic_Algorithms.pdf
7. What is genetic algorithm (GA)?
GA is a particular class of evolutionary algorithm
Initially developed by Prof. John Holland
"Adaptation in natural and artificial systems“, University of Michigan press, 1975
Based on Darwin’s theory of evolution
“Natural Selection” & “Survival of the fittest”
物競天擇 適者生存 不適者淘汰
Imitate the mechanism of biological evolution
- Reprodution
- Crossover
- Mutation
8. Advantages of GA
GA can be regarded as a search method from
multiple directions – reproduction, crossover, mutation
Provide efficient techniques to search optimal solutions for
optimization problems having
- Discontinuous
- Highly nonlinear
- Stochastic
- Has unreliable or undefined derivatives
Provide solutions for highly complex search space
Have superior performance over the traditional optimal
techniques, e.g., the gradient descent method.
9. Traditional GA
All variables of interest must be encoded as binary
digits (genes) forming a string (chromosome).
Gene – a single encoding of part of the solution space.
Chromosome – a string of genes that represent a solution.
1 gene
1 1 0 1 0 chromosome
IMG from http://static.howstuffworks.com/gif/cell-dna.jpg
10. Real-coded GA (RCGA)
All genes in chromosome are real numbers
- suitable for most systems.
- genes are directly real values during genetic
operations.
- the length of chromosomes is shorter than that in
binary-coded, so it can be easily performed.
1.1 gene
1.1 0.1 15 10 0.12
chromosome
IMG from http://static.howstuffworks.com/gif/cell-dna.jpg
11. Notations of RCGA (Chen et al., 2008)
Θ = [θ1 θ 2 L θ m ] is a solution set (chromosome) of the
optimization problem
θ i is called a gene, i∈m and m = { 1, 2 L , m }
The admissible parameter space for Θ is defined as
Ω Θ = { Θ ∈ ℜm | θ1,min ≤ θ1 ≤ θ1,max , θ 2,min ≤ θ 2 ≤ θ 2,max ,
L , θ m ,min ≤ θ m ≤ θ m ,max }
12. Procedure of RCGA (Chen et al., 2008)
Reproduction (tournament selection)
Discard Pr × N chromosomes with maximum values of objective
Add Pr × N chromosomes with minimum values of objective
Example: Pr=0.5
Sort by objective value New population
⎡θ 2,1 θ 2,2 L θ 2,m ⎤ ⎡ 0.1⎤ ⎡θ 2,1 θ 2,2 L θ 2,m ⎤ ⎡ 0.1⎤
⎢θ θ1,2 L θ1,m ⎥ ⎢0.2 ⎥ ⎢θ θ1,2 L θ1,m ⎥ ⎢0.2 ⎥
⎢ 1,1 ⎥⎢ ⎥ add ⎢ 1,1 ⎥⎢ ⎥
⎢θ 4,1 θ 4,1 L θ 4,m ⎥ ⎢ 0.3⎥ ⎢θ 2,1 θ 2,2 L θ 2,m ⎥ ⎢ 0.1⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥
⎣θ3,1 θ3,2 L θ3,m ⎦ ⎣0.4 ⎦ ⎣θ1,1 θ1,2 L θ1,m ⎦ ⎣0.2 ⎦
Discard
13. Procedure of RCGA (Chen et al., 2008)
Crossover
Divided chromosomes into N/2 pairs where serve as parents.
Suppose that Θ1 and Θ 2 are parents of a given pair.
Example:
Divided into
two group ⎡θ 2,1 θ 2,2 L θ 2,m ⎤
⎡θ1,1 θ1,2 L θ1,m ⎤ ⎢θ θ1,2 L θ1,m ⎥
⎢θ ⎣ 1,1 ⎦
⎢ 2,1 θ 2,2 L θ 2,m ⎥
⎥
⎢θ3,1 θ3,2 L θ3,m ⎥
⎢ ⎥ ⎡θ 4,1 θ 4,1 L θ 4,m ⎤
⎣θ 4,1 θ 4,2 L θ 4,m ⎦ ⎢θ θ3,2 L θ3,m ⎥
⎣ 3,1 ⎦
14. Procedure of RCGA (Chen et al., 2008)
Crossover
Θ1
if obj (Θ1 ) < obj (Θ 2 )
Θ1 ⎡θ θ 2,2 L θ 2,m ⎤
Θ1 ← Θ1 + r (Θ1 − Θ 2 ) 2,1
⎢θ θ1,2 L θ1,m ⎥
Θ 2 ← Θ 2 + r (Θ1 − Θ 2 )
If c>Pc ⎣ 1,1 ⎦
Θ2
else random
c ∈ [ 0 1] Θ 2 ⎡θ 4,1 θ 4,1 L θ 4,m ⎤
Θ1 ← Θ1 + r (Θ 2 − Θ1 ) ⎢θ
⎣ 3,1 θ3,2 L θ3,m ⎥ ⎦
Θ 2 ← Θ 2 + r (Θ 2 − Θ1 )
obj ( Θ1 ) − obj ( Θ 2 )
r=
max ( obj ( Θ ) ) − min ( obj ( Θ ) )
15. Procedure of RCGA (Chen et al., 2008)
Mutation
Randomly select Pm× N chromosomes in the current population.
Example: Pm=0.5 If a generated chromosome is outside the
search space Ω Θ ,then the chromosome
will be bounded by Ω Θ .
⎡θ1,1 θ1,2 L θ1,m ⎤
⎢θ θ 2,2 L θ 2,m ⎥
⎢ 2,1 ⎥ Θ ← Θ+ s×Φ
⎢θ3,1 θ3,2 L θ3,m ⎥
⎢ ⎥ m
Φ ∈ ℜ : random vector
⎣θ 4,1 θ 4,2 L θ 4,m ⎦
s ( > 0 ) :mutation size
16. Procedure of RCGA (Chen et al., 2008)
Step 1. Generate a population of N chromosomes from Ω Θ .
Step 2. Evaluate the corresponding objective function value for each
chromosome in the population.
Step 3. If the pre-specified number of generations, G , is reached, or
max ( obj ( Θ ) ) − min ( obj ( Θ ) ) ≤ ε , then stop.
Step 4. Perform operations of reproduction, crossover, and mutation.
Notice that if the objective function value of offspring chromosome is bigger
than the objective function value of parent chromosome, then the parent
chromosome will be retained in this generation.
Step 5. Go back to Step 2.
17. Methods Comparison
The proposed Deb et al., 2000 Chang, 2007
(Chen et al., 2008)
Initial population Sobol (Pseudo Random) Random Random
reproduction tournament selection tournament selection tournament selection
crossover •N/2 pairs by sorting •Random pair •Random pair
with objective function • Simulated binary •Direction-based
value crossover (SBX) •random step size
•Direction-based
•controlled step size
mutation Quadratic-decay Polynomial-type Random
18. Global optimization using RCGA:
Single-objective
min f ( x ) Single-objective function
x
c (x) ≤ 0
Nonlinear constraints
ceq ( x ) = 0
Ax ≤ b
Linear constraints
A eq x = b eq
x L ≤ x ≤ xU Variables constraints
19. Benchmark test 1:
De Jong function, 1975
max F = 3905.93 − 100 ( x12 − x2 ) − (1 − x1 )
2 2
s.t
− 3 ≤ xi ≤ 3, i = 1,2
Global optimal solution
X(1,1)
F=3905.93
20. Benchmark test 1: De Jong function, 1975
Results:
(N=100, Pr=0.2, Pc=0.3, Pm=0.3,ε=1e-4,runs=300)
Methods Avg. iteration no. Avg. time (s)
The proposed 13.7633 0.11814
Deb, et al., 2000 15.4733 0.13213
Chang, 2007 15.31333 0.13130
27. Optimization using RCGA:
Multi-objective
min f1 ( x ) , f 2 ( x ) ,L, f M ( x ) Multi-objective
x
s.t.
c (x) ≤ 0
Nonlinear constraints
ceq ( x ) = 0
Ax ≤ b
Linear constraints
A eq x = b eq
x L ≤ x ≤ xU Variable constraints
29. Concept of Pareto-optimal
solutions : non-dominated (Goldberg, 1989)
B dominate A
A C dominate A
B, C non-dominated
B
D, E non-dominated
2
C E dominate A, B, C
D
D dominate A, B
E
1
30. How does multi-objective
optimization work?
Non-dominated Crowding distance New
Parents sorting sorting for each front Population
1
2 Front 1 Front 1 Front 1
RCGA
M Front 2 Front 2 Front 2 N
Front 3 Front 3 Front 3
N CAT
Offspring
1
2
Rejected
M
N
31. How to extend RCGA to multi-
objective optimization problems
Crossover: Min J 1 , J 2 , L , J M
J i (Θ1 ) < J i (Θ 2 )
J i (θ1 ) − J i (θ 2 )
if
i
Θ1 ← Θ1 + r (Θ1 − Θ 2 ) ωi = M
i
Θ 2 ← Θ 2 + r (Θ1 − Θ 2 )
∑ J (θ ) − J (θ )
i =1
i 1 i 2
else M
i
Θ1 ← Θ1 + r (Θ 2 − Θ1 ) θ1 = ∑ ωiθ1i
i =1
i
Θ 2 ← Θ 2 + r (Θ 2 − Θ1 ) M
θ 2 = ∑ ωiθ 2i
i =1
32. Methods Comparison
The proposed NSGA-II
(Deb et al., 2000)
Initial population Sobol (pseudo random) Random
reproduction Tournament selection Tournament selection
crossover •N/2 pairs by sorting •Random pair
with crowding •Simulated binary
distance crossover (SBX)
•Multi-direction based
•controlled step size
mutation Quadratic-decay Polynomial-type
33. Benchmark test 1:
FON function
⎛ 3 ⎛ 1 ⎞ ⎞
2 Optimal solutions
f1 ( x ) = 1 − exp ⎜ −∑ ⎜ xi − ⎟ ⎟
⎜ i =1 ⎝ 3⎠ ⎟ ⎡ 1 1 ⎤
⎝ ⎠ x1 = x2 = x3 ∈ ⎢ − , ⎥
⎛ 3 ⎛ ⎣ 3 3⎦
1 ⎞ ⎞
2
f 2 ( x ) = 1 − exp ⎜ −∑ ⎜ xi + ⎟ ⎟
⎜ i =1 ⎝ 3⎠ ⎟
⎝ ⎠
RCGA parameters
s.t.
N=100
−π ≤ x ≤ π Pc=0.1
Pm=0.1
34. Results:
After 20 iterations
The proposed method NSGA-II
After 50 iterations
35. Benchmark test 2:
KUR function
Optimal solutions
( ( ))
n −1
f1 ( x ) = ∑ −10 exp −0.2 xi 2 + xi +12
i
( )
n
f 2 ( x ) = ∑ xi + 5sin xi 3
0.8
i =1
s.t.
−5 ≤ xi ≤ 5
RCGA parameters
N=100
Pc=0.1
Pm=0.1
36. Results:
After 60 iterations
The proposed method NSGA-II
After 150 iterations
37. Benchmark test 3:
ZTD6 function
f1 ( x ) = 1 − exp ( −4 x1 ) sin 6 ( 6π x1 )
(
f 2 ( x ) = g ( x ) 1 − x1 g ( x ) ) Optimal solutions
g ( x ) = 1 + 10 ( n − 1) + ∑ ( xi 2 − 10 cos ( 4π xi ) )
n
i=2
s.t.
0 ≤ xi ≤ 1, i = 1, 2,L ,10
RCGA parameters
N=100
Pc=0.1
Pm=0.1
38. Results:
After 200 iterations
The proposed method NSGA-II
After 500 iterations
39. Data Driven Techniques Using
RCGA
Single-objective process optimization
x1 y1
min f (y )
x2 y2 xi
M M s.t.
xn ym
xi ,min ≤ xi ≤ xi ,max
40. Multi-objective optimization
x1 y1
x2 y2
M M
xn ym
min f1 ( y ) , f 2 ( y ) ,L , f M ( y )
xi
s.t.
xi ,min_ i ≤ xi ≤ xi ,max_ i , i = 1, 2,L , n
41. Data Driven Flow Chart
Initialize setting
runs=0
Search optimal
design parameters
Generate a group of
by RCGA
design of
experiments Yes
Reach the goal Stop
Train a model by Calculate objective
function value No
neural network
algorithm
Add result into
the neural
network model
and count
runs=runs+1
42. Data Driven Flow Chart
Initialize setting
runs=0 Generate N chromosomes
⎡ θ1,1 θ1,2 L θ1,n ⎤
Search optimal
design parameters
⎢θ θ 2,2 L θ 2,n ⎥
by RCGA Θ = ⎢ ⎥
Generate a group of 2,1
design of
experiments ⎢ M M the O MYes⎥
Reach goal Stop
⎢ ⎥
⎣θ N ,1 θ N ,2 L θ N ,n ⎦
Train a model by Calculate objective
Generate
function value No
objective function values
neural network
algorithm
obj ( Θ ) = [ obj L objN ]
Add result into T
obj
1 the neural
2
network model
and count
runs=runs+1
43. Data Driven Flow Chart
Initialize setting
runs=0
Train NN model
NN type: feed-forward
Search optimal
design parameters
Generate a group of
by RCGA
design of
experiments Yes
Reach the goal Stop
Train a model by Calculate objective
function value No
neural network
algorithm
Add result into
the neural
MSE index < 1e-3
network model
and count
runs=runs+1
44. Data Driven Flow Chart
Initialize setting Single-objective:
runs=0
Apply direction-based
RCGA to search optimal
Search optimal solution according to
Generate a group of
design parameters current NN model.
by RCGA
designs of
experiments Yes
Reach the goal Stop
Multi-objective:
Calculate objective Apply multi-direction RCGA
Train a model by
neural network function value to search optimal Pareto-
No
algorithm front according to current
NN result into
Add model.
the neural
network model
and count
runs=runs+1
45. Data Driven Flow Chart
Initialize setting Single-objective:
runs=0
Calculate the objective
function value from the
Search optimal solution searched by RCGA.
design parameters
Generate a group of
by RCGA
design of
experiments Yes
Multi-objective:
Reach the goal Stop
Pick up first p points from
Perato front which is
Train a model by Calculate objective sorted by crowding
function value No
neuron network distance.
algorithm
Then generate the
Add result into
corresponding objective
the neuron
function value(s).
network model
and count
runs=runs+1
46. Data Driven Flow Chart
Calculate performance index:
Initialize setting
runs=0
Single objective:
Generate a group of ( y j − y j )
1 p 2 Search optimal
MSE = ∑ ˆ design parameters
p by RCGA
design of j =1
experiments Yes
Reach the goal Stop
Multi-objective:
∑∑ ( )
M p Calculate objective
Train a model by1
1 2
MSE =
neuron network y −y
ˆ function value
i, j i, j
No
M p i =1 j =1
algorithm
Add result into
y Predict from NN model
ˆ the neuron
network model
and count
y Calculate from process runs=runs+1
47. Data Driven Flow Chart
Initialize setting
runs=0
Search optimal
design parameters
Generate a group of
by RCGA
design of
experiments Yes
Reach the goal Stop
Train a model by Calculate objective
function value No
neural network
algorithm
Add result into
the neural
network model
and count
runs=runs+1
48. Data Driven test 1: Single-objective
2
(
min F = 3 (1 − x1 ) exp − ( x1 ) − ( x2 + 1)
2 2
MATLAB, peaks function
⎛x ⎞
− 10 ⎜ 1 − x13 − x25 ⎟ exp ( − x12 − x2 2 )
⎝5 ⎠
1
(
− exp − ( x1 + 1) − x2 2 ⎟
3
2
)⎞
⎠
s.t
− 3 ≤ xi ≤ 3, i = 1, 2
Global optimal solution
X(0.2281, -1.6255)
F= -6.55113
53. 40
After 20 iterations
Optimal solution: x1=0.2282
x2=-1.6255
Predicted F= -6.5513
cf. Global optimal solution
X(0.2281,-1.6255)
F= -6.5511
54. Application to the optimal design
of an MOCVD reactor
AIXTRON AIX200/4
The schematic of horizontal MOCVD reactor
(top: 3D view; bottom: 2D side view).
55. Objective function
Susceptor temperature: 600K ~ 1200K
Total flow rate : 10000sccm ~ 15000sccm
Pressure: 8kPa ~ 15kPa
Objective function:
2
1 ⎛ 1 ⎞ 1
+ β (δ n ) + (1 − GR f ) (1 − δ f )
1 2 1 2
J = α⎜ +
2
⎟ GR
2 ⎝ GR n ⎠ 2 2 2
GR =
∫ GR dA δ =∫
GR − GR
dA
A GR
57. Suscuptor Temperature: 600 K ~ 1200 K
Total flow rate : 10000 sccm ~ 15000 sccm
Pressure: 8 kPa ~ 15 kPa
Before After
GR =
∫ GR dA = 8.65 ×10 −9
m / min GR =
∫ GR dA = 11.65 ×10 −9
m / min
A A
GR − GR GR − GR
δ =∫ dA = 0.003504 δ =∫ dA = 0.00220
GR GR
58. Data Driven Test 2: Multi-objective
Pareto-optimal solutions:
CONSTR function
A : 0.39 ≤ x1 ≤ 0.67 ⇒ x2 = 6 − 9 x1
f1 ( x ) = x1
B : 0.67 ≤ x1 ≤ 1 ⇒ x2 = 0
f 2 ( x ) = (1 + x2 ) x1
s.t.
−20 ≤ xi ≤ 20, i = 1, 2
g1 ( x ) = x2 + 9 x1 ≥ 6
g1 ( x ) = − x2 + 9 x1 ≥ 1 B
RCGA parameters
N=100
Pc=0.1 A
Pm=0.1
63. Multi-objective design of
Horizontal MOCVD process
AIXTRON AIX200/4
The schematic of horizontal MOCVD reactor
(top: 3D view; bottom: 2D side view).
64. Objective functions
Susceptor Temperature: 833K ~ 1033K
Total flow rate : 13000sccm ~ 20000sccm
Pressure: 10kPa ~ 100kPa
Growth of GaAs film on a 3-inch substrate
Objective functions:
GR =
∫ GRdA
A
∫ ( GR − GR )
2
dA
δ=
A
65. Design of experiments -
Taguchi method
L25(56)
Susceptor Temperature (K): T
Total flow rate (sccm): U
Pressure (kPa): P
Five levels for each factor
Levels
Variables
Level 1 Level 2 Level 3 Level 4 Level 5
T 833 883 933 983 1033
P 10 25 50 75 100
U 13000 14000 16000 18000 20000
75. Optimal Pareto-front
solutions of the MOCVD
90
80 B
70
60
Uniformity index
50
40
30
20 C
D A
10
0
-10
-45 -40 -35 -30 -25 -20 -15 -10 -5 0
Growth Rate (nm/min)
76. Case A. the best uniformity
(min. of δ )
Operating conditions:
T= 883 K
P=10 kPa
U= 13000 sccm
Performance:
GR = 3.461 (nm/ min)
δ = 0.00409
77. Case B. the max. growth
rate (min. − GR )
Operating conditions:
T= 957.9659 K
P=10 kPa
U= 20000 sccm
Performance:
GR = 44.346 (nm/ min)
δ = 82.059
78. Case C. min. J = −GR + δ
Operating conditions:
T= 935.82 K
P=18.87 kPa
U= 20000 sccm
Performance:
GR = 34.852 (nm/ min)
δ = 3.245
79. Case D. min. J = −GR + 10δ
Operating conditions:
T= 917.81 K
P=16.23 kPa
U= 20000 sccm
Performance:
GR = 29.401 (nm/ min)
δ = 1.024
80. Conclusions
An efficient global optimization scheme
using a real-coded genetic algorithms
has been proposed.
Effective data driven techniques for
single objective and multi-objective
optimal process design have been
developed.
The proposed schemes have been tested
successfully on the optimal design of
MOCVD processes.