SlideShare une entreprise Scribd logo
1  sur  14
Understanding of Backpropagation
Kang, Min-Guk
𝑋1
𝑋2
𝑧(1)
| 𝑎(1)
𝑧(2)
| 𝑎(2)
𝑧(3)
| 𝑎(3)
𝑧(4)
| 𝑎(4)
𝑧(5)
| 𝑎(5)
𝑧(6)
| 𝑎(6)
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤𝑖𝑛,1
𝑤𝑖𝑛,2
1
1. What is Backpropagation ?
1. Definition
Backpropagation is an algorithm for supervised learning of artificial neural networks using gradient descent.
2. History
Backpropagation algorism was developed in the 1970s, but in 1986, Rumelhart, Hinton and Williams
showed experimentally that this method can generate useful internal representations of incoming data in
Hidden layers of neural networks.
3. How to use Backpropagation?
Backpropagation consists of using simple chain rules. However, we often use non-linear functions
for activation functions, It is hard for us to use backpropagation.
(In this case, I will use sigmoid function for activation function.)
2
2. Preparations
1. Cost function(Loss function)
I will use below cost function(𝑦𝑎 is value of hypothesis, 𝑦𝑡 is value of true)
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
2. Derivative of sigmoid function
𝑑𝑆(𝑧)
𝑑𝑧
=
1
(1+𝑒−𝑧)2 × −1 × −𝑒−𝑧 = 𝑆(𝑧)(1 − 𝑆(𝑧))
∵ Sigmoid function = S(z) =
1
1+𝑒−𝑧 , 𝐹 𝑧 =(
1
𝑔 𝑧
) =
− 𝑔 𝑧
𝑔 𝑧 2
3. How to renew weights using Gradient descent
𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑
(η is learning rate)
3
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊2,3
[σ{𝑧(3)
}] (σ is sigmoid function)
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕𝑍3
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕
𝜕𝑊2,3
( 𝑎(2)
𝑤2,3)
∴
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑎(2)
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
4
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊1,2
[σ{𝑧(3)
}] (σ is sigmoid function)
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕𝑍3
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕
𝜕𝑊1,2
( 𝑎(2)
𝑤2,3)
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] × 𝑤2,3 ×
𝜕
𝜕𝑊1,2
𝑎(2)
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] × 𝑤2,3 ×
𝜕
𝜕𝑊1,2
σ{𝑧(2)
}
∴
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑤2,3 σ{𝑧(2)
}[1 − σ{𝑧(2)
} ] 𝑎(1)
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
5
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊 𝑖𝑛,1
[σ{𝑧(3)
}] (σ is sigmoid function)
Using same way, we will get below equation.
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑤2,3 σ{𝑧(2)
}[1 − σ{𝑧(2)
} ] 𝑤1,2 σ{𝑧(1)
}[1 − σ{𝑧(1)
} ] 𝑋𝑖𝑛
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
6
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-2. The weight update is dependent on derivatives that reside on both paths.
To get the result, you have to do more tedious calculations than the previous one. So I now just write the result of it.
If you want to know the calculation process, look at the next slide!
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2)
}[1 − σ{𝑧(2)
}]𝑤2,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}] 𝑤1,2 + σ{𝑧(3)
}[1 − σ{𝑧(3)
}]𝑤3,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}]𝑤1,3]
① ② ③ ④
2
3
𝑋𝑖𝑛 1
𝑤𝑖𝑛,1
4 𝑎(3) = 𝑦𝑎
Feed forward
𝑧(4)
| 𝑎(4)
𝑧(3)
| 𝑎(3)
𝑧(1)| 𝑎(1)
𝑤1,2 𝑤2,4
𝑤3,4𝑤1,3
𝑧(2)
| 𝑎(2)
①②
③④
7
2
3
𝑋𝑖𝑛 1
𝑤𝑖𝑛,1
4 𝑎(3)
= 𝑦𝑎
Feed forward
𝑧(4)
| 𝑎(4)
𝑧(3)
| 𝑎(3)
𝑧(1)
| 𝑎(1)
𝑤1,2 𝑤2,4
𝑤3,4𝑤1,3
𝑧(2)
| 𝑎(2)
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
=
𝜕
𝜕𝑊 𝑖𝑛,1
1
2
(𝑦𝑎 − 𝑦𝑡)2
= 𝑦𝑎 − 𝑦𝑡 (
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 2
}𝑤2,4 + σ{𝑧 3
}𝑤3,4))
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4
𝜕
𝜕𝑊 𝑖𝑛,1
σ{𝑧 2
} + 𝑤3,4
𝜕
𝜕𝑊 𝑖𝑛,1
σ{𝑧 3
} ] = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 1
}𝑤1,2)+
𝑤3,4σ{𝑧 3
}
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 1
}𝑤1,3) ]
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}𝑤1,2σ{𝑧 1
}
𝜕
𝜕𝑊 𝑖𝑛,1
(𝑋𝑖𝑛 𝑤𝑖𝑛,1) + 𝑤3,4σ{𝑧 3
} 𝑤1,3 σ{𝑧 1
}
𝜕
𝜕𝑊 𝑖𝑛,1
(𝑋𝑖𝑛 𝑤𝑖𝑛,1) ]
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}𝑤1,2σ{𝑧 1
} 𝑋𝑖𝑛 + 𝑤3,4σ{𝑧 3
} 𝑤1,3 σ{𝑧 1
}𝑋𝑖𝑛 ]
= (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2)
}[1 − σ{𝑧(2)
}]𝑤2,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}] 𝑤1,2 + σ{𝑧(3)
}[1 − σ{𝑧(3)
}]𝑤3,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}]𝑤1,3]
𝑧(1)
𝑧(1)
8
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-3. The derivative for a weight is not dependent on the derivatives of any of the other weights in the same layer.
This is easy, so I will not explain it here.(homework )
𝑋1
𝑋2
1
2
3
4
5
6
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤(1)
𝑤(2)
Independant
9
3. Jump to the Backpropagation
2. Application of Gradient descent
𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑
(η is learning rate)
① At first, We initialize weights and biases with initializer  we know!
② we can control the learning rate  we know!
③ we can get this value through the equation  we know!
Then, we can renew the weights using above equation. But, is not it too difficult to apply?
So, We will define Error Signal for simple application.
① ② ③
10
3. Jump to the Backpropagation
3. Error Signals
1-1 Defintion: δ𝒋 =
𝝏𝑪
𝝏𝒁 𝒋
1-2 General Form of Signals
δj =
𝜕C
𝜕Zj
=
𝜕
𝜕Zj
1
2
(𝑦𝑎 − 𝑦𝑡)2
= (𝑦𝑎 − 𝑦𝑡)
𝜕𝑦 𝑎
𝜕𝑍 𝑗
------- ①
𝜕𝑦 𝑎
𝜕𝑍 𝑗
=
𝜕𝑦 𝑎
𝜕𝑎 𝑗
𝜕𝑎 𝑗
𝜕𝑍 𝑗
=
𝜕𝑦 𝑎
𝜕𝑎 𝑗
× σ(𝑧𝑗) (∵ 𝑎𝑗 = σ(𝑧𝑗))
Because neural network consists of Multiple units, we can think all of the units 𝑘 ∈ 𝑜𝑢𝑡𝑠 𝑗 .
So,
𝜕𝑦 𝑎
𝜕𝑍 𝑗
= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝜕𝑧 𝑘
𝜕𝑎 𝑗
𝜕𝑦 𝑎
𝜕𝑍 𝑗
= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝑤𝑗𝑘 (∵ 𝑧 𝑘 = 𝑤𝑗𝑘 𝑎𝑗)
By above equation ① and δk = (𝑦𝑎 − 𝑦𝑡)
𝜕𝑦 𝑎
𝜕𝑍 𝑘
δj = (𝑦𝑎 − 𝑦𝑡) σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝑤𝑗𝑘 = (𝑦𝑎 − 𝑦𝑡)σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
δk
(𝑦 𝑎−𝑦𝑡)
𝑤𝑗𝑘
∴ δj= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk 𝑤𝑗𝑘 , and for starting, we define δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙)
}[1 − σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙)
}]
11
3. Jump to the Backpropagation
3. Error Signals
1-3 The General Form of weight variation
( ※ 𝑊3→6,𝑛𝑒𝑤= 𝑊3→6,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊3→6,𝑜𝑙𝑑
)
( ※ δ6 = δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] )
∆𝑊3,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(3)
= −ηδ6 𝑎(3)
∆𝑊4,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(4)
= −ηδ6 𝑎(4)
∆𝑊5,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(5)
= −ηδ6 𝑎(5)
∆𝑊1,3 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑊3,6 σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑎(1)
= −η ( 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δ6) × 𝑤3,6 σ(𝑧3) 𝑎(1)
= −ηδ3 𝑎(1)
………
∴ ∆𝑊𝑖,𝑗= −ηδ𝑗 𝑎(𝑖)
 We can easily renew weights by using Error Signals δ and Equation ∆𝑾𝒊,𝒋= −𝜼𝜹𝒋 𝒂(𝒊)
𝑋1
𝑋2
𝑧(1)
| 𝑎(1)
𝑧(2)
| 𝑎(2)
𝑧(3)
| 𝑎(3)
𝑧(4)
| 𝑎(4)
𝑧(5)
| 𝑎(5)
𝑧(6)
| 𝑎(6)
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤𝑖𝑛,1
𝑤𝑖𝑛,2
12
4. Summarize
Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation.
(Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 13
4. Summarize
Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation.
(Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 14

Contenu connexe

Tendances

Hawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedHawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedfoxtrot jp R
 
Maximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleMaximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleLiang Kai Hu
 
Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networksParveenMalik18
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Hojin Yang
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processgidc engineering college
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulationSolo Hermelin
 
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsRotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsSolo Hermelin
 
Lecture 1 computational intelligence
Lecture 1  computational intelligenceLecture 1  computational intelligence
Lecture 1 computational intelligenceParveenMalik18
 
Gamma & Beta functions
Gamma & Beta functionsGamma & Beta functions
Gamma & Beta functionsSelvaraj John
 
Metrics for generativemodels
Metrics for generativemodelsMetrics for generativemodels
Metrics for generativemodelsDai-Hai Nguyen
 
B.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionB.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionRai University
 
Dealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbDealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbfoxtrot jp R
 
Gamma beta functions-1
Gamma   beta functions-1Gamma   beta functions-1
Gamma beta functions-1Selvaraj John
 
Beta & Gamma Functions
Beta & Gamma FunctionsBeta & Gamma Functions
Beta & Gamma FunctionsDrDeepaChauhan
 
5 cramer-rao lower bound
5 cramer-rao lower bound5 cramer-rao lower bound
5 cramer-rao lower boundSolo Hermelin
 
Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Mohammad Umar Rehman
 
Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Rai University
 

Tendances (20)

Hawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedHawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _secured
 
Maximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleMaximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of Beetle
 
Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networks
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth process
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulation
 
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsRotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
 
Lecture 1 computational intelligence
Lecture 1  computational intelligenceLecture 1  computational intelligence
Lecture 1 computational intelligence
 
Dyadics
DyadicsDyadics
Dyadics
 
Gamma & Beta functions
Gamma & Beta functionsGamma & Beta functions
Gamma & Beta functions
 
HERMITE SERIES
HERMITE SERIESHERMITE SERIES
HERMITE SERIES
 
Metrics for generativemodels
Metrics for generativemodelsMetrics for generativemodels
Metrics for generativemodels
 
17 Disjoint Set Representation
17 Disjoint Set Representation17 Disjoint Set Representation
17 Disjoint Set Representation
 
B.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionB.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma function
 
Dealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbDealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdb
 
Gamma beta functions-1
Gamma   beta functions-1Gamma   beta functions-1
Gamma beta functions-1
 
Beta & Gamma Functions
Beta & Gamma FunctionsBeta & Gamma Functions
Beta & Gamma Functions
 
5 cramer-rao lower bound
5 cramer-rao lower bound5 cramer-rao lower bound
5 cramer-rao lower bound
 
Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon
 
Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2
 

Similaire à Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Demetrio Ccesa Rayme
 
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsTenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsQUESTJOURNAL
 
publisher in research
publisher in researchpublisher in research
publisher in researchrikaseorika
 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mappinginventionjournals
 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesIOSR Journals
 
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSRai University
 
BSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IBSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IRai University
 
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSRai University
 
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Vladimir Godovalov
 
B.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationB.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationRai University
 
One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...Lossian Barbosa Bacelar Miranda
 
Functions of severable variables
Functions of severable variablesFunctions of severable variables
Functions of severable variablesSanthanam Krishnan
 

Similaire à Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method (20)

Z transforms
Z transformsZ transforms
Z transforms
 
E0561719
E0561719E0561719
E0561719
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007
 
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsTenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
 
Taller 1 parcial 3
Taller 1 parcial 3Taller 1 parcial 3
Taller 1 parcial 3
 
Algebra 6
Algebra 6Algebra 6
Algebra 6
 
publisher in research
publisher in researchpublisher in research
publisher in research
 
lec32.ppt
lec32.pptlec32.ppt
lec32.ppt
 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence Spaces
 
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
 
BSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IBSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-I
 
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
 
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
 
J256979
J256979J256979
J256979
 
B.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationB.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integration
 
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
 
One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...
 
Functions of severable variables
Functions of severable variablesFunctions of severable variables
Functions of severable variables
 

Plus de 강민국 강민국

Plus de 강민국 강민국 (11)

PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution Examp...
PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution  Examp...PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution  Examp...
PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution Examp...
 
Deeppermnet
DeeppermnetDeeppermnet
Deeppermnet
 
[Pr12] deep anomaly detection using geometric transformations
[Pr12] deep anomaly detection using geometric transformations[Pr12] deep anomaly detection using geometric transformations
[Pr12] deep anomaly detection using geometric transformations
 
[Pr12] self supervised gan
[Pr12] self supervised gan[Pr12] self supervised gan
[Pr12] self supervised gan
 
Ebgan
EbganEbgan
Ebgan
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Deep Feature Consistent VAE
Deep Feature Consistent VAEDeep Feature Consistent VAE
Deep Feature Consistent VAE
 
[Probability for machine learning]
[Probability for machine learning][Probability for machine learning]
[Probability for machine learning]
 
Deep learning overview
Deep learning overviewDeep learning overview
Deep learning overview
 
Generative adversarial network
Generative adversarial networkGenerative adversarial network
Generative adversarial network
 
Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)
 

Dernier

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 

Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method

  • 1. Understanding of Backpropagation Kang, Min-Guk 𝑋1 𝑋2 𝑧(1) | 𝑎(1) 𝑧(2) | 𝑎(2) 𝑧(3) | 𝑎(3) 𝑧(4) | 𝑎(4) 𝑧(5) | 𝑎(5) 𝑧(6) | 𝑎(6) 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤𝑖𝑛,1 𝑤𝑖𝑛,2 1
  • 2. 1. What is Backpropagation ? 1. Definition Backpropagation is an algorithm for supervised learning of artificial neural networks using gradient descent. 2. History Backpropagation algorism was developed in the 1970s, but in 1986, Rumelhart, Hinton and Williams showed experimentally that this method can generate useful internal representations of incoming data in Hidden layers of neural networks. 3. How to use Backpropagation? Backpropagation consists of using simple chain rules. However, we often use non-linear functions for activation functions, It is hard for us to use backpropagation. (In this case, I will use sigmoid function for activation function.) 2
  • 3. 2. Preparations 1. Cost function(Loss function) I will use below cost function(𝑦𝑎 is value of hypothesis, 𝑦𝑡 is value of true) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 2. Derivative of sigmoid function 𝑑𝑆(𝑧) 𝑑𝑧 = 1 (1+𝑒−𝑧)2 × −1 × −𝑒−𝑧 = 𝑆(𝑧)(1 − 𝑆(𝑧)) ∵ Sigmoid function = S(z) = 1 1+𝑒−𝑧 , 𝐹 𝑧 =( 1 𝑔 𝑧 ) = − 𝑔 𝑧 𝑔 𝑧 2 3. How to renew weights using Gradient descent 𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑 (η is learning rate) 3
  • 4. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊2,3 [σ{𝑧(3) }] (σ is sigmoid function) 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕𝑍3 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕 𝜕𝑊2,3 ( 𝑎(2) 𝑤2,3) ∴ 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑎(2) 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 4
  • 5. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊1,2 [σ{𝑧(3) }] (σ is sigmoid function) 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕𝑍3 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕 𝜕𝑊1,2 ( 𝑎(2) 𝑤2,3) 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝑤2,3 × 𝜕 𝜕𝑊1,2 𝑎(2) = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝑤2,3 × 𝜕 𝜕𝑊1,2 σ{𝑧(2) } ∴ 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑤2,3 σ{𝑧(2) }[1 − σ{𝑧(2) } ] 𝑎(1) 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 5
  • 6. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊 𝑖𝑛,1 [σ{𝑧(3) }] (σ is sigmoid function) Using same way, we will get below equation. 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑤2,3 σ{𝑧(2) }[1 − σ{𝑧(2) } ] 𝑤1,2 σ{𝑧(1) }[1 − σ{𝑧(1) } ] 𝑋𝑖𝑛 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 6
  • 7. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-2. The weight update is dependent on derivatives that reside on both paths. To get the result, you have to do more tedious calculations than the previous one. So I now just write the result of it. If you want to know the calculation process, look at the next slide! 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2) }[1 − σ{𝑧(2) }]𝑤2,4 σ{𝑧(1) }[1 − σ{𝑧(1) }] 𝑤1,2 + σ{𝑧(3) }[1 − σ{𝑧(3) }]𝑤3,4 σ{𝑧(1) }[1 − σ{𝑧(1) }]𝑤1,3] ① ② ③ ④ 2 3 𝑋𝑖𝑛 1 𝑤𝑖𝑛,1 4 𝑎(3) = 𝑦𝑎 Feed forward 𝑧(4) | 𝑎(4) 𝑧(3) | 𝑎(3) 𝑧(1)| 𝑎(1) 𝑤1,2 𝑤2,4 𝑤3,4𝑤1,3 𝑧(2) | 𝑎(2) ①② ③④ 7
  • 8. 2 3 𝑋𝑖𝑛 1 𝑤𝑖𝑛,1 4 𝑎(3) = 𝑦𝑎 Feed forward 𝑧(4) | 𝑎(4) 𝑧(3) | 𝑎(3) 𝑧(1) | 𝑎(1) 𝑤1,2 𝑤2,4 𝑤3,4𝑤1,3 𝑧(2) | 𝑎(2) 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝜕 𝜕𝑊 𝑖𝑛,1 1 2 (𝑦𝑎 − 𝑦𝑡)2 = 𝑦𝑎 − 𝑦𝑡 ( 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 2 }𝑤2,4 + σ{𝑧 3 }𝑤3,4)) 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4 𝜕 𝜕𝑊 𝑖𝑛,1 σ{𝑧 2 } + 𝑤3,4 𝜕 𝜕𝑊 𝑖𝑛,1 σ{𝑧 3 } ] = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 } 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 1 }𝑤1,2)+ 𝑤3,4σ{𝑧 3 } 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 1 }𝑤1,3) ] 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 }𝑤1,2σ{𝑧 1 } 𝜕 𝜕𝑊 𝑖𝑛,1 (𝑋𝑖𝑛 𝑤𝑖𝑛,1) + 𝑤3,4σ{𝑧 3 } 𝑤1,3 σ{𝑧 1 } 𝜕 𝜕𝑊 𝑖𝑛,1 (𝑋𝑖𝑛 𝑤𝑖𝑛,1) ] 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 }𝑤1,2σ{𝑧 1 } 𝑋𝑖𝑛 + 𝑤3,4σ{𝑧 3 } 𝑤1,3 σ{𝑧 1 }𝑋𝑖𝑛 ] = (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2) }[1 − σ{𝑧(2) }]𝑤2,4 σ{𝑧(1) }[1 − σ{𝑧(1) }] 𝑤1,2 + σ{𝑧(3) }[1 − σ{𝑧(3) }]𝑤3,4 σ{𝑧(1) }[1 − σ{𝑧(1) }]𝑤1,3] 𝑧(1) 𝑧(1) 8
  • 9. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-3. The derivative for a weight is not dependent on the derivatives of any of the other weights in the same layer. This is easy, so I will not explain it here.(homework ) 𝑋1 𝑋2 1 2 3 4 5 6 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤(1) 𝑤(2) Independant 9
  • 10. 3. Jump to the Backpropagation 2. Application of Gradient descent 𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑 (η is learning rate) ① At first, We initialize weights and biases with initializer  we know! ② we can control the learning rate  we know! ③ we can get this value through the equation  we know! Then, we can renew the weights using above equation. But, is not it too difficult to apply? So, We will define Error Signal for simple application. ① ② ③ 10
  • 11. 3. Jump to the Backpropagation 3. Error Signals 1-1 Defintion: δ𝒋 = 𝝏𝑪 𝝏𝒁 𝒋 1-2 General Form of Signals δj = 𝜕C 𝜕Zj = 𝜕 𝜕Zj 1 2 (𝑦𝑎 − 𝑦𝑡)2 = (𝑦𝑎 − 𝑦𝑡) 𝜕𝑦 𝑎 𝜕𝑍 𝑗 ------- ① 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = 𝜕𝑦 𝑎 𝜕𝑎 𝑗 𝜕𝑎 𝑗 𝜕𝑍 𝑗 = 𝜕𝑦 𝑎 𝜕𝑎 𝑗 × σ(𝑧𝑗) (∵ 𝑎𝑗 = σ(𝑧𝑗)) Because neural network consists of Multiple units, we can think all of the units 𝑘 ∈ 𝑜𝑢𝑡𝑠 𝑗 . So, 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝜕𝑧 𝑘 𝜕𝑎 𝑗 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝑤𝑗𝑘 (∵ 𝑧 𝑘 = 𝑤𝑗𝑘 𝑎𝑗) By above equation ① and δk = (𝑦𝑎 − 𝑦𝑡) 𝜕𝑦 𝑎 𝜕𝑍 𝑘 δj = (𝑦𝑎 − 𝑦𝑡) σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝑤𝑗𝑘 = (𝑦𝑎 − 𝑦𝑡)σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk (𝑦 𝑎−𝑦𝑡) 𝑤𝑗𝑘 ∴ δj= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk 𝑤𝑗𝑘 , and for starting, we define δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙) }[1 − σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙) }] 11
  • 12. 3. Jump to the Backpropagation 3. Error Signals 1-3 The General Form of weight variation ( ※ 𝑊3→6,𝑛𝑒𝑤= 𝑊3→6,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊3→6,𝑜𝑙𝑑 ) ( ※ δ6 = δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] ) ∆𝑊3,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(3) = −ηδ6 𝑎(3) ∆𝑊4,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(4) = −ηδ6 𝑎(4) ∆𝑊5,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(5) = −ηδ6 𝑎(5) ∆𝑊1,3 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑊3,6 σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑎(1) = −η ( 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δ6) × 𝑤3,6 σ(𝑧3) 𝑎(1) = −ηδ3 𝑎(1) ……… ∴ ∆𝑊𝑖,𝑗= −ηδ𝑗 𝑎(𝑖)  We can easily renew weights by using Error Signals δ and Equation ∆𝑾𝒊,𝒋= −𝜼𝜹𝒋 𝒂(𝒊) 𝑋1 𝑋2 𝑧(1) | 𝑎(1) 𝑧(2) | 𝑎(2) 𝑧(3) | 𝑎(3) 𝑧(4) | 𝑎(4) 𝑧(5) | 𝑎(5) 𝑧(6) | 𝑎(6) 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤𝑖𝑛,1 𝑤𝑖𝑛,2 12
  • 13. 4. Summarize Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation. (Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 13
  • 14. 4. Summarize Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation. (Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 14