SlideShare une entreprise Scribd logo
1  sur  44
Bayesian Inference :
Kalman filter에서 Optimization까지
Bayes Rule
𝑃 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝑑𝑎𝑡𝑎 =
𝑃 𝑑𝑎𝑡𝑎 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝑃(ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠)
𝑃(𝑑𝑎𝑡𝑎)
• Bayes rule tells us how to do inference about hypotheses from data.
• Learning and prediction can be seen as forms of inference.
Given information
Estimate hypothesis
Rev'd Thomas Bayes (1702-1761)
Data : Observation, Hypothesis : Model
Countbayesie.com/blog/2016/5/1/a-guide-to-Bayesian-statistics
3
Contents :
- Learning : Maximum a Posterior Estimator(MAP)
- Prediction : Kalman Filter and it’s implementation
- Optimization : Bayesian Optimization and it’s application
4
Learning :
5
Cost to minimize : Cross-entropy Error
Function J(ϴ) = −
1
𝑚
𝑙𝑜g(P 𝑦 𝑥; ϴ )
Logistic regression
Likelihood
Maximum likelihood estimator (MLE)
𝜃∗ = argmax 𝜃 𝑙𝑜g(P 𝑦 𝑥; ϴ ) = arg𝑚𝑖𝑛 𝜃{ J(ϴ)}
• This approach is very ill-conditioned in nature
 sensitive to noise and model error
Learning :
6
Regularized Logistic Regression
Now assume that prior distribution over parameters exists :
Then we can apply Bayes Rule:
Posterior distribution
over model parameters
Learning :
7
Regularized Logistic Regression
Now assume that prior distribution over parameters exists :
Then we can apply Bayes Rule:
Data likelihood for specific parameters
(could be modeled with Deep Network!)
Learning :
8
Regularized Logistic Regression
Now assume that prior distribution over parameters exists :
Then we can apply Bayes Rule:
Prior distribution over parameters
(describes our prior knowledge or /
and our desires for the model)
Learning :
9
Regularized Logistic Regression
Now assume that prior distribution over parameters exists :
Then we can apply Bayes Rule:
Bayesian evidence
A powerful method for model selection!
Learning :
10
Regularized Logistic Regression
Now assume that prior distribution over parameters exists :
Then we can apply Bayes Rule:
Learning :
As a rule this integral is intractable :(
(You can never integrate this)
11
The core idea of Maximum a Posteriori Estimator:
Maximum a posteriori estimator(MAP)
𝐽 𝑀𝐴𝑃 ϴ = − log 𝑃 ϴ 𝑥, 𝑦 = -log 𝑃 𝑦 𝑥, ϴ - log 𝑃 ϴ + log 𝑃 𝑦
=𝐽 𝑀𝐿𝐸 ϴ +
1
2𝜎 𝑤
2 𝑖 ϴ𝑖
2
+ 𝑐𝑜𝑛𝑠𝑡
𝜃 𝑀𝐴𝑃
∗
= argmax 𝜃(log(𝑃 𝑦 𝑥, ϴ + log 𝑃 ϴ )) = arg𝑚𝑖𝑛 𝜃{ 𝐽 𝑀𝐴𝑃 ϴ }
Loss function of Posterior distribution over model parameters
assuming a Gaussian prior for the weights
Regularized Logistic RegressionLearning :
Variational Inference
True posterior :
Modeled with Deep Neural Network
Let’s find good approximation :
Learning :
Variational Inference
True posterior :
Modeled with Deep Neural Network
Let’s find good approximation :
Learning :
Variational Inference
True posterior :
Let’s find good approximation :
Learning :
Intractable integral :(
Variational Inference
True posterior :
Let’s find good approximation :
Learning :
Variational Inference
True posterior :
Let’s find good approximation :
Explicitly define distribution family
for approximation
(e.g. multivariate gaussian)
Learning :
Variational Inference
True posterior :
Let’s find good approximation :
Learning :
Variational parameters
(e.g. mean vector, covariance matrix)
Variational Inference
True posterior :
Let’s find good approximation :
Kullback-Leibler divergence
(measure of distributions dissimilarity)
Learning :
Speaking mathematically:
Variational Inference
True posterior :
Let’s find good approximation :
Speaking mathematically:
True posterior is unknown :(
Learning :
20
Prediction : Kalman Filter
Autonomous Mobile Robot Design
Dr. Kostas Alexis (CSE)
Kalman Filter –A Primer
Consider a time-discrete stochastic process(Markov chain)
21
Estimates the state xt of a discrete-time controlled process that is
governed by the linear stochastic difference equation
And (linear)measurements of the state
with and
Prediction : Kalman Filter
22
Bayes Filter Algorithm
For each step, do:
• Apply motion model
• Apply to sensor model
constant
Prediction :
23
From Bayes Filter to Kalman Filter
For each step, do:
• Apply motion model
• Apply to sensor model
Prediction :
24
Kalman Filter AlgorithmPrediction :
Kt : Kalman Gain
Cov. of state Cov. of measurement noise
>>∽1 for
<<∽0 for
While passing through tunnel
25
Implementation of Kalman FilterPrediction :
GPS aided IMU :
- Gyro has drift, bias, and alignment error
- GPS, vision or kinematics can cope with these inherent problems
“ASSESSMENT OF INTEGRATED GPS/INS FOR THE EX-171 EXTENDED
RANGE GUIDED MUNITION”, AIAA-98-4416
26
• Eq. of error dynamics
Implementation of Kalman FilterPrediction :
• Measurement Model
state
output
27
System
v
+
+ y
+
-
K
y
K
w
u
Kalman
Filter
xe
ye
xr e
Complement data
(GPS, Kinematics, Vision, etc)
ρ
Implementation of Kalman FilterPrediction :
LQG(Linear Quadratic Gaussian) controller
= LQR(Linear Quadratic Regulation) + Kalman Filter
28
Implementation of Kalman FilterPrediction :
Optimal Data Sampling Strategy !
30
Bayesian OptimizationOptimization :
Surrogate Model (Gaussian Process)
+ Parameter Exploration or Exploitation
(Acquisition Function)
Bayesian Optimization
=
Automatic Gain Tuning
based on
Gaussian Process Global Optimization
(= Bayesian Optimization)
32
Bayesian OptimizationOptimization :
Short Introduction on
Gaussian Processes
Regression, Classification & Optimization
Why GPs ? :
- Provide Closed-Form Predictions !
- Effective for small data problems
- And Explainable !
How Do We Deal With Many Parameters, Little Data ?
1. Regularization
e.g., smoothing, L1 penalty, drop out in neural nets, large K
for K-nearest neighbor
2. Standard Bayesian approach
specify probability of data given weights, P(D|W)
specify weight priors given hyper-parameter α, P(W|α)
find posterior over weights given data, P(W|D, α)
With little data, strong weight prior constrains inference
3. Gaussian processes
place a prior over functions, p(f) directly rather than
over model parameters, p(w)
Functions : Relationship between Input and Output
Distribution of functions that satisfy
within the range of Input, X and Output, f
 Prior over functions, No Constraints
X
f
prior
• GP specifies a prior over functions, f(x)
• Suppose we have a set of observations:
• D = {(x1,y1), (x2, y2), (x3, y3), …, (xn,
yn)}
Standard Bayesian approach
• p(f|D) ~ p(D|f) p(f)
One view of Bayesian inference
• generating samples (the prior)
• discard all samples inconsistent with
our data, leaving the samples of
interest (the posterior)
• The Gaussian process allows us to
do this analytically.
Gaussian Process Approach
prior
posterior
 Bayesian data modeling technique that account for uncertainty
 Bayesian kernel regression machines
Gaussian Process Approach
Procedure to sample
2. Compute Covariance Matrix for a given 𝑋 = 𝑥1 … . 𝑥 𝑛
1. Let’s assume input, X and function, f distributed as follows
X
f
Procedure to sample
3. Compute SVD or Cholesky decomp. of K to get orthogonal basis
functions
K = 𝐴𝑆𝐵 𝑇 = 𝐿𝐿𝑇
4. Compute Basis Function
𝑓𝑖 = 𝐴𝑆1/2 𝑢𝑖
or 𝑓𝑖 = 𝐿𝑢𝑖
𝑢𝑖 ∶ 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑒𝑐𝑡𝑜𝑟 𝑤𝑖𝑡ℎ
𝑧𝑒𝑟𝑜 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑢𝑛𝑖𝑡 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
L : Lower part of Cholesky
decomp. of K
X
f
posterior
X
f
prior
J
= 𝜃1 𝑟 𝑡 − 𝑦 𝑡 + 𝜃2 𝑦(𝑡)
A simple PD control example
Global optimal gains, θ to get a minimum cost J ?
A simple PD control example
Procedure of Bayesian Optimization
1. GP prior before observing any data
2. GP posterior, after five noisy evaluations
3. The next parameters θnext are chosen
at the maximum of Acquisition function
Repeat until you can find
a globally optimal θ
argmin
Acquisition function
= 𝐻 𝑥∗ 𝐷𝑡 − 𝐻 𝑥∗ 𝐷𝑡U{𝑥, 𝑦}
Information gain, I : Mutual information for an observed data
 Reduction of uncertainty in the location 𝑥∗ by selecting points (𝑥, 𝑦)
that are expected to cause the largest reduction in entropy of
distribution 𝐻 𝑥∗ 𝐷𝑡
Acquisition function and Entropy Search

Contenu connexe

Tendances

Lecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptxLecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptx
GauravSonawane51
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forests
Marc Garcia
 
Ant Colony Optimization presentation
Ant Colony Optimization presentationAnt Colony Optimization presentation
Ant Colony Optimization presentation
Partha Das
 

Tendances (20)

An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Kalman filters
Kalman filtersKalman filters
Kalman filters
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Lecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptxLecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptx
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forests
 
Practical Swarm Optimization (PSO)
Practical Swarm Optimization (PSO)Practical Swarm Optimization (PSO)
Practical Swarm Optimization (PSO)
 
Greedymethod
GreedymethodGreedymethod
Greedymethod
 
Ant Colony Optimization presentation
Ant Colony Optimization presentationAnt Colony Optimization presentation
Ant Colony Optimization presentation
 
Evolutionary computing - soft computing
Evolutionary computing - soft computingEvolutionary computing - soft computing
Evolutionary computing - soft computing
 
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Adversarial Search
Adversarial SearchAdversarial Search
Adversarial Search
 
인공지능 방법론 - Deep Learning 쉽게 이해하기
인공지능 방법론 - Deep Learning 쉽게 이해하기인공지능 방법론 - Deep Learning 쉽게 이해하기
인공지능 방법론 - Deep Learning 쉽게 이해하기
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
Bagging.pptx
Bagging.pptxBagging.pptx
Bagging.pptx
 
Gradient Boosting
Gradient BoostingGradient Boosting
Gradient Boosting
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 

Similaire à Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님

proposal_pura
proposal_puraproposal_pura
proposal_pura
Erick Lin
 

Similaire à Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님 (20)

Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
DCWP_CVPR2023.pptx
DCWP_CVPR2023.pptxDCWP_CVPR2023.pptx
DCWP_CVPR2023.pptx
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
Bayesian Non-parametric Models for Data Science using PyMC
 Bayesian Non-parametric Models for Data Science using PyMC Bayesian Non-parametric Models for Data Science using PyMC
Bayesian Non-parametric Models for Data Science using PyMC
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Expectation propagation
Expectation propagationExpectation propagation
Expectation propagation
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
proposal_pura
proposal_puraproposal_pura
proposal_pura
 
Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Seminar9
Seminar9Seminar9
Seminar9
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 

Plus de AI Robotics KR

Plus de AI Robotics KR (18)

Sensor Fusion Study - Real World 2: GPS & INS Fusion [Stella Seoyeon Yang]
Sensor Fusion Study - Real World 2: GPS & INS Fusion [Stella Seoyeon Yang]Sensor Fusion Study - Real World 2: GPS & INS Fusion [Stella Seoyeon Yang]
Sensor Fusion Study - Real World 2: GPS & INS Fusion [Stella Seoyeon Yang]
 
Sensor Fusion Study - Ch14. The Unscented Kalman Filter [Sooyoung Kim]
Sensor Fusion Study - Ch14. The Unscented Kalman Filter [Sooyoung Kim]Sensor Fusion Study - Ch14. The Unscented Kalman Filter [Sooyoung Kim]
Sensor Fusion Study - Ch14. The Unscented Kalman Filter [Sooyoung Kim]
 
Sensor Fusion Study - Real World 1: Lidar radar fusion [Kim Soo Young]
Sensor Fusion Study - Real World 1: Lidar radar fusion [Kim Soo Young]Sensor Fusion Study - Real World 1: Lidar radar fusion [Kim Soo Young]
Sensor Fusion Study - Real World 1: Lidar radar fusion [Kim Soo Young]
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
 
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
 
Sensor Fusion Study - Ch12. Additional Topics in H-Infinity Filtering [Hayden]
Sensor Fusion Study - Ch12. Additional Topics in H-Infinity Filtering [Hayden]Sensor Fusion Study - Ch12. Additional Topics in H-Infinity Filtering [Hayden]
Sensor Fusion Study - Ch12. Additional Topics in H-Infinity Filtering [Hayden]
 
Sensor Fusion Study - Ch11. The H-Infinity Filter [김영범]
Sensor Fusion Study - Ch11. The H-Infinity Filter [김영범]Sensor Fusion Study - Ch11. The H-Infinity Filter [김영범]
Sensor Fusion Study - Ch11. The H-Infinity Filter [김영범]
 
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
 
Sensor Fusion Study - Ch9. Optimal Smoothing [Hayden]
Sensor Fusion Study - Ch9. Optimal Smoothing [Hayden]Sensor Fusion Study - Ch9. Optimal Smoothing [Hayden]
Sensor Fusion Study - Ch9. Optimal Smoothing [Hayden]
 
Sensor Fusion Study - Ch8. The Continuous-Time Kalman Filter [이해구]
Sensor Fusion Study - Ch8. The Continuous-Time Kalman Filter [이해구]Sensor Fusion Study - Ch8. The Continuous-Time Kalman Filter [이해구]
Sensor Fusion Study - Ch8. The Continuous-Time Kalman Filter [이해구]
 
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
 
Sensor Fusion Study - Ch6. Alternate Kalman filter formulations [Jinhyuk Song]
Sensor Fusion Study - Ch6. Alternate Kalman filter formulations [Jinhyuk Song]Sensor Fusion Study - Ch6. Alternate Kalman filter formulations [Jinhyuk Song]
Sensor Fusion Study - Ch6. Alternate Kalman filter formulations [Jinhyuk Song]
 
Sensor Fusion Study - Ch5. The discrete-time Kalman filter [박정은]
Sensor Fusion Study - Ch5. The discrete-time Kalman filter  [박정은]Sensor Fusion Study - Ch5. The discrete-time Kalman filter  [박정은]
Sensor Fusion Study - Ch5. The discrete-time Kalman filter [박정은]
 
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
 
Sensor Fusion Study - Ch4. Propagation of states and covariance [김동현]
Sensor Fusion Study - Ch4. Propagation of states and covariance [김동현]Sensor Fusion Study - Ch4. Propagation of states and covariance [김동현]
Sensor Fusion Study - Ch4. Propagation of states and covariance [김동현]
 
Sensor Fusion Study - Ch2. Probability Theory [Stella]
Sensor Fusion Study - Ch2. Probability Theory [Stella]Sensor Fusion Study - Ch2. Probability Theory [Stella]
Sensor Fusion Study - Ch2. Probability Theory [Stella]
 
Sensor Fusion Study - Ch1. Linear System [Hayden]
Sensor Fusion Study - Ch1. Linear System [Hayden]Sensor Fusion Study - Ch1. Linear System [Hayden]
Sensor Fusion Study - Ch1. Linear System [Hayden]
 
ROS2 on WebOS - Brian Shin(LG)
ROS2 on WebOS - Brian Shin(LG)ROS2 on WebOS - Brian Shin(LG)
ROS2 on WebOS - Brian Shin(LG)
 

Dernier

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 

Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님

  • 1. Bayesian Inference : Kalman filter에서 Optimization까지
  • 2. Bayes Rule 𝑃 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝑑𝑎𝑡𝑎 = 𝑃 𝑑𝑎𝑡𝑎 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝑃(ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠) 𝑃(𝑑𝑎𝑡𝑎) • Bayes rule tells us how to do inference about hypotheses from data. • Learning and prediction can be seen as forms of inference. Given information Estimate hypothesis Rev'd Thomas Bayes (1702-1761) Data : Observation, Hypothesis : Model Countbayesie.com/blog/2016/5/1/a-guide-to-Bayesian-statistics
  • 3. 3 Contents : - Learning : Maximum a Posterior Estimator(MAP) - Prediction : Kalman Filter and it’s implementation - Optimization : Bayesian Optimization and it’s application
  • 5. 5 Cost to minimize : Cross-entropy Error Function J(ϴ) = − 1 𝑚 𝑙𝑜g(P 𝑦 𝑥; ϴ ) Logistic regression Likelihood Maximum likelihood estimator (MLE) 𝜃∗ = argmax 𝜃 𝑙𝑜g(P 𝑦 𝑥; ϴ ) = arg𝑚𝑖𝑛 𝜃{ J(ϴ)} • This approach is very ill-conditioned in nature  sensitive to noise and model error Learning :
  • 6. 6 Regularized Logistic Regression Now assume that prior distribution over parameters exists : Then we can apply Bayes Rule: Posterior distribution over model parameters Learning :
  • 7. 7 Regularized Logistic Regression Now assume that prior distribution over parameters exists : Then we can apply Bayes Rule: Data likelihood for specific parameters (could be modeled with Deep Network!) Learning :
  • 8. 8 Regularized Logistic Regression Now assume that prior distribution over parameters exists : Then we can apply Bayes Rule: Prior distribution over parameters (describes our prior knowledge or / and our desires for the model) Learning :
  • 9. 9 Regularized Logistic Regression Now assume that prior distribution over parameters exists : Then we can apply Bayes Rule: Bayesian evidence A powerful method for model selection! Learning :
  • 10. 10 Regularized Logistic Regression Now assume that prior distribution over parameters exists : Then we can apply Bayes Rule: Learning : As a rule this integral is intractable :( (You can never integrate this)
  • 11. 11 The core idea of Maximum a Posteriori Estimator: Maximum a posteriori estimator(MAP) 𝐽 𝑀𝐴𝑃 ϴ = − log 𝑃 ϴ 𝑥, 𝑦 = -log 𝑃 𝑦 𝑥, ϴ - log 𝑃 ϴ + log 𝑃 𝑦 =𝐽 𝑀𝐿𝐸 ϴ + 1 2𝜎 𝑤 2 𝑖 ϴ𝑖 2 + 𝑐𝑜𝑛𝑠𝑡 𝜃 𝑀𝐴𝑃 ∗ = argmax 𝜃(log(𝑃 𝑦 𝑥, ϴ + log 𝑃 ϴ )) = arg𝑚𝑖𝑛 𝜃{ 𝐽 𝑀𝐴𝑃 ϴ } Loss function of Posterior distribution over model parameters assuming a Gaussian prior for the weights Regularized Logistic RegressionLearning :
  • 12. Variational Inference True posterior : Modeled with Deep Neural Network Let’s find good approximation : Learning :
  • 13. Variational Inference True posterior : Modeled with Deep Neural Network Let’s find good approximation : Learning :
  • 14. Variational Inference True posterior : Let’s find good approximation : Learning : Intractable integral :(
  • 15. Variational Inference True posterior : Let’s find good approximation : Learning :
  • 16. Variational Inference True posterior : Let’s find good approximation : Explicitly define distribution family for approximation (e.g. multivariate gaussian) Learning :
  • 17. Variational Inference True posterior : Let’s find good approximation : Learning : Variational parameters (e.g. mean vector, covariance matrix)
  • 18. Variational Inference True posterior : Let’s find good approximation : Kullback-Leibler divergence (measure of distributions dissimilarity) Learning : Speaking mathematically:
  • 19. Variational Inference True posterior : Let’s find good approximation : Speaking mathematically: True posterior is unknown :( Learning :
  • 20. 20 Prediction : Kalman Filter Autonomous Mobile Robot Design Dr. Kostas Alexis (CSE) Kalman Filter –A Primer Consider a time-discrete stochastic process(Markov chain)
  • 21. 21 Estimates the state xt of a discrete-time controlled process that is governed by the linear stochastic difference equation And (linear)measurements of the state with and Prediction : Kalman Filter
  • 22. 22 Bayes Filter Algorithm For each step, do: • Apply motion model • Apply to sensor model constant Prediction :
  • 23. 23 From Bayes Filter to Kalman Filter For each step, do: • Apply motion model • Apply to sensor model Prediction :
  • 24. 24 Kalman Filter AlgorithmPrediction : Kt : Kalman Gain Cov. of state Cov. of measurement noise >>∽1 for <<∽0 for While passing through tunnel
  • 25. 25 Implementation of Kalman FilterPrediction : GPS aided IMU : - Gyro has drift, bias, and alignment error - GPS, vision or kinematics can cope with these inherent problems “ASSESSMENT OF INTEGRATED GPS/INS FOR THE EX-171 EXTENDED RANGE GUIDED MUNITION”, AIAA-98-4416
  • 26. 26 • Eq. of error dynamics Implementation of Kalman FilterPrediction : • Measurement Model state output
  • 27. 27 System v + + y + - K y K w u Kalman Filter xe ye xr e Complement data (GPS, Kinematics, Vision, etc) ρ Implementation of Kalman FilterPrediction : LQG(Linear Quadratic Gaussian) controller = LQR(Linear Quadratic Regulation) + Kalman Filter
  • 28. 28 Implementation of Kalman FilterPrediction :
  • 29. Optimal Data Sampling Strategy !
  • 30. 30 Bayesian OptimizationOptimization : Surrogate Model (Gaussian Process) + Parameter Exploration or Exploitation (Acquisition Function) Bayesian Optimization =
  • 31. Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesian Optimization)
  • 33. Short Introduction on Gaussian Processes Regression, Classification & Optimization
  • 34. Why GPs ? : - Provide Closed-Form Predictions ! - Effective for small data problems - And Explainable !
  • 35. How Do We Deal With Many Parameters, Little Data ? 1. Regularization e.g., smoothing, L1 penalty, drop out in neural nets, large K for K-nearest neighbor 2. Standard Bayesian approach specify probability of data given weights, P(D|W) specify weight priors given hyper-parameter α, P(W|α) find posterior over weights given data, P(W|D, α) With little data, strong weight prior constrains inference 3. Gaussian processes place a prior over functions, p(f) directly rather than over model parameters, p(w)
  • 36. Functions : Relationship between Input and Output Distribution of functions that satisfy within the range of Input, X and Output, f  Prior over functions, No Constraints X f prior
  • 37. • GP specifies a prior over functions, f(x) • Suppose we have a set of observations: • D = {(x1,y1), (x2, y2), (x3, y3), …, (xn, yn)} Standard Bayesian approach • p(f|D) ~ p(D|f) p(f) One view of Bayesian inference • generating samples (the prior) • discard all samples inconsistent with our data, leaving the samples of interest (the posterior) • The Gaussian process allows us to do this analytically. Gaussian Process Approach prior posterior
  • 38.  Bayesian data modeling technique that account for uncertainty  Bayesian kernel regression machines Gaussian Process Approach
  • 39. Procedure to sample 2. Compute Covariance Matrix for a given 𝑋 = 𝑥1 … . 𝑥 𝑛 1. Let’s assume input, X and function, f distributed as follows X f
  • 40. Procedure to sample 3. Compute SVD or Cholesky decomp. of K to get orthogonal basis functions K = 𝐴𝑆𝐵 𝑇 = 𝐿𝐿𝑇 4. Compute Basis Function 𝑓𝑖 = 𝐴𝑆1/2 𝑢𝑖 or 𝑓𝑖 = 𝐿𝑢𝑖 𝑢𝑖 ∶ 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑒𝑐𝑡𝑜𝑟 𝑤𝑖𝑡ℎ 𝑧𝑒𝑟𝑜 𝑚𝑒𝑎𝑛 𝑎𝑛𝑑 𝑢𝑛𝑖𝑡 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 L : Lower part of Cholesky decomp. of K X f posterior X f prior
  • 41. J = 𝜃1 𝑟 𝑡 − 𝑦 𝑡 + 𝜃2 𝑦(𝑡) A simple PD control example Global optimal gains, θ to get a minimum cost J ?
  • 42. A simple PD control example Procedure of Bayesian Optimization 1. GP prior before observing any data 2. GP posterior, after five noisy evaluations 3. The next parameters θnext are chosen at the maximum of Acquisition function Repeat until you can find a globally optimal θ
  • 44. = 𝐻 𝑥∗ 𝐷𝑡 − 𝐻 𝑥∗ 𝐷𝑡U{𝑥, 𝑦} Information gain, I : Mutual information for an observed data  Reduction of uncertainty in the location 𝑥∗ by selecting points (𝑥, 𝑦) that are expected to cause the largest reduction in entropy of distribution 𝐻 𝑥∗ 𝐷𝑡 Acquisition function and Entropy Search