SlideShare une entreprise Scribd logo
Presented By: Aayush Srivastava
& Divyank Saxena
Methods of
Optimization in
Machine Learning
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
Punctuality
Join the session 5 minutes prior to
the session start time. We start on
time and conclude on time!
Feedback
Make sure to submit a constructive
feedback for all sessions as it is
very helpful for the presenter.
Silent Mode
Keep your mobile devices in silent
mode, feel free to move out of
session in case you need to attend
an urgent call.
Avoid Disturbance
Avoid unwanted chit chat during
the session.
Our Agenda
01 What is Optimization in
Machine Learning
02 What is Gradient Descent
03
What is Minibatch Stochastic
Gradient
04
What is Adam optimization
05
Demo
05
06
What is Stochastic Gradient
Descent
.
What is Optimization in ML
● Optimization in Machine Learning is a technique used to find the best set of parameters for a given
model to minimize a loss function and improve its performance. It is an essential step in the training
process of a machine learning model.
● The goal of optimization is to find the best weights and biases for the model, so that it can make
accurate predictions.
● Optimization is used in machine learning because models typically have many parameters, and finding
the best values for those parameters can be a challenging task.
● With optimization techniques, the model can automatically search for the best parameters, rather than
relying on manual tuning by the user.
.
What is Cost Function
● A cost function is a function which measures the error between predictions and their actual values
across the whole dataset.
● Minimizing the cost function helps the learning algorithm find the optimal set of parameters, such as
weights and biases, that produce the best predictions.
● Cost function is a measure of how wrong the model is in estimating the relationship between X(input)
and Y(output) Parameter
- m is the number of samples
- Sum from i to m,
- The actual calculation is just the hypothesis value for h(x)
minus the actual value of y. Then you square whatever you get.
.
What is Cost Function
● Let’s run through the calculation for best_fit_1.
1.The hypothesis is 0.50. This is the h_the ha(x(i)) part
what we think is the correct value.
2.The actual value for the sample data is 1.00.
So we are left with (0.50 — 1.00)^2 , which is 0.25.
3.Let’s add this result to an array called results and do the same for all three points
4.Results = [0.25, 2.25, 4.00]
5.Finally, we add them all up and multiply by ⅙ .We get the cost for best_fit1 = 1.083
.
What is Cost Function
● COST: best_fit_1: 1.083
best_fit_2: 0.083
best_fit_3: 0.25
● A low costs represents a smaller difference.
.
What is Loss Function
● A loss function, also known objective function, is a mathematical measure of how well a model is able
to make predictions that match the true values.
● A loss function measures the error between a single prediction and the corresponding actual value.
● Loss and cost functions are methods of measuring the error in machine learning predictions. Loss
functions measure the error per observation, whilst cost functions measure the error over all
observations.
Types:
1.Mean Squared Error (MSE): This loss function measures the average squared difference between the
predicted values and the true values.
2.Mean Absolute Error (MAE): This loss function measures the average absolute difference between the
predicted values and the true values.
● Gradient, in plain terms means slope or slant of a surface. So gradient descent literally means
descending a slope to reach the lowest point on that surface
● Gradient descent enables a model to learn the gradient or direction that the model should take in
order to reduce errors (differences between actual y and predicted y).
● This algorithm that tries to find a minimum of a function iteratively
What is Gradient Descent
.
What is Learning Rate
● Learning Rate:
The learning rate is a hyperparameter in machine learning that determines the step size at which the
optimization algorithm updates the model's parameters. It is used to control the speed at which the
model learns.
.
Limitation of Gradient Descent
● Some limitations and drawbacks that can affect its performance and efficiency.
● Local Minima: Gradient Descent can get stuck in a local minimum, which may not be the global
minimum, and therefore, the optimization will not produce the best result.
● Vanishing gradient: When training deep neural networks, the gradients can become very small,
leading to the vanishing gradient problem, which can slow down or prevent convergence.
● Stochastic Gradient Descent (SGD) is a variant of Gradient Descent optimization algorithm, that is
used to update the parameters of a model in a more efficient and faster way.
● “Stochastic” in plain terms means “random”
● In SGD, at each step, the algorithm calculates the gradient for one observation picked at random,
instead of calculating the gradient for the entire dataset..
● So, let’s have a dataset that contains 1000 rows, and when we apply SGD it will update the model
parameters 1000 times in one complete cycle of a dataset instead of one time as in Gradient Descent.
What is Stochastic Gradient Descent
● In the left diagram of the above picture, we have SGD (where 1 per step time) we take a Gradient
Descent step for each example and on the right diagram is GD(1 step per entire training set).
● This represents a significant performance improvement, when the dataset contains millions of
observations.
What is Stochastic Gradient Descent
Advantages of Stochastic Gradient Descent
● It is easier to fit into memory due to a single training sample being processed by the network
● For larger datasets it can converge faster as it causes updates to the parameters more frequently
● Due to frequent updates the steps taken towards the minima of the loss function have oscillations
which can help getting out of local minimums of the loss function
What is Stochastic Gradient Descent
● So far we encountered two extremes in the approach to gradient-based learning:
● First Gradient Descent uses the full dataset to compute gradients and to update parameters, one
pass at a time. And Conversely, Stochastic Gradient Descent processes one training example at a
time to make progress. Either of them has its own drawbacks.
● Gradient descent is not particularly data efficient whenever data is very similar. Stochastic gradient
descent is not particularly computationally efficient since CPUs and GPUs cannot exploit the full
power of vectorization.
● This suggests that there might be something in between, and in fact, that is what we have been using
so far in the examples we discussed.
What is Minibatch Stochastic Gradient
● Mini Batch Gradient Descent is considered to be the cross-over between GD and SGD. In this
approach instead of iterating through the entire dataset or one observation, we split the dataset into
small subsets (batches) and compute the gradients for each batch.
● Steps involved in Mini-batch stochastic gradient:
1. Pick a mini-batch
2. Feed it to Neural Network
3. Calculate the mean gradient of the mini-batch
4. Use the mean gradient we calculated in step 3 to update the weights
5. Repeat steps 1–4 for the mini-batches we created
What is Minibatch Stochastic Gradient
● Minibatch stochastic gradient descent is able to trade-off convergence speed and computation
efficiency. A minibatch size of 10 is more efficient than stochastic gradient descent; a minibatch size
of 100 even outperforms GD in terms of runtime.
What is Minibatch Stochastic Gradient
Advantages of Mini-Batch Gradient Descent:
● Reduces variance of the parameter update and hence lead to stable convergence
● Speeds the learning
● Helpful to estimate the approximate location of the actual minimum
Disadvantages of Mini Batch Gradient Descent:
● Loss is computed for each mini batch and hence total loss needs to be accumulated across all mini
batches
Advantages and Disadvantages
The Adam optimization algorithm is an extension to stochastic gradient descent that has recently
seen broader adoption for deep learning applications in computer vision and natural language
processing.
The method is really efficient when working with large problem involving a lot of data or parameters.
Adam is an adaptive learning rate method, which means, it computes individual learning rates for
different parameters. Its name is derived from adaptive moment estimation
What is Adam Optimizer
The method computes individual adaptive learning rates for different parameters from estimates of
first and second moments of the gradients.
Adam optimizer involves a combination of two gradient descent methodologies:
1. Momentum:
This algorithm is used to accelerate the gradient descent algorithm by taking into consideration
the ‘exponentially weighted average’ of the gradients. Using averages makes the algorithm
converge towards the minima in a faster pace.
2. Root Mean Square Propagation (RMSP):
It maintains per-parameter learning rates that are adapted based on the average of recent
magnitudes of the gradients for the weight (e.g. how quickly it is changing). This means the
algorithm does well on online and non-stationary problems (e.g. noisy).
How Adam Optimizer Work
List of attractive benefits of using Adam, as follows:
● Straightforward to implement.
● Computationally efficient.
● Less memory requirements.
● Well suited for problems that are large in terms of data and/or parameters.
● Appropriate for problems with very noisy/or sparse gradients.
● Hyper-parameters have intuitive interpretation and typically require little tuning.
Benefits of Adam Optimizer
Demo
Thank You !
Get in touch with us:
Lorem Studio, Lord Building
D4456, LA, USA

Contenu connexe

Tendances

Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
Francesco Casalegno
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
Prof. Neeta Awasthy
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
Mostafa G. M. Mostafa
 
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
Amir Razmjou
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
Kush Kulshrestha
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Hakky St
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
Jie-Han Chen
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
Francesco Casalegno
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
Tunde Ajose-Ismail
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
DezyreAcademy
 

Tendances (20)

Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 

Similaire à Methods of Optimization in Machine Learning

4. OPTIMIZATION NN AND FL.pptx
4. OPTIMIZATION NN AND FL.pptx4. OPTIMIZATION NN AND FL.pptx
4. OPTIMIZATION NN AND FL.pptx
kumarkaushal17
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
Regresión
RegresiónRegresión
Regression ppt
Regression pptRegression ppt
Regression ppt
SuyashSingh70
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
MohamedAliHabib3
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
Vishwas N
 
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine LearningA Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
Venkata Karthik Gullapalli
 
Dnn guidelines
Dnn guidelinesDnn guidelines
Dnn guidelines
Naitik Shukla
 
Feature Scaling and Normalization Feature Scaling and Normalization.pptx
Feature Scaling and Normalization Feature Scaling and Normalization.pptxFeature Scaling and Normalization Feature Scaling and Normalization.pptx
Feature Scaling and Normalization Feature Scaling and Normalization.pptx
Nishant83346
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
Joe li
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Maninda Edirisooriya
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
PATHALAMRAJESH
 
Sample_Subjective_Questions_Answers (1).pdf
Sample_Subjective_Questions_Answers (1).pdfSample_Subjective_Questions_Answers (1).pdf
Sample_Subjective_Questions_Answers (1).pdf
AaryanArora10
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
Chitrachitrap
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
Tahmid Abtahi
 
Everything You Wanted to Know About Optimization
Everything You Wanted to Know About OptimizationEverything You Wanted to Know About Optimization
Everything You Wanted to Know About Optimization
indico data
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
Ramakrishna Reddy Bijjam
 
Dimd_m_004 DL.pdf
Dimd_m_004 DL.pdfDimd_m_004 DL.pdf
Dimd_m_004 DL.pdf
juan631
 
Linear programming models - U2.pptx
Linear programming models - U2.pptxLinear programming models - U2.pptx
Linear programming models - U2.pptx
MariaBurgos55
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
Wuhyun Rico Shin
 

Similaire à Methods of Optimization in Machine Learning (20)

4. OPTIMIZATION NN AND FL.pptx
4. OPTIMIZATION NN AND FL.pptx4. OPTIMIZATION NN AND FL.pptx
4. OPTIMIZATION NN AND FL.pptx
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Regresión
RegresiónRegresión
Regresión
 
Regression ppt
Regression pptRegression ppt
Regression ppt
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine LearningA Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
 
Dnn guidelines
Dnn guidelinesDnn guidelines
Dnn guidelines
 
Feature Scaling and Normalization Feature Scaling and Normalization.pptx
Feature Scaling and Normalization Feature Scaling and Normalization.pptxFeature Scaling and Normalization Feature Scaling and Normalization.pptx
Feature Scaling and Normalization Feature Scaling and Normalization.pptx
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
Sample_Subjective_Questions_Answers (1).pdf
Sample_Subjective_Questions_Answers (1).pdfSample_Subjective_Questions_Answers (1).pdf
Sample_Subjective_Questions_Answers (1).pdf
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Everything You Wanted to Know About Optimization
Everything You Wanted to Know About OptimizationEverything You Wanted to Know About Optimization
Everything You Wanted to Know About Optimization
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
Dimd_m_004 DL.pdf
Dimd_m_004 DL.pdfDimd_m_004 DL.pdf
Dimd_m_004 DL.pdf
 
Linear programming models - U2.pptx
Linear programming models - U2.pptxLinear programming models - U2.pptx
Linear programming models - U2.pptx
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
 

Plus de Knoldus Inc.

Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
Knoldus Inc.
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
Knoldus Inc.
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
Knoldus Inc.
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
Knoldus Inc.
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
Knoldus Inc.
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
Knoldus Inc.
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
Knoldus Inc.
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
Knoldus Inc.
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
Knoldus Inc.
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
Knoldus Inc.
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
Knoldus Inc.
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
Knoldus Inc.
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
Knoldus Inc.
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
Knoldus Inc.
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
Knoldus Inc.
 
Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
Knoldus Inc.
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Knoldus Inc.
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
Knoldus Inc.
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
Knoldus Inc.
 

Plus de Knoldus Inc. (20)

Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
 
Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
 

Dernier

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 

Dernier (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 

Methods of Optimization in Machine Learning

  • 1. Presented By: Aayush Srivastava & Divyank Saxena Methods of Optimization in Machine Learning
  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time! Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter. Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call. Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. Our Agenda 01 What is Optimization in Machine Learning 02 What is Gradient Descent 03 What is Minibatch Stochastic Gradient 04 What is Adam optimization 05 Demo 05 06 What is Stochastic Gradient Descent
  • 4. . What is Optimization in ML ● Optimization in Machine Learning is a technique used to find the best set of parameters for a given model to minimize a loss function and improve its performance. It is an essential step in the training process of a machine learning model. ● The goal of optimization is to find the best weights and biases for the model, so that it can make accurate predictions. ● Optimization is used in machine learning because models typically have many parameters, and finding the best values for those parameters can be a challenging task. ● With optimization techniques, the model can automatically search for the best parameters, rather than relying on manual tuning by the user.
  • 5. . What is Cost Function ● A cost function is a function which measures the error between predictions and their actual values across the whole dataset. ● Minimizing the cost function helps the learning algorithm find the optimal set of parameters, such as weights and biases, that produce the best predictions. ● Cost function is a measure of how wrong the model is in estimating the relationship between X(input) and Y(output) Parameter - m is the number of samples - Sum from i to m, - The actual calculation is just the hypothesis value for h(x) minus the actual value of y. Then you square whatever you get.
  • 6. . What is Cost Function ● Let’s run through the calculation for best_fit_1. 1.The hypothesis is 0.50. This is the h_the ha(x(i)) part what we think is the correct value. 2.The actual value for the sample data is 1.00. So we are left with (0.50 — 1.00)^2 , which is 0.25. 3.Let’s add this result to an array called results and do the same for all three points 4.Results = [0.25, 2.25, 4.00] 5.Finally, we add them all up and multiply by ⅙ .We get the cost for best_fit1 = 1.083
  • 7. . What is Cost Function ● COST: best_fit_1: 1.083 best_fit_2: 0.083 best_fit_3: 0.25 ● A low costs represents a smaller difference.
  • 8. . What is Loss Function ● A loss function, also known objective function, is a mathematical measure of how well a model is able to make predictions that match the true values. ● A loss function measures the error between a single prediction and the corresponding actual value. ● Loss and cost functions are methods of measuring the error in machine learning predictions. Loss functions measure the error per observation, whilst cost functions measure the error over all observations. Types: 1.Mean Squared Error (MSE): This loss function measures the average squared difference between the predicted values and the true values. 2.Mean Absolute Error (MAE): This loss function measures the average absolute difference between the predicted values and the true values.
  • 9. ● Gradient, in plain terms means slope or slant of a surface. So gradient descent literally means descending a slope to reach the lowest point on that surface ● Gradient descent enables a model to learn the gradient or direction that the model should take in order to reduce errors (differences between actual y and predicted y). ● This algorithm that tries to find a minimum of a function iteratively What is Gradient Descent
  • 10. . What is Learning Rate ● Learning Rate: The learning rate is a hyperparameter in machine learning that determines the step size at which the optimization algorithm updates the model's parameters. It is used to control the speed at which the model learns.
  • 11. . Limitation of Gradient Descent ● Some limitations and drawbacks that can affect its performance and efficiency. ● Local Minima: Gradient Descent can get stuck in a local minimum, which may not be the global minimum, and therefore, the optimization will not produce the best result. ● Vanishing gradient: When training deep neural networks, the gradients can become very small, leading to the vanishing gradient problem, which can slow down or prevent convergence.
  • 12. ● Stochastic Gradient Descent (SGD) is a variant of Gradient Descent optimization algorithm, that is used to update the parameters of a model in a more efficient and faster way. ● “Stochastic” in plain terms means “random” ● In SGD, at each step, the algorithm calculates the gradient for one observation picked at random, instead of calculating the gradient for the entire dataset.. ● So, let’s have a dataset that contains 1000 rows, and when we apply SGD it will update the model parameters 1000 times in one complete cycle of a dataset instead of one time as in Gradient Descent. What is Stochastic Gradient Descent
  • 13. ● In the left diagram of the above picture, we have SGD (where 1 per step time) we take a Gradient Descent step for each example and on the right diagram is GD(1 step per entire training set). ● This represents a significant performance improvement, when the dataset contains millions of observations. What is Stochastic Gradient Descent
  • 14. Advantages of Stochastic Gradient Descent ● It is easier to fit into memory due to a single training sample being processed by the network ● For larger datasets it can converge faster as it causes updates to the parameters more frequently ● Due to frequent updates the steps taken towards the minima of the loss function have oscillations which can help getting out of local minimums of the loss function What is Stochastic Gradient Descent
  • 15. ● So far we encountered two extremes in the approach to gradient-based learning: ● First Gradient Descent uses the full dataset to compute gradients and to update parameters, one pass at a time. And Conversely, Stochastic Gradient Descent processes one training example at a time to make progress. Either of them has its own drawbacks. ● Gradient descent is not particularly data efficient whenever data is very similar. Stochastic gradient descent is not particularly computationally efficient since CPUs and GPUs cannot exploit the full power of vectorization. ● This suggests that there might be something in between, and in fact, that is what we have been using so far in the examples we discussed. What is Minibatch Stochastic Gradient
  • 16. ● Mini Batch Gradient Descent is considered to be the cross-over between GD and SGD. In this approach instead of iterating through the entire dataset or one observation, we split the dataset into small subsets (batches) and compute the gradients for each batch. ● Steps involved in Mini-batch stochastic gradient: 1. Pick a mini-batch 2. Feed it to Neural Network 3. Calculate the mean gradient of the mini-batch 4. Use the mean gradient we calculated in step 3 to update the weights 5. Repeat steps 1–4 for the mini-batches we created What is Minibatch Stochastic Gradient
  • 17. ● Minibatch stochastic gradient descent is able to trade-off convergence speed and computation efficiency. A minibatch size of 10 is more efficient than stochastic gradient descent; a minibatch size of 100 even outperforms GD in terms of runtime. What is Minibatch Stochastic Gradient
  • 18. Advantages of Mini-Batch Gradient Descent: ● Reduces variance of the parameter update and hence lead to stable convergence ● Speeds the learning ● Helpful to estimate the approximate location of the actual minimum Disadvantages of Mini Batch Gradient Descent: ● Loss is computed for each mini batch and hence total loss needs to be accumulated across all mini batches Advantages and Disadvantages
  • 19. The Adam optimization algorithm is an extension to stochastic gradient descent that has recently seen broader adoption for deep learning applications in computer vision and natural language processing. The method is really efficient when working with large problem involving a lot of data or parameters. Adam is an adaptive learning rate method, which means, it computes individual learning rates for different parameters. Its name is derived from adaptive moment estimation What is Adam Optimizer
  • 20. The method computes individual adaptive learning rates for different parameters from estimates of first and second moments of the gradients. Adam optimizer involves a combination of two gradient descent methodologies: 1. Momentum: This algorithm is used to accelerate the gradient descent algorithm by taking into consideration the ‘exponentially weighted average’ of the gradients. Using averages makes the algorithm converge towards the minima in a faster pace. 2. Root Mean Square Propagation (RMSP): It maintains per-parameter learning rates that are adapted based on the average of recent magnitudes of the gradients for the weight (e.g. how quickly it is changing). This means the algorithm does well on online and non-stationary problems (e.g. noisy). How Adam Optimizer Work
  • 21. List of attractive benefits of using Adam, as follows: ● Straightforward to implement. ● Computationally efficient. ● Less memory requirements. ● Well suited for problems that are large in terms of data and/or parameters. ● Appropriate for problems with very noisy/or sparse gradients. ● Hyper-parameters have intuitive interpretation and typically require little tuning. Benefits of Adam Optimizer
  • 22. Demo
  • 23. Thank You ! Get in touch with us: Lorem Studio, Lord Building D4456, LA, USA