PPT3: Main algorithms and techniques required for implementing Machine Learning & Deep Learning

Main algorithms and techniques
required for implementing Machine
Learning & Deep Learning

Table of Content
●
Programming vs Learning
●
Stages of ML Process
●
ML Architecture
–Data Acquisition
–Data Processing
–Data Modeling
–Execution
–Deployment
●
EndtoEnd ML & Analytics
Architecture
●
Skills needed for ML
● ML Practice
● Techniques of ML
–Prediction Algorithm
● Description of ML Algorithms with
Problem solving Approaches
–Supervised
–Unsupervised
–Ensemble
● ANN Algorithms
● DL algorithms
–Summary of Different DL
Architecture with pros and cons

Learning from Data without Being
Explicitly Programmed

ML Architecture: Data Acquisition

ML Architecture: Data Processing

ML Architecture: Data Modeling

EndtoEnd ML & Analytics
Architecture

Techniques
Choose it if you need to explore
your data & want to train a model
to find a good internal
representation, such as
splitting data up into clusters.
Choose it if you need to train a
model to make a prediction, the
future value of a continuous
variable or a classification.
Supervised
Learning
Unsupervised
Learning

Linear Regression
(Lots of numerical data)
 Used for applications such as Sales Forecasting, Risk Assessment analysis
 It is used to showcase the relationship between dependent and independent variables
K Nearest Neighbors
(Group membership based on proximity)
 Used in industrial applications
 Used in handwriting detection
 Used in image/video recognition tasks.
 It is a lazy algorithm that takes a nonparametric approach to predictive analysis.
Logistic Regression
(Target variable is categorical)
 To identifying risk factors for diseases & planning preventive measures
 Classifying words as nouns, pronouns, & verbs
● Weather forecasting
● It is used for predictive analysis.

Decision Tree
(If/then/else. Noncontiguous data. Can also be regression)
 Used in pattern recognition
 Used in option pricing in finances & identifying disease & risk trends.
 Robust to errors
 Can handle missing values nicely
 Can handle both categorical and numerical variables.
Random Forest
(Find best split randomly.Can also be regression)
 Used in industrial applications
 Used for both classification and regression analysis tasks.
 Creates a forest with a no. of trees & makes them random.
 Runs efficiently on large databases.
 Has high classification accuracy.
Support Vector Machine
(Maximum margin classifier.Fundamental Data Science algorithm)
 Used in business applications – Such as comparing the relative performance of stocks over
a period of time.
 Classifying data sets into different classes through a hyperplane.
 Marginalizes the classes & maximizes the distances between them.
 It require more accuracy & efficiency of data.

Principal Component Analysis (PCA)
(Distil feature space into components that describe greatest variance )
 Used in gene expression analysis, stock market predictions
 Used in pattern classification tasks that ignore class labels.
 It is dimensionality reduction algorithm.
 Used for speeding up.
 Used for making compelling visualizations of complex datasets.
 It identifies patterns in data & aims to make correlations of variables in them.
KMeans Clustering
(Similar datum into groups based on centroids)
 Used in grouping images into different categories, detecting different activity types in
motion sensors
 For monitoring whether tracked data points changes between different groups over
time.
 It works by categorizing unstructured data into a no. of different groups.
 Each dataset contains a collection of features
 Algorithm classifies unstructured data & categorizes them based on specific features.
Naive Bayes Classifier
(Updating knowledge step by step with new info )
 Based on Bayes Theorem of probability.
 Used in document classification,spam filters, sentiment analysis etc.
 These algorithm for ranking pages, indexing relevacy scores and classifying data
categorically.

NAME DESCRIPTION ADVANTAGES DISADVANTAGES
LINEAR
REGRESSION
The “best fit” line
through all data points.
Predictions are
numerical.
Easy to understand –
we clearly see what the
biggest drivers of the
model are.
● Sometimes too
simple to capture
complex
relationships
between variables.
● Tendency for the
model to “overfit”.
LOGISTIC
REGRESSION
The adaptation of linear
regression to problems
of classification (e.g.,
yes/no questions,
groups, etc.)
Also easy to understand. ● Sometimes too
simple to capture
complex
relationships
between variables.
● Tendency for the
model to “overfit”.
DECISION TREE A graph that uses
branching method to
match all possible
outcomes of a decision.
Easy to understand and
implement.
● Not often used on
its own for
prediction because
it’s also often too
simple and not
powerful enough for
complex data.

NAME DESCRIPTION ADVANTAGES DISADVANTAGES
RANDOM FOREST Takes the average of
many decision trees,
each of which is made
with a sample of the
data. Each tree is
weaker than a full
decision tree, but by
combining them we get
better overall
performance.
A sort of “wisdom of the
crowd”. Tends to result
in very high quality
models. Fast to train.
● Can be slow to
output predictions
relative to other
algorithms.
● Not easy to
understand
predictions.
GRADIENT
BOOSTING
Uses even weaker
decision trees, that are
increasingly focused on
“hard” examples.
High – performing. ● A small change in
the feature set or
training set can
create radical
changes in the
model.
● Not easy to
understand
predictions.
NEURAL
NETWORKS
Mimics the behavior of
the brain. Neural
networks ae
interconnected neurons
that pass messages to
each other. Deep
learning uses several
layers of neural
networks put one after
the other.
Can handel extremely
complex tasks – no
other algorithm comes
close in image
recognition.
● Very, very slow
totrain, because they
have so many layers.
Require a lot of
power.
● Almost impossible to
understand
predictions.

Description of ML Algorithms
with Problem solving
Approaches

Description:
Computations are structured in terms
of interconnected groups, much like
the neurons in a brain. Neural networks
are used to model complex relationships
between inputs and outputs to find patterns
in data or to capture a statistical structure
among variables with unknown relationships.
They may also be used to discover unknown
inputs (unsupervised).
Usage Examples
in Business:
➢ Predicting financial
results
➢ Fraud detection
Neural
network
Supervised
type

Description:
of categories outputs or observations based on
defined classifications. Classification models
are used to predict new outputs based on
Classification rules. Regression models are
generally used to predict outputs from training
data.
Usage Examples
in Business:
➢ Spam filtering
➢ Fraud detection
Classification
And/or
regression
Supervised
type

Description:
Computations are particular representations
of possible solutions to a decision
based on certain conditions. Decision
trees are great for building classification
models because they can decompose
datasets into smaller, more manageable
subsets.
Usage Examples
in Business:
➢ Risk assessment
➢ Threat management
systems
➢ Any optimization
problem where an
exhaustive search is
not feasible
Decision
tree
Supervised
type

Description:
of groups of input data (clusters) based
on how similar they are to one another.
Cluster analysis is heavily used to solve
exploratory challenges where little is
known about the data.
Usage Examples
in Business:
➢ Financial transactions
➢ Streaming analytics
in IoT
➢ Underwriting in
insurance
Cluster
analysis
Unsupervised
type

Description:
Computations are used to provide a
description or label to input data, such as in
classification. Each input is evaluated and
matched based on a pattern identified. Pattern
recognition can be used for supervised
learning as well.
Usage Examples
in Business:
➢ Spam detection
➢ Biometrics
➢ Identify management
Pattern
recognition
Unsupervised
type

Description:
Computations are rulebased in order to
Determine the relationship between different
Types of input or variables and to make
Predictions.
Usage Examples
in Business:
➢ Security & intrusion
detection
➢ Bioinformatics
➢ Manufacturing and
assembly
Association
rule
learning
Unsupervised
type

Ensemble Algorithms
Ensemble methods are models composed of multiple
weaker models that are independently trained & whose
predictions are combined in some way to make the
overall prediction.
Techniques are:
● Boosting
● Bootstrapped Aggregation (Bagging)
● Stacked Generalization (Stacking)
● Gradient Boosting Machines (GBM)
● Gradient Boosted Regression Trees(GBRT) etc.

Neural Networks
(Complex relationships. Prone to overfitting Basically magic)
• DL networks used in handwriting analysis, colorization of black & white images
• Computer vision processes
• Describing or captioning photos based on visual features
ANN
• It consist of different layers
which analyze data.
• Hidden layers detect patterns
in data.
• Greater the no. of layers, more
accurate the outcomes.
• Neural networks learn on their
own & assign weights to neurons
every time their networks process
data.

Artificial Neural Network
Algorithms:
ANN are models that are inspired by the structure and/or function of
biological neural networks.
They are a class of pattern matching that are commonly used for regression
and classification problems but are really an enormous subfield comprised of
hundreds of algorithms and variations for all manner of problem types.
Algorithms are:
● Perceptron
● Multilayer Perceptrons (MLP)
● BackPropagation
● Stochastic Gradient Descent
● Hopfield Network
● Radial Basis Function Network (RBFN)

Perceptron
• It is a basic processing unit.
• The output of a perceptron is
the weighted sum of its inputs
and a bias unit, which acts
as an intercept
Multilayer Perceptron
• A multilayer perceptron, besides
input & output layers, also has
hidden layers.
• This stack of layer allow it to learn
nonlinear decision boundaries in
the data.
Stacked Autoencoder
• It is multiple layers of autoencoders that are
trained in an unsupervised fashion individually.
• After this one final softmax layer is trained.

Deep Learning Algorithms:
DL methods are a modern update to ANN that exploit abundant cheap
computation. They are concerned with building much larger and more complex
neural networks. They are concerned with very large datasets of labelled analog
data, such as image, text, audio, and video.
Algorithms are:
● Convolutional Neural Network (CNN)
● Recurrent Neural Networks (RNNs)
● Long ShortTerm Memory Networks (LSTMs)
● Stacked AutoEncoders
● Deep Boltzmann Machine (DBM)
● Deep Belief Networks (DBN)
● Generative Adversarial Network (GAN)
● Deep Learning based Transfer Learning

CNN & RNN
• CNN are feedforward Neural networks which take in fixed inputs & give fixed outputs.
• For example image feature classification,video processing
• RNN use internal memory
• RNN are versatile
• Use timeseries information for giving outputs.
• For example language processing tasks, text & speech analysis
Deep Belief Networks
• Used in field of Image Recognition, Video Sequence recognition, and Motioncapture
data.
• It is comprised of multiple layers of graphical models having directed and undirected
edges.
• DBN does not use any labels.
• DBNs are generative models.
• A DBN finetunes the entire input in a sequence as the model is trained

Boltzmann Machine
• These are two layer neural networks which make stochastic decisions
• It does not discriminate between neurons
• It learns the distribution of data using the input & makes inferences on unseen data.
• It is a generative model – it does not expect input, it rather creates it.
Generative Adversarial Networks
• GANs are used for generating new data.
• GAN comprises of 2 parts, a discriminator and a generator.
• Generator is like a reverse CNN, it takes a small amount of data & up scales it to
generate input.
• Discriminant takes this input & predicts whether it belongs to the dataset.
• GANs have been used to generate paintings.

Summary of Different Deep
Learning Architecture

Convolutional Neural Network
Description
●
Inspired by the neuralbiological model of the visual
cortex.
●
Wellsuited for 2D data, i.e., images, thus 1D
temporal data need to be preprocessed to form 2D
vectors.
●
ReLU after convolutional layers help accelerate the
convergence speed.
Characteristics
Pros:
– Few neuron connections required with respect to a
typical ANN.
– Many variants have been prposed: ADCNN, LiftingNet,
and inception net, etc.
– The classical CNN exhibits a good denoising capability.
Cons:
– May require many layers to find an entire hierarchy.
– May require a large labeled dataset.

Deep Autoencoder
Description
● Mainly designed for feature extraction or
dimension reduction.
● Has the same number of input and output
nodes.
● Unsupervised learning method aiming at
reconstructing the input vector.
Characteristics
Pros:
– Does not require labeled data.
– Many variations have been proposed to make the
representation more noiseresilient and robust:
stacked denoising AE, deep ensemble AE, and
stacked sparse AE.
Cons:
– Requires a pretraining stage.
– Training may suffer from the vanishing of errors.

Deep Belief Network
Description
● Composed of RBMs where each subnetwork’s
hidden layer serves as the visible layer for
the text.
● Has undirected connections just at the top
layers.
● Allow unsupervised and supervised training
of the network.
Characteristics
Pros:
– Proposes a layerbylayer greedy learning
strategy to initialize the network.
– Tractable inferences maximize the likelihood
directly.
Cons:
– Training may be computationally expensive due
to the initialization process and sampling stage.

Recurrent Neural Network
Description
● An ANN capable of analyzing 1D
sequential or temporal data streams.
● LSTM revibrated the application of RNNs.
● Suitable for applications where the output
depends in the previous computations.
Characteristics
Pros:
– Memorizes sequential events.
– Models time dependencies.
– Capable of receiving inputs of variable lengths.
Cons:
– Frequent learning issues due to gradient
vanishing/exploding.

Generative Adversarial Network
Description
● Mainly designed to generate photographs that
look superficially authentic to human observers.
● Implemented by a system of generative and
discriminative networks.
● SemiSupervised learning method aiming at
reconstructing the input vector.
Characteristics
Pros:
– Requires almost no modifications when transferring to
new applications.
– Requires no Monte Carlo approximations to train.
– Does not introduce deterministic bias.
Cons:
– GAN training is unstable as it requires finding the
Nash equilibrium of a game.
– Hard to learn to generate discrete data, such as text.

PPT3: Main algorithms and techniques required for implementing Machine Learning & Deep Learning

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à PPT3: Main algorithms and techniques required for implementing Machine Learning & Deep Learning

Similaire à PPT3: Main algorithms and techniques required for implementing Machine Learning & Deep Learning (20)

Dernier

Dernier (20)

PPT3: Main algorithms and techniques required for implementing Machine Learning & Deep Learning