2. Machine Learning Paradigms : Machine Learning systems, supervised
and un-supervised learning, inductive learning, deductive learning,
clustering, support vector machines, case based reasoning and
learning.
Artificial Neural Networks : Artificial Neural Networks, Single-Layer
feedforward networks, multi-layer feedforward networks, radial basis
function networks, design issues of artificial neural networks and
recurrent networks
PPT BY MADHAV MISHRA 2
3. Machine learning (ML) is a type of artificial intelligence (AI) that allows software
applications to become more accurate at predicting outcomes without being
explicitly programmed to do so.
Machine learning algorithms use historical data as input to predict new output
values.
Recommendation engines are a common use case for machine learning.
Other popular uses include fraud detection, spam filtering, malware threat
detection, business process automation (BPA) and predictive maintenance.
PPT BY MADHAV MISHRA 3
4. Types of machine learning
Classical machine learning is often categorized by how an algorithm learns to
become more accurate in its predictions.
There are four basic approaches:
supervised learning, unsupervised learning, semi-supervised learning and
reinforcement learning.
The type of algorithm a data scientist chooses to use depends on what type of data
they want to predict.
Supervised learning. In this type of machine learning, data scientists supply
algorithms with labeled training data and define the variables they want the
algorithm to assess for correlations. Both the input and the output of the
algorithm is specified.
Unsupervised learning. This type of machine learning involves algorithms that
train on unlabeled data. The algorithm scans through data sets looking for any
meaningful connection. Both the data algorithms train on and the predictions or
recommendations they output are predetermined.
PPT BY MADHAV MISHRA 4
5. Semi-supervised learning. This approach to machine learning
involves a mix of the two preceding types. Data scientists may feed
an algorithm mostly labeled training data, but the model is free to
explore the data on its own and develop its own understanding of
the data set.
Reinforcement learning. Reinforcement learning is typically used to
teach a machine to complete a multi-step process for which there are
clearly defined rules. Data scientists program an algorithm to
complete a task and give it positive or negative cues as it works out
how to complete a task. But for the most part, the algorithm decides
on its own what steps to take along the way.
PPT BY MADHAV MISHRA 5
6. How supervised machine learning works
Supervised machine learning requires the data
scientist to train the algorithm with both
labeled inputs and desired outputs.
Supervised learning algorithms are good for the
following tasks:
Binary classification. Dividing data into two
categories.
Multi-class classification. Choosing between
more than two types of answers.
Regression modeling. Predicting continuous
values.
Ensembling. Combining the predictions of
multiple machine learning models to produce an
accurate prediction.
PPT BY MADHAV MISHRA 6
How supervised
machine learning
works…?
8. HOW
UNSUPERVISED
MACHINE
LEARNING
WORKS..?
Unsupervised machine learning algorithms do
not require data to be labeled.
They sift through unlabeled data to look for
patterns that can be used to group data points
into subsets.
Most types of deep learning, including neural
networks, are unsupervised algorithms.
Unsupervised learning algorithms are good for
the following tasks:
Clustering. Splitting the data set into groups
based on similarity.
Anomaly detection. Identifying unusual data
points in a data set.
Association mining. Identifying sets of items in
a data set that frequently occur together.
Dimensionality Reduction. Reducing the
number of variables in a data set.
PPT BY MADHAV MISHRA 8
10. HOW SEMI-
SUPERVISED
LEARNING
WORKS..?
Semi-supervised learning works by data scientists feeding
a small amount of labeled training data to an algorithm.
From this, the algorithm learns the dimensions of the
data set, which it can then apply to new, unlabeled data.
The performance of algorithms typically improves when
they train on labeled data sets.
But labeling data can be time-consuming and expensive.
Semi-supervised learning strikes a middle ground
between the performance of supervised learning and the
efficiency of unsupervised learning.
Some areas where semi-supervised learning is used
include:
Machine translation. Teaching algorithms to translate
language based on less than a full dictionary of words.
Fraud detection. Identifying cases of fraud when you only
have a few positive examples.
Labeling data. Algorithms trained on small data sets can
learn to apply data labels to larger sets automatically.
PPT BY MADHAV MISHRA
10
11. HOW REINFORCE
MENT LEARNING
WORKS..?
Reinforcement learning works by programming an
algorithm with a distinct goal and a prescribed set
of rules for accomplishing that goal.
Data scientists also program the algorithm to seek
positive rewards -- which it receives when it
performs an action that is beneficial toward the
ultimate goal -- and avoid punishments -- which it
receives when it performs an action that gets it
farther away from its ultimate goal.
Reinforcement learning is often used in areas like:
Robotics. Robots can learn to perform tasks in the
physical world using this technique.
Video gameplay. Reinforcement learning has been
used to teach bots to play a number of video
games.
Resource management. Given finite resources and
a defined goal, reinforcement learning can help
enterprises plan how to allocate resources.
PPT BY MADHAV MISHRA 11
13. Today, machine learning is used in a wide range of applications. Perhaps
one of the most well-known examples of machine learning in action is
the recommendation engine that powers Facebook's News Feed.
Facebook uses machine learning to personalize how each member's feed
is delivered. If a member frequently stops to read a particular group's
posts, the recommendation engine will start to show more of that group's
activity earlier in the feed.
Behind the scenes, the engine is attempting to reinforce known patterns
in the member's online behavior. Should the member change patterns
and fail to read posts from that group in the coming weeks, the News
Feed will adjust accordingly.
PPT BY MADHAV MISHRA 13
14. In addition to recommendation engines, other uses for machine learning include the
following:
Customer relationship management -- CRM software can use machine learning
models to analyze email and prompt sales team members to respond to the most
important messages first. More advanced systems can even recommend potentially
effective responses.
Business intelligence -- BI and analytics vendors use machine learning in their
software to identify potentially important data points, patterns of data points and
anomalies.
Human resource information systems -- HRIS systems can use machine learning
models to filter through applications and identify the best candidates for an open
position.
Self-driving cars -- Machine learning algorithms can even make it possible for a semi-
autonomous car to recognize a partially visible object and alert the driver.
Virtual assistants -- Smart assistants typically combine supervised and unsupervised
machine learning models to interpret natural speech and supply context.
PPT BY MADHAV MISHRA 14
15. CHOOSING THE
RIGHT
MACHINE
LEARNING
MODEL
The process of choosing the right machine learning
model to solve a problem can be time-consuming if
not approached strategically.
Step 1: Align the problem with potential data
inputs that should be considered for the solution.
This step requires help from data scientists and
experts who have a deep understanding of the
problem.
Step 2: Collect data, format it and label the data if
necessary. This step is typically led by data
scientists, with help from data wranglers.
Step 3: Chose which algorithm(s) to use and test to
see how well they perform. This step is usually
carried out by data scientists.
Step 4: Continue to fine-tune outputs until they
reach an acceptable level of accuracy. This step is
usually carried out by data scientists with
feedback from experts who have a deep
understanding of the problem.
PPT BY MADHAV MISHRA 15
16. Inductive Learning is where we are given examples of a function in the form of
data (x) and the output of the function (f(x)).
The goal of inductive learning is to learn the function for new data (x).
So What is Inductive Learning?
From the perspective of inductive learning, we are given input samples (x) and
output samples (f(x)) and the problem is to estimate the function (f).
Specifically, the problem is to generalize from the samples and the mapping to be
useful to estimate the output for new samples in the future.
In practice it is almost always too hard to estimate the function, so we are looking
for very good approximations of the function.
PPT BY MADHAV MISHRA 16
17. SOME
PRACTICAL
EXAMPLES
OF
INDUCTION
ARE..?
Credit risk assessment.
The x is the properties of the customer.
The f(x) is credit approved or not.
Disease diagnosis.
The x are the properties of the patient.
The f(x) is the disease they suffer from.
Face recognition.
The x are bitmaps of peoples faces.
The f(x) is to assign a name to the face.
Automatic steering.
The x are bitmap images from a camera in front of
the car.
The f(x) is the degree the steering wheel should be
turned.
PPT BY MADHAV MISHRA 17
18. WHEN
SHOULD
YOU USE
INDUCTIVE
LEARNING..
?
There are problems where inductive learning is not a
good idea. It is important when to use and when not to
use supervised machine learning.
4 problems where inductive learning might be a good
idea:
Problems where there is no human expert. If people do
not know the answer they cannot write a program to
solve it. These are areas of true discovery.
Humans can perform the task but no one can describe
how to do it. There are problems where humans can do
things that computer cannot do or do well. Examples
include riding a bike or driving a car.
Problems where the desired function changes
frequently. Humans could describe it and they could
write a program to do it, but the problem changes too
often. It is not cost effective. Examples include the
stock market.
Problems where each user needs a custom function. It
is not cost effective to write a custom program for each
user. Example is recommendations of movies or books
on Netflix or Amazon.
PPT BY MADHAV MISHRA 18
19. A
FRAMEWORK
FOR
STUDYING
INDUCTIVE
LEARNING..
Terminology used in machine learning:
Training example: a sample from x including its
output from the target function
Target function: the mapping function f from x to
f(x)
Hypothesis: approximation of f, a candidate
function.
Concept: A boolean target function, positive
examples and negative examples for the 1/0 class
values.
Classifier: Learning program outputs a classifier
that can be used to classify.
Learner: Process that creates the classifier.
Hypothesis space: set of possible approximations of
f that the algorithm can create.
Version space: subset of the hypothesis space that
is consistent with the observed data.
PPT BY MADHAV MISHRA 19
33. Acts as a Neural Networks.
Modeled off the human brain.
Does not follow explicit instructions.
Is trained instead of programmed.
PPT BY MADHAV MISHRA 33
37. Signals travel between neurons through electrical pulses
•Within neurons, communication is through chemical
neurotransmitters
•If the inputs to a neuron are greater than its threshold,
the neuron fires, sending an electrical pulse to other
neurons
PPT BY: MADHAV MISHRA
39. •Inputs and outputs are 0 (no) or 1 (yes)
•Initially, weights are random
•Provide training input
•Compare output of neural network to desired output
–If same, reinforce patterns
–If different, adjust weights
PPT BY: MADHAV MISHRA
46. Weights are numeric values which are multiplied with inputs. In backpropagation, they
are modified to reduce the loss. In simple words, weights are machine learnt values from
Neural Networks. They self-adjust depending on the difference between predicted outputs
vs training inputs.
Activation Function is a mathematical formula which helps the neuron to switch
ON/OFF.
PPT BY: MADHAV MISHRA
48. ● Input layer represents dimensions of the input vector.
● Hidden layer represents the intermediary nodes that divide the input
space into regions with (soft) boundaries. It takes in a set of weighted
input and produces output through an activation function.
● Output layer represents the output of the neural network.
PPT BY: MADHAV MISHRA
50. Perceptron model, proposed by Minsky-Papert is one of the simplest and oldest models of Neuron. It is the
smallest unit of neural network that does certain computations to detect features or business intelligence in
the input data. It accepts weighted inputs, and apply the activation function to obtain the output as the final
result. Perceptron is also known as TLU(threshold logic unit)
Perceptron is a supervised learning algorithm that classifies the data into two categories, thus it is a binary
classifier.
PPT BY: MADHAV MISHRA
51. Advantages of Perceptron
Perceptrons can implement Logic Gates like AND, OR, or NAND
Disadvantages of Perceptron
Perceptrons can only learn linearly separable problems such as boolean AND problem. For non-linear
problems such as boolean XOR problem, it does not work.
PPT BY: MADHAV MISHRA
53. The simplest form of neural networks where input data travels in one direction only, passing through artificial
neural nodes and exiting through output nodes. Where hidden layers may or may not be present, input and
output layers are present there. Based on this, they can be further classified as a single-layered or
multi-layered feed-forward neural network.
Number of layers depends on the complexity of the function. It has uni-directional forward propagation but no
backward propagation. Weights are static here. An activation function is fed by inputs which are multiplied by
weights. To do so, classifying activation function or step activation function is used.
For example: The neuron is activated if it is above threshold (usually 0) and the neuron produces 1 as an
output. The neuron is not activated if it is below threshold (usually 0) which is considered as -1. They are fairly
simple to maintain and are equipped with to deal with data which contains a lot of noise.
PPT BY: MADHAV MISHRA
54. ADVANTAGES OF FEED FORWARD NEURAL
NETWORKS
1. Less complex, easy to design & maintain
2. Fast and speedy [One-way propagation]
3. Highly responsive to noisy data
Disadvantages of Feed Forward Neural Networks:
1. Cannot be used for deep learning [due to absence of dense layers and back propagation]
PPT BY: MADHAV MISHRA
56. An entry point towards complex neural nets where input data travels through various layers of artificial
neurons. Every single node is connected to all neurons in the next layer which makes it a fully connected
neural network. Input and output layers are present having multiple hidden Layers i.e. at least three or more
layers in total. It has a bi-directional propagation i.e. forward propagation and backward propagation.
Inputs are multiplied with weights and fed to the activation function and in backpropagation, they are modified
to reduce the loss. In simple words, weights are machine learnt values from Neural Networks. They self-adjust
depending on the difference between predicted outputs vs training inputs.
PPT BY: MADHAV MISHRA
57. ADVANTAGES ON MULTI-LAYER
PERCEPTRON
1. Used for deep learning [due to the presence of dense fully connected layers and back propagation]
Disadvantages on Multi-Layer Perceptron:
1. Comparatively complex to design and maintain
Comparatively slow (depends on number of hidden layers)
PPT BY: MADHAV MISHRA
58. D. RADIAL BASIS FUNCTION NEURAL
NETWORKS
PPT BY: MADHAV MISHRA
59. Radial Basis function
if we have two classes i.e. class A and Class B, then the new input to be classified is more close to classA
prototypes than the class B prototypes. Hence, it could be tagged or classified as classA.
PPT BY: MADHAV MISHRA
61. Application: Power Restoration
a. Power Cut P1 needs to be restored first
b. Power Cut P3 needs to be restored next, as it impacts more houses
c. Power Cut P2 should be fixed last as it impacts only one house
PPT BY: MADHAV MISHRA
63. RECURRENT NEURAL
NETWORKS
The first layer is typically a feed forward neural network followed by recurrent neural
network layer where some information it had in the previous time-step is remembered by
a memory function. Forward propagation is implemented in this case. It stores information
required for its future use. If the prediction is wrong, the learning rate is employed to make
small changes
Applications of Recurrent Neural Networks
● Text processing like auto suggest, grammar checks, etc.
● Text to speech processing
● Image tagger
● Sentiment Analysis
● Translation
PPT BY: MADHAV MISHRA
65. Arguably, the best-known disadvantage of neural networks is their “black box” nature.
Simply put, you don’t know how or why your NN came up with a certain output.
For example, when you put an image of a cat into a neural network and it predicts it to be a car, it is very
hard to understand what caused it to arrive at this prediction.
This is why a lot of banks don’t use neural networks to predict whether a person is creditworthy — they
need to explain to their customers why they didn't get the loan, otherwise the person may feel unfairly
treated.
PPT BY: MADHAV MISHRA
67. Although there are libraries like Keras that make the development of neural networks fairly simple,
sometimes you need more control over the details of the algorithm, like when you're trying to solve a
difficult problem with machine learning that no one has ever done before.
In that case, you might use Tensorflow, which provides more opportunities, but it is also more
complicated and the development takes much longer (depending on what you want to build).
Then a practical question arises for any company: Is it really worth it for expensive engineers to spend
weeks developing something that may be solved much faster with a simpler algorithm?
PPT BY: MADHAV MISHRA
68. 3.AMOUNT OF DATA
Neural networks usually require much more data than traditional machine learning
algorithms, as in at least thousands if not millions of labeled samples. This isn’t an
easy problem to deal with and many machine learning problems can be solved well
with less data if you use other algorithms.
Although there are some cases where neural networks do well with little data, most of
the time they don’t. In this case, a simple algorithm like naive Bayes, which deals much
better with little data, would be the appropriate choice.
PPT BY: MADHAV MISHRA
69. 4. COMPUTATIONALLY EXPENSIVE
Usually, neural networks are also more computationally expensive than traditional algorithms.
State of the art deep learning algorithms, which realize successful training of really deep neural
networks, can take several weeks to train completely from scratch.
By contrast, most traditional machine learning algorithms take much less time to train, ranging from a
few minutes to a few hours or days.
The amount of computational power needed for a neural network depends heavily on the size of your
data, but also on the depth and complexity of your network.
For example, a neural network with one layer and 50 neurons will be much faster than a random forest
with 1,000 trees. By comparison, a neural network with 50 layers will be much slower than a random
forest with only 10 trees.
PPT BY: MADHAV MISHRA