CS767_Lecture_04.pptx

Advance Topics in Artificial Intelligence
CSC 767
Learning Algorithm - Artificial Neural Networks
(ANN)
Courtesy Shahid Abid

• Perceptron
• Simple Perceptron for Pattern Classification
• Perceptron: Algorithm
• Perceptron: Example
• Perceptron: Matlab code the example
• Assignment

• Perceptrons had perhaps the most far-reaching impact of any of
the early neural networks.
• A number of different types of Perceptrons have been used and
described by various researchers.
• The original perceptrons had three layers of neurons – sensory
units, associator units and a response unit – forming an
approximate model of a retina.
• Under suitable assumptions, its iterative learning procedure can
be proved to converge to the correct weights i.e., the weights that
allow the net to produce the correct output value for each of the
training input patterns.

The Perceptron Rule
• A bipolar threshold function with a reasonably small value of
the threshold θ is used.
• A learning parameter a is used to control the rate of learning.
• The weight change over a link is proportional to a times the
product of the input and target values.
• The learning is carried out through a multi-pass treatment.
The weights are adjusted by repeated training, until a
convergence is achieved.
• The method always converges for linearly separable
problems.

• The architecture of a simple
perceptron for performing single
classification is shown in the figure.
• The goal of the net is to classify each
input pattern as belonging, or not
belonging, to a particular class.
• Belonging is signified by the output
unit giving a response of +1; not
belonging is indicated by a response
of -1.
• A zero output means that it is not
decisive.

















x
if
x
if
x
if
x
f
1
0
1
)
(
X1
X2
l
l
l
Xn
1
Y
b
w1
w2
wn

1. Initialize weights, bias and the threshold . Also set the learning
rate a such that ( 0 < a <=1).
2. While stopping condition is false, do the following steps.
3. For each training pair s:t, do steps 4 to 7.
4. Set the activations of input units. xi = si
5. Compute the response of the output unit.
6. Update weights and bias if an error occurred for this
pattern.
If y is not equal to t (under some limit) then
wi(new) = wi(old) + a xi t for i = 1 to n
b(new) = b(old) + a t
end if
7. Test stopping condition: If no weight changed in step 3,
stop, else continue.

• Lets consider the following training data:
• We initialize the weights to be w1 = 0, w2 = 0 and b = 0.
Also we set a = 1, and  = 0.2.
• The following table shows the sequence in which the net
is provided with the input one by one and checked for the
required target.
Yin = b + x1w1+ x2w2
Inputs Target
x1 x2 t
1
1
0
0
1
0
1
0
1
-1
-1
-1

1
1 1 1 0 0 1 1 1 1
Input Bias
yin y t
x1 x2 w1 w2 b
1 0 1 2 1 -1 0 1 0
0 1 1 1 1 -1 0 0 -1
0 0 1 -1 -1 -1 0 0 -1
Net
Input
Output Target Weights
0 0 0
2 1 1 1 -1 -1 1 1 1 0
1 0 1 1 1 -1 0 1 -1
0 1 1 0 0 -1 0 0 -2
0 0 1 -2 -1 -1 0 0 -2
wi(new) = wi(old) + a xi t b(new) = b(old) + a t
Yin = b + x1w1+ x2w2 Yin <0 = -1; Yin > 0 = 1

10 1 1 1 1 1 1 2 3 -4
Input Bias
yin y t
x1 x2 w1 w2 b
1 0 1 -2 -1 -1 2 3 -4
0 1 1 -1 -1 -1 2 3 -4
0 0 1 -4 -1 -1 2 3 -4
Net
Input
Output Target Weights
Thus the positive response is given by all the points such that
And the negative response is given by all the points such that
2
.
0
4
3
2 2
1 


 x
x
2
.
0
4
3
2 2
1 

 x
x
wi(new) = wi(old) + a xi t b(new) = b(old) + a t
Yin <0 = -1; Yin > 0 = 1

Learning Rate
Learning Rate [0.001 to 1.0]
Smaller learning rates require more training epochs given the
smaller changes made to the weights each update,
whereas larger learning rates result in rapid changes and
require fewer training epochs with data loss.
Activation Function. It saya after a certain threshold
neuron is activated and below the said threshold neuron is
deactivated.
Bias. A value (or a vector) that is added to the product of
inputs and weights. The bias is used to shift the result of
activation function towards the positive or negative side

• We note that the output of the Perceptron is 1(positive), if the net input ‘yin’
is greater than .
• Also that the output is -1(negative), if the net input ‘yin’ is less than –.
• We also know that
yin = b + x1*w1 + x2*w2
in this case
• Thus we have
2*x1 + 3*x2 – 4 > 0.2 (+ve output)
2*x1 + 3*x2 – 4 < -0.2 (-ve output)
• Removing the inequalities and simplifying them gives us the following two
equations:
2*x1 + 3*x2 = 4.2
2*x1 + 3*x2 = 3.8

• These two equations are equations of straight lines in x1,x2-
plane.
• These lines geometrically separate out the input training data
into two classes.
• Hence they serve as acting like decision boundaries.
• The two lines have the same slope, but their y-intercepts are
different. So essentially, we have two parallel straight lines
acting line decision boundaries for out input data.
• The parameter , determines the separation between two
lines (and hence the width of the indecisive region).

• Graphically
8
.
3
3
2 2
1 
 x
x
2
.
4
3
2 2
1 
 x
x
Undecided
Region
x1
x2
(1, 1)
(0, 0) (1, 0)

• The following slides contain the Matlab source
code for the Simple Perceptron Studied in the
class.
• Also find the Second assignment in these
slides

Assignment#2
Note: Please submit your assignment (both in hard- and soft copy form).
Please carry out the following exercises by doing
appropriate modification in the above code.
Exercise 1: Consider the training data for the Perceptron
used above
Inputs target
x1 x2 x3 bias t
1 1 1 1 1
1 1 0 1 -1
1 0 1 1 -1
0 1 1 1 -1
(i.e. there is one more input neuron now). Verfy that
iterations converge after 26 epochs for learning rate
equal to 1.0 and theta equal to 0.1, and that the
converged set of weights and bias is w1=2, w2 =3, w3 =4
and b=-8.
Exercise 2: Plot the changes in the separating lines as
they occur in Exercise1.

Assignment#3
Question 1:
Using the above code, find the weights required to perform the
following classification:
Vectors (1,1,1,1) and (-1,1,-1,-1) are members of the class
(and therefore have target value 1);
Vectors (1,1,1,-1) and (1,-1,-1,1) are not members of the
class (and have target value -1).
Use a learning rate of 1 and starting weights of 0. Using each
of the training vectors as input, test the response of the net.
Display the number of epochs taken to each the convergence.
Also plot the maximum error in each epoch.

CS767_Lecture_04.pptx

Recommandé

Recommandé

Contenu connexe

Similaire à CS767_Lecture_04.pptx

Similaire à CS767_Lecture_04.pptx (20)

Plus de ShujatHussainGadi

Plus de ShujatHussainGadi (7)

Dernier

Dernier (20)

CS767_Lecture_04.pptx