Introduction to Neural networks (under graduate course) Lecture 6 of 9

Neural Networks
Dr. Randa Elanwar
Lecture 6

Lecture Content
• Non Linearly separable functions: XOR gate
implementation
– MLP data transformation
– mapping implementation
– graphical solution
2Neural Networks Dr. Randa Elanwar

Non linear problems
• XOR problem
• The only way to separate the positive from negative examples
is to draw 2 lines (i.e., we need 2 straight line equations) or
nonlinear region to capture one type only
+ve
+ve
-ve
-ve+ve
-ve
cba yx 
22

Non linear problems
• To implement the nonlinearity we need to insert one or more
extra layer of nodes between the input layer and the output
layer (Hidden layer)

Non linear problems
2-layer Feed Forward Example XOR solution

MLP data transformation and mapping
implementation
• Need for hidden units:
• If there is one layer of enough hidden units, the input can be
recoded (memorized)  multilayer perceptron (MLP)
• This recoding allows any problem to be mapped/represented
(e.g., x2, x3, etc.)
• Question: how can the weights of the hidden units be trained?
• Answer: Learning algorithms e.g., back propagation
• The word ‘Back propagation’ is meant to the error propagation
for weight adaptation of layers beginning from the last hidden
layer back to the first i.e. weights of last layer are computed
before the previous layers

Learning Non Linearly Separable Functions
• Back propagation tries to transform training patterns to
make them almost linearly separable and use linear
network
• In other words, if we need more than 1 straight line to
separate +ve and –ve patterns, we solve the problem in
two phases:
– In phase 1: we first represent each straight line with a single
perceptron and classify/map the training patterns (output)
– In phase 2: these outputs are transformed to new patterns which
are now linearly separable and can be classified by an additional
perceptron giving the final result.

Multi-layer Networks and Perceptrons
- Have one or more
layers of hidden units.
- With two possibly very
large hidden layers, it is
possible to implement
any function.
- Networks without hidden
layer are called perceptrons.
- Perceptrons are very limited
in what they can represent,
but this makes their learning
problem much simpler.

Solving Non Linearly Separable Functions
• Example: XOR problem
• Phase 1: we draw arbitrary lines
• We find the line equations g1(x) = 0 and g2(x) = 0 using
arbitrary intersections on the axes (yellow points p1, p2, p3, p4).
• We assume the +ve and –ve directions for each line.
• We classify the given patterns as +ve/-ve with respect to both g1(x) & g2(x)
• Phase 2: we transform the patterns we have
• Let the patterns that are +ve/-ve with respect to both g1(x) and g2(x) belong to class
B (similar signs), otherwise belong to class A (different signs).
• We find the line equations g(y) = 0 using arbitrary intersections on the new axes
X2
X1
A
A
B
B
x1 x2 XOR Class
0 0 0 B
0 1 1 A
1 0 1 A
1 1 0 B
g2(x)
g1(x) +ve+ve
p1
p2
p3
p4

• Let p1 = (0.5,0), p2 = (0,0.5), p3(1.5,0), p4(0,1.5)
• Constructing g1(x) = 0
• g1(x) = x1 + x2 – 0.5 = 0
• g2(x) = x1 + x2 – 1.5 = 0
)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
px
px





5.00
05.0
5.01
02





x
x
)1(3)1(4
)2(3)2(4
)1(31
)2(32
pp
pp
px
px





5.10
05.1
5.11
02





x
x

• Assume x1>p1(1) is the positive direction for g1(x)
• Assume x1>p3(1) is the positive direction for g2(x)
• Classifying the given patterns with respect to g1(x) and g2(x):
• If we represent +ve and –ve values as a result of step function
i.e., y1 =f(g1(x))= 0 if g1(x) is –ve and 1 otherwise
and y2 =f(g2(x))= 0 if g2(x) is –ve and 1 otherwise
• We now have only three patterns that can be linearly separable and
we got rid of the extra pattern causing the problem (since 2 patterns
coincide)
x1 x2 g1(x) g2(x) y1 y2 Class
0 0 -ve -ve 0 0 B
0 1 +ve -ve 1 0 A
1 0 +ve -ve 1 0 A
1 1 +ve +ve 1 1 B

• Let p1 = (0.5,0), p2 = (0,-0.25)
• Constructing g(y) = 0
• g(y) = y1 – 2 y2 – 0.5 = 0
y2
y1
A
B
B
g(y)
+ve
p1
p2)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
py
py





5.00
025.0
5.01
02





y
y
x1
x2
1
1
-0.5
1
-2 -0.5
1
1
-1.5
g1(x)
g2(x)
g(y)
Output layerHidden layerInput layer

• Example: The linearly non separable patterns x1 = [3 0],
x2 = [5 2], x3 = [1 3], x4 = [2 4], x5 = [1 1], x6 = [3 3]
have to be classified into two categories C1 = {x1, x2,
x3, x4} and C2 = {x5, x6} using a feed forward 2-layer
neural network.
– Select a suitable number of partitioning straight lines.
– Consequently design the first stage (hidden layer) of the
network with bipolar discrete perceptrons.
– Using this layer, transform the six samples.
– Design the output layer of the network with a bipolar
discrete perceptron using the transformed samples.

• Let p1 = (0,1), p2 = (1,2),
• p3 = (2,0), p4 = (3,1)
• g1(x) = x1 - x2 + 1 = 0
• g2(x) = x1 - x2 – 2 = 0
X2
X1
x3
x1
x6
x5
g2(x)
g1(x)
+ve
+ve
p1
p2 x2
x4
p3
p4
)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
px
px





01
12
01
12





x
x
)1(3)1(4
)2(3)2(4
)1(31
)2(32
pp
pp
px
px





23
01
21
02





x
x

x1 x2 g1(x) g2(x) y1 y2 Class
3 0 +ve +ve 1 1 B
5 2 +ve +ve 1 1 B
1 3 -ve -ve -1 -1 B
2 4 -ve -ve -1 -1 B
1 1 +ve -ve 1 -1 A
3 3 +ve -ve 1 -1 A
Assume x2<p1(2) is the positive direction for g1(x)
Assume x1>p3(1) is the positive direction for g2(x)
Classifying the given patterns with respect to g1(x) and g2(x):
We now have only three patterns that can be linearly separable and we got rid of the
extra pattern causing the problem (since 2 patterns coincide)
we represent +ve and –ve values as a result of
bipolar function
y1 =f(g1(x))= -1 if g1(x) is –ve and 1 otherwise
y2 =f(g2(x))= -1 if g2(x) is –ve and 1 otherwise

• Let p1 = (1,0), p2 = (0,-1)
• Constructing g(y) = 0
• g(y) = y1 – y2 – 1 = 0
y2
y1
x5,x6
x1,x2
x3,x4
g(y)
+vep1
p2
)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
py
py





10
01
11
02





y
y
x1
x2
1
-1
1
1
-1 -1
1
-1
-2
g1(x)
g2(x)
g(y)
Output layerHidden layerInput layer

Introduction to Neural networks (under graduate course) Lecture 6 of 9

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Introduction to Neural networks (under graduate course) Lecture 6 of 9

Similar to Introduction to Neural networks (under graduate course) Lecture 6 of 9 (20)

More from Randa Elanwar

More from Randa Elanwar (20)

Recently uploaded

Recently uploaded (20)

Introduction to Neural networks (under graduate course) Lecture 6 of 9