5. • Examples of linearly inseparable classes
- Logical XOR (exclusive OR) function
patterns (bipolar) decision boundary
x1 x2 y
-1 -1 -1
-1 1 1
1 -1 1
1 1 -1
No line can separate these two classes, as can be
seen from the fact that the following linear inequality
system has no solution
because we have b < 0 from
(1) + (4), and b >= 0 from
(2) + (3), which is a
contradiction
o
xo
x
x: class I (y = 1)
o: class II (y = -1)
<++
≥−+
≥+−
<−−
(4)
(3)
(2)
(1)
0
0
0
0
21
21
21
21
wwb
wwb
wwb
wwb
6. – XOR can be solved by a more complex network with
hidden units
Y
z2
z1x1
x2
2
2
2
2
-2
-2
θ = 1
θ = 0
(-1, -1) (-1, -1) -1
(-1, 1) (-1, 1) 1
(1, -1) (1, -1) 1
(1, 1) (1, 1) -1
8. • Perceptron learning algorithm
Step 0. Initialization: wk = 0, k = 1 to n
Step 1. While stop condition is false do steps 2-5
Step 2. For each of the training sample ij: class(ij)
do steps 3 -5
Step 3. compute net= w* ij
Step 4. compute o=f(net)
Step 5. If o != class(ij)
wk := wk + µ ∗ ij * class(ij), k = 1 to n
Notes:
- Learning occurs only when a sample has o != class(ij)
- Two loops, a completion of the inner loop (each
sample is used once) is called an epoch
Stop condition
- When no weight is changed in the current epoch, or
- When pre-determined number of epochs is reached
12. Notes
Informal justification: Consider o = 1 and class(ij) = -1
– To move o toward class(ij), w1should reduce net
– If ij = 1, ij * class(ij) < 0, need to reduce w (ij *w is
reduced )
– If ij = -1, ij * class(ij) >0 need to increase w (ij *w is
reduced )
23. Error Backpropagation
We want to train a multi-layer feedforward
network by gradient descent to
approximate an unknown function, based
on some training data consisting of pairs
(x,t). The vector x represents a pattern of
input to the network, and the vector t the
corresponding target (desired output).
Weight from unit j to unit i by wij.