# Neural Networks

Sagacious IT Solution
15 Jul 2018
1 sur 56

### Neural Networks

• 1. Applications of AI (Neural Network) By - Suraj Awal
• 2. Artificial Neural Network (ANN) - Neural network is the network of neurons as found in the brain - ANN is the network of artificial neurons that approximately represent the parts of the brain - Artificial neurons are the approximation of the brain neurons using physical device or mathematical model
• 3. Human brain and ANN - In ANN, the brain is modelled as: a) Neuron → soma b) Input → dendrite c) Output → axon d) Weight → synapse - Neuron encode activation or output as a series of electrical pulses - Soma processes incoming activations and converts into output activations - Dendrites receive activation from other neurons - Axon transmits output activation to other neurons - The junction between axon and dendrite is called synapse
• 4. Basic Terminologies Weighting Factor (w) - The value given to each input to determine its strength - Neuron computes the weighted sum of input and compares with threshold - If sum is less than threshold, neuron outputs -1. Else, the neuron activates and output +1 Threshold - The minimum value required by the node as inputs in order to activate the node
• 5. Basic Terminologies Activation Function (f) - The function that performs mathematical operation on the signal output - Sign function, Sigmoid function, Step function, Linear Function - Sign function : f = +1 (x>=t) or -1 - Step function : f = +1 (x>=t) or 0 - Sigmoid function : f = (1 / (1 + e-X)) - Linear function : f = X
• 7. Types of Neural Network (Single Layer Feed Forward)
• 8. Types of Neural Network (Multi Layer Feed Forward)
• 9. Types of Neural Network (Feedback or Recurrent Network)
• 10. Applications of Neural Network 1. Financial Modelling - Stock Market Value Prediction 2. Robotics - Automatic adaptable robot 3. Data Analysis 4. Predictive Model - Weather prediction 5. Bioinformatic Application - DNA Sequencing
• 11. Learning Process 1. Supervised Learning: - You will have labelled training data - You will classify the inputs into one of the label 1. Unsupervised Learning: - You will have unlabelled data - You will cluster the inputs into different classes
• 12. Learning Rate - Constant that affects the speed of learning - Affects the performance of the algorithm - High learning rate : accuracy will be low but trains faster - Low learning rate : accuracy will be high but trains slowly - Learning rate should be adjusted as per necessity after complete analysis of the system
• 13. Activation Function (f) - Performs mathematical operation on the signal output - Step Function : f(x) = 1 if x >= T ; 0 otherwise - Sign Function : f(x) = 1 if x >= T ; -1 otherwise - Sigmoid Function : f(x) = 1 / (1 + e-x) - Linear Function : f(x) = x
• 14. McCulloch-Pitts Model - Early model of neural network - Introduced by Warren McCulloch and Walter Pitts in 1943 - Also known as linear threshold gate - It models a neuron with a set of inputs I1, I2,, ……, In and one output Y. - Classify the set of inputs into two different classes. So, the output is binary. - Mathematically, it is modelled as: sum = I1 * w1 + I2 * w2 + …….. + In * wn Y = f(sum) Where, w1, w2, ….., wn are weights value in range (0, 1) or (-1, 1)
• 17. McCulloch-Pitts Model (Example - AND Gate) Inputs : I1, I2 Output : Y Threshold Function : f(x) = 0 ; x < T = 1; x >= T Where, T is the threshold value
• 18. Adaline Network - It can have inputs -1 or +1 - It will produce output as -1 or +1 - It uses a bias input - Training of weight is done using delta rule (least mean squares) - During training, activation function is identity function - After training, activation function is threshold function
• 19. Adaline Network (Algorithm) 1. Initialize weights to small random values and select learning rate (alpha) 2. For each input vector s, with target output t, set input to s. 3. Compute neuron inputs: y_in = b + summation(xi * wi) 4. Use delta rule to update bias and weights. b(new) = b(old) + alpha * (t - y_in) wi(new) = wi(old) + alpha * (t - y_in) * xi 1. Repeat from step 2 until largest weight change across all training samples is less than a specified tolerance value.
• 20. Adaline Network (Realization of AND Gate) Assume: learning rate = 0.1 Tolerance = 0.1 Assume a two inputs AND Gate, which is true only if all the inputs are true. So, inputs are I1, I2 and Bias(B) Activation Function: Y = 1 ; y_in >= 0 = -1 ; y_in < 0 Initialization: Set weights to small random values to each of the input. w1 = 0.2 w2 = 0.3 b = 0.1
• 21. Adaline Network (Realization of AND Gate) First Cycle: # Run 1 Inputs = [1, 1, 1] Y_in = 0.1 * 1 + 0.2 * 1 + 0.3 * 1 = 0.6 b(new) = 0.1 + 0.1 * (1 - 0.6) = 0.14 w1(new) = 0.2 + 0.1 * (1 - 0.6) * 1 = 0.24 w2(new) = 0.3 + 0.1 * (1 - 0.6) * 1 = 0.34 So, Largest weight change = 0.04
• 22. Adaline Network (Realization of AND Gate) First Cycle: # Run 2 Inputs = [1, 1, -1] Y_in = 0.04 b(new) = 0.04 w1(new) = 0.14 w2(new) = 0.44 So, Largest weight change = 0.1
• 23. Adaline Network (Realization of AND Gate) First Cycle: # Run 3 Inputs = [1, -1, 1] Y_in = 0.34 b(new) = -0.09 w1(new) = 0.27 w2(new) = 0.31 So, Largest weight change = 0.13
• 24. Adaline Network (Realization of AND Gate) First Cycle: # Run 4 Inputs = [1, -1, -1] Y_in = -0.67 b(new) = -0.27 w1(new) = 0.43 w2(new) = 0.47 So, Largest weight change = 0.16
• 25. Adaline Network (Realization of AND Gate) As, all the runs in the first cycle do not have largest weight change < tolerance value, we proceed to second cycle.
• 26. Adaline Network (Realization of AND Gate) Second Cycle: # Run 1 Inputs = [1, 1, 1] Y_in = 0.63 b(new) = -0.233 w1(new) = 0.46 w2(new) = 0.5 So, Largest weight change = 0.03
• 27. Adaline Network (Realization of AND Gate) Second Cycle: # Run 2 Inputs = [1, 1, -1] Y_in = -0.273 b(new) = -0.3 w1(new) = 0.39 w2(new) = 0.57 So, Largest weight change = 0.07
• 28. Adaline Network (Realization of AND Gate) Second Cycle: # Run 3 Inputs = [1, -1, 1] Y_in = -0.12 b(new) = -0.38 w1(new) = 0.47 w2(new) = 0.48 So, Largest weight change = 0.09
• 29. Adaline Network (Realization of AND Gate) Second Cycle: # Run 4 Inputs = [1, -1, -1] Y_in = -1.33 b(new) = -0.34 w1(new) = 0.43 w2(new) = 0.44 So, Largest weight change = 0.04
• 30. Adaline Network (Realization of AND Gate) In second cycle, largest weight change across all the training samples is less than the tolerance value. So, the solution is: B = -0.34 W1 = 0.43 W2 = 0.44
• 33. Significance of Bias - On training, the weights will be changed. - But, the effect will only be the change in steepness of the curve. - The curve will always pass through origin - What if, our problem should be defined such that it should act as the same curve but that should be shifted vertically. - For such situations, bias play an important role
• 34. Perceptron Network - Single layer Feed forward neural network - Primary use - binary classification - Able to learn any linearly separable function - Activation function : Step function
• 35. Perceptron Network (Algorithm) 1. Initially random weights are assigned to input variables in the range [-0.5, 0.5] 2. Inputs are provided and the output is observed. 3. Weights are adjusted if error is present using the rule: Wi = Wi + (alpha * Xi * e) 1. This process is continued until a single epoch has no error for all input set.
• 36. Perceptron Network (Realization of AND Gate) Assume: learning rate = 0.1 threshold = 0.2 Assume a two inputs AND Gate, which is true only if all the inputs are true. So, inputs are I1, I2 Activation Function: Y = 1 ; y_in >= 0 = 0 ; y_in < 0 Initialization: Set weights to small random values to each of the input. w1 = 0.3 w2 = -0.1
• 37. Perceptron Network (Realization of AND Gate) Epoch 1: I1 I2 Y(d) Y(act) e w1 w2 0 0 0 0 0 0.3 -0.1 0 1 0 0 0 0.3 -0.1 1 0 0 1 -1 0.2 -0.1 1 1 1 0 1 0.3 0
• 38. Perceptron Network (Realization of AND Gate) Epoch 2: I1 I2 Y(d) Y(act) e w1 w2 0 0 0 0 0 0.3 0 0 1 0 0 0 0.3 0 1 0 0 1 -1 0.2 0 1 1 1 1 0 0.2 0
• 39. Perceptron Network (Realization of AND Gate) Epoch 3: I1 I2 Y(d) Y(act) e w1 w2 0 0 0 0 0 0.2 0 0 1 0 0 0 0.2 0 1 0 0 1 -1 0.1 0 1 1 1 0 1 0.2 0.1
• 40. Perceptron Network (Realization of AND Gate) Epoch 4: I1 I2 Y(d) Y(act) e w1 w2 0 0 0 0 0 0.2 0.1 0 1 0 0 0 0.2 0.1 1 0 0 1 -1 0.1 0.1 1 1 1 1 0 0.1 0.1
• 41. Perceptron Network (Realization of AND Gate) Epoch 5: I1 I2 Y(d) Y(act) e w1 w2 0 0 0 0 0 0.1 0.1 0 1 0 0 0 0.1 0.1 1 0 0 0 0 0.1 0.1 1 1 1 1 0 0.1 0.1
• 42. Perceptron Network (Realization of AND Gate) In epoch 5, there is no existence of errors. So, the desired solution is: W1 = 0.1 W2 = 0.1
• 43. Back Propagation Network - Multilayer feed forward neural network - Calculates gradient of loss function with respect to all weights in the network - Gradient provided to the optimization method which updates the weight to minimize loss - Two phase : Propagation and weight update
• 44. Back Propagation Network (Algorithm) Assumption: A = no of units in input layer C = no of units in output layer B = no of units in hidden layer xi = activation level of units in input layer yi = activation level of units in hidden layer oi = activation level of units in output layer w1ij = weight from input to hidden layer w2ij = weight from hidden to output layer
• 45. Back Propagation Network (Algorithm) 1. Initialize weights randomly between [-0.1, 0.1] 2. Initialize activations of thresholding units (x0 and h0) 3. Choose input-output pair (xi and yi). Assign activation level to input units. 4. Propagate activations from input layer to hidden layer using activation function: h(j)= 1 / [1 + e ^ {- Summation i = 0 to A (w1ij * x1)}] --------> for j=1,………..B 1. Propagate activations from hidden layer to output layer using activation function: o(j)= 1 / [1 + e ^ {- Summation i = 0 to B (w2ij * hi)}] --------> for j=1,………..C
• 46. Back Propagation Network (Algorithm) 6. Compute errors of units in output layer (d2j) d2j = oj * (1 - oj) * (yj - oj) …. For j = 1 to C 7. Compute errors of units in hidden layer (d1j) d1j = hj * (1 - hj) * {Sum(1 to C) [d2i * w2ji]} For j = 1 to C 8. Adjust weights between hidden layer and output layer w2ij (new) = (alpha * d2j * hi) + w2ij
• 47. Back Propagation Network (Algorithm) 9. Adjust weights between input layer and hidden layer w1ij (new) = (alpha * d1j * xi) + w1ij 10. Repeat from step 4 until converges
• 48. Back Propagation Network (Application) - To design neural network for linearly inseparable functions.
• 49. Hopfield Network - A network with memory - Features of Hopfield network are: a) Distributed representation b) Asynchronous control c) Content addressable memory d) Fault Tolerance
• 50. Hopfield Network (Steps) - Processing units are in one of two states: active or inactive - A positive weight indicates two units tend to activate each other - A negative weight indicates an active unit deactivates a neighbouring unit
• 51. Hopfield Network (Algorithm) 1. A random unit is chosen. 2. If any of its neighbors are active, the unit computes sum of weights on the connections to active neighbours. 3. If the sum is positive, the unit becomes active 4. Otherwise, it becomes inactive 5. Repeat from step 1, until it reaches stable state.
• 52. Hopfield Network (Practical Applications) 1. Image detection and recognition 2. X-ray Image Enhancement 3. Medical Image Restoration 4. Pattern Recognition 5. Optimization
• 53. Kohonen Network - Unsupervised learning network based on concept of graded learning - Feed forward network with input and output(Kohonen) layers - Every neuron of input layer is connected to every neuron of Kohonen layer - Each connection is associated with some weight. - Also known as self organizing map
• 54. Kohonen Network (Algorithm) 1. Initialize all the weights randomly 2. Initialize neighbourhood and learning rate 3. For all input vector at random: a) Select input vector at random b) Find Kohonen neuron j that has its associated weight vector closest to input vector c) Modify weights of all neurons in neighbourhood of radius r of selected neuron using: wj(t+1) = wj(t) + alpha * [x(t) - wj(t)]
• 55. Kohonen Network (Algorithm) d) Update the value of alpha by reducing it gradually e) Reduce neighbourhood radius r gradually
• 56. Kohonen Network (Practical Application) 1. Clustering of genes in Medical field 2. Analysis of multimedia and web based contents 3. Analysis of remote satellite sensed images