SlideShare une entreprise Scribd logo
1  sur  36
Agenda
▪ Why Not Feedforward Networks?
▪ What Is Recurrent Neural Network?
▪ Issues With Recurrent Neural Networks
▪ Vanishing And Exploding Gradient
▪ How To Overcome These Challenges?
▪ Long Short Term Memory Units
▪ LSTM Use-Case
Agenda
▪ Why Not Feedforward Networks?
▪ What Is Recurrent Neural Network?
▪ Issues With Recurrent Neural Networks
▪ Vanishing And Exploding Gradient
▪ How To Overcome These Challenges?
▪ Long Short Term Memory Units
▪ LSTM Use-Case
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why Not Feedforward Network?
Let’s begin by understanding few limitations with feedforward networks
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why Not Feedforward Networks?
A trained feedforward network can be exposed to any random collection of photographs, and the first photograph it is
exposed to will not necessarily alter how it classifies the second
Seeing photograph of a dog will not lead the net to perceive an elephant next
Output at ‘t’ Output at ‘t-1’
No Relation
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Why Not Feedforward Networks?
When you read a book, you understand it based on your understanding of previous words
I cannot predict the next word in a sentence if I use feedforward nets
Input
at ‘t+1’
Output
at ‘t+1’
Output
at ‘t-2’
Output
at ‘t-1’
Output
at ‘t’
Independent
of the
previous
outputs
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How To Overcome This Challenge?
Let’s understand how RNN solves this problem
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How To Overcome This Challenge?
Input at
‘t-1’
A
Output
at ‘t-1’
Input at
‘t’
A
Output
at ‘t’
Input at
‘t+1’
A
Output
at ‘t+1’
Input
A
Output
Info from input
– ‘t-1’
Info from input
– ‘t’
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
Now, is the correct time to understand what is RNN
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
Suppose your gym trainer has
made a schedule for you.
The exercises are
repeated after every
third day.
Recurrent Networks are a type of artificial neural network designed to recognize patterns in sequences of data, such
as text, genomes, handwriting, the spoken word, or numerical times series data emanating from sensors, stock
markets and government agencies.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
First Day
Second Day
Third Day
Shoulder
Exercises
Biceps Exercises
Cardio
Exercises
Predicting the type of exercise
Using Feedforward Net
Day of the
week
Month of
the year
Health
Status
Shoulder
Exercises
Biceps Exercises
Cardio
Exercises
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
First Day
Second Day
Third Day
Shoulder
Exercises
Biceps Exercises
Cardio
Exercises
Predicting the type of exercise
Shoulder
Yesterday
Biceps Yesterday
Cardio Yesterday
Shoulder
Exercises
Biceps Exercises
Cardio
Exercises
Using Recurrent Net
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
First Day
Second Day
Third Day
Shoulder
Exercises
Biceps Exercises
Cardio
Exercises
Predicting the type of exercise
Using Recurrent Net
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
Predicting the type of exercise
Using Recurrent Net
Information from
prediction at time
‘t-1’
New Information
Prediction at
time ‘t’
Vector 1
Vector 2
Vector 3
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
Predicting the type of exercise
Using Recurrent Net
Vector 1
Vector 2
Vector 3
Prediction
New
Information
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Recurrent Neural Network?
𝑥0
ℎ0
𝑦0
𝑤 𝑅
𝑤𝑖
𝑤 𝑦
𝑥1
ℎ1
𝑦1
𝑤 𝑅
𝑤𝑖
𝑤 𝑦
𝑥2
ℎ2
𝑦2
𝑤 𝑅
𝑤𝑖
𝑤 𝑦
ℎ(𝑡) = 𝑔ℎ (𝑤𝑖 𝑥(𝑡) + 𝑤 𝑅ℎ(𝑡−1) + 𝑏ℎ)
𝑦(𝑡)
= 𝑔 𝑦 (𝑤 𝑦ℎ(𝑡)
+ 𝑏 𝑦)
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Training A Recurrent Neural Network
Let’s see how we train a Recurrent Neural Network
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Training A Recurrent Neural Network
Recurrent Neural Nets uses backpropagation algorithm, but it is applied for every time stamp. It is
commonly known as Backpropagation Through Time (BTT).
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Training A Recurrent Neural Network
Recurrent Neural Nets uses backpropagation algorithm, but it is applied for every time stamp. It is
commonly known as Backpropagation Through Time (BTT).
Vanishing
Gradient
Exploding
Gradient
Let’s look at the
issues with
Backpropagation
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Vanishing And Exploding Gradient
Problem
Let’s understand the issues with Recurrent Neural Networks
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Vanishing Gradient
𝑤 = 𝑤 + ∆𝑤
∆𝑤 = 𝑛
𝑑𝑒
𝑑𝑤
𝑒 = (𝐴𝑐𝑡𝑢𝑎𝑙 𝑂𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑜𝑑𝑒𝑙 𝑂𝑢𝑡𝑝𝑢𝑡)^2
𝑖𝑓
𝑑𝑒
𝑑𝑤
≪≪1
∆𝑤 <<<<<<1
𝑤 ≪≪≪ 1
Backpropagation
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Exploding Gradient
𝑤 = 𝑤 + ∆𝑤
∆𝑤 = 𝑛
𝑑𝑒
𝑑𝑤
𝑒 = (𝐴𝑐𝑡𝑢𝑎𝑙 𝑂𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑜𝑑𝑒𝑙 𝑂𝑢𝑡𝑝𝑢𝑡)^2
𝑖𝑓
𝑑𝑒
𝑑𝑤
≫≫1
∆𝑤 >>>>>>1
𝑤 ≫≫≫ 1
Backpropagation
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How To Overcome These Challenge?
Now, let’s understand how we can overcome Vanishing and Exploding Gradient
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
How To Overcome These Challenges?
▪ Truncated BTT
Instead of starting backpropagation at the last
time stamp, we can choose a smaller time stamp
like 10 (we will lose the temporal context after 10
time stamps)
▪ Clip gradients at threshold
Clip the gradient when it goes higher than a
threshold
▪ RMSprop to adjust learning rate
Exploding gradients
▪ ReLU activation function
We can use activation functions like ReLU, which
gives output one while calculating gradient
▪ RMSprop
Clip the gradient when it goes higher than a
threshold
▪ LSTM, GRUs
Different network architectures that has been
specially designed can be used to combat this
problem
Vanishing gradients
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
✓ Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN.
✓ They are capable of learning long-term dependencies.
The repeating module in a standard RNN contains a single layer
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
𝑓𝑡 = σ(𝑤𝑓 ℎ 𝑡−1, 𝑥𝑡 + 𝑏𝑓)
Step-1
The first step in the LSTM is to identify those information that are
not required and will be thrown away from the cell state. This
decision is made by a sigmoid layer called as forget gate layer.
𝑤𝑓 = 𝑊𝑒𝑖𝑔ℎ𝑡
ℎ 𝑡−1 = 𝑂𝑢𝑡𝑝𝑢𝑡 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑡𝑖𝑚𝑒 𝑠𝑡𝑎𝑚𝑝
𝑥𝑡 = 𝑁𝑒𝑤 𝑖𝑛𝑝𝑢𝑡
𝑏𝑓 = 𝐵𝑖𝑎𝑠
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
Step-2
The next step is to decide, what new information we’re going to store in the cell state. This whole
process comprises of following steps. A sigmoid layer called the “input gate layer” decides which
values will be updated. Next, a tanh layer creates a vector of new candidate values, that could be
added to the state.
𝑖 𝑡 = σ(𝑤𝑖 ℎ 𝑡−1, 𝑥𝑡 + 𝑏𝑖)
𝑐˜ 𝑡 = 𝑡𝑎𝑛ℎ(𝑤𝑐 ℎ 𝑡−1, 𝑥𝑡 + 𝑏 𝑐)
In the next step, we’ll combine these two to update the state.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
Step-3
Now, we will update the old cell state, Ct−1, into the new cell state Ct. First, we multiply the
old state (Ct−1) by ft , forgetting the things we decided to forget earlier. Then, we add 𝑖 𝑡* 𝑐˜ 𝑡.
This is the new candidate values, scaled by how much we decided to update each state value.
𝑐𝑡 = 𝑓𝑡 ∗ 𝑐𝑡−1 + 𝑖 𝑡* 𝑐˜ 𝑡
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks
Step-4
We will run a sigmoid layer which decides what parts of the cell state we’re going to output.
Then, we put the cell state through tanh (push the values to be between −1 and 1) and
multiply it by the output of the sigmoid gate, so that we only output the parts we decided to.
𝑜𝑡 = σ(𝑤𝑜 ℎ 𝑡−1, 𝑥𝑡 + 𝑏 𝑜)
ℎ 𝑡 = 𝑜𝑡*tanh(𝑐𝑡)
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
LSTM Use-Case
Let’s look at a use-case where we will be using TensorFlow
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks Use-Case
We will feed a LSTM with correct sequences from the text of 3 symbols as inputs and 1 labeled
symbol, eventually the neural network will learn to predict the next symbol correctly
had a general
LSTM
cell
Council
Prediction
label
vs
inputs
LSTM cell with
three inputs and
1 output.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks Use-Case
long ago , the mice had a general council to consider what measures
they could take to outwit their common enemy , the cat . some said
this , and some said that but at last a young mouse got up and said he
had a proposal to make , which he thought would meet the case . you
will all agree , said he , that our chief danger consists in the sly and
treacherous manner in which the enemy approaches us . now , if we
could receive some signal of her approach , we could easily escape from
her . i venture , therefore , to propose that a small bell be procured , and
attached by a ribbon round the neck of the cat . by this means we
should always know when she was about , and could easily retire while
she was in the neighborhood . this proposal met with general applause ,
until an old mouse got up and said that is all very well , but who is to
bell the cat ? the mice looked at one another and nobody spoke . then
the old mouse said it is easy to propose impossible remedies .
How to
train the
network?
A short story from Aesop’s Fables
with 112 unique symbols
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Long Short Term Memory Networks Use-Case
A unique integer value is assigned to each symbol because
LSTM inputs can only understand real numbers.
20 6 33
LSTM
cell
LSTM cell with
three inputs and
1 output.
had a general
.01 .02 .6 .00
37
37
vs
Council
Council
112-element
vector
Recurrent Neural Network
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Session In A Minute
Why Not Feedforward Network What Is Recurrent Neural Network? Vanishing Gradient
Exploding Gradient LSTMs LSTM Use-Case
Recurrent Neural Network Tutorial
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorflow Tutorial | Edureka

Contenu connexe

Tendances

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Simplilearn
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Simplilearn
 

Tendances (20)

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
RNN-LSTM.pptx
RNN-LSTM.pptxRNN-LSTM.pptx
RNN-LSTM.pptx
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
LSTM
LSTMLSTM
LSTM
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Rnn & Lstm
Rnn & LstmRnn & Lstm
Rnn & Lstm
 
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 

En vedette

A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
Philip Zheng
 

En vedette (8)

Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Ambasadori ai Romaniei in lume
Ambasadori ai Romaniei in lumeAmbasadori ai Romaniei in lume
Ambasadori ai Romaniei in lume
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
 
RNN Explore
RNN ExploreRNN Explore
RNN Explore
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 

Similaire à Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorflow Tutorial | Edureka

Similaire à Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorflow Tutorial | Edureka (20)

Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
 
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
 
Dataworkz odsc london 2018
Dataworkz odsc london 2018Dataworkz odsc london 2018
Dataworkz odsc london 2018
 
Deep Learning Fundamentals
Deep Learning FundamentalsDeep Learning Fundamentals
Deep Learning Fundamentals
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
 
Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Deep Learning Demystified
Deep Learning DemystifiedDeep Learning Demystified
Deep Learning Demystified
 
Deep recurrent neutral networks for Sequence Learning in Spark
Deep recurrent neutral networks for Sequence Learning in SparkDeep recurrent neutral networks for Sequence Learning in Spark
Deep recurrent neutral networks for Sequence Learning in Spark
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Deep Learning A-Z™: Recurrent Neural Networks (RNN) - Module 3
Deep Learning A-Z™: Recurrent Neural Networks (RNN) - Module 3Deep Learning A-Z™: Recurrent Neural Networks (RNN) - Module 3
Deep Learning A-Z™: Recurrent Neural Networks (RNN) - Module 3
 
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
 
Artificial Neural networks
Artificial Neural networksArtificial Neural networks
Artificial Neural networks
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
 
Lect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdfLect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdf
 
Designing your neural networks – a step by step walkthrough
Designing your neural networks – a step by step walkthroughDesigning your neural networks – a step by step walkthrough
Designing your neural networks – a step by step walkthrough
 
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
 

Plus de Edureka!

Plus de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorflow Tutorial | Edureka

  • 1. Agenda ▪ Why Not Feedforward Networks? ▪ What Is Recurrent Neural Network? ▪ Issues With Recurrent Neural Networks ▪ Vanishing And Exploding Gradient ▪ How To Overcome These Challenges? ▪ Long Short Term Memory Units ▪ LSTM Use-Case
  • 2. Agenda ▪ Why Not Feedforward Networks? ▪ What Is Recurrent Neural Network? ▪ Issues With Recurrent Neural Networks ▪ Vanishing And Exploding Gradient ▪ How To Overcome These Challenges? ▪ Long Short Term Memory Units ▪ LSTM Use-Case
  • 3. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Why Not Feedforward Network? Let’s begin by understanding few limitations with feedforward networks
  • 4. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Why Not Feedforward Networks? A trained feedforward network can be exposed to any random collection of photographs, and the first photograph it is exposed to will not necessarily alter how it classifies the second Seeing photograph of a dog will not lead the net to perceive an elephant next Output at ‘t’ Output at ‘t-1’ No Relation
  • 5. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Why Not Feedforward Networks? When you read a book, you understand it based on your understanding of previous words I cannot predict the next word in a sentence if I use feedforward nets Input at ‘t+1’ Output at ‘t+1’ Output at ‘t-2’ Output at ‘t-1’ Output at ‘t’ Independent of the previous outputs
  • 6. Copyright © 2017, edureka and/or its affiliates. All rights reserved. How To Overcome This Challenge? Let’s understand how RNN solves this problem
  • 7. Copyright © 2017, edureka and/or its affiliates. All rights reserved. How To Overcome This Challenge? Input at ‘t-1’ A Output at ‘t-1’ Input at ‘t’ A Output at ‘t’ Input at ‘t+1’ A Output at ‘t+1’ Input A Output Info from input – ‘t-1’ Info from input – ‘t’
  • 8. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? Now, is the correct time to understand what is RNN
  • 9. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? Suppose your gym trainer has made a schedule for you. The exercises are repeated after every third day. Recurrent Networks are a type of artificial neural network designed to recognize patterns in sequences of data, such as text, genomes, handwriting, the spoken word, or numerical times series data emanating from sensors, stock markets and government agencies.
  • 10. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? First Day Second Day Third Day Shoulder Exercises Biceps Exercises Cardio Exercises Predicting the type of exercise Using Feedforward Net Day of the week Month of the year Health Status Shoulder Exercises Biceps Exercises Cardio Exercises
  • 11. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? First Day Second Day Third Day Shoulder Exercises Biceps Exercises Cardio Exercises Predicting the type of exercise Shoulder Yesterday Biceps Yesterday Cardio Yesterday Shoulder Exercises Biceps Exercises Cardio Exercises Using Recurrent Net
  • 12. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? First Day Second Day Third Day Shoulder Exercises Biceps Exercises Cardio Exercises Predicting the type of exercise Using Recurrent Net
  • 13. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? Predicting the type of exercise Using Recurrent Net Information from prediction at time ‘t-1’ New Information Prediction at time ‘t’ Vector 1 Vector 2 Vector 3
  • 14. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? Predicting the type of exercise Using Recurrent Net Vector 1 Vector 2 Vector 3 Prediction New Information
  • 15. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Recurrent Neural Network? 𝑥0 ℎ0 𝑦0 𝑤 𝑅 𝑤𝑖 𝑤 𝑦 𝑥1 ℎ1 𝑦1 𝑤 𝑅 𝑤𝑖 𝑤 𝑦 𝑥2 ℎ2 𝑦2 𝑤 𝑅 𝑤𝑖 𝑤 𝑦 ℎ(𝑡) = 𝑔ℎ (𝑤𝑖 𝑥(𝑡) + 𝑤 𝑅ℎ(𝑡−1) + 𝑏ℎ) 𝑦(𝑡) = 𝑔 𝑦 (𝑤 𝑦ℎ(𝑡) + 𝑏 𝑦)
  • 16. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Training A Recurrent Neural Network Let’s see how we train a Recurrent Neural Network
  • 17. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Training A Recurrent Neural Network Recurrent Neural Nets uses backpropagation algorithm, but it is applied for every time stamp. It is commonly known as Backpropagation Through Time (BTT).
  • 18. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Training A Recurrent Neural Network Recurrent Neural Nets uses backpropagation algorithm, but it is applied for every time stamp. It is commonly known as Backpropagation Through Time (BTT). Vanishing Gradient Exploding Gradient Let’s look at the issues with Backpropagation
  • 19. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Vanishing And Exploding Gradient Problem Let’s understand the issues with Recurrent Neural Networks
  • 20. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Vanishing Gradient 𝑤 = 𝑤 + ∆𝑤 ∆𝑤 = 𝑛 𝑑𝑒 𝑑𝑤 𝑒 = (𝐴𝑐𝑡𝑢𝑎𝑙 𝑂𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑜𝑑𝑒𝑙 𝑂𝑢𝑡𝑝𝑢𝑡)^2 𝑖𝑓 𝑑𝑒 𝑑𝑤 ≪≪1 ∆𝑤 <<<<<<1 𝑤 ≪≪≪ 1 Backpropagation
  • 21. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Exploding Gradient 𝑤 = 𝑤 + ∆𝑤 ∆𝑤 = 𝑛 𝑑𝑒 𝑑𝑤 𝑒 = (𝐴𝑐𝑡𝑢𝑎𝑙 𝑂𝑢𝑡𝑝𝑢𝑡 − 𝑀𝑜𝑑𝑒𝑙 𝑂𝑢𝑡𝑝𝑢𝑡)^2 𝑖𝑓 𝑑𝑒 𝑑𝑤 ≫≫1 ∆𝑤 >>>>>>1 𝑤 ≫≫≫ 1 Backpropagation
  • 22. Copyright © 2017, edureka and/or its affiliates. All rights reserved. How To Overcome These Challenge? Now, let’s understand how we can overcome Vanishing and Exploding Gradient
  • 23. Copyright © 2017, edureka and/or its affiliates. All rights reserved. How To Overcome These Challenges? ▪ Truncated BTT Instead of starting backpropagation at the last time stamp, we can choose a smaller time stamp like 10 (we will lose the temporal context after 10 time stamps) ▪ Clip gradients at threshold Clip the gradient when it goes higher than a threshold ▪ RMSprop to adjust learning rate Exploding gradients ▪ ReLU activation function We can use activation functions like ReLU, which gives output one while calculating gradient ▪ RMSprop Clip the gradient when it goes higher than a threshold ▪ LSTM, GRUs Different network architectures that has been specially designed can be used to combat this problem Vanishing gradients
  • 24. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks ✓ Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN. ✓ They are capable of learning long-term dependencies. The repeating module in a standard RNN contains a single layer
  • 25. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks
  • 26. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks 𝑓𝑡 = σ(𝑤𝑓 ℎ 𝑡−1, 𝑥𝑡 + 𝑏𝑓) Step-1 The first step in the LSTM is to identify those information that are not required and will be thrown away from the cell state. This decision is made by a sigmoid layer called as forget gate layer. 𝑤𝑓 = 𝑊𝑒𝑖𝑔ℎ𝑡 ℎ 𝑡−1 = 𝑂𝑢𝑡𝑝𝑢𝑡 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑡𝑖𝑚𝑒 𝑠𝑡𝑎𝑚𝑝 𝑥𝑡 = 𝑁𝑒𝑤 𝑖𝑛𝑝𝑢𝑡 𝑏𝑓 = 𝐵𝑖𝑎𝑠
  • 27. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Step-2 The next step is to decide, what new information we’re going to store in the cell state. This whole process comprises of following steps. A sigmoid layer called the “input gate layer” decides which values will be updated. Next, a tanh layer creates a vector of new candidate values, that could be added to the state. 𝑖 𝑡 = σ(𝑤𝑖 ℎ 𝑡−1, 𝑥𝑡 + 𝑏𝑖) 𝑐˜ 𝑡 = 𝑡𝑎𝑛ℎ(𝑤𝑐 ℎ 𝑡−1, 𝑥𝑡 + 𝑏 𝑐) In the next step, we’ll combine these two to update the state.
  • 28. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Step-3 Now, we will update the old cell state, Ct−1, into the new cell state Ct. First, we multiply the old state (Ct−1) by ft , forgetting the things we decided to forget earlier. Then, we add 𝑖 𝑡* 𝑐˜ 𝑡. This is the new candidate values, scaled by how much we decided to update each state value. 𝑐𝑡 = 𝑓𝑡 ∗ 𝑐𝑡−1 + 𝑖 𝑡* 𝑐˜ 𝑡
  • 29. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Step-4 We will run a sigmoid layer which decides what parts of the cell state we’re going to output. Then, we put the cell state through tanh (push the values to be between −1 and 1) and multiply it by the output of the sigmoid gate, so that we only output the parts we decided to. 𝑜𝑡 = σ(𝑤𝑜 ℎ 𝑡−1, 𝑥𝑡 + 𝑏 𝑜) ℎ 𝑡 = 𝑜𝑡*tanh(𝑐𝑡)
  • 30. Copyright © 2017, edureka and/or its affiliates. All rights reserved. LSTM Use-Case Let’s look at a use-case where we will be using TensorFlow
  • 31. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Use-Case We will feed a LSTM with correct sequences from the text of 3 symbols as inputs and 1 labeled symbol, eventually the neural network will learn to predict the next symbol correctly had a general LSTM cell Council Prediction label vs inputs LSTM cell with three inputs and 1 output.
  • 32. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Use-Case long ago , the mice had a general council to consider what measures they could take to outwit their common enemy , the cat . some said this , and some said that but at last a young mouse got up and said he had a proposal to make , which he thought would meet the case . you will all agree , said he , that our chief danger consists in the sly and treacherous manner in which the enemy approaches us . now , if we could receive some signal of her approach , we could easily escape from her . i venture , therefore , to propose that a small bell be procured , and attached by a ribbon round the neck of the cat . by this means we should always know when she was about , and could easily retire while she was in the neighborhood . this proposal met with general applause , until an old mouse got up and said that is all very well , but who is to bell the cat ? the mice looked at one another and nobody spoke . then the old mouse said it is easy to propose impossible remedies . How to train the network? A short story from Aesop’s Fables with 112 unique symbols
  • 33. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Long Short Term Memory Networks Use-Case A unique integer value is assigned to each symbol because LSTM inputs can only understand real numbers. 20 6 33 LSTM cell LSTM cell with three inputs and 1 output. had a general .01 .02 .6 .00 37 37 vs Council Council 112-element vector Recurrent Neural Network
  • 34. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Session In A Minute Why Not Feedforward Network What Is Recurrent Neural Network? Vanishing Gradient Exploding Gradient LSTMs LSTM Use-Case Recurrent Neural Network Tutorial
  • 35. Copyright © 2017, edureka and/or its affiliates. All rights reserved.