Recurrent Neural Network
ACRRL
Applied Control & Robotics Research Laboratory of Shiraz University
Department of Power and Control Engineering, Shiraz University, Fars, Iran.
Mohammad Sabouri
https://sites.google.com/view/acrrl/
2. ACRRL
Applied Control & Robotics Research Laboratory of
Shiraz University
Department of Power and Control Engineering, Shiraz University, Fars, Iran.
3. Recurrent Neural Network
A Recurrent Neural Network (RNN) is a class of artificial neural
network that has memory or feedback loops that allow it to better
recognize patterns in data.
• Recurrent neural network (RNN) is a neural network model
proposed in the 80’s for modelling time series.
RNNs are an extension of regular artificial neural networks that add connections
feeding the hidden layers of the neural network back into themselves - these are called
recurrent connections.
4. Recurrent Neural Network
The structure of the network is similar to feedforward neural network, with t
he distinction that it allows a recurrent hidden state whose activation at each
time is dependent on that of the previous time (cycle).
5. Recurrent Neural Network
• Recurrent networks, on the other hand, take as their input not just the current input exa
mple they see, but also what they have perceived previously in time.
• This enables RNNs to have improved accuracy compared to MLPs, which only have t
he single input and no memory, RNNs can take several prior input and extrapolate out
with improved accuracy. In other words, RNNs take into consideration what it has lear
ned from prior inputs to classify the current input.
6. Some Example of Recurrent
Neural Network
The beauty of recurrent neural networks lies in their diversity of applicati
on. When we are dealing with RNNs they have a great ability to deal wit
h various input and output types.
• Sentiment Classification
• Image Captioning
• Language Translation
7. Some example of Recurrent
Neural Network
Sentiment Classification
This can be a task of simply classifying tweets into positive and negative sentiment. So he
re the input would be a tweet of varying lengths, while output is of a fixed type and size.
8. Some Example of Recurrent
Neural Network
• Image Captioning
Here, let’s say we have an image for which we need a textual description. So we have a si
ngle input – the image, and a series or sequence of words as output. Here the image might
be of a fixed size, but the output is a description of varying lengths
9. Some Example of Recurrent
Neural Network
• Language Translation
This basically means that we have some text in a particular language let’s say English, an
d we wish to translate it in French. Each language has it’s own semantics and would have
varying lengths for the same sentence. So here the inputs as well as outputs are of varying
lengths.
10. Recurrent Neural Network
So RNNs can be used for mapping inputs to outputs of varying types, lengths and are fairly
generalized in their application. Looking at their applications
11. Where to use a RNN?
• Language Modelling and Generating Text
Given a sequence of word, here we try to predict the likelihood of the next wo
rd. This is useful for translation since the most likely sentence would be the on
e that is correct.
• Machine Translation
Translating text from one language to other uses one or the other form of RN
N. All practical day systems use some advanced version of a RNN.
• Speech Recognition
Predicting phonetic segments based on input sound waves, thus formulating a
word.
12. Where to use a RNN?
• Generating Image Descriptions
A very big use case is to understand what is happening inside an image, thus
we have a good description. This works in a combination of CNN and RNN. C
NN does the segmentation and RNN then used the segmented data to recreat
e the description. It’s rudimentary but the possibilities are limitless.
• Video Tagging
This can be used for video search where we do image description of a video fr
ame by frame.
15. Mathematical Formulation
Recurrent neural networks learn from sequences. A sequence is define
d as a list of (xi,yi) pairs, where xi is the input at time i and yi is the d
esired output. Note that that is a single sequence; the entire data set co
nsists of many sequences.
16. Mathematical Formulation
In addition to the data in our data set, each time step has another input: the hidden state hi−1
from the previous time step. In this way, the recurrent neural network can maintain some int
ernal context as it progresses forward in the sequence. Thus, to summarize, at time i the rec
urren
t network has:
• Input vector xi (data)
• Output vector yi (data)
• Predicted output vector y^i(computed through forward propagation)
• Hidden state hi
17. Mathematical Formulation
When looking only at a single timestep, the recurrent network looks like a simple one-hidden-
layer feed forward network. It has an input layer for xi, an output layer for yi, and another inp
ut layer for the previous hidden state hi−1. Finally, it has one hidden layer between these. The
only unusual thing is that we have two input layers; both of the input layers are are connected
to the hidden layer as if they were really just a single layer.
Thus, we have three separate matrices of weights:
• Input-to-hidden weights Whx
• Hidden-to-hidden weights Whh
• Hidden-to-output weights Wyh
18. Mathematical Formulation
There are several things to note here. First of all, note that the predicted outputs are not su
bject to the nonlinearity. We may want to predict things other than the things in the range
of the nonlinearity, so instead we do not apply the nonlinearity. For specific use cases of r
ecurrent nets, this can be amended, and a nonlinearity specific to the problem can be chos
en. Finally, note that these equations are the same as the equations for a single hidden lay
er feed forward network, with the caveat that the input layer is broken into two pieces xi a
nd hi−1.
22. Jordan RNN
Pro: Fast to train because can be parallelized in time
Cons:
• Output transforms hidden state → nonlinear effects, information distorted
• The output dimension may be too small → information in hidden states is t
runcated
23. Elman RNN
Often referenced as the basic RNN structure and called “Vanilla” RNN
• Should see complete sequence to be trained
• Can not be parallelized by timestamps
• Has some important training difficulties….
28. RNN Problem
However, conventional RNNs have a few limitations. They are difficult to train and have a
very short-term memory, which limits their functionality. To overcome the memory limitatio
n, a newer form of RNN, known as LSTM or Long Short-term Memory networks are used.
LSTMs extend the memory RNNs to enable them to perform tasks involving longer-term m
emory.