This document presents a study comparing Long Short-Term Memory (LSTM) architectures for next frame forecasting in satellite image time series data. Three models - ConvLSTM, Stack-LSTM and CNN-LSTM - were implemented and evaluated based on training loss, time and structural similarity between predicted and actual images. The CNN-LSTM architecture was found to provide the best performance, achieving accurate predictions while requiring less processing time than ConvLSTM for higher resolution images. Overall, the study demonstrates the suitability of deep learning models like CNN-LSTM for predictive tasks using earth observation satellite imagery time series data.
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
CARI-2020, Application of LSTM architectures for next frame forecasting in Sentinel-1 images time series
1. African Conference on Research in Computer Science and Applied Mathematics
CARI’2020 – Polytech School of Thiès, Senegal October 2020
Application of LSTM architectures for next frame
forecasting in Sentinel-1 images time series
Waytehad Rose Moskolaïa,b , Wahabou Abdoua , Albert Dipandaa , Kolyangb
a Computer Science Department, University of Burgundy, 21078 DIJON Cedex, France.
b Computer Science Department, University of Maroua, P.O. Box 46 MAROUA, Cameroon.
3. • Definition :
Part of data mining that allows estimating future trends of events.
Create predictions about unknown future events.
Used in several activity sectors:
Predictive Analytics
3
Historical
Data
Predictive
Algorithms
Model
New
Data
Model Predictions
Introduction
sales
Bank
Weather
Health
Energy
Agriculture
Earth observation
4. Technologies used
4
• For models using remote sensing data, classical Machine Learning
algorithms are generally used: Random Forest, SVM, Regression,
Neural Networks, etc.
• But some limits exist
The necessity to first extract the features or linearize data
The use of auxiliary data
performance is often subject to many physical assumptions
• Recent works use more efficient technologies to achieve better
results: Deep Learning (Jason Brownlee, 2018)
Introduction
Classical Learning
Performance
Amount of Training data
Deep Learning
5. Deep Learning architectures
5
• Deep Learning (DL) :
Is a part of artificial intelligence and Machine Learning
Mimics the workings of the human brain
Allows computers to learn by themselves from examples…
• Several DL architectures are used for prediction in time series:
RNN: Recurrent Neural Networks, namely the Long Short-Term
Memory (LSTM) (Sepp Hochreiter and J. Schmidhuber, 1995)
CNN : Convolutional Neural Network, suitable for images
ConvLSTM : Fusion of CNN architecture and LSTM architecture
(SHI Xingjian et al., 2015)
CNN-LSTM : Combination of CNN architecture and LSTM
architecture (CNN + LSTM), (Z. Shen et al., 2019)…
Introduction
6. Research question
6
• In general, determining which algorithms are best for a problem is
the key to getting the most out of a predictive analytics solution.
• Main research question : which architecture is the most suitable
for prediction tasks in satellite images time series ?
• Proposed approach : the implementation and comparative study of
three architectures widely used for prediction (ConvLSTM, Stack-
LSTM and CNN-LSTM), in the context of next occurrence prediction
in a given satellite images time series.
Introduction
7. Objectives
7
Let 𝑋_𝑡 be a function of ℝ × ℝ of size (W, H) representing an image at
time t. Given a sequence of images 𝑋_(𝑡−𝑛), 𝑋_(𝑡−𝑛−1), … 𝑋_𝑡,
the objectives of this work are :
The implementation of sequence-to-one models based on
Stack-LSTM, ConvLSTM and CNN-LSTM architectures, for the
prediction of the image at time t+1
Performance evaluation of each model
time
Predicted Image at time t+1
Images time series
Methodology
8. Materials
8
• Used data: 158 sentinel-1 images (www.earth-explorer.usg.org),
Wildlife Reserve of Togodo, from September 2016 to May 2019
• Development tools:
Virtual GPU, Google Colab (https://colab.research.google.com)
Python (Programming language)
Tensorflow and Keras libraries
Quantum GIS (for image preprocessing)
Methodology
9. Data preparation
9
• Preprocessing : Radiometric and geometric corrections,
Normalization, resizing, clipping, transformation to RGB files …
• Constitution of training set (about 80%) and test set (20%)
• Transformation of the training set into the format (samples, timestep,
Wx, Hx, features)
X_train Y_train
𝑋1, 𝑋2 , 𝑋3, 𝑋4, 𝑋5 [𝑋6]
𝑋2, 𝑋3 , 𝑋4, 𝑋5, 𝑋6 [𝑋7]
𝑋3, 𝑋4 , 𝑋5, 𝑋6, 𝑋7 [𝑋8]
…
𝑋𝑡−5, 𝑋𝑡−4 , … , 𝑋𝑡−1 [𝑋𝑡]
Timestep : Number of occurrence in input sequence
Samples : Batch size for training step (t – timestep)
Features : Number of variable to predict (1)
Wx, Hx : Size of images (64X64 and 128X128)
Methodology
11. Training parameters
11
• Optimization function: Adaptive Moment Optimization (adam)
• Loss function : Root Mean-Square-Error (RMSE) and Mean
Absolute Error (MAE))
• Training steps : 100 epochs (the number of times that the dataset
passes through the neural network)
Methodology
12. Evaluation parameters
12
• Evolution of loss during the training step
• Total training time
• Values of Structural SIMilarity (SSIM) between the predicted image
and the real one
Activity Recognition: Generating a textual
Methodology
13. Graphical display
13
Prediction results based on ConvLSTM, Stack-LSTM and CNN-
LSTM (respectively (a), (b), (c)). Timestep = 5, with (64×64) images
from test set.
Results and Discussions
14. Graphical display
14
Prediction results based on ConvLSTM, Stack-LSTM and CNN-LSTM
(respectively (a), (b), (c)). Timestep = 10, with (64×64) images from
test set.
Results and Discussions
15. Evolution of training Loss
15
Evolution of training loss (MAE) over epochs depending on timestep (vary
from 5 to 10). (a) Left: training loss with ConvLSTM, right: Training loss with
CNN-LSTM. (b) Training loss with Stack-LSTM
Results and Discussions
16. Evolution of training loss
16
Evolution of training loss values over epochs. (a) Left: MAE with (128×128)
images, Right: MAE with (64×64) images. (b) Left: RMSE with (128×128)
images, Right: RMSE with (64×64) images.
Results and Discussions
17. Evaluation criteria
17
• Due to convolutions operations, time processing is significantly higher
with ConvLSTM model than CNN-LSTM and Stack-LSTM when the
resolution of images increases.
Results and Discussions
18. Conclusion
18
• The use of ConvLSTM architecture for the forecasting tasks
from earth observation images time series is not advisable
(size of images, length of sequences)
• The use of CNN-LSTM architecture is recommended
• Predictions with Stack-LSTM models are done pixel by pixel
• In all situations it is necessary to choose good parameters to
achieve better results (optimization).
Conclusion and Perspectives
19. Perspectives
19
• What next?
Optimize model based on CNN-LSTM architecture to
improve accuracy
Use more date
Test on others area
Create a model based on different architectures to
achieve better results.
Conclusion and Perspectives