In this project work, a multi-step deep neural network is used to forecast power generation and load demand for a short-term time frame. The data or feature vectors that have been used to predict the target, is a sequential time series sequence. In this project, a Recurrent Neural Network has been used in combination with a convolutional neural network to have a better forecasting model for the Windpark, Solar park and Loadpark datasets. Moreover, the forecasting performance of Feedforward neural network and Long Short Term Memory also has been compared. The whole project work has divided into two parts, in the first approach the raw dataset has been divided into a train, test split and no previous step data have been used. In the second step whole raw dataset has been divided into test, train and validation split. Additionally, current and seven previous time steps data has been fed into the model.
Test different neural networks models for forecasting of wind,solar and energy usage
1. Test different Neural Network
Models for forecasting of Wind,Solar Power
Generation and Energy usage within c/sells
Presented by:
Tonmoy Ibne Arif
Master's of Electrical and Communication Engineering
1
3. Motivation
Power forecasting of
renewable energy is an
active research field.
Smart grids requires
load forecasting.
Reduce operation cost.
Better load
scheduling.
Reduce dependency
on fossil fuel.
Test different NN
architecture on the
datasets
3
4. Dataset description
Dataset for EuropeWindFarm, German Solar farm and Load data from different
nodes.
Wind Dataset
• Day ahead forecast for 45 off & onshore wind farms.
• Time series for two years hourly averaged wind power generation
• Data features -time stamp of measurement, wind speed at different hub
height, air pressure, temperature and power generation.
• The original windfarm is masked using normalization.
4
Fig.1. Typical Wind turbine. [1]
5. Dataset description
Solar Dataset
• The dataset contains from 21 photovoltaic facilities from Germany
• The nominal power range from 100kW to 8500kW.
• The original solar farm is masked using normalization.
Load Dataset
• Load data from 89 different nodes.
• The data contains NWP. Which has three hour resolution weather prediction
• Load is taken for 12 months.
5
Fig.2. Hybrid greed system. [2]
6. Data pre-processing
• Cyclical continuous features are converted into two division features
sine and cosine.
• NaN and Null values are removed .
• Load data’s different blocks in h5 file can be accessed using different
keys
• Weather model and load model frame miss match has been dropped.
• Windfarm and Solar farm data was normalized.
• Redundancy is eliminated using normalization on load dataset.
• Drop features with no information
• For example:- forecasting time from wind park dataset.
6
Fig.3. Data pre-processing representation[3]
7. Data Split method
• In first approach, the whole dataset has been split into train and test dataset.
• In the second approach, the whole dataset has been split into train, validation and test set.
7
8. Feature selection
• Highly correlated feature(threshold 0.8 ) has been dropped
using correlation matrix. Wind speed at 10m height
dropped.
• To reduce the training cost, the best features is selected
using mutual info regression from scikit-learn feature
selection.
• This method measures dependency between the
variables.
• Zero means i.i.d.
• High value means higher dependency.
8
Fig.4. Mutula info regression fwature selection method
9. Feature selection
Wrapper method
• Sequential from MLX tend -greedy search algorithm
• D-dimensional to K-dimensional feature vector.(K<D)
• Reduce generalization error and improve computation efficiency by removing irrelevant features and
noise.
9Fig. 5. Performance with 17 features Fig. 6. Performance with 7 features Fig. 7. Performance with 14 features
10. Perceptron
where,
Xi=the input of the neuron.
wi =the weight of each connection to the neuron.
bi =the bias of the neuron.
f(…..) is the acivation function of the neuron.
10
Fig.8. Perceptron Learning Algorithm[8]
11. Feedforward neural network
• Initial weights are randomly initialized
• First observation of dataset feed into the input layer.
• Forward propagation from left to right.
• Measure the error of the prediction.
• Back propagation right to left.
• Weights updates after 100 batch observation.
• Training process finishes after 30 epochs.
11
Fig. 9. A simple Feedforward neural network[4]
12. ANN hyperparameter tuning
For ANN Adam optimizer with learning rate 0.001 and activation function ReLU is chosen.
12
Fig.10. Optimizer Adam lr 0.001 activation ReLU. Fig.11. Optimizer SGD lr.001 activation ReLU. Fig.12. Optimizer RMSprop lr 0.001 activation ReLU.
13. Recurrent Neural Network(RNN)
• Stacked LSTM -multiple LSTM layers.
• Stacked LSTM- 4 hidden layers.
• Return sequence true- One output per input time step
rather than one output time step for all input time steps.
• which Also provide 3D array.
• Each input requires 3 dimensional data
• Each layer provides a sequence output.
• Output single value as a 2d array.
13
Fig. 13. Implemented LSTM architecture.
14. Long Short Term Memory(LSTM)
14
Fig. 14. Typical LSTM architecture.[5] Fig. 15. The repeating module in an LSTM .[5]
15. Long Short Term Memory(LSTM)
• S is the weighted sum input from previous layer
and activated with sigmoid activation function.
• T is the weighted sum input from previous layer
and activated with tanh activation function.
• t- time step.
• X-input.
• h-hidden state, which act as a memory.
• Length of X- size/dimension of input.
• Length of h- no. of hidden state.
• C –cells state, act as a high way for the
sequence chain.
keras using state_size,units
15
Fig. 16. Animated LSTM architecture.[6]
16. LSTM hyperparameter tuning
16
Fig.17. Optimizer Adam lr 0.001 activation ReLU. Fig.18. Optimizer SGD lr.001 activation ReLU. Fig.19. Optimizer RMSprop lr 0.001 activation ReLU.
For RNN LSTM RMSprop optimizer with learning rate 0.001 and activation function ReLU is chosen.
17. Recurrent Convolutional Neural Network(RCNN)
• CNN LSTM –CNN + LSTM.
• CNN –efficiently extract and learn from sequential time series data.
• Each input requires 3 dimensional data
• CNN- to interpret subsequences of input.
• Conv1D- features from short fixed length.
• Automatically learn the salient features.
• Maxplooing 1 –stride size 2.
Simplifies the features maps by keeping ¼ of the values with the largest signal
• Flatten-multi dimensional vector to single dimensional vector.
The distilled features map from maxpooling layer are then flattened into one long
vector
• LSTM-Usage the previous layer output for decoding process.
• Dense- this layer produces output prediction.
17Fig. 20. Implemented CNN-LSTM architecture.
18. Recurrent Convolutional Neural Network(RCNN)
18Fig. 21. A Pooling Layer reducing a feature map by taking the largest value.[7]
19. CNN LSTM hyperparameter tuning
For CNN LSTM RMSprop optimizer with learning rate 0.001 and activation function ReLU is chosen.
19
Fig.22. Optimizer Adam lr 0.001 activation relu. Fig.23. Optimizer SGD lr.001 activation relu. Fig:24. Optimizer SGD lr 0.001 activation relu.
20. Results: ANN training on single dataset
20
Fig.25. ANN applied on single Solarpark dataset. Fig.27. ANN applied on single Loadpark dataset.Fig. 26. ANN applied on single Windpark dataset.
21. Results: LSTM forecasting accuracy on single
dataset
21
Fig.28. RNN LSTM with 1 hour ahead forecast resolution for single Windpark dataset . Fig.29. RNN LSTM with 1 hour ahead forecast resolution for single Windpark dataset .
22. Results: LSTM training on single dataset
22
Fig.30. LSTM applied on single Solarpark dataset. Fig.31. LSTM applied on single Windpark dataset. Fig.32. LSTM applied on single Loadpark dataset.
23. Results: CNN-LSTM training on single dataset.
23
Fig.33. CNN-LSTM applied on single Solarpark dataset. Fig.34. CNN-LSTM applied on single Windpark dataset. Fig.35. CNN-LSTM applied on single Loadpark dataset.
24. Results: NN models forecasting accuracy comparison.
24
Fig.36. NN models performance on whole Solarpark dataset. Fig.37. Boxplot measurement of NN models on whole Solarpark dataset.
25. Results: NN models forecasting accuracy comparison.
25
Fig.38. NN models performance on whole Windpark dataset. Fig.39. Boxplot measurement of NN models on whole Windpark dataset.
26. Results: NN models forecasting accuracy
comparison.
26
Fig.40. NN models performance on whole Loadpark dataset. Fig.41. Boxplot measurement of NN models on whole Loaddpark dataset.
27. Conclusion
and Outlook
27
The sole purpose of this
experiment is to
compare different NN
architecture for short
term forecasting.
Data post-processing
method applied to the
resultant data analysis.
As the Dataset is huge, it
took a lot's amount of
time to implement and
train different NN
models.
The future
improvements for this
project
Implementation of a
more fast learning
algorithm – Auto LSTM.
More robust feature
selection algorithm-
Auto Encoder.
Test more possible
neural networks and
compare their overall
performance.