Synthetic Gradients were introduced in 2016 by Max Jaderberg and other researchers at DeepMind. They are designed to replace backpropagation, and they can make all sorts of deep neural networks much faster to train, and even perform better. Moreover, they can allow Recurrent Neural Networks to learn long term patterns in the data.
Papers:
- Decoupled Neural Interfaces using Synthetic Gradients, Max Jaderberg et al., 2016: https://arxiv.org/abs/1608.05343
- Understanding Synthetic Gradients and Decoupled Neural Interfaces, Wojciech Marian Czarnecki et al., 2017: https://arxiv.org/abs/1703.00522
Blog posts:
- By Max Jaderberg, DeepMind: https://deepmind.com/blog/decoupled-neural-networks-using-synthetic-gradients/
- By iamtrask: https://iamtrask.github.io/2017/03/21/synthetic-gradients/
Implementations:
- DNI-TensorFlow by Andrew Liao: https://github.com/vyraun/DNI-tensorflow
- Jupyter Notebook with TensorFlow by Nitarshan Rajkumar: https://github.com/nitarshan/decoupled-neural-interfaces/
- DNI.PyTorch by Andrew Liao: https://github.com/andrewliao11/dni.pytorch/
The painting on the first slide is by Annie Clavel, a great French artist currently living in Los Angeles. Visit her website: http://www.annieclavel.com/.
35. Aurélien Géron, 2017
Hidden Layer
Training a Synthetic Gradient Model
hi
ǁ δi
– δi
ǁ²
^
θi
Synthetic Gradient
Model Loss
Mi
36. Aurélien Géron, 2017
Synthetic Gradient
Model LossHidden Layer
Training a Synthetic Gradient Model
hi
θi
?
Mi ǁ δi
– δi
ǁ²
^
37. Aurélien Géron, 2017
Hidden Layer
Hidden Layer
Training a Synthetic Gradient model
hi
hi+1
θi+1
θi
Mi
δi+1
^Mi+
1
ǁ δi
– δi
ǁ²
^
Synthetic Gradient
Model Loss
?
38. Aurélien Géron, 2017
Hidden Layer
Hidden Layer
Training a Synthetic Gradient Model
hi
hi+1
θi+1
θi
Synthetic Gradient
Model Loss
Mi
δi+1
^Mi+
1
ǁ δi
– δi
ǁ²
^
39. Aurélien Géron, 2017
Hidden Layer
Output Layer
Training a Synthetic Gradient Model
hi
hi+1
θi+1
θi
Loss
δi+1
Synthetic Gradient
Model Loss
Mi ǁ δi
– δi
ǁ²
^