14. Discussion
1. BN has been so influential that many state-of- the-art systems and their hyper-
parameters have been de- signed for it, which may not be optimal for GN-
based models.
2. Sense GN is related to LN and IN, two normalization methods that are
particularly successful in training recurrent (RNN/LSTM) or generative (GAN)
models. This suggests us to study GN in those areas in the future.