2. はじめに
深層学習における予測の不確実性について以下の論文をベースに説明します.
A Survey of Uncertainty in Deep Neural Networks (https://arxiv.org/abs/2107.03342)
深層学習を用いている研究者,開発者の方に有益になると嬉しいです.
対象者:深層学習の予測の不確実性ってどこで役に立つの?
どうやって不確実性を定量化しているの?と思った人向け.
前提知識:確率・統計の基礎,機械(深層) 学習の教科書レベル,最近のMLトピック
105. Calibration評価指標
• the average bin confidence
• the average bin accuracy
• Expected Calibration Error (ECE)
• Static Calibration Error (SCE)
• the adaptive Expected Calibration Error (aECE)
106. 信頼度と精度の関係
データサンプル全てに対して,モデルの信頼度(confidence)を計算しソートする.
その後,M 当分し,各ビンごとに以下を計算する.
Average bin confidence:各ビンの平均信頼度
<latexit sha1_base64="ue0g9fkPNGN5N59n6oHYoKoKDuc=">AAAClHichVFNSxtBGH5cbWvXWmMFEXpZDBa9xFkPVYqCIAVPxa+okA1hdp2YIfvF7iQYl/yB3ouHniqISH9Cjyr4Bzz4E0qPCl489M1moVRR32Fmnnnmfd55ZsYOXRkrxq56tN6+Fy9f9b/WB94Mvh3KDb/bjING5IiiE7hBtG3zWLjSF0UllSu2w0hwz3bFll1f6uxvNUUUy8DfUK1QlD2+68uqdLgiqpKbtkq6VeMqCdvGguXxPaOS1NtGONlaqBuWJ3do4KpmV5O99pRulSu5PCuwNIyHwMxAHlmsBLljWNhBAAcNeBDwoQi74IiplWCCISSujIS4iJBM9wXa0EnboCxBGZzYOo27tCplrE/rTs04VTt0iks9IqWBCXbJTtg1u2A/2W9292itJK3R8dKi2e5qRVgZ+jq2fvusyqNZofZP9aRnhSrmUq+SvIcp07mF09U39w+u1z+tTSQf2CH7Q/5/sCt2SjfwmzfO0apY+w6dPsC8/9wPweZMwfxYMFdn8otz2Vf04z3GMUnvPYtFLGMFRTr3G37hDOfaqDavLWmfu6laT6YZwX+hffkLnQKa9A==</latexit>
p̂ = max
k
p(y = k | x)
Average bin accuracy : 各ビンの平均精度
Well-calibrated
[37]
132. 参考文献
[1] J. C. Reinhold, Y. He, S. Han, Y. Chen, D. Gao, J. Lee, J. L. Prince, and A. Carass, “Validating uncertainty in medical image
translation,” in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE, 2020, pp. 95–98.
[2] T. Nair, D. Precup, D. L. Arnold, and T. Arbel, “Exploring uncertainty measures in deep networks for multiple sclerosis lesion
detection and segmentation,” Medical image analysis, vol. 59, p. 101557, 2020.
[3] Kendall, Alex, Vijay Badrinarayanan, and Roberto Cipolla. "Bayesian segnet: Model uncertainty in deep convolutional encoder-
decoder architectures for scene understanding." arXiv preprint arXiv:1511.02680 (2015).
[4] Sedlmeier, Andreas, et al. "Uncertainty-based out-of-distribution classification in deep reinforcement learning." arXiv preprint
arXiv:2001.00496 (2019).
[5] Ruβwurm, Marc, et al. "Model and Data Uncertainty for Satellite Time Series Forecasting with Deep Recurrent Models." IGARSS
2020-2020 IEEE International Geoscience and Remote Sensing Symposium. IEEE.
[6] J. Gawlikowski, S. Saha, A. Kruspe, and X. X. Zhu, “Out-of- distribution detection in satellite image classification,” in RobustML
workshop at ICLR 2021. ICRL, 2021, pp. 1–5.
[7] J. Zeng, A. Lesnikowski, and J. M. Alvarez, “The relevance of bayesian layer positioning to model uncertainty in deep bayesian
active learning,” arXiv preprint arXiv:1811.12535, 2018.
[8] Baier, Lucas, et al. "Detecting Concept Drift With Neural Network Model Uncertainty." arXiv preprint arXiv:2107.01873 (2021).
133. 参考文献
[9] Abdar, Moloud, et al. "A review of uncertainty quantification in deep learning: Techniques, applications and
challenges." Information Fusion (2021).
[10] A. Malinin and M. Gales, “Predictive uncertainty estimation via prior networks,” in Advances in Neural Information Processing
Systems, 2018, pp. 7047–7058.
[11] Pearce, Tim, Felix Leibfried, and Alexandra Brintrup. "Uncertainty in neural networks: Approximately bayesian
ensembling." International conference on artificial intelligence and statistics. PMLR, 2020.
[12] Amini, A., Schwarting, W., Soleimany, A., & Rus, D. (2019). Deep evidential regression. arXiv preprint arXiv:1910.02600.
[13] A. Ashukha, A. Lyzhov, D. Molchanov, and D. Vetrov, “Pitfalls of in-domain uncertainty estimation and ensembling in deep
learning,” in International Conference on Learning Representations, 2020.
[14] E. Hu ̈llermeier and W. Waegeman, “Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and
methods,” Machine Learning, vol. 110, no. 3, pp. 457–506, 2021.
[15] Y.Ovadia,E.Fertig,J.Ren,Z.Nado,D.Sculley,S.Nowozin,J.Dillon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model’s
uncertainty? evaluating predictive uncertainty under dataset shift,” in Advances in Neural Information Processing Systems, 2019, pp.
13 991– 14 002.
[16] D. Hendrycks, M. Mazeika, and T. Dietterich, “Deep anomaly detection with outlier exposure,” in International Conference on
Learning Representations, 2019.
134. 参考文献
[17] A. Malinin and M. Gales, “Predictive uncertainty estimation via prior networks,” in Advances in Neural Information Processing
Systems, 2018, pp. 7047–7058.
[18] M. Sensoy, L. Kaplan, and M. Kandemir, “Evidential deep learning to quantify classification uncertainty,” in Advances in Neural
Information Processing Systems, 2018, pp. 3179–3189.
[19] M. Raghu, K. Blumer, R. Sayres, Z. Obermeyer, B. Kleinberg, S. Mullainathan, and J. Kleinberg, “Direct uncertainty prediction for
medical second opinions,” in International Conference on Machine Learning. PMLR, 2019, pp. 5281–5290.
[20] T. Ramalho and M. Miranda, “Density estimation in representation space to predict model uncertainty,” in Engineering
Dependable and Secure Machine Learning Systems: Third International Workshop, EDSMLS 2020, New York City, NY, USA, February 7,
2020, Revised Selected Papers, vol. 1272. Springer Nature, 2020, p. 84.
[21] S. Liang, Y. Li, and R. Srikant, “Enhancing the reliability of out-of- distribution image detection in neural networks,” in 6th
International Conference on Learning Representations, 2018.
[22] Y.-C. Hsu, Y. Shen, H. Jin, and Z. Kira, “Generalized odin: Detect- ing out-of-distribution image without learning from out-of-
distribution data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 951–10 960.
[23] Jospin, Laurent Valentin, et al. "Hands-on Bayesian Neural Networks--a Tutorial for Deep Learning Users." arXiv preprint
arXiv:2007.06823 (2020).
[24] B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,”
in Advances in neural information processing systems, 2017, pp. 6402–6413.
135. 参考文献
[25] A. Vyas, N. Jammalamadaka, X. Zhu, D. Das, B. Kaul, and T. L. Willke, “Out-of-distribution detection using an ensemble of self
supervised leave-out classifiers,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 550–564.
[26] H. Guo, H. Liu, R. Li, C. Wu, Y. Guo, and M. Xu, “Margin & diversity based ordering ensemble pruning,” Neurocomputing, vol. 275,
pp. 237– 246, 2018.
[27] A. Malinin, B. Mlodozeniec, and M. Gales, “Ensemble distribution distillation,” in 8th International Conference on Learning
Representations, 2020.
[28] J. Lindqvist, A. Olmin, F. Lindsten, and L. Svensson, “A general framework for ensemble distribution distillation,” in 2020 IEEE 30th
International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2020, pp. 1–6.
[29] M. Valdenegro-Toro, “Deep sub-ensembles for fast uncertainty estima- tion in image classification,” in Bayesian Deep Learning
Workshop at Neural Information Processing Systems 2019, 2019.
[30] Y. Wen, D. Tran, and J. Ba, “Batchensemble: an alternative approach to efficient ensemble and lifelong learning,” in 8th
International Conference on Learning Representations, 2020.
[31] Shanmugam, Divya, et al. "When and why test-time augmentation works." arXiv preprint arXiv:2011.11156 (2020).
[32] Kim, Ildoo, Younghoon Kim, and Sungwoong Kim. "Learning loss for test-time augmentation." arXiv preprint
arXiv:2010.11422 (2020).
136. 参考文献
[33] D. Molchanov, A. Lyzhov, Y. Molchanova, A. Ashukha, and D. Vetrov, “Greedy policy search: A simple baseline for learnable test-time
augmentation,” arXiv preprint arXiv:2002.09103, vol. 2, no. 7, 2020.
[34] T. Pearce, A. Brintrup, M. Zaki, and A. Neely, “High-quality prediction intervals for deep learning: A distribution-free, ensembled
approach,” in International Conference on Machine Learning. PMLR, 2018, pp.
[35] D. Su, Y. Y. Ting, and J. Ansel, “Tight prediction intervals using expanded interval minimization,” arXiv preprint arXiv:1806.11222,
2018.
[36] A. G. Roy, S. Conjeti, N. Navab, C. Wachinger, A. D. N. Initiative et al., “Bayesian quicknat: Model uncertainty in deep whole-brain
segmentation for structure-wise quality control,” NeuroImage, vol. 195, pp. 11–22, 2019.
[37] C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in International Conference on Machine
Learning. PMLR, 2017, pp. 1321–1330.
[38] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in International Conference on
Learning Representations, 2018.
[39] S. Thulasidasan, G. Chennupati, J. A. Bilmes, T. Bhattacharya, and S. Michalak, “On mixup training: Improved calibration and
predictive uncertainty for deep neural networks,” in Advances in Neural Informa- tion Processing Systems, 2019, pp. 13 888–13 899.
[40] K.Patel,W.Beluch,D.Zhang,M.Pfeiffer,andB.Yang,“On-manifold adversarial data augmentation improves uncertainty calibration,” in
2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021, pp. 8029–8036.