オープンサイエンスとオープンデータ
ImageNet: http://www.image-net.org/
1,400万枚を超える画像,物体名(クラス名)は2万種類以上
14,197,122 images, 21841 synsets indexed
画像に写っている物体名(クラス名)を付与
http://starpentagon.net/analytics/imagenet_ilsvrc2012_dataset/
Berkeley DeepDrive BDD100k: http://bdd-data.berkeley.edu/
Currently the largest dataset for self-driving AI. Contains over 100,000
videos of over 1,100-hour driving experiences across different times of
the day and weather conditions. The annotated images come from New
York and San Francisco areas.
訓練済みモデルの公開と集合知による改良
https://modelzoo.co/
7
自然言語処理のための機械学習の発展
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina
Toutanova, BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding,
arXiv:1810.04805, 2019.
汎用的な学習済みモデル
Tom B. Brown他, Kristina Toutanova, Language Models are
Few-Shot Learners, arXiv:2005.14165, 2020
Generative Pre-trained Transformer (GTP) - 3
14
開発の難しさ・ベストプラクティスの整理
Bernardi, L., Mavridis, T., & Estevez, P. (2019). 150 successful machine learning
models: 6 lessons learned at Booking.com. In Proceedings of the ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (pp. 1743–1751).
booking.comにおけるベストプラクティスを整理
Vogelsang, A., & Borg, M. (2019). Requirements engineering for machine learning:
Perspectives from data scientists. In Proceedings - 2019 IEEE 27th International
Requirements Engineering Conference Workshops, REW 2019 (pp. 245–251). IEEE.
データサイエンティストの観点での要求工学を整理
Wan, Z., Xia, X., Lo, D., & Murphy, G. C. (2019). How does Machine Learning Change
Software Development Practices? IEEE Transactions on Software Engineering, 1–14.
ソフトウェア工学の観点での課題を整理
Amershi, S., Begel, A., Bird, C., Deline, R., Gall, H., Kamar, E., … Zimmermann, T.
(2019). Software Engineering for Machine Learning: A Case Study. 41st ACM/IEEE
International Conference on Software Engineering (ICSE 2019).
マイクロソフトの機械学習プロジェクトにおけるプラクティスを整理
31
DNNの脆弱性:Adversarial Examples
(敵対的標本)
Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Song, D.
Robust Physical-World Attacks on Deep Learning Models, CVPR 2018
Carlini, N., & Wagner, D. (2017). Towards Evaluating the Robustness of Neural
Networks. Proceedings - IEEE Symposium on Security and Privacy, 39–57.
49
保証範囲の明確化
65
Rahimi, M., & Chechik, M. (2019). Toward Requirements Specification for Machine-Learned Components. In
27th International Requirements Engineering Conference (pp. 241–244).
65
訓練済みモデルのさまざまな確認方法
Ribeiro, M. T., & Guestrin, C. (2016). “Why Should I Trust You?” Explaining the
Predictions of Any Classifier. In the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining - KDD ’16 (pp. 1135–
1144).
出力に寄与している入力を抽出
出力に寄与している訓練データを抽出
Pang Wei Koh, Percy Liang, Understanding Black-box Predictions via Influence Functions,
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1885-
1894, 2017.
XAI(Explainable AI)の研究として盛ん
66
解析可能なモデルを変換・抽出
67
WFAで
モデル抽出
Takamasa Okudono, Masaki Waga, Taro Sekiyama, Ichiro Hasuo:
Weighted Automata Extraction from Recurrent Neural Networks
via Regression, AAAI 2020
Satoshi Hara, Kohei Hayashi, Making Tree Ensembles
Interpretable: A Bayesian Model Selection Approach,
Proceedings of the Twenty-First International
Conference on Artificial Intelligence and Statistics, PMLR
84:77-85, 2018.
場合分けを大まかに理解する
67
原因追求:原因、不都合の分類
72
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea
Stocco, Paolo Tonella, Taxonomy of Real Faults in Deep Learning Systems, ICSE
2020
Md Johirul Islam, Rangeet Pan, Giang Nguyen, Hridesh Rajan, Repairing Deep
Neural, Networks: Fix Patterns and Challenges, ICSE 2020
72
原因追求: 意味を考えて何を間違いやす
いか分析
73
Cynthia C. S. Liem and Annibale Panichella,
Oracle Issues in Machine Learning and Where to
Find Them, 8th International Workshop on
Realizing Artificial Intelligence Synergies in
Software Engineering, 2020