This presentation is intended as a high-level introduction for to deep learning and its applications in materials science. The intended audience is materials scientists and engineers
Disclaimers: the second half of this presentation is intended as a broad overview of deep learning applications in materials science; due to time limitations it is not intended to be comprehensive. As a review of the field, this necessarily includes work that is not my own. If my own name is not included explicitly in the reference at the bottom of a slide, I was not involved in that work.
Any mention of commercial products in this presentation is for information only; it does not imply recommendation or endorsement by NIST.
TMS workshop on machine learning in materials science: Intro to deep learning for materials
1. A brief introduction to deep learning
in materials science
Brian DeCost
brian.decost@nist.gov
September 2018
TMS Machine Learning Workshop — Pittsburgh, PA
Disclaimers: the second half of this presentation is intended as a broad overview of deep
learning applications in materials science; due to time limitations it is not intended to be
comprehensive. As a review of the field, this necessarily includes work that is not my own.
If my own name is not included explicitly in the reference at the bottom of a slide, I was
not involved in that work.
Any mention of commercial products in this presentation is for information only; it does not
imply recommendation or endorsement by NIST.
2.
3. Learning objectives
3
What is deep learning?
- Intuitively answer: 'what is a deep learning model?'
- Be aware of important parameters and pitfalls
- How can we use deep learning on small datasets?
Deep learning in materials science
- Identify opportunities in materials science
- What innovation is still required?
https://cs231n.github.io/ -- great introductory course
https://fast.ai -- state-of-the-art (2018) practical deep learning
http://deeplearningbook.org -- Theoretical/mathematical perspective
4. What is a deep learning model?
4
The current (3rd) neural net wave is enabled by data scale and computational capability
Deep learning models build complex functions out of simple pieces
Success attributed to learnable representations for complex data
Grounding visual explanations Hendricks, Hu, Darrell, and Akata (2017) arXiv:1711.06465
Materials scientists have experimented with neural nets
since at least the second neural net wave in the 90s
Bhadeshia, MacKay, and Svensson. "Impact toughness of C–Mn steel arc welds–Bayesian neural network analysis."
Materials Science and Technology 11.10 (1995): 1046-1051.
5. Bayesian neural networks in MSE, in the 90s
5 HKDH, Bhadeshia. "Neural networks in materials science." ISIJ international 39.10 (1999): 966-979.
Austenite transformation temperatureToughness vs O2
Pioneering work by Bhadeshia, MacKay, and Svensson
6. Deep learning: function composition
6
The simplest neural net classifier is a zero-layer model
f(x) = (Wx + b)
a.k.a. logistic regression
(z) = 1/(1 + e z
)
Add one layer: non-linear neural net
f(x) = (W2 (W1x))
f(x) = (W4 (W3 (W2 (W1x))))
Add layers to increase capacity
7. Deep Learning
7
LeCun, Bengio, and Hinton Deep Learning 2015
https://dx.doi.org/10.1038/nature14539
Stochastic gradient descent via backpropagation
-- differentiate the model quality (e.g. mean squared error)
-- then repeatedly apply the chain rule to update parameters
8. In practice: autograd libraries
8
f(x) = (W4 (W3 (W2 (W1x))))
Add layers to increase capacity
specifying our 3-layer net takes
just 11 lines of python
10. Convolutional networks
10
LeCun, Bengio, and Hinton Deep Learning 2015
https://dx.doi.org/10.1038/nature14539
Just as before:
compose many layers of learnable convolution filters
Convolution -- Activation -- Pooling
11. Representation learning in action
11 Mahendran and Vedaldi, Visualizing deep convolutional neural networks using natural pre-images arXiv:1512.02017
Most parameters of neural net models are not directly interpretable, but there is a lot of
research into indirect methods of model interpretation
13. How can we use CNNs with small materials data?
13
Layer 1
(64x2)
Layer 2
(128x2)
Layer 3
(256x3)
Layer 4
(512x3)
Input
Layer 5
(512x3)
Layer 6
(4096)
Layer 7
(4096)
Layer 8
(1000)
Output
(1000)
dog
Input
3 channels
Block 2
128 channels
Block 1
64 channels
Block 3
256 channels
Block 4
512 channels
Block 5
512 channels
DeCost Ph.D. Thesis (2016); block copolymer micrograph courtesy of Bongjoon Lee and Prof. Michael Bockstaller
VGG
architecture
Simple: linear model using fixed CNN features
Transfer learning: use pre-trained model as a good initialization
14. Hyperparameter tuning
14
-- learning rate and schedule
-- regularization strength, data augmentation
-- optimization algorithm
-- depth and width
-- ...
Bergstra and Bengio Random Search for Hyper-Parameter Optimization JMLR 2012
Most important: use a good validation set:
deep learning models have very high capacity
use random search
with log-uniform sampling
15. Some open problems to think about
15
- How can we train models on small datasets?
- How can we exploit these models to gain physical insight?
- How should we quantify our confidence in their predictions?
- How can we use deep learning with orientation maps?
16. Applications in materials science
16
spatial mapping data
- quantitative microstructure-based models
- automated image acquisition
- dynamic tracking
- feature recognition --> quantitative analysis
Two areas I find compelling:
atomistic surrogate modeling
- data-driven interatomic potentials
- learnable crystal structure representations
17. Microstructure informatics opportunities and challenges
17
Literature search
Autonomous microscopy
Microstructure generation
Microstructure dataset
exploration
Semantic segmentation
Quantitative processing,
structure, properties relationships
Community-curated datasets with properties metadata!
19. Relating processing, structure, properties
19
DeCost, Francis and Holm "Exploring the microstructure manifold: image texture
representations applied to ultrahigh carbon steel microstructures” Acta Mater.
DOI: 10.1016/j.actamat.2017.05.014
Ultra-high carbon steel
20. Magnification and quench
20
DeCost, Francis and Holm "Exploring the microstructure manifold: image texture
representations applied to ultrahigh carbon steel microstructures” Acta Mater.
DOI: 10.1016/j.actamat.2017.05.014
Ultra-high carbon steel
21. X-ray scattering data classification
21
Wang, Boyu, et al. "X-ray scattering image classification using deep learning.
" Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017.
Key idea: use convolutional autoencoder to learn features from synthetic X-ray scattering data
Compare reconstructions:
22. Fused 2-point statistics and 3D CNN
22
Predicting stiffness of composite materials
Cecen, Dai, Yabansu, S.R. Kalidindi, and Song
Material structure-property linkages using three-dimensional convolutional neural networks Acta Materialia 2018
3D CNN filters
23. Real-time defect tracking in WS2 with STEM
23
Maksov et al Deep Learning Analysis of Defect and Phase Evolution
During Electron Beam Induced Transformations in WS2 arXiv:1803.05381
1. Train a U-Net type model to identify lattice defects (STEM images of WS2)
(obtained by taking the difference with an FFT reconstruction)
2. Cluster CNN features to identify defect types
3. Localize and track:
extract defect kinetics!
24. Image-to-image regression and classification
24
Input Conv1 Conv2 Conv3
Conv4
Conv5
Upsample & Concatenate OutputMLP
Representation learning Classifier
ˆy = SoftMax(h2(h1(Z)))Z = Concat(Upsample(C5(C4(C3(C2(C1(X)))))))
Bansal et al. "Pixelnet: Representation of the pixels, by the pixels, and for the
pixels.” arXiv preprint arXiv:1702.06506 (2017).
DeCost, Francis, and Holm. "High throughput quantitative metallography for
complex microstructures using deep learning: A case study in ultrahigh carbon steel”
accepted for publication in Microscopy & Microanalysis arXiv:1805.08693
This kind of model has become standard in medical imaging and robotics
What can we use it for in materials science?
Pixelnet architecture:
25. Ultrahigh carbon steel segmentation dataset
25
24 annotated micrographs (20 train, 4 val, 6-fold CV)
Blue: ferrite
Cyan: carbide network
Yellow: cementite particles
Green: Widmanstätten lath
DeCost, Francis, and Holm. "High throughput quantitative metallography for
complex microstructures using deep learning: A case study in ultrahigh carbon steel”
accepted for publication in Microscopy & Microanalysis arXiv:1805.08693
26. Particle size distribution dataset
26
Current image processing workflow:
- manually select threshold in ImageJ
- carefully clean up after watershed segmentation
- compute particle size distributions
We can train a CNN to reproduce the cleaned-up particle labels!
DeCost, Francis, and Holm. "High throughput quantitative metallography for
complex microstructures using deep learning: A case study in ultrahigh carbon steel”
accepted for publication in Microscopy & Microanalysis arXiv:1805.08693
27. UHCS quantitative feature extraction
27
Proof-of-concept: fast methods for extracting quantitative info:
Run both CNNs to obtain particle sizes from complex images
DeCost, Francis, and Holm. "High throughput quantitative metallography for
complex microstructures using deep learning: A case study in ultrahigh carbon steel”
accepted for publication in Microscopy & Microanalysis arXiv:1805.08693
28. Extracting denuded zone widths
28
Hecht et al. "Coarsening of inter and intragranular proeutectoid cementite in an
initially pearlitic 2C- 4Cr ultrahigh carbon steel” Accepted for publication in Met.
Mater. Trans. A (2017).
Current workflow:
- manually trace interface
- manually draw width samples
DeCost, Francis, and Holm. "High throughput quantitative metallography for
complex microstructures using deep learning: A case study in ultrahigh carbon steel”
accepted for publication in Microscopy & Microanalysis arXiv:1805.08693
29. Quantifying dislocations in STEM images
29 Li, Field, and Morgan Automated defect analysis in electron microscopic images npj Computational Materials 2018
1. Sliding window detector
2. CNN filter
3. watershed, region analysis
30. Microstructure cluster analysis
30 Kitahara and Holm, Microstructure Cluster Analysis with Transfer Learning and Unsupervised Learning IMMI 2018
Key idea: cluster CNN features from fracture surface images
Additive In-718 Charpy coupons
horizontal
vertical
31. Graph convolution networks
31
Graph convolution: locally-weighted sums of atom feature vectors
Duvenaud et al. "Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Xie and Grossman, Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. Phys Rev. Lett 2018
Ahmad, Tian Xie, Maheshwari, Grossman, and Viswanathan
Machine Learning Enabled Computational Screening of Inorganic Solid Electrolytes for Suppression of Dendrite Formation in Lithium Metal Anodes ACS Central Science
32. Neural network potentials
32
Goal: Replace ab initio MD with Neural Net MD: fast with DFT quality
Lots of interesting work, much of which is inspired by Behler and Parrinello
Behler, Jörg, and Michele Parrinello. "Generalized neural-network representation of high-dimensional potential-energy surfaces."
Physical review letters 98.14 (2007): 146401.
Classic: fingerprint local atomic environments --> neural network model
Pun, Batra, Ramprasad, and Mishin Physically-informed artificial neural networks for atomistic modeling of materials arXiv:1808.01696
33. Extracting materials process data from the literature
33
Kim, Huang, Stefanie Jegelka and Olivetti
Virtual screening of inorganic materials synthesis parameters with deep learning npj Comp. Mater. Sci. doi:10.1038/s41524-017-0055-6
2: Variational autoencoder: featurize synthesis routes1: word2vec: features words
https://www.tensorflow.org/tutorials/representation/word2vec
3: map processing space!
34. Molecular autoencoder for chemical design
34
Goḿez-Bombarelli et al Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
ACS Central Science 2018 10.1021/acscentsci.7b00572
1. Variational autoencoder to learn molecular features
2. Perform property optimization on the latent representations!
35. Resources
35
https://cs231n.github.io/ -- great introductory course
https://fast.ai -- state-of-the-art (2018) practical deep learning
http://deeplearningbook.org -- Theoretical/mathematical perspective
Also: http://colah.github.io/posts/2015-08-Backprop/
Interactive ConvNets in the browser: https://cs.stanford.edu/people/karpathy/convnetjs/
transfer learning: https://medium.com/nanonets/nanonets-how-to-use-deep-learning-when-you
limited-data-f68c0b512cab