Evaluating Machine Learning Algorithms for Materials Science using the Matbench Protocol
1. Evaluating Machine Learning Algorithms
for Materials Science using the
Matbench Protocol
Anubhav Jain
Staff Scientist, Lawrence Berkeley National Laboratory
Deputy Director, Materials Project
materialsproject.org
The Materials Project
Slides (already) uploaded to https://hackingmaterials.lbl.gov
2. Outline of talk
1. A quick introduction to the Materials Project
2. Engaging the community: The MPContribs data platform
3. Benchmarking machine learning algorithms using the Matbench protocol
4. The core of Materials Project is a free database of
calculated materials properties and crystal structures
Free, public resource
• www.materialsproject.org
Data on ~150,000 materials,
including information on:
• electronic structure
• phonon and thermal
properties
• elastic / mechanical properties
• magnetic properties
• ferroelectric properties
• piezoelectric properties
• dielectric properties
Powered by hundreds of millions
of CPU-hours invested into high-
quality calculations
4
6. Apps give insight into data
Materials Explorer
Phase Stability Diagrams
Pourbaix Diagrams
(Aqueous Stability)
Battery Explorer
6
7. The code powering the Materials Project is
available open source (BSD/MIT licenses)
just-in-time error correction, fixing your
calculations so you don’t have to
‘recipes' for common materials
science simulation tasks
making materials science web apps easy
workflow management software for
high-throughput computing
materials science analysis code:
make, transform and analyze crystals,
phase diagrams and more
& more … MP team members also contribue to
several other non-MP codes, e.g. matminer for
machine learning featurization
7
8. Example: calculation workflows implemented in
by dozens of collaborators
Phonons
Elasticity Defects
Magnetism
Band
Structures
Stability
Grain
Boundaries
Equations
of State
X-ray
Absorption
Spectra
Piezoelectric
Dielectric
Surfaces
& more …
9
Requirements: VASP license and a big computer
ABINIT planned in future w/G.-M. Rignanese
8
9. Example 2: matminer allows researchers to generate
diverse feature sets for machine learning
9
>60 featurizer classes can
generate thousands of potential
descriptors that are described in
the literature
feat = EwaldEnergy([options])
y = feat.featurize([input_data])
• compatible with scikit-
learn pipelining
• automatically deploy
multiprocessing to
parallelize over data
• include citations to
methodology papers
10. The Materials Project is used heavily by the research
community
> 180,000 registered
users
> 40,000 new users last year
~100 new registrations/day
~5,000-10,000 users log on every day
> 2M+ records downloaded through API each day; 1.8 TB of data served per month 10
11. A large fraction of users are from industry
Student
44%
Academia
36%
Industry
10%
Government
5%
Other
5%
3.5%
Schrodinger: Many of our customers are active users of
the Materials Project and use MP databases for
their projects. Enabling direct access to MP databases
from within Schrödinger software is a powerful addition
that will be appreciated by our users.
Toyota: “Materials Project
is a wonderful project.
Please accept my
appreciation to you to
release it free and easy to
access.”
Hazen Research: “Amazing
and well done data base. I
still remember searching
Landolt-Börnstein series
during my PhD for similar
things.”
11
13. How can we use Materials Project to build a
community of materials researchers?
Materials Project now has
high visibility (e.g., by search
engines)
How can we use this
platform to help add value to
the community of materials
researchers?
13
14. Beyond calculations: MPContribs allows the research
community to contribute their own data
A “materials detail page,”
containing all the information MP
has calculated about a specific
material
Experimental data on a
material (either specific
phase, composition, or
chemical system)
“MPContribs” bridges
the gap
14
15. 2. Materials Project links
to your contribution
3. Your data set and
paper are linked
1. Google links to
Materials Project page
15
From Google search to your data and your research, via MP
16. MPContribs is open for contributions
You can now apply to contribute
your data set and we will work
with you to disseminate via MP
Designed for:
• smaller data sets (e.g., MBs to
GBs); for large data files see
NOMAD or other repos
• Linking to MP compositions
Available via mpcontribs.org
16
18. MP is now involved in an effort to benchmark
various machine learning algorithms
18
19. Model 2
Without standardized benchmarks, ML models can be difficult to compare
Model 1
Dataset 1
+
No structures
No AB2C3 compositions
4k samples Dataset 2
+
Model 3
Dataset 3
+
RMSETest Set = 0.05 eV MAE5-fold CV = 0.021 eV Val. Loss = 0.005
VS. VS.
Structures avail.
100k samples
Eabove hull < 0.050 eV
???
??? ???
???
???
20. What’s needed –
an “ImageNet” for materials science
https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/
20
21. Can we make the same
advancements in materials
as in computer vision?
One of the reasons computer science
/ machine learning seems to advance
so quickly is that they decouple
data generation from algorithm
development
This allows groups to focus on
algorithm development without all
the data generation, data cleaning,
etc. that often is the majority of an
end-to-end data science project
Clear comparisons also move the
field forward and measure progress 21
22. The ingredients of the Matbench
benchmark
qStandard data sets
qStandard test splits according to nested cross-validation procedure
qAn online leaderboard that encourages reproducible results
22
23. Matbench includes 13 different ML tasks
23
Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer
Reference Algorithm. npj Comput Mater 2020, 6 (1), 138. https://doi.org/10.1038/s41524-020-00406-3.
24. The tasks encompass a variety of problems
13 Ready-to-use ML tasks ranging in training size, target property, inputs, task type.
• Pre-cleaned datasets from literature and
online repositories (such as Materials Project)
• Wide range of practical solid state ML tasks
• Experimental and computed properties
• Standardized error evaluation (nested CV)
25. Browse datasets and tasks with Materials Project MPContribsML
https://ml.materialsproject.org
26. The ingredients of the Matbench
benchmark
ü Standard data sets
qStandard test splits according to nested cross-validation procedure
qAn online leaderboard that encourages reproducible results
26
27. 27
Most commonly used test split procedure
• Training/validation
is used for model
selection
• Test / hold-out is
used only for error
estimation
(Test set should not
inform model
selection, i.e. “final
answer”)
28. Think of it as N different “universes” – we have a different training
of the model in each universe and a different hold-out.
28
Nested CV – like hold-out, but varies the hold-out set
29. Think of it as N different “universes” – we have a different training
of the model in each universe and a different hold-out.
29
Nested CV – like hold-out, but varies the hold-out set
“A nested CV procedure provides an almost unbiased estimate of the true error.”
Varma and Simon, Bias in error estimation when using cross-validation for model
selection (2006)
30. The ingredients of the Matbench
benchmark
ü Standard data sets
ü Standard test splits according to nested cross-validation procedure
qAn online leaderboard that encourages reproducible results
30
31. Matbench has an online leaderboard – matbench.materialsproject.org
32. Complete and reproducible results on standardized ML tasks
Sample-by-sample predictions of all
algorithms on all tasks, notebooks and
scripts for reproduction
Aggregate scores across nested CV folds
Complete model metadata,
hyperparameters, required compute,
academic references
.json .ipynb .py
33. Algorithm comparison across individual tasks OR complete benchmark
Example: matbench_dielectric
Compare both specialized and general-purpose
algorithms across multiple error metrics
34. Evaluation of ML paradigms drives research and development
Traditional paradigms:
• Traditional Models (e.g., RF + MagPie[1] features)
• AutoML inside “traditional ML” space (Automatminer)
Advancements in deep neural networks:
1. doi.org/10.1038/npjcompumats.2016.28
Attention Networks
(e.g., CRABNet [2])
Optimal Descriptor Networks
(e.g, MODNet [3])
Crystal Graph Networks
(e.g, CGCNN, MEGNet [4])
2. doi.org/10.1038/s41524-021-00545-1 3. doi.org/10.1038/s41524-021-00552-2 4. doi.org/10.1021/acs.chemmater.9b01294
35. Matbench compares these ML model paradigms
Traditional paradigms:
• Traditional Models (e.g., RF + MagPie[1] features)
• AutoML inside “traditional ML” space (Automatminer)
Advancements in deep neural networks:
1. doi.org/10.1038/npjcompumats.2016.28
Attention Networks
(e.g., CRABNet [2])
Optimal Descriptor Networks
(e.g, MODNet [3])
Crystal Graph Networks
(e.g, CGCNN, MEGNet [4])
2. doi.org/10.1038/s41524-021-00545-1 3. doi.org/10.1038/s41524-021-00552-2 4. doi.org/10.1021/acs.chemmater.9b01294
✓ - in Matbench
✓ - in Matbench
✓ - in Matbench
✓ - CGCNN in
Matbench
✓ - MEGNET in
progress
✓ - PR in review
36. Contribute your model to the body of knowledge
Matbench Python package
Evaluate an entire benchmark with ~10 lines of code
$: pip install matbench
from matbench.bench import MatbenchBenchmark
mb = MatbenchBenchmark(autoload=False)
for task in mb.tasks:
task.load()
for fold in task.folds:
train_inputs, train_outputs = task.get_train_and_val_data(fold)
my_model.train_and_validate(train_inputs, train_outputs)
test_inputs = task.get_test_data(fold, include_target=False)
predictions = my_model.predict(test_inputs)
task.record(fold, predictions)
mb.to_file("my_models_benchmark.json.gz")
Your model needs to have:
• a function that trains it
based on training data
• makes a prediction based
on the trained model
37. Contribute your model to the body of knowledge
Matbench Python package
Evaluate an entire benchmark with ~10 lines of code
$: pip install matbench
from matbench.bench import MatbenchBenchmark
mb = MatbenchBenchmark(autoload=False)
for task in mb.tasks:
task.load()
for fold in task.folds:
train_inputs, train_outputs = task.get_train_and_val_data(fold)
my_model.train_and_validate(train_inputs, train_outputs)
test_inputs = task.get_test_data(fold, include_target=False)
predictions = my_model.predict(test_inputs)
task.record(fold, predictions)
mb.to_file("my_models_benchmark.json.gz")
Submit model file along
with your desired model
metadata via Github PR
38. The ingredients of the Matbench
benchmark
ü Standard data sets
ü Standard test splits according to nested cross-validation procedure
ü An online leaderboard that encourages reproducible results
38
39. Results so far: graph NN for large
data sets, conventional ML for small
Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer
Reference Algorithm. npj Comput Mater 2020, 6 (1), 138. https://doi.org/10.1038/s41524-020-00406-3.
39
40. Overall and upcoming goals for
Matbench
• We have introduced a method that allows researchers to evaluate
their machine learning models on a standard benchmark, if they so
choose
• The “Matbench” resource also provides metadata and code examples
that allows others to reproduce and use community ML models more
easily, as well as discover new ML models
• In the future, we hope to do expand the type of tasks, perform meta-
analyses on what kinds of algorithms work best for certain problems,
and plot progress on these tasks over time
40
41. Concluding thoughts
The Materials Project is a free resource providing data and tools to
help perform research and development of new materials
Even more can be accomplished as a unified community to push
forward data dissemination as well as the capabilities of machine
learning
41
We encourage you to give Matbench a try, and look forward to
seeing your algorithm on the leaderboard!
42. Kristin Persson
MP Director
The team Intro
Thank you!
Patrick Huck
Staff Scientist
(MPContribs)
Alex Dunn
Grad Student
(Matbench /
matminer)
Slides (already) uploaded to https://hackingmaterials.lbl.gov