SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Software tools, crystal descriptors, and
machine learning applied to materials design
Anubhav Jain
Energy Technologies Area
Lawrence Berkeley National Laboratory
Berkeley, CA
TMS 2019
Slides (already) posted to hackingmaterials.lbl.gov
2
This talk is centered around open-source software that you
can use to accelerate your own materials design efforts
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
3
Outline
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
We know that high-throughput DFT is useful for generating
large data sets, e.g., for materials screening
4
M. De Jong, W. Chen, T.
Angsten, A. Jain, R. Notestine,
A. Gamst, M. Sluiter, C. K.
Ande, S. Van Der Zwaag, J. J.
Plata, C. Toher, S. Curtarolo,
G. Ceder, K. a Persson, and M.
Asta, Sci. Data, 2015, 2, 150009.
>10,000
elastic tensors
>48000
Seebeck
coefficients +
cRTA transport
Ricci, Chen, Aydemir, Snyder,
Rignanese, Jain, & Hautier, Sci
Data 2017, 4, 170085.
We know that high-throughput DFT is useful for generating
large data sets, e.g., for materials screening
5
M. De Jong, W. Chen, T.
Angsten, A. Jain, R. Notestine,
A. Gamst, M. Sluiter, C. K.
Ande, S. Van Der Zwaag, J. J.
Plata, C. Toher, S. Curtarolo,
G. Ceder, K. a Persson, and M.
Asta, Sci. Data, 2015, 2, 150009.
>10,000
elastic tensors
>48000
Seebeck
coefficients +
cRTA transport
Ricci, Chen, Aydemir, Snyder,
Rignanese, Jain, & Hautier, Sci
Data 2017, 4, 170085.
Atomate’s goal: make
high-throughput easy and
scalable for everyone
A “black-box” view of performing a calculation
6
“something”
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Unfortunately, the inside of the “black box”
is usually tedious and “low-level”
7
lots of tedious,
low-level work…
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Input file flags
SLURM format
how to fix ZPOTRF?
q set up the structure coordinates
q write input files, double-check all
the flags
q copy to supercomputer
q submit job to queue
q deal with supercomputer
headaches
q monitor job
q fix error jobs, resubmit to queue,
wait again
q repeat process for subsequent
calculations in workflow
q parse output files to obtain results
q copy and organize results, e.g., into
Excel
What would be a better way?
8
“something”
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
What would be a better way?
9
Results!
researcher
What is the
GGA-PBE elastic
tensor of GaAs?
Workflows to run
q band structure
q surface energies
ü elastic tensor
q Raman spectrum
q QH thermal expansion
Ideally the method should scale to millions of calculations
10
Results!
researcher
Start with all binary
oxides, replace O->S,
run several different
properties
Workflows to run
ü band structure
ü surface energies
ü elastic tensor
q Raman spectrum
q QH thermal expansion
q spin-orbit coupling
Atomate tries make it easy, automatic, and flexible to
generate data with existing simulation packages
11
Results!
researcher
Run many different
properties of many
different materials!
Each simulation procedure translates high-level instructions
into a series of low-level tasks
12
quickly and automatically translate high-level (minimal)
specifications into well-defined FireWorks workflows
What is the
GGA-PBE elastic
tensor of GaAs?
M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, et al.,
Charting the complete elastic properties of inorganic crystalline compounds,
Sci. Data. 2 (2015).
Atomate contains a library of simulation procedures
13
VASP-based
• band structure
• spin-orbit coupling
• hybrid functional
calcs
• elastic tensor
• piezoelectric tensor
• Raman spectra
• NEB
• GIBBS method
• QH thermal
expansion
• AIMD
• ferroelectric
• surface adsorption
• work functions
• NMR spectra*
• Bader charges*
• Magnetic
orderings*
• SCAN functionals*
Other
• BoltzTraP
• FEFF method
• Q-Chem*
*=added / major
updates in past year
Mathew, K. et al Atomate: A high-level interface to generate, execute, and analyze
computational materials science workflows, Comput. Mater. Sci. 139 (2017) 140–152.
14
Full operation diagram
job 1
job 2
job 3 job 4
structure workflow database of
all workflows
automatically submit + executeoutput files + database
15
A web-based interface is in progress to give atomate users a
“personal Materials Project” of their own calculations
Atomate now powers the Materials Project
• Online resource of density
functional theory simulation data
for ~85,000 inorganic materials
• Includes band structures, elastic
tensors, piezoelectric tensors,
battery properties and more
• >75,000 registered users
• Free
• www.materialsproject.org
16
Jain et al. Commentary: The Materials Project: A
materials genome approach to accelerating
materials innovation. APL Mater. 1, 11002 (2013).
17
Getting started with atomate
Mathew, K. et al. Atomate: A high-
level interface to generate, execute,
and analyze computational
materials science workflows.
Comput. Mater. Sci. 139, 140–152
(2017).
hackingmaterials.github.io/
atomate
https://groups.google.com/
forum/#!forum/atomate
Paper Docs Support
18
Outline
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
• With atomate/FireWorks,
the user must decide which
calculations to perform
– E.g., which materials to
calculate
• Rocketsled is an extension
to FireWorks that lets the
computer decide what the
next best calculation is
based on the results of
previous calculations
• Works for materials design
or any other “inverse
computational problem”
19
Rocketsled uses adaptive design to suggest the best
computations to optimize some metric
20
Given a search domain, Rocketsled uses an optimization
engine to select calculations and submit to supercomputers
Optimization engine includes 4 built-in regressors (e.g., RandomForest,
Gaussian Process) and 5 acquisition functions (e.g., Expected
Improvement). Can bootstrap uncertainty estimates. Or use your own!
21
Results of using optimization can be dramatic!
In the problem of finding materials with
high K and high G for superhard
materials (7394 possibilities), Rocketsled
finds solutions ~30-60X faster than
randomly computing the space.
Can use pure ML approaches or use
matminer featurizations for materials
science (latter helps give such good
performance)
22
Results of using optimization can be dramatic!
In the problem of finding materials with
high K and high G for superhard
materials (7394 possibilities), Rocketsled
finds solutions ~30-60X faster than
randomly computing the space.
Even after just 200
calculations of the
7394 possibilities,
all solutions are
almost certain to
be found with
Rocketsled.
Can use pure ML approaches or use
matminer featurizations for materials
science (latter helps give such good
performance)
23
Getting started with rocketsled
Dunn, A.R., et al. Rocketsled: a
software library for optimizing
high-throughput computational
searches. J. Phys. Mater.
https://doi.org/10.1088/2515-
7639/ab0c3d
hackingmaterials.github.io/
rocketsled
https://groups.google.com/for
um/#!forum/fireworkflows
Paper Docs Support
24
Outline
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
25
What is needed to do machine learning on materials?
How can we represent
chemistry and structure as
vectors?
How do we get
enough output
data for training?
Matminer connects materials data with data mining
algorithms and data visualization libraries
26
Ward, L. et al. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60–69 (2018).
>60 featurizer classes can
generate thousands of potential
descriptors that are described in
the literature
27
Matminer contains a library of descriptors for various
materials science entities
feat = EwaldEnergy([options])
y = feat.featurize([input_data])
• compatible with scikit-
learn pipelining
• automatically deploy
multiprocessing to
parallelize over data
• include citations to
methodology papers
28
Interactive Jupyter notebooks demonstrate use cases
https://github.com/hackingmaterials/matminer_examples
Many examples available:
• Retrieving data from various databases
• Predicting bulk / shear modulus
• Predicting formation energies:
• from composition alone
• with Voronoi-based structure features
included
• with Coulomb matrix and Orbital Field
matrix descriptors (reproducing
previous studies in the literature)
• Making interactive visualizations
• Creating an ML pipeline
29
Getting started with matminer
Ward et al. Matminer : An open
source toolkit for materials data
mining. Computational Materials
Science, 152, 60–69 (2018).
Paper Docs Support
hackingmaterials.github.io
/matminer
https://groups.google.com/
forum/#!forum/matminer
30
Outline
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
31
Typically several steps of machine learning are performed by
a human researcher – can these be automated?
Descriptors developed and
chosen by a researcher
ML model developed
and chosen by a
researcher
Why can’t we just give the computer some raw input data
(compositions, crystal structures) and output properties and get
back an ML model?
32
Automatminer develops an ML model automatically given
raw data (structures or compositions plus output properties)
Featurizer
MagPie
SOAP
Sine Coulomb Matrix
+ many, many more
• Missing value
imputation
• Scaling
• One-hot
encoding
• PCA-based
• Correlation
• Relief-based
(MultiSURF)
Uses genetic
algorithms to find
the best machine
learning model +
hyperparameters
33
We are benchmarking automatminer vs current state of the
art against 11 problems intended to be a standard test set
Dataset Target(s) Samples
Elastic Tensor KVRH (GPa), GVRH (GPa) 10,987
Dielectric Tensor Refractive index 4,765
JARVIS 2D Exfoliation energy (meV/atom) 636
Materials Project
phonons
Highest LO Phonon Frequency (Last
PhDOS peak)
1,265
Materials Project
(stable)
Band gap (eV), Is metallic? (classification) 106,113
Perovskites Formation energy (eV/atom) 18,928
Experimental Band
Gaps
Is metallic? (classification) 6,354
Experimental Metallic
Glasses
Glass forms? (classification) 7,190
Materials Project (all) Formation energy (eV/atom) 132,752
34
Usually, automatminer does very well
Usually, automatminer outperforms both
state-of-the-art graph based models AND
human-generated models!
But …
35
Graph-based approaches work better in some problems
Hypothesis – automatminer
approaches are better for smaller
data sets, graph-based
approaches are better for larger
data sets
Unfortunately, it can be difficult
to train some of the graph
models on large data sets,
particularly without GPUs, so
the results are not in yet!
36
Getting started with automatminer
Paper Docs Support
hackingmaterials.github.io
/automatminer
https://groups.google.com/
forum/#!forum/matminer
In preparation …
37
Outline
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
38
Robocrystallographer is a computer program that can
analyze crystal structures and describe them as text
39
Example of fully automated robocrystallographer output
40
Example of fully automated robocrystallographer output
GaAs is zincblende structured and crystallizes in
the cubic F4 ̅3m space group. Ga3+ is bonded to
four equivalent As3– atoms to form corner-
sharing GaAs4 tetrahedra. All Ga–As bond lengths
are 2.49 Å. As3– is bonded in a tetrahedral
geometry to four equivalent Ga3+ atoms.
41
Example of fully automated robocrystallographer output
42
Example of fully automated robocrystallographer output
BiOCuSe is parent of FeAs superconductors
structured and crystallizes in the tetragonal
P4/nmm space group. The structure is two-
dimensional and consists of one BiO sheet
oriented in the (0, 0, 1) direction and one CuSe
sheet oriented in the (0, 0, 1) direction. In the BiO
sheet, Bi3+ is bonded in a 4-coordinate geometry
to four equivalent O2– atoms. All Bi–O bond
lengths are 2.35 Å. O2– is bonded in a tetrahedral
geometry to four equivalent Bi3+ atoms. In the
CuSe sheet, Cu1+ is bonded to four equivalent
Se2– atoms to form a mixture of edge and corner-
sharing CuSe4 tetrahedra. All Cu–Se bond lengths
are 2.52 Å. Se2– is bonded in a 4-coordinate
geometry to four equivalent Cu1+ atoms.
43
Robocrystallographer is integrated into the Materials Project
Click the
robot icon for
Robocrys
Click the
speaker icon
to have it talk
to you.
TiO2 is Rutile structured and crystallizes in the
tetragonal P4_2/mnm space group. The
structure is three-dimensional. Ti4+ is bonded
to six equivalent O2- atoms to form a mixture of
corner and edge-sharing TiO6 octahedra. The
corner-sharing octahedral tilt angles are 49°.
There is four shorter (1.96 Å) and two longer
(2.00 Å) Ti–O bond length. O2- is bonded in a
distorted trigonal planar geometry to three
equivalent Ti4+ atoms.
44
Getting started with robocrystallographer
Submitted - waiting
for referee report!!
Paper Docs Support
hackingmaterials.github.io
/robocrystallographer
Alex Ganose
aganose@lbl.gov
45
Conclusion: hopefully you’ve found something interesting
or useful for your own work!
High-throughput
computing and
simulations
Machine learning
Interpretable
crystal structure
representations
• Lead developers:
– Atomate: Kiran Mathew
– Rocketsled: Alex Dunn
– Matminer: Logan Ward
– Automatminer: Alex Dunn
– Robocrystallographer: Alex Ganose
• And the dozens of other developers who have contributed to
these packages or reported issues!
• Funding: U.S. Department of Energy, Basic Energy Sciences,
Early Career Award
• AddiLonal funding from the DOE-funded Materials Project
46
Acknowledgements
Slides (already) posted to hackingmaterials.lbl.gov

Contenu connexe

Tendances

Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Anubhav Jain
 
Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Anubhav Jain
 
Methods, tools, and examples (Part II): High-throughput computation and machi...
Methods, tools, and examples (Part II): High-throughput computation and machi...Methods, tools, and examples (Part II): High-throughput computation and machi...
Methods, tools, and examples (Part II): High-throughput computation and machi...Anubhav Jain
 
Conducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectConducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectAnubhav Jain
 
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...Anubhav Jain
 
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...Anubhav Jain
 
Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Anubhav Jain
 
Atomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discoveryAtomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discoveryAnubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLAnubhav Jain
 
Computational screening of tens of thousands of compounds as potential thermo...
Computational screening of tens of thousands of compounds as potential thermo...Computational screening of tens of thousands of compounds as potential thermo...
Computational screening of tens of thousands of compounds as potential thermo...Anubhav Jain
 
High-throughput computation and machine learning methods applied to materials...
High-throughput computation and machine learning methods applied to materials...High-throughput computation and machine learning methods applied to materials...
High-throughput computation and machine learning methods applied to materials...Anubhav Jain
 
Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...Anubhav Jain
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsAnubhav Jain
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...Anubhav Jain
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science researchAnubhav Jain
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructureAnubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Anubhav Jain
 

Tendances (20)

Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...
 
Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...Capturing and leveraging materials science knowledge from millions of journal...
Capturing and leveraging materials science knowledge from millions of journal...
 
Methods, tools, and examples (Part II): High-throughput computation and machi...
Methods, tools, and examples (Part II): High-throughput computation and machi...Methods, tools, and examples (Part II): High-throughput computation and machi...
Methods, tools, and examples (Part II): High-throughput computation and machi...
 
Conducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials ProjectConducting and Enabling Data-Driven Research Through the Materials Project
Conducting and Enabling Data-Driven Research Through the Materials Project
 
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
 
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
 
Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...
 
Atomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discoveryAtomate: a tool for rapid high-throughput computing and materials discovery
Atomate: a tool for rapid high-throughput computing and materials discovery
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
 
Computational screening of tens of thousands of compounds as potential thermo...
Computational screening of tens of thousands of compounds as potential thermo...Computational screening of tens of thousands of compounds as potential thermo...
Computational screening of tens of thousands of compounds as potential thermo...
 
High-throughput computation and machine learning methods applied to materials...
High-throughput computation and machine learning methods applied to materials...High-throughput computation and machine learning methods applied to materials...
High-throughput computation and machine learning methods applied to materials...
 
Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...Software tools for calculating materials properties in high-throughput (pymat...
Software tools for calculating materials properties in high-throughput (pymat...
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials Informatics
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructure
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 

Similaire à Software tools, crystal descriptors, and machine learning applied to materials design

Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...Anubhav Jain
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Anubhav Jain
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
Scientific
Scientific Scientific
Scientific marpierc
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Anubhav Jain
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesIan Foster
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAnubhav Jain
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET Journal
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningKAMAL CHOUDHARY
 
A Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge SystemsA Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge Systemsaimsnist
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...Anubhav Jain
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Anubhav Jain
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...IRJET Journal
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applicationsaimsnist
 

Similaire à Software tools, crystal descriptors, and machine learning applied to materials design (20)

Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...Overview of accelerated materials design efforts in the Hacking Materials res...
Overview of accelerated materials design efforts in the Hacking Materials res...
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
Scientific
Scientific Scientific
Scientific
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
A Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge SystemsA Machine Learning Framework for Materials Knowledge Systems
A Machine Learning Framework for Materials Knowledge Systems
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and ApplicationsData Mining to Discovery for Inorganic Solids: Software Tools and Applications
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
 

Plus de Anubhav Jain

Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Anubhav Jain
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAnubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software disseminationAnubhav Jain
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software disseminationAnubhav Jain
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Anubhav Jain
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Anubhav Jain
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Anubhav Jain
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst DesignAnubhav Jain
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Anubhav Jain
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAnubhav Jain
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …Anubhav Jain
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials ProjectAnubhav Jain
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Anubhav Jain
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectAnubhav Jain
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...Anubhav Jain
 
Machine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignMachine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignAnubhav Jain
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignAnubhav Jain
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAnubhav Jain
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Anubhav Jain
 

Plus de Anubhav Jain (20)

Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesis
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
 
Best practices for DuraMat software dissemination
Best practices for DuraMat software disseminationBest practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst Design
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine Learning
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials Project
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials Project
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...
 
Machine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignMachine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst Design
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data Analysis
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...
 

Dernier

Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 

Dernier (20)

Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 

Software tools, crystal descriptors, and machine learning applied to materials design

  • 1. Software tools, crystal descriptors, and machine learning applied to materials design Anubhav Jain Energy Technologies Area Lawrence Berkeley National Laboratory Berkeley, CA TMS 2019 Slides (already) posted to hackingmaterials.lbl.gov
  • 2. 2 This talk is centered around open-source software that you can use to accelerate your own materials design efforts High-throughput computing and simulations Machine learning Interpretable crystal structure representations
  • 4. We know that high-throughput DFT is useful for generating large data sets, e.g., for materials screening 4 M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, M. Sluiter, C. K. Ande, S. Van Der Zwaag, J. J. Plata, C. Toher, S. Curtarolo, G. Ceder, K. a Persson, and M. Asta, Sci. Data, 2015, 2, 150009. >10,000 elastic tensors >48000 Seebeck coefficients + cRTA transport Ricci, Chen, Aydemir, Snyder, Rignanese, Jain, & Hautier, Sci Data 2017, 4, 170085.
  • 5. We know that high-throughput DFT is useful for generating large data sets, e.g., for materials screening 5 M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, M. Sluiter, C. K. Ande, S. Van Der Zwaag, J. J. Plata, C. Toher, S. Curtarolo, G. Ceder, K. a Persson, and M. Asta, Sci. Data, 2015, 2, 150009. >10,000 elastic tensors >48000 Seebeck coefficients + cRTA transport Ricci, Chen, Aydemir, Snyder, Rignanese, Jain, & Hautier, Sci Data 2017, 4, 170085. Atomate’s goal: make high-throughput easy and scalable for everyone
  • 6. A “black-box” view of performing a calculation 6 “something” Results! researcher What is the GGA-PBE elastic tensor of GaAs?
  • 7. Unfortunately, the inside of the “black box” is usually tedious and “low-level” 7 lots of tedious, low-level work… Results! researcher What is the GGA-PBE elastic tensor of GaAs? Input file flags SLURM format how to fix ZPOTRF? q set up the structure coordinates q write input files, double-check all the flags q copy to supercomputer q submit job to queue q deal with supercomputer headaches q monitor job q fix error jobs, resubmit to queue, wait again q repeat process for subsequent calculations in workflow q parse output files to obtain results q copy and organize results, e.g., into Excel
  • 8. What would be a better way? 8 “something” Results! researcher What is the GGA-PBE elastic tensor of GaAs?
  • 9. What would be a better way? 9 Results! researcher What is the GGA-PBE elastic tensor of GaAs? Workflows to run q band structure q surface energies ü elastic tensor q Raman spectrum q QH thermal expansion
  • 10. Ideally the method should scale to millions of calculations 10 Results! researcher Start with all binary oxides, replace O->S, run several different properties Workflows to run ü band structure ü surface energies ü elastic tensor q Raman spectrum q QH thermal expansion q spin-orbit coupling
  • 11. Atomate tries make it easy, automatic, and flexible to generate data with existing simulation packages 11 Results! researcher Run many different properties of many different materials!
  • 12. Each simulation procedure translates high-level instructions into a series of low-level tasks 12 quickly and automatically translate high-level (minimal) specifications into well-defined FireWorks workflows What is the GGA-PBE elastic tensor of GaAs? M. De Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, et al., Charting the complete elastic properties of inorganic crystalline compounds, Sci. Data. 2 (2015).
  • 13. Atomate contains a library of simulation procedures 13 VASP-based • band structure • spin-orbit coupling • hybrid functional calcs • elastic tensor • piezoelectric tensor • Raman spectra • NEB • GIBBS method • QH thermal expansion • AIMD • ferroelectric • surface adsorption • work functions • NMR spectra* • Bader charges* • Magnetic orderings* • SCAN functionals* Other • BoltzTraP • FEFF method • Q-Chem* *=added / major updates in past year Mathew, K. et al Atomate: A high-level interface to generate, execute, and analyze computational materials science workflows, Comput. Mater. Sci. 139 (2017) 140–152.
  • 14. 14 Full operation diagram job 1 job 2 job 3 job 4 structure workflow database of all workflows automatically submit + executeoutput files + database
  • 15. 15 A web-based interface is in progress to give atomate users a “personal Materials Project” of their own calculations
  • 16. Atomate now powers the Materials Project • Online resource of density functional theory simulation data for ~85,000 inorganic materials • Includes band structures, elastic tensors, piezoelectric tensors, battery properties and more • >75,000 registered users • Free • www.materialsproject.org 16 Jain et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 1, 11002 (2013).
  • 17. 17 Getting started with atomate Mathew, K. et al. Atomate: A high- level interface to generate, execute, and analyze computational materials science workflows. Comput. Mater. Sci. 139, 140–152 (2017). hackingmaterials.github.io/ atomate https://groups.google.com/ forum/#!forum/atomate Paper Docs Support
  • 19. • With atomate/FireWorks, the user must decide which calculations to perform – E.g., which materials to calculate • Rocketsled is an extension to FireWorks that lets the computer decide what the next best calculation is based on the results of previous calculations • Works for materials design or any other “inverse computational problem” 19 Rocketsled uses adaptive design to suggest the best computations to optimize some metric
  • 20. 20 Given a search domain, Rocketsled uses an optimization engine to select calculations and submit to supercomputers Optimization engine includes 4 built-in regressors (e.g., RandomForest, Gaussian Process) and 5 acquisition functions (e.g., Expected Improvement). Can bootstrap uncertainty estimates. Or use your own!
  • 21. 21 Results of using optimization can be dramatic! In the problem of finding materials with high K and high G for superhard materials (7394 possibilities), Rocketsled finds solutions ~30-60X faster than randomly computing the space. Can use pure ML approaches or use matminer featurizations for materials science (latter helps give such good performance)
  • 22. 22 Results of using optimization can be dramatic! In the problem of finding materials with high K and high G for superhard materials (7394 possibilities), Rocketsled finds solutions ~30-60X faster than randomly computing the space. Even after just 200 calculations of the 7394 possibilities, all solutions are almost certain to be found with Rocketsled. Can use pure ML approaches or use matminer featurizations for materials science (latter helps give such good performance)
  • 23. 23 Getting started with rocketsled Dunn, A.R., et al. Rocketsled: a software library for optimizing high-throughput computational searches. J. Phys. Mater. https://doi.org/10.1088/2515- 7639/ab0c3d hackingmaterials.github.io/ rocketsled https://groups.google.com/for um/#!forum/fireworkflows Paper Docs Support
  • 25. 25 What is needed to do machine learning on materials? How can we represent chemistry and structure as vectors? How do we get enough output data for training?
  • 26. Matminer connects materials data with data mining algorithms and data visualization libraries 26 Ward, L. et al. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60–69 (2018).
  • 27. >60 featurizer classes can generate thousands of potential descriptors that are described in the literature 27 Matminer contains a library of descriptors for various materials science entities feat = EwaldEnergy([options]) y = feat.featurize([input_data]) • compatible with scikit- learn pipelining • automatically deploy multiprocessing to parallelize over data • include citations to methodology papers
  • 28. 28 Interactive Jupyter notebooks demonstrate use cases https://github.com/hackingmaterials/matminer_examples Many examples available: • Retrieving data from various databases • Predicting bulk / shear modulus • Predicting formation energies: • from composition alone • with Voronoi-based structure features included • with Coulomb matrix and Orbital Field matrix descriptors (reproducing previous studies in the literature) • Making interactive visualizations • Creating an ML pipeline
  • 29. 29 Getting started with matminer Ward et al. Matminer : An open source toolkit for materials data mining. Computational Materials Science, 152, 60–69 (2018). Paper Docs Support hackingmaterials.github.io /matminer https://groups.google.com/ forum/#!forum/matminer
  • 31. 31 Typically several steps of machine learning are performed by a human researcher – can these be automated? Descriptors developed and chosen by a researcher ML model developed and chosen by a researcher Why can’t we just give the computer some raw input data (compositions, crystal structures) and output properties and get back an ML model?
  • 32. 32 Automatminer develops an ML model automatically given raw data (structures or compositions plus output properties) Featurizer MagPie SOAP Sine Coulomb Matrix + many, many more • Missing value imputation • Scaling • One-hot encoding • PCA-based • Correlation • Relief-based (MultiSURF) Uses genetic algorithms to find the best machine learning model + hyperparameters
  • 33. 33 We are benchmarking automatminer vs current state of the art against 11 problems intended to be a standard test set Dataset Target(s) Samples Elastic Tensor KVRH (GPa), GVRH (GPa) 10,987 Dielectric Tensor Refractive index 4,765 JARVIS 2D Exfoliation energy (meV/atom) 636 Materials Project phonons Highest LO Phonon Frequency (Last PhDOS peak) 1,265 Materials Project (stable) Band gap (eV), Is metallic? (classification) 106,113 Perovskites Formation energy (eV/atom) 18,928 Experimental Band Gaps Is metallic? (classification) 6,354 Experimental Metallic Glasses Glass forms? (classification) 7,190 Materials Project (all) Formation energy (eV/atom) 132,752
  • 34. 34 Usually, automatminer does very well Usually, automatminer outperforms both state-of-the-art graph based models AND human-generated models! But …
  • 35. 35 Graph-based approaches work better in some problems Hypothesis – automatminer approaches are better for smaller data sets, graph-based approaches are better for larger data sets Unfortunately, it can be difficult to train some of the graph models on large data sets, particularly without GPUs, so the results are not in yet!
  • 36. 36 Getting started with automatminer Paper Docs Support hackingmaterials.github.io /automatminer https://groups.google.com/ forum/#!forum/matminer In preparation …
  • 38. 38 Robocrystallographer is a computer program that can analyze crystal structures and describe them as text
  • 39. 39 Example of fully automated robocrystallographer output
  • 40. 40 Example of fully automated robocrystallographer output GaAs is zincblende structured and crystallizes in the cubic F4 ̅3m space group. Ga3+ is bonded to four equivalent As3– atoms to form corner- sharing GaAs4 tetrahedra. All Ga–As bond lengths are 2.49 Å. As3– is bonded in a tetrahedral geometry to four equivalent Ga3+ atoms.
  • 41. 41 Example of fully automated robocrystallographer output
  • 42. 42 Example of fully automated robocrystallographer output BiOCuSe is parent of FeAs superconductors structured and crystallizes in the tetragonal P4/nmm space group. The structure is two- dimensional and consists of one BiO sheet oriented in the (0, 0, 1) direction and one CuSe sheet oriented in the (0, 0, 1) direction. In the BiO sheet, Bi3+ is bonded in a 4-coordinate geometry to four equivalent O2– atoms. All Bi–O bond lengths are 2.35 Å. O2– is bonded in a tetrahedral geometry to four equivalent Bi3+ atoms. In the CuSe sheet, Cu1+ is bonded to four equivalent Se2– atoms to form a mixture of edge and corner- sharing CuSe4 tetrahedra. All Cu–Se bond lengths are 2.52 Å. Se2– is bonded in a 4-coordinate geometry to four equivalent Cu1+ atoms.
  • 43. 43 Robocrystallographer is integrated into the Materials Project Click the robot icon for Robocrys Click the speaker icon to have it talk to you. TiO2 is Rutile structured and crystallizes in the tetragonal P4_2/mnm space group. The structure is three-dimensional. Ti4+ is bonded to six equivalent O2- atoms to form a mixture of corner and edge-sharing TiO6 octahedra. The corner-sharing octahedral tilt angles are 49°. There is four shorter (1.96 Å) and two longer (2.00 Å) Ti–O bond length. O2- is bonded in a distorted trigonal planar geometry to three equivalent Ti4+ atoms.
  • 44. 44 Getting started with robocrystallographer Submitted - waiting for referee report!! Paper Docs Support hackingmaterials.github.io /robocrystallographer Alex Ganose aganose@lbl.gov
  • 45. 45 Conclusion: hopefully you’ve found something interesting or useful for your own work! High-throughput computing and simulations Machine learning Interpretable crystal structure representations
  • 46. • Lead developers: – Atomate: Kiran Mathew – Rocketsled: Alex Dunn – Matminer: Logan Ward – Automatminer: Alex Dunn – Robocrystallographer: Alex Ganose • And the dozens of other developers who have contributed to these packages or reported issues! • Funding: U.S. Department of Energy, Basic Energy Sciences, Early Career Award • AddiLonal funding from the DOE-funded Materials Project 46 Acknowledgements Slides (already) posted to hackingmaterials.lbl.gov