Materials design is a grand challenge of materials science. And the main approach for solving this problem is still intuition-based. Such a way requires a lot of time and financial resources and months to years of conducting the experiment and doing characterization. Therefore, any kind of model that can be used at the very first stage of materials design and can narrow the selection area is a helpful tool for synthetic chemist. Also, an automated search for materials with human-defined target properties in the entire chemical space, i.e. inverse materials design is a highly desired tool in the exploration of materials design space.
Along with that, de novo design is not a kind of a completely new task in a field of development of new organic molecules with target properties. A lot of different generative approaches are being used along with screening the libraries of existing molecules, searching for drugs for a particular target, or generating new ones based on a very simple initial structure.
Here we would like to present a new approach for generating new materials with desired properties. We used autoencoder neural network architecture to encode materials composition and crystal structure as a vector in a latent space. In such case, any Quantitative Structure-Property Relationship (QSPR) model based on the vector can be interpreted as function in the latent space and can be used to predict property of existing materials as well as for prophetic ones. Such an approach has comparable accuracy with such classic computational methods as DFT in the case of predicting values of energies or charges, but significantly transcends them in terms of computational time.
The proposed method was tested for generating super-firm materials, but can easily be extended to any target properties, granted a database of materials properties can be provided for training.
1. In silico design of new
functional materials
Vadim Korolev, Artem Mitrofanov, Artem Eliseev, Boris Sattarov, Valery
Tkachenko
Science Data
Software, LLC
Lomonosov
Moscow State University
1
13. Estimator or why do we need one more
database?
• Experimental check
• Composition
• Methods
• DFT calculator
• xyz
• Method
• Basis set
• Machine learning model:
• Vectorization
• Method
• Training dataset
13
18. Decoder
• VAE decoder
• XRD to xyz
Raw spectra Decoded spectra
hexagonal
total 414
MAE, A a,b: 2.1; c: 8.0
a,b: 1.7 (22%)
c: 9.7 (77%)
Good 175 134
tetragonal
Total 530
MAE, A a,b: 30; c: 30
a,b: 2.4 (30%)
c: 7.6 (68%)
Good 196 121
cubic
Total 238
MAE, A a,b,c: 5 a,b,c: 1.5 (16%)
Good 184 163
all
Total 1233
MAE, A a,b: 2.1c: 31 2.4 (24%)
Good
18