基于深度学习原子特征表示方法的Janus过渡金属硫化物带隙预测

孙涛; 袁健美

doi:10.7498/aps.72.20221374

摘要

随着人工智能的发展, 机器学习在材料计算中的应用越来越广泛. 将机器学习应用到材料性质预测等任务中首要实现的是获得有效的材料特征表示. 本文采用一种原子特征表示方法, 研究一种低维、密集的分布式原子特征向量, 并用于材料带隙预测任务. 按照材料化学式中原子种类和原子个数, 使用Transformer编码器作为模型结构, 通过训练大量的材料化学式数据, 从而提取参与训练元素的特征. 利用该方法预测Janus结构过渡金属硫族化合物MXY (M代表过渡金属, X, Y是不同硫族元素)二维材料带隙. 基于深度学习得到的原子特征向量比传统的Magpie方法和Atom2Vec方法的预测平均绝对误差更小. 可视化分析和材料性质预测数值实验表明, 本文提出的基于深度学习提取的原子特征表示方法, 可以有效表征材料特征, 并且应用到材料带隙预测任务中.

关键词:

Abstract

With the development of artificial intelligence, machine learning (ML) is more and more widely used in material computing. To apply ML to the prediction of material properties, the first thing to do is to obtain effective material feature representation. In this paper, an atomic feature representation method is used to study a low-dimensional, densely distributed atomic eigenvector, which is applied to the band gap prediction in material design. According to the types and numbers of atoms in the chemical formula of material, the Transformer Encoder is used as a model structure, and a large number of material chemical formula data are trained to extract the features of the training elements. Through the clustering analysis of the atomic feature vectors of the main group elements, it is found that the element features can be used to distinguish the element categories. The Principal Component Analysis of the atomic eigenvector of the main group element shows that the projection of the atomic eigenvector on the first principal component reflects the outermost electron number corresponding to the element. It illustrates the effectiveness of atomic eigenvector extracted by using the transformer model. Subsequently, the atomic feature representation method is used to represent the material characteristics. Three ML methods named Random Forest (RF), Kernel Ridge Regression (KRR) and Support Vector Regression (SVR) are used to predict the band gap of the two-dimensional transition metal chalcogenide compound MXY (M represents transition metal, X and Y refer to the different chalcogenide elements) with Janus structure. The hyperparameters of ML model are determined by searching for parameters. To obtain stable results, the ML model is tested by 5-fold cross-validation. The results obtained from the three ML models show that the average absolute error of the prediction using atomic feature vectors based on deep learning is smaller than that obtained from the traditional Magpie method and the Atom2Vec method. For the atomic eigenvector method proposed in this paper, the prediction accuracy of the KRR model is better than that of the results obtained from the Magpie method and Atom2Vec method. It shows that the atomic feature vector proposed in this paper has a certain correlation between the features, and is a low-dimensional and densely distributed feature vector. Visual analysis and numerical experiments of material property prediction show that the atomic feature representation method based on deep learning extraction proposed in this paper can effectively characterize the material features and can be applied to the tasks of material band gap prediction.

Keywords:

作者及机构信息

孙涛,
袁健美

搜索