-
The development of first principle methods can represent the summit of the sciences in the material computing and molecular modeling, and the corresponding first principle software packages are closely related with the accumulation of theories and algorithms in this field. In this paper, we reported our recent progress in refactoring the first principle package BSTATE. The key points in the reconstruction are lowering the doorsill, extending the scope of application, as well as adjusting package to the popular computer hardware. And as such, we updated the Makefile system to the new CMake system, in which the GUI can be used and many math libraries can be configured automatically; we added the support for the Libxc library, in which a large quantity of density functionals are included; we updated the interface for supporting GPU, in order to support the heterogeneous computing system. After refactoring, the Makefile system of BSTATE can supply both the Makefile and CMake system, and the Fourier transform libraries such as FFTW2, FFTW3, and Cufftw, the math libraries such as Intel MKL library, Openblas, and the density functional library such as Libxc, can be automatically or manually assigned. The integration of FFTW3 can slightly prompt the calculating efficiency in Intel’s many integrated core (MIC) architecture, and the integration of Cufftw can supply the initial support for the graphics processing unit (GPU) architecture, respectively. The usage of Libxc library makes the BSTATE package has the capacity to use hundreds density functionals, and the usages of various functionals were demonstrated by calculating the density of states of GaAs compound. Beyond the integration of various libraries, the parallel performance of BSTATE was also investigated. It can be found that the Fourier transformation and the solving for the eigenvalue equations are the major contributions. Using the tuning and analysis utilities (TAU) tool, we found that the tasks can be well distributed in modern HPC clusters. It implied that the refactoring didn’t affect the parallel efficiency of original BSTATE package. In a following benchmark test of graphene fragments, one can found that the refactored BSTATE package showed the best performance, its FFTW3 & Libxc version owns about 0–17% acceleration comparing to that of FFTW2 version.
-
Keywords:
- first principle computational software /
- density-functional theory /
- Beijing Simulation Tool for Atom TEchnology /
- Software refactoring
[1] Frisch M J, Trucks G W, Schlegel H B, et al. 2016 Gaussian Inc. Wallingford CT
[2] Kresse G, Furthmüller J 1996 Comp. Mater. Sci. 6 15Google Scholar
[3] Liu W, Wang F, Li L 2003 J. Theor. Comput. Chem. 2 257Google Scholar
[4] Li P, Liu X, Chen M, Lin P, Ren X, Lin L, Yang C, He L 2016 Comput. Mater. Sci. 112 503Google Scholar
[5] Fang Z, Terakura K J 2002 Phys. Condens. Mat. 14 3001Google Scholar
[6] Troullier N, Martins J L 1991 Phys. Rev. B 43 1993Google Scholar
[7] Vanderbilt D 1990 Phys. Rev. B 41 7892Google Scholar
[8] Kresse G, Hafner J J 1994 Phys.: Condens. Matter 6 8245Google Scholar
[9] Deng X Y, Wang L, Dai X, Fang Z 2009 Phys. Rev. B 79 075114Google Scholar
[10] Rullier-Albenque F, Alloul H, Balakirev F, Proust C 2008 Europhys. Lett. 81 37008Google Scholar
[11] Rice T M, Ueda K 1985 Phys. Rev. Lett. 55 995Google Scholar
[12] Yu R, Zhang W, Zhang H J, Zhang S C, Dai X, Fang Z 2010 Science 329 61Google Scholar
[13] 余睿, 张薇, 翁红明, 戴希, 方忠 2010 物理 39 618
Yu R, Zhang W, Weng H M, Dai X, Fang Z 2010 Physics 39 618
[14] 梁拥成, 张英, 郭万林, 姚裕贵, 方忠 2007 物理 36 385Google Scholar
Liang Y C, Zhang Y, Guo W L, Yao Y G, Fang Z 2007 Physics 36 385Google Scholar
[15] 徐刚, 戴希, 方忠 2009 物理 38 651Google Scholar
Xu G, Dai X, Fang Z 2009 Physics 38 651Google Scholar
[16] Dirac P A M 1930 Proc. Camb. Phil. Soc. 26 376Google Scholar
[17] Perdew J P, Chevary J A, Vosko S H, Jackson K A, Pederson M R, Singh D J, Fiolhais C 1992 Phys. Rev. B 46 6671Google Scholar
[18] Perdew J P, Burke K, Ernzerhof M 1997 Phys. Rev. Lett. 77 3865
[19] Kohn W, Sham L J 1965 Phys. Rev. 140 A1133Google Scholar
[20] Hohenberg P, Kohn W 1964 Phys. Rev. 136 B864Google Scholar
[21] 黄美纯 2000 物理学进展 20 199Google Scholar
Huang M C 2000 Progress in Physics 20 199Google Scholar
[22] 李震宇, 贺伟, 杨金龙 2005 化学进展 17 192Google Scholar
Li Z Y, H W, Yang J L 2005 Progress In Chemistry 17 192Google Scholar
[23] Heyd J, Scuseria G E 2003 J. Chem. Phys 118 8207Google Scholar
[24] Heyd J, Scuseria G E 2006 J. Chem. Phys 124 219906Google Scholar
[25] Clemencic M, Mato P 2012 J. Phys.: Conf. Ser. 396 052021Google Scholar
[26] Frigo M, Johnson S G 2005 Proceedings of the IEEE 93 216Google Scholar
[27] Lehtola S, Steigemann C, Oliveira M J T, Marques M A L 2018 Software X 7 1
[28] Engel E, Keller S, Bonetti A F, Müller H, Dreizler R M 1995 Phys. Rev. A. 52 2750Google Scholar
[29] Rae A 1973 Chem. Phys. Lett. 18 574Google Scholar
-
表 1 重构前后BSTATE编译系统的对比
Table 1. Comparison of BSTATE compilation system.
项目 重构前 重构后 编译系统 GNUMake CMake 图形GUI 不支持 支持 跨平台 手工修改Makefile文件提供支持 原生支持 数学库 手工配置 自动配置 外置函数库 手工配置 支持自动配置 异构支持 否 是 高级编译选项 手工配置 支持GUI配置 多线程编译 不支持 支持 用户门槛 高 低 表 2 Libxc版本与原始版本性能比较
Table 2. Benchmarks between BSTATEs with/without Libxc.
项目 单核/s 多核/s BSTATE 42.4 22.6 BSTATE+Libxc 43.6 23.4 性能比 0.97 0.97 * 测试机器为AMD A10 PRO-7800 B R7 (4核); GaAs体系 表 3 FFTW3+Libxc版本与原始FFTW2版本性能比较
Table 3. Benchmarks between V.FFTW2 and V.FFTW3+ Libxc.
项目 CPU平台/s MIC平台/s FFTW2 1181 1717 FFTW3+Libxc 1179 1593 性能比 1.00 1.08 * CPU平台为Intel至强E7-4830v3 (56核); 石墨烯体系
* MIC平台为Intel Phi-7210 (64核); 石墨烯体系 -
[1] Frisch M J, Trucks G W, Schlegel H B, et al. 2016 Gaussian Inc. Wallingford CT
[2] Kresse G, Furthmüller J 1996 Comp. Mater. Sci. 6 15Google Scholar
[3] Liu W, Wang F, Li L 2003 J. Theor. Comput. Chem. 2 257Google Scholar
[4] Li P, Liu X, Chen M, Lin P, Ren X, Lin L, Yang C, He L 2016 Comput. Mater. Sci. 112 503Google Scholar
[5] Fang Z, Terakura K J 2002 Phys. Condens. Mat. 14 3001Google Scholar
[6] Troullier N, Martins J L 1991 Phys. Rev. B 43 1993Google Scholar
[7] Vanderbilt D 1990 Phys. Rev. B 41 7892Google Scholar
[8] Kresse G, Hafner J J 1994 Phys.: Condens. Matter 6 8245Google Scholar
[9] Deng X Y, Wang L, Dai X, Fang Z 2009 Phys. Rev. B 79 075114Google Scholar
[10] Rullier-Albenque F, Alloul H, Balakirev F, Proust C 2008 Europhys. Lett. 81 37008Google Scholar
[11] Rice T M, Ueda K 1985 Phys. Rev. Lett. 55 995Google Scholar
[12] Yu R, Zhang W, Zhang H J, Zhang S C, Dai X, Fang Z 2010 Science 329 61Google Scholar
[13] 余睿, 张薇, 翁红明, 戴希, 方忠 2010 物理 39 618
Yu R, Zhang W, Weng H M, Dai X, Fang Z 2010 Physics 39 618
[14] 梁拥成, 张英, 郭万林, 姚裕贵, 方忠 2007 物理 36 385Google Scholar
Liang Y C, Zhang Y, Guo W L, Yao Y G, Fang Z 2007 Physics 36 385Google Scholar
[15] 徐刚, 戴希, 方忠 2009 物理 38 651Google Scholar
Xu G, Dai X, Fang Z 2009 Physics 38 651Google Scholar
[16] Dirac P A M 1930 Proc. Camb. Phil. Soc. 26 376Google Scholar
[17] Perdew J P, Chevary J A, Vosko S H, Jackson K A, Pederson M R, Singh D J, Fiolhais C 1992 Phys. Rev. B 46 6671Google Scholar
[18] Perdew J P, Burke K, Ernzerhof M 1997 Phys. Rev. Lett. 77 3865
[19] Kohn W, Sham L J 1965 Phys. Rev. 140 A1133Google Scholar
[20] Hohenberg P, Kohn W 1964 Phys. Rev. 136 B864Google Scholar
[21] 黄美纯 2000 物理学进展 20 199Google Scholar
Huang M C 2000 Progress in Physics 20 199Google Scholar
[22] 李震宇, 贺伟, 杨金龙 2005 化学进展 17 192Google Scholar
Li Z Y, H W, Yang J L 2005 Progress In Chemistry 17 192Google Scholar
[23] Heyd J, Scuseria G E 2003 J. Chem. Phys 118 8207Google Scholar
[24] Heyd J, Scuseria G E 2006 J. Chem. Phys 124 219906Google Scholar
[25] Clemencic M, Mato P 2012 J. Phys.: Conf. Ser. 396 052021Google Scholar
[26] Frigo M, Johnson S G 2005 Proceedings of the IEEE 93 216Google Scholar
[27] Lehtola S, Steigemann C, Oliveira M J T, Marques M A L 2018 Software X 7 1
[28] Engel E, Keller S, Bonetti A F, Müller H, Dreizler R M 1995 Phys. Rev. A. 52 2750Google Scholar
[29] Rae A 1973 Chem. Phys. Lett. 18 574Google Scholar
Catalog
Metrics
- Abstract views: 8483
- PDF Downloads: 149
- Cited By: 0