搜索

x

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于图形处理器加速数值求解三维含时薛定谔方程

唐富明 刘凯 杨溢 屠倩 王凤 王哲 廖青

引用本文:
Citation:

基于图形处理器加速数值求解三维含时薛定谔方程

唐富明, 刘凯, 杨溢, 屠倩, 王凤, 王哲, 廖青

Numerical solution of three-dimensional time-dependent Schrödinger equation based on graphic processing unit acceleration

Tang Fu-Ming, Liu Kai, Yang Yi, Tu Qian, Wang Feng, Wang Zhe, Liao Qing
PDF
HTML
导出引用
  • 量子力学领域中对强激光场与原子分子相互作用的理论研究非常依赖于数值求解含时薛定谔方程. 本文在强场电离的背景下并行求解氢原子的三维含时薛定谔方程. 基于球极坐标系, 采用分裂算符-傅里叶变换方法将含时薛定谔方程进行了离散化. 由此可得到长度规范下的光电子连续态波函数. 图形处理器(GPU)可以依托多线程结构充分发挥细粒度并行的优势, 实现整体算法的并行加速. 计算表明, 相对于中央处理器(CPU), GPU并行计算有着最高约60倍的加速比. 由此可见, 基于GPU加速数值求解三维含时薛定谔方程能够显著缩短计算耗费的时间. 这一工作对利用GPU快速求解三维含时薛定谔方程有着重要的指导意义.
    In the field of quantum mechanics, the theoretical study of the interaction between intense laser field and atoms and molecules depends very much on the numerical solution of the time-dependent Schrödinger equation. However, solving the three-dimensional time-dependent Schrödinger equation is not a simple task, and the analytical solution cannot be obtained, so it can only be solved numerically with the help of computer. In order to shorten the computing time and obtain the results quickly, it is necessary to use parallel methods to speed up computing. In this paper, under the background of strong field ionization, the three-dimensional time-dependent Schrödinger equation of hydrogen atom is solved in parallel, and the suprathreshold ionization of hydrogen atom under the action of linearly polarized infrared laser electric field is taken for example. Based on the spherical polar coordinate system, the time-dependent Schrödinger equation is discretized by the splitting operator-Fourier transform method, and the photoelectron continuous state wave function under the length gauge can be obtained. In Graphics processing unit (GPU) accelerated applications, the sequential portion of the workload runs on central processing unit (CPU) (which is optimized for single-threaded performance), while the compute-intensive part of the application runs in parallel on thousands of GPU cores. The GPU can make full use of the advantage of fine-grained parallelism based on multi-thread structure to realize parallel acceleration of the whole algorithm. Two accelerated computing modes of CPU parallel and GPU parallel are adopted, and their parallel acceleration performance is discussed. Compared with the results from the existing physical laws, the calculation error is also within an acceptable range, and the result is also consistent with the result from the existing physical laws of suprathreshold ionization, which also verifies the correctness of the program. In order to obtain a relatively accurate acceleration ratio, many different experiments are carried out. Computational experiments show that under the condition of ensuring accuracy, the GPU parallel computing speeds by up to about 60 times maximally based on the computational performance of CPU. It can be seen that the accelerated numerical solution of three-dimensional time-dependent Schrödinger equation based on GPU can significantly shorten the computational time. This work has important guiding significance for rapidly solving the three-dimensional time-dependent Schrödinger equation by using GPU.
      通信作者: 廖青, liaoqing@wit.edu.cn
    • 基金项目: 国家自然科学基金(批准号: 11674257, 11604248, 11874019, 11947096)和湖北省高等学校优秀中青年科技创新团队计划(批准号: T201806)资助的课题
      Corresponding author: Liao Qing, liaoqing@wit.edu.cn
    • Funds: Project supported by the National Natural Science Foundation of China (Grant Nos. 11674257, 11604248, 11874019, 11947096) and the Program for Distinguished Middle-aged and Young Innovative Research Team in Higher Education of Hubei Province, China (Grant No. T201806)
    [1]

    Corkum P 1993 Phys. Rev. Lett. 71 1994Google Scholar

    [2]

    Liu K L, Luo S Q, Li M, Li Y, Feng Y D, Du B J, Zhou Y M, Lu P X, Barth I 2019 Phys. Rev. Lett. 122 053202Google Scholar

    [3]

    Lewenstein M, Balcou P, Ivanov M Y, L’Huillier A, Corkum P B 1994 Phys. Rev. A 49 2117Google Scholar

    [4]

    Zhang X F, Zhu X S, Wang D, Li L, Liu X, Liao Q, Lan P F, Lu P X 2019 Phys. Rev. A 99 013414Google Scholar

    [5]

    Gaarde M B, Tate J L, Schafer K J 2008 J. Phys. B 41 132001Google Scholar

    [6]

    Liu K, Qin M Y, Li Q G, Liao Q 2018 Opt. Quantum Electron. 50 364Google Scholar

    [7]

    Liao Q, Li Y, Qin M Y, Lu P X 2017 Phys. Rev. A 96 063408Google Scholar

    [8]

    Liu K, Wang F, Wang Z, Qin M Y, Liao Q 2019 J. Opt. Soc. Am. B 36 2624Google Scholar

    [9]

    Muller H G 1999 Laser Phys. 9 138

    [10]

    Bauer D, Koval P 2006 Comput. Phys. Commun. 174 396Google Scholar

    [11]

    Madsen L B, Nikolopoulos L A A, Kjeldsen T K, Fernández J 2007 Phys. Rev. A 76 063407Google Scholar

    [12]

    Keldysh L V 1964 Sov. Phys. JETP 20 1307

    [13]

    Faisal F H M 1973 J. Phys. B: At. Mol. Opt. Phys. 6 L89Google Scholar

    [14]

    Reiss H R 1980 Phys. Rev. A 22 1786Google Scholar

    [15]

    Gallagher T F 1988 Phys. Rev. Lett. 61 2304Google Scholar

    [16]

    Corkum P B, Burnett N H, Brunel F 1989 Phys. Rev. Lett. 62 1259Google Scholar

    [17]

    肖相如, 王慕雪, 黎敏, 耿基伟, 刘运全, 彭良友 2016 物理学报 65 220203Google Scholar

    Xiao X R, Wang M X, Li M, Geng J W, Liu Y Q, Peng L Y 2016 Acta Phys. Sin. 65 220203Google Scholar

    [18]

    Gainullin I 2017 Comput. Phys. Commun. 72 210

    [19]

    Liu Q, Liu F, Hou C 2020 Procedia Computer Sci. 171 312Google Scholar

    [20]

    Penfold T J 2017 Phys. Chem. Chem. Phys. 19 19601Google Scholar

    [21]

    Broin C Ó2015 Ph. D. Dissertation (Dublin: Dublin City University)

    [22]

    Broin C Ó, Nikolopoulos L A A 2014 Comput. Phys. Commun. 185 1791Google Scholar

    [23]

    Feit M D, Fleck J A, Steiger A 1982 J. Comput. Phys. 47 412Google Scholar

    [24]

    Kjeldsen T K 2007 Ph. D. Dissertation (Arhus: University of Arhus)

  • 图 1  数据传输流程图

    Fig. 1.  The flowchart of data transmission.

    图 2  加速比随着角量子数的变化

    Fig. 2.  Speedup ratio as a function of angular quantum number.

    图 3  加速比随着径向网格点的变化

    Fig. 3.  Speedup ratio as a function of radial grid point.

    图 4  加速比随着矩阵大小的变化

    Fig. 4.  Speedup ratio as a function of the size of matrix.

    图 5  加速比随着矩阵大小的变化

    Fig. 5.  Speedup ratio as a function of the size of matrix.

    图 6  氢原子的光电子末态动量分布 (a) CPU计算结果; (b) GPU计算结果

    Fig. 6.  Photoelectron final-state momentum distributions of hydrogen atom: (a) Calculation results of CPU; (b) calculation results of GPU.

    表 1  TDSE算法步骤

    Table 1.  TDSE algorithm steps.

    算法   $\varPhi (t + \Delta t) = {{\rm{e}}^{ - {\rm{i}}H(t)\Delta t}}\varPhi (t)$
     Input: ${f_l}({r_i}, t)$
     Output: ${f_l}({r_i}, t)$
     1. for n do
     2. for l do
     3.  ${f_l}({r_i}, t) = {\rm{ifft} }\left( { {\rm{diag} }\Big( { { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\tfrac{ { {k^2} } }{2} } } } \Big) \cdot {\rm{fft} }\left( { {f_l}({r_i}, t)} \right)} \right)$
     4. end for
     5. for i and l do
     6.  ${f_l}({r_i}, t) = { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\left[ {\tfrac{ {l(l + 1)} }{ {2 r_i^2} }\, - \, \frac{1}{ { {r_i} } } } \right]} } \cdot {f_l}({r_i}, t)$
     7. end for
     8. for i and j do
     9.  $\varPhi ({r_i}, {x_j}, t) = \sum\limits_{l = 0}^L {{f_l}({r_i}, t){P_l}({x_j})} $
     10. end for
     11. for i and j do
     12. $\left| {\varPhi ({r_i}, {x_j}, t)} \right\rangle = { {\rm{e} }^{ {\rm{i} }\Delta tE(n){r_i}{x_j} } } \cdot \left| {\varPhi ({r_i}, {x_j}, t)} \right\rangle$
     13. end for
     14. for i and j do
     15. ${f_l}({r_i}, t) = \sum\limits_{j = 1}^{L + 1} {{w_j}{P_l}({x_j})} \varPhi ({r_i}, {x_j}, t)$
     16. end for
     17. for i and l do
     18. ${f_l}({r_i}, t) = { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\left[ {\tfrac{ {l(l + 1)} }{ {2 r_i^2} }\, - \, \frac{1}{ { {r_i} } } } \right]} } \cdot {f_l}({r_i}, t)$
     19. end for
     20. for l do
     21. ${f_l}({r_i}, t) = {\rm{ifft} }\left( { {\rm{diag} }\Big( { { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\tfrac{ { {k^2} } }{2} } } } \Big) \cdot {\rm{fft} }\left( { {f_l}({r_i}, t)} \right)} \right)$
     22. end for
     23. end for
    下载: 导出CSV

    表 2  不同角量子数下CPU与GPU的计算时间比较

    Table 2.  Computation time of CPU and GPU under different angular quantum numbers.

    角量子数L计算时间/s
    CPUGPU
    42164.309159.368
    94120.602164.418
    197922.537205.440
    3917682.308378.104
    7936774.347757.198
    下载: 导出CSV

    表 3  不同径向网格点下CPU与GPU的计算时间比较

    Table 3.  Computation time of CPU and GPU under different radial grid points.

    径向网格点数R计算时间/s
    CPUGPU
    2121118.348148.302
    2131871.128154.614
    2143846.120160.763
    2157922.537205.440
    21616862.467354.554
    下载: 导出CSV

    表 4  不同矩阵大小下CPU与GPU的计算时间 比较

    Table 4.  Computation time of CPU and GPU under different matrix sizes.

    矩阵大小计算时间/s
    CPUGPU
    5 × 212199.158149.895
    10 × 213965.276166.039
    20 × 2143846.120160.763
    40 × 21517682.308378.104
    80 × 21674761.6951524.669
    下载: 导出CSV

    表 5  不同矩阵大小下CPU与GPU的计算时间比较

    Table 5.  Computation time of CPU and GPU under different matrix sizes.

    矩阵大小计算时间/s
    CPUGPU
    5 × 212437.584315.448
    10 × 2132075.667463.183
    20 × 2149252.539629.088
    40 × 21540617.723814.985
    80 × 216182135.6433024.669
    下载: 导出CSV
  • [1]

    Corkum P 1993 Phys. Rev. Lett. 71 1994Google Scholar

    [2]

    Liu K L, Luo S Q, Li M, Li Y, Feng Y D, Du B J, Zhou Y M, Lu P X, Barth I 2019 Phys. Rev. Lett. 122 053202Google Scholar

    [3]

    Lewenstein M, Balcou P, Ivanov M Y, L’Huillier A, Corkum P B 1994 Phys. Rev. A 49 2117Google Scholar

    [4]

    Zhang X F, Zhu X S, Wang D, Li L, Liu X, Liao Q, Lan P F, Lu P X 2019 Phys. Rev. A 99 013414Google Scholar

    [5]

    Gaarde M B, Tate J L, Schafer K J 2008 J. Phys. B 41 132001Google Scholar

    [6]

    Liu K, Qin M Y, Li Q G, Liao Q 2018 Opt. Quantum Electron. 50 364Google Scholar

    [7]

    Liao Q, Li Y, Qin M Y, Lu P X 2017 Phys. Rev. A 96 063408Google Scholar

    [8]

    Liu K, Wang F, Wang Z, Qin M Y, Liao Q 2019 J. Opt. Soc. Am. B 36 2624Google Scholar

    [9]

    Muller H G 1999 Laser Phys. 9 138

    [10]

    Bauer D, Koval P 2006 Comput. Phys. Commun. 174 396Google Scholar

    [11]

    Madsen L B, Nikolopoulos L A A, Kjeldsen T K, Fernández J 2007 Phys. Rev. A 76 063407Google Scholar

    [12]

    Keldysh L V 1964 Sov. Phys. JETP 20 1307

    [13]

    Faisal F H M 1973 J. Phys. B: At. Mol. Opt. Phys. 6 L89Google Scholar

    [14]

    Reiss H R 1980 Phys. Rev. A 22 1786Google Scholar

    [15]

    Gallagher T F 1988 Phys. Rev. Lett. 61 2304Google Scholar

    [16]

    Corkum P B, Burnett N H, Brunel F 1989 Phys. Rev. Lett. 62 1259Google Scholar

    [17]

    肖相如, 王慕雪, 黎敏, 耿基伟, 刘运全, 彭良友 2016 物理学报 65 220203Google Scholar

    Xiao X R, Wang M X, Li M, Geng J W, Liu Y Q, Peng L Y 2016 Acta Phys. Sin. 65 220203Google Scholar

    [18]

    Gainullin I 2017 Comput. Phys. Commun. 72 210

    [19]

    Liu Q, Liu F, Hou C 2020 Procedia Computer Sci. 171 312Google Scholar

    [20]

    Penfold T J 2017 Phys. Chem. Chem. Phys. 19 19601Google Scholar

    [21]

    Broin C Ó2015 Ph. D. Dissertation (Dublin: Dublin City University)

    [22]

    Broin C Ó, Nikolopoulos L A A 2014 Comput. Phys. Commun. 185 1791Google Scholar

    [23]

    Feit M D, Fleck J A, Steiger A 1982 J. Comput. Phys. 47 412Google Scholar

    [24]

    Kjeldsen T K 2007 Ph. D. Dissertation (Arhus: University of Arhus)

计量
  • 文章访问数:  5988
  • PDF下载量:  121
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-05-11
  • 修回日期:  2020-07-10
  • 上网日期:  2020-11-26
  • 刊出日期:  2020-12-05

/

返回文章
返回