-
量子力学领域中对强激光场与原子分子相互作用的理论研究非常依赖于数值求解含时薛定谔方程. 本文在强场电离的背景下并行求解氢原子的三维含时薛定谔方程. 基于球极坐标系, 采用分裂算符-傅里叶变换方法将含时薛定谔方程进行了离散化. 由此可得到长度规范下的光电子连续态波函数. 图形处理器(GPU)可以依托多线程结构充分发挥细粒度并行的优势, 实现整体算法的并行加速. 计算表明, 相对于中央处理器(CPU), GPU并行计算有着最高约60倍的加速比. 由此可见, 基于GPU加速数值求解三维含时薛定谔方程能够显著缩短计算耗费的时间. 这一工作对利用GPU快速求解三维含时薛定谔方程有着重要的指导意义.In the field of quantum mechanics, the theoretical study of the interaction between intense laser field and atoms and molecules depends very much on the numerical solution of the time-dependent Schrödinger equation. However, solving the three-dimensional time-dependent Schrödinger equation is not a simple task, and the analytical solution cannot be obtained, so it can only be solved numerically with the help of computer. In order to shorten the computing time and obtain the results quickly, it is necessary to use parallel methods to speed up computing. In this paper, under the background of strong field ionization, the three-dimensional time-dependent Schrödinger equation of hydrogen atom is solved in parallel, and the suprathreshold ionization of hydrogen atom under the action of linearly polarized infrared laser electric field is taken for example. Based on the spherical polar coordinate system, the time-dependent Schrödinger equation is discretized by the splitting operator-Fourier transform method, and the photoelectron continuous state wave function under the length gauge can be obtained. In Graphics processing unit (GPU) accelerated applications, the sequential portion of the workload runs on central processing unit (CPU) (which is optimized for single-threaded performance), while the compute-intensive part of the application runs in parallel on thousands of GPU cores. The GPU can make full use of the advantage of fine-grained parallelism based on multi-thread structure to realize parallel acceleration of the whole algorithm. Two accelerated computing modes of CPU parallel and GPU parallel are adopted, and their parallel acceleration performance is discussed. Compared with the results from the existing physical laws, the calculation error is also within an acceptable range, and the result is also consistent with the result from the existing physical laws of suprathreshold ionization, which also verifies the correctness of the program. In order to obtain a relatively accurate acceleration ratio, many different experiments are carried out. Computational experiments show that under the condition of ensuring accuracy, the GPU parallel computing speeds by up to about 60 times maximally based on the computational performance of CPU. It can be seen that the accelerated numerical solution of three-dimensional time-dependent Schrödinger equation based on GPU can significantly shorten the computational time. This work has important guiding significance for rapidly solving the three-dimensional time-dependent Schrödinger equation by using GPU.
-
Keywords:
- three-dimensional time-dependent Schrödinger equation /
- strong-field ionization /
- parallel computing
[1] Corkum P 1993 Phys. Rev. Lett. 71 1994Google Scholar
[2] Liu K L, Luo S Q, Li M, Li Y, Feng Y D, Du B J, Zhou Y M, Lu P X, Barth I 2019 Phys. Rev. Lett. 122 053202Google Scholar
[3] Lewenstein M, Balcou P, Ivanov M Y, L’Huillier A, Corkum P B 1994 Phys. Rev. A 49 2117Google Scholar
[4] Zhang X F, Zhu X S, Wang D, Li L, Liu X, Liao Q, Lan P F, Lu P X 2019 Phys. Rev. A 99 013414Google Scholar
[5] Gaarde M B, Tate J L, Schafer K J 2008 J. Phys. B 41 132001Google Scholar
[6] Liu K, Qin M Y, Li Q G, Liao Q 2018 Opt. Quantum Electron. 50 364Google Scholar
[7] Liao Q, Li Y, Qin M Y, Lu P X 2017 Phys. Rev. A 96 063408Google Scholar
[8] Liu K, Wang F, Wang Z, Qin M Y, Liao Q 2019 J. Opt. Soc. Am. B 36 2624Google Scholar
[9] Muller H G 1999 Laser Phys. 9 138
[10] Bauer D, Koval P 2006 Comput. Phys. Commun. 174 396Google Scholar
[11] Madsen L B, Nikolopoulos L A A, Kjeldsen T K, Fernández J 2007 Phys. Rev. A 76 063407Google Scholar
[12] Keldysh L V 1964 Sov. Phys. JETP 20 1307
[13] Faisal F H M 1973 J. Phys. B: At. Mol. Opt. Phys. 6 L89Google Scholar
[14] Reiss H R 1980 Phys. Rev. A 22 1786Google Scholar
[15] Gallagher T F 1988 Phys. Rev. Lett. 61 2304Google Scholar
[16] Corkum P B, Burnett N H, Brunel F 1989 Phys. Rev. Lett. 62 1259Google Scholar
[17] 肖相如, 王慕雪, 黎敏, 耿基伟, 刘运全, 彭良友 2016 物理学报 65 220203Google Scholar
Xiao X R, Wang M X, Li M, Geng J W, Liu Y Q, Peng L Y 2016 Acta Phys. Sin. 65 220203Google Scholar
[18] Gainullin I 2017 Comput. Phys. Commun. 72 210
[19] Liu Q, Liu F, Hou C 2020 Procedia Computer Sci. 171 312Google Scholar
[20] Penfold T J 2017 Phys. Chem. Chem. Phys. 19 19601Google Scholar
[21] Broin C Ó2015 Ph. D. Dissertation (Dublin: Dublin City University)
[22] Broin C Ó, Nikolopoulos L A A 2014 Comput. Phys. Commun. 185 1791Google Scholar
[23] Feit M D, Fleck J A, Steiger A 1982 J. Comput. Phys. 47 412Google Scholar
[24] Kjeldsen T K 2007 Ph. D. Dissertation (Arhus: University of Arhus)
-
表 1 TDSE算法步骤
Table 1. TDSE algorithm steps.
算法 $\varPhi (t + \Delta t) = {{\rm{e}}^{ - {\rm{i}}H(t)\Delta t}}\varPhi (t)$ Input: ${f_l}({r_i}, t)$ Output: ${f_l}({r_i}, t)$ 1. for n do 2. for l do 3. ${f_l}({r_i}, t) = {\rm{ifft} }\left( { {\rm{diag} }\Big( { { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\tfrac{ { {k^2} } }{2} } } } \Big) \cdot {\rm{fft} }\left( { {f_l}({r_i}, t)} \right)} \right)$ 4. end for 5. for i and l do 6. ${f_l}({r_i}, t) = { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\left[ {\tfrac{ {l(l + 1)} }{ {2 r_i^2} }\, - \, \frac{1}{ { {r_i} } } } \right]} } \cdot {f_l}({r_i}, t)$ 7. end for 8. for i and j do 9. $\varPhi ({r_i}, {x_j}, t) = \sum\limits_{l = 0}^L {{f_l}({r_i}, t){P_l}({x_j})} $ 10. end for 11. for i and j do 12. $\left| {\varPhi ({r_i}, {x_j}, t)} \right\rangle = { {\rm{e} }^{ {\rm{i} }\Delta tE(n){r_i}{x_j} } } \cdot \left| {\varPhi ({r_i}, {x_j}, t)} \right\rangle$ 13. end for 14. for i and j do 15. ${f_l}({r_i}, t) = \sum\limits_{j = 1}^{L + 1} {{w_j}{P_l}({x_j})} \varPhi ({r_i}, {x_j}, t)$ 16. end for 17. for i and l do 18. ${f_l}({r_i}, t) = { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\left[ {\tfrac{ {l(l + 1)} }{ {2 r_i^2} }\, - \, \frac{1}{ { {r_i} } } } \right]} } \cdot {f_l}({r_i}, t)$ 19. end for 20. for l do 21. ${f_l}({r_i}, t) = {\rm{ifft} }\left( { {\rm{diag} }\Big( { { {\rm{e} }^{ - {\rm{i} }\tfrac{ {\Delta t} }{2}\tfrac{ { {k^2} } }{2} } } } \Big) \cdot {\rm{fft} }\left( { {f_l}({r_i}, t)} \right)} \right)$ 22. end for 23. end for 表 2 不同角量子数下CPU与GPU的计算时间比较
Table 2. Computation time of CPU and GPU under different angular quantum numbers.
角量子数L 计算时间/s CPU GPU 4 2164.309 159.368 9 4120.602 164.418 19 7922.537 205.440 39 17682.308 378.104 79 36774.347 757.198 表 3 不同径向网格点下CPU与GPU的计算时间比较
Table 3. Computation time of CPU and GPU under different radial grid points.
径向网格点数R 计算时间/s CPU GPU 212 1118.348 148.302 213 1871.128 154.614 214 3846.120 160.763 215 7922.537 205.440 216 16862.467 354.554 表 4 不同矩阵大小下CPU与GPU的计算时间 比较
Table 4. Computation time of CPU and GPU under different matrix sizes.
矩阵大小 计算时间/s CPU GPU 5 × 212 199.158 149.895 10 × 213 965.276 166.039 20 × 214 3846.120 160.763 40 × 215 17682.308 378.104 80 × 216 74761.695 1524.669 表 5 不同矩阵大小下CPU与GPU的计算时间比较
Table 5. Computation time of CPU and GPU under different matrix sizes.
矩阵大小 计算时间/s CPU GPU 5 × 212 437.584 315.448 10 × 213 2075.667 463.183 20 × 214 9252.539 629.088 40 × 215 40617.723 814.985 80 × 216 182135.643 3024.669 -
[1] Corkum P 1993 Phys. Rev. Lett. 71 1994Google Scholar
[2] Liu K L, Luo S Q, Li M, Li Y, Feng Y D, Du B J, Zhou Y M, Lu P X, Barth I 2019 Phys. Rev. Lett. 122 053202Google Scholar
[3] Lewenstein M, Balcou P, Ivanov M Y, L’Huillier A, Corkum P B 1994 Phys. Rev. A 49 2117Google Scholar
[4] Zhang X F, Zhu X S, Wang D, Li L, Liu X, Liao Q, Lan P F, Lu P X 2019 Phys. Rev. A 99 013414Google Scholar
[5] Gaarde M B, Tate J L, Schafer K J 2008 J. Phys. B 41 132001Google Scholar
[6] Liu K, Qin M Y, Li Q G, Liao Q 2018 Opt. Quantum Electron. 50 364Google Scholar
[7] Liao Q, Li Y, Qin M Y, Lu P X 2017 Phys. Rev. A 96 063408Google Scholar
[8] Liu K, Wang F, Wang Z, Qin M Y, Liao Q 2019 J. Opt. Soc. Am. B 36 2624Google Scholar
[9] Muller H G 1999 Laser Phys. 9 138
[10] Bauer D, Koval P 2006 Comput. Phys. Commun. 174 396Google Scholar
[11] Madsen L B, Nikolopoulos L A A, Kjeldsen T K, Fernández J 2007 Phys. Rev. A 76 063407Google Scholar
[12] Keldysh L V 1964 Sov. Phys. JETP 20 1307
[13] Faisal F H M 1973 J. Phys. B: At. Mol. Opt. Phys. 6 L89Google Scholar
[14] Reiss H R 1980 Phys. Rev. A 22 1786Google Scholar
[15] Gallagher T F 1988 Phys. Rev. Lett. 61 2304Google Scholar
[16] Corkum P B, Burnett N H, Brunel F 1989 Phys. Rev. Lett. 62 1259Google Scholar
[17] 肖相如, 王慕雪, 黎敏, 耿基伟, 刘运全, 彭良友 2016 物理学报 65 220203Google Scholar
Xiao X R, Wang M X, Li M, Geng J W, Liu Y Q, Peng L Y 2016 Acta Phys. Sin. 65 220203Google Scholar
[18] Gainullin I 2017 Comput. Phys. Commun. 72 210
[19] Liu Q, Liu F, Hou C 2020 Procedia Computer Sci. 171 312Google Scholar
[20] Penfold T J 2017 Phys. Chem. Chem. Phys. 19 19601Google Scholar
[21] Broin C Ó2015 Ph. D. Dissertation (Dublin: Dublin City University)
[22] Broin C Ó, Nikolopoulos L A A 2014 Comput. Phys. Commun. 185 1791Google Scholar
[23] Feit M D, Fleck J A, Steiger A 1982 J. Comput. Phys. 47 412Google Scholar
[24] Kjeldsen T K 2007 Ph. D. Dissertation (Arhus: University of Arhus)
计量
- 文章访问数: 7514
- PDF下载量: 145
- 被引次数: 0