Rapid analysis model and extrapolation method of neural network in spectral diagnostic

TIAN Wenjing YANG Zongyu XU Min LONG Ting HE Xiaoxue KE Rui YANG Shuosu YU Deliang SHI Zhongbing GAO Zhe


Rapid analysis model and extrapolation method of neural network in spectral diagnostic

TIAN Wenjing, YANG Zongyu, XU Min, LONG Ting, HE Xiaoxue, KE Rui, YANG Shuosu, YU Deliang, SHI Zhongbing, GAO Zhe
  • Real-time measurement and feedback control of key plasma parameters are critical for future fusion reactor operation, with ion temperature being a vital control target as part of the triple product for fusion ignition. However, plasma diagnostics often require complex data analysis. A widely used method of obtaining ion temperature $ {T}_{{\mathrm{i}}} $ from charge exchange recombination spectrum (CXRS) is iterative spectral fitting, which is time-consuming and requires expert intervention during data analysis. On top of that, frequent human expert intervention is required in traditional iterative fitting. Therefore, the traditional method cannot meet the demand for real-time $ {T}_{{\mathrm{i}}} $ measurement. Neural network (NN), which can learn the underlying relationships between the measured spectra and $ {T}_{{\mathrm{i}}} $, is a promising approach to cope with this problem. In fact, NN approach has been widely adopted in the field of magnetically confined plasma. Previous study in JET has achieved a satisfactory accuracy for inferring $ {T}_{{\mathrm{i}}} $ from CXRS spectra compared with the traditional fitting results. Recently, the study of disruption prediction has achieved great progress with the help of deep NNs. However, these researches are conducted on steadily-operating devices; for NN models, the data distribution of training set is similar to that of test set. This is not the case for newly-built tokamak like HL-3 nor for future fusion reactors such as ITER. For new devices, there will be a period for the plasma parameters to rise from low to high ranges. In this case, it is crucial to investigate the ability of NN model to extrapolate based on low parameter training data.A traditional neural network (TNN)-based model is proposed to accelerate the analysis of spectral data of CXRS, with a focus on investigating the ability of the model to extrapolate to much higher $ {T}_{{\mathrm{i}}} $ range. The dataset contains about 122000 spectral data, as well as their corresponding $ {T}_{{\mathrm{i}}} $ inferred from offline iterative process. The results demonstrate that the TNN-based model achieves excellent analysis of $ {T}_{{\mathrm{i}}} $ as indicated by a coefficient of determination (R²) of 0.92, and reduces the inference time for analyzing a single spectrum to less than 1 ms, reaching 100–1000 times faster than traditional spectral fitting methods. However, the performance of the data-driven neural network model is limited by challenges such as insufficient data and imbalanced data distribution, which further deteriorates the ability to extrapolate. Generally, data with higher $ {T}_{{\mathrm{i}}} $ account for a small portion of the total dataset. In our study, only about 5% of the spectra correspond to $ {T}_{{\mathrm{i}}} > 2{\mathrm{ }}\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $ (ranging from 2 to 4 keV). However, they reflect the temperature of central plasma, which is more important for assessing the performance of plasma. To overcome this limitation, this study synthesizes high-temperature data based on experimental data from discharges with $ {T}_{{\mathrm{i}}} $ in low-temperature range. By incorporating 5% synthetic data into the training set only consisting of data with $ {T}_{{\mathrm{i}}} < 2\;{\mathrm{ }}{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $, the ability of model to extrapolate is extended to the whole range of $ {T}_{{\mathrm{i}}} < 4\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $. The average relative error (ARE) of the model within the training data in the range of $ {3\;{\mathrm{ }}{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} < T}_{{\mathrm{i}}} < 4\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $ decreases from 35% to below 15%, corresponding to a reduction of approximately 60% relative to the ARE before adding synthetic data. This approach demonstrates the feasibility of using synthetic data to enhance the performance of artificial intelligence algorithms in the field of magnetic confinement fusion. These findings provide valuable ideas for developing the real-time ion temperature measurement and feedback control of future high-parameter fusion devices. Furthermore, the study lays a foundation for investigating high-performance across-device characteristic, such as machine learning-based disruption prediction and tearing mode control.
  • 图 1  HL-2A装置CXRS光谱诊断示例 (a) CXRS诊断系统示意图; (b)离子温度为3.1和1.4 keV下光谱及其分解结果

    Figure 1.  The CXRS diagnostic system and measured spectra in HL-2A: (a) Diagram of CXRS diagnostic system; (b) spectra and their decomposition results at ion temperature of 3.1 and 1.4 keV.

    图 2  数据集的离子温度分布

    Figure 2.  Distribution of ion temperature in the whole dataset.

    图 3  神经网络模型结构, 其中实线箭头连接的地方表示残差连接

    Figure 3.  The architecture of the CNN-based model, where the solid arrow is connected, the residual connection is represented.

    图 4  神经网络模型的标签与预测值散点图, 图中黑色虚线为横坐标与纵坐标值相等的线

    Figure 4.  The scatter plot of label and output of the CNN-based model, the black dotted line in the figure is the line where the horizontal coordinate and the vertical coordinate are equal.

    图 5  神经网络模型的标签与预测结果对比 (a)标签与预测离子温度不同时刻剖面对比; (b)标签与预测离子温度不同位置时间演化对比

    Figure 5.  The label against the output ion temperature of the CNN-based model: (a) Ti profiles in different time; (b) time evolution of Ti at different radius.

    图 6  光谱曲线及对应的IG曲线(炮号34339, 810 ms). 神经网络对输入的IG曲线呈现近似“M”型, 且与高斯峰的位置准确吻合, 呈现出检测高斯峰的行为

    Figure 6.  The spectrum and IG graph of the spectrum in 810 ms of shot No. 34339. The neural network presents an approximate “M” shape to the input IG curve, and it is exactly consistent with the position of the Gaussian peak, showing the behavior of detecting the Gaussian peak

    图 7  Ti在0—2 keV参数区间训练的神经网络模型外推到更高温度参数区间的表现

    Figure 7.  The performance of the CNN-based model trained on dataset with Ti in 0–2 keV on test set in higher Ti range.

    图 8  合成谱线的流程图及谱线示例.

    Figure 8.  Examples of synthetic spectrum, along with the flow chart of the synthesizing process.

    图 9  $ {T}_{{\mathrm{i}}} $在0—2 keV区间训练的神经网络模型外推到更高温度参数区间的表现

    Figure 9.  Performance of the CNN-based model trained on dataset with $ {T}_{{\mathrm{i}}} $ between 0–2 keV on data with higher $ {T}_{{\mathrm{i}}} $.

    图 10  训练集有、无合成训练数据的神经网络模型在不同离子温度区间的平均相对误差MRE表现对比

    Figure 10.  Comparison of the MRE in various $ {T}_{{\mathrm{i}}} $ ranges of CNN-based models w/ or w/o synthetic training data.

    表 1  CNN层的参数配置

    Table 1.  Configuration of the CNN layers.

    表 2  模型训练过程的超参数配置及相关说明

    Table 2.  Hyper-parameters of the CNN-based model.

    高斯噪声(Gaussian noise)μ = 0, σ = 0.01添加高斯噪声对训练集进行数据增强
    批次尺寸(Batch size)128随机梯度下降过程中的批次数据个数
    早停机制(Early stopping)20经过一定轮次的训练若效果不再提升则视为训练完成
    神经元屏蔽(Dropout)0.2全连接层中神经元输出在训练过程中随机置零的比例, 以减少过拟合
    表 3  模型的拟合效果及速度评估

    Table 3.  Performance of the model.

    训练集 测试集/keV $ {R}^{2} $ MRE 推理耗时/
    无合成数据 0—2 0.92 14% 0.59
    无合成数据 0—4 0.86 15%
    有合成数据 0—4 0.93 13%
