识别Z玻色子喷注的卷积神经网络方法

李靖; 孙昊

doi:10.7498/aps.70.20201557

摘要

高能物理中喷注识别任务是从背景中识别出感兴趣的特定信号, 这些信号对于在大型强子对撞机上发现新的粒子, 或者新的过程都有着非常重要的意义. 量能器中产生的能量沉积可以看做是对喷注的一种拍照, 分析这样产生的数据在机器学习领域中属于一个典型的视觉识别任务. 基于喷注图片, 本文探索了利用卷积神经网络(convolutional neural networks, CNNs)识别量子色动力学背景下的Z玻色子喷注, 并与传统的增强决策树(boosted decision trees, BDTs)方法进行了对比. 在本文利用的输入前提下, 三种相关的性能参数表明, CNN比BDT带来了约1.5倍的效果提升. 除此之外, 通过最优与最差的喷注图与混淆矩阵, 说明了CNN通过训练学习到的内容与整体识别能力.

关键词:

Z玻色子衰变 /
夸克 /
胶子 /
神经网络

Abstract

The jet tagging task in high-energy physics is to distinguish signals of interest from the background, which is of great importance for the discovery of new particles, or new processes, at the large hadron collider. The energy deposition generated in the calorimeter can be seen as a kind of picture. Based on this notion, tagging jets initiated by different processes becomes a classic image classification task in the computer vision field. We use jet images as the input built on high dimensional low-level information, energy-momentum four-vectors, to explore the potential of convolutional neural networks (CNNs). Four models of different depths are designed to make the best underlying useful features of jet images. Traditional multivariable method, boosted decision tree (BDT), is used as a baseline to determine the performance of networks. We introduce four observable quantities into BDTs: the mass, transverse momenta of fat jets, the distance between the leading and subleading jets, and N-subjettiness. Different tree numbers are adopted to build three kinds of BDTs, which is intended to have variable classifying abilities. After training and testing, the results show that the CNN 3 is the neatest and most efficient network under the design of stacking convolutional layers. Deepening the model could improve the performance to a certain extent but it is unable to work all the time. The performances of all BDTs are almost the same, which is possibly due to a small number of input observable types. The performance metrics show that the CNNs outperform the BDTs: the background rejection efficiency increases up to 150% at 50% signal efficiency. Besides, after inspecting the best and the worst samples, we conclude the characteristics of jets initiated by different processes: jets obtained by Z boson decays tend to concentrate in the center of jet images or have a clear differentiable substructure; the substructures of jets from general quantum chromodynamics processes have more random forms and not only just have two subjets. As the final step, the confusion matrix of the CNN 3 indicate that it comes to be kind of conservative. Exploring the way of keeping the balance between conservative and radical is our goal in the future work.

Keywords:

作者及机构信息

李靖,
孙昊

大连理工大学物理学院, 大连　116024

通信作者: 孙昊, haosun@dlut.edu.cn

基金项目: 国家自然科学基金(批准号: 11675033, 12075043)资助的课题

Authors and contacts

Li Jing,
Sun Hao

School of Physics, Dalian University of Technology, Dalian 116024, China

Corresponding author: Sun Hao, haosun@dlut.edu.cn

Funds: Project supported by the National Natural Science Foundation of China (Grant Nos. 11675033, 12075043)

文章全文

参考文献

[1]	Kogler R, Nachman B, Schmidt A, Asquith L, Winkels E, Campanelli M, Delitzsch C, Harris P, Hinzmann A, Kar D, McLean C, Pilot J, Takahashi Y, Tran N, Vernieri C, Vos M 2019 Rev. Mod. Phys. 91 045003 Google Scholar
[2]	Kasieczka G, Plehn T, Butter A, Cranmer K, Debnath D, Dillon B M, Fairbairn M, Faroughy D A, Fedorko W, Gay C, Gouskos L, Kamenik J F, Komiske P, Leiss S, Lister A, Macaluso S, Metodiev E, Moore L, Nachman B, Nordström K, Pearkes J, Qu H, Rath Y, Rieger M, Shih D, Thompson J, Varma S 2019 SciPost Phys. 7 014 Google Scholar
[3]	Larkoski A J, Moult I, Nachman B 2020 Phys. Rep. 841 1 Google Scholar
[4]	来志, 郭亮, 李小珍, 党文佳 2013 物理学报 62 184207 Google Scholar Lai Z, Guo L, Li X Z, Dang W J 2013 Acta Phys. Sin. 62 184207 Google Scholar
[5]	杨自欣, 高章然, 孙晓帆, 蔡宏灵, 张凤鸣, 吴小山 2019 物理学报 68 210502 Google Scholar Yang Z X, Gao Z R, Sun X F, Cai H L, Zhang F M, Wu X S 2019 Acta Phys. Sin. 68 210502 Google Scholar
[6]	徐启伟, 王佩佩, 曾镇佳, 黄泽斌, 周新星, 刘俊敏, 李瑛, 陈书青, 范滇元 2020 物理学报 69 014209 Google Scholar Xu Q W, Wang P P, Zeng Z J, Huang Z B, Zhou X X, Liu J M, Li Y, Chen S Q, Fan D Y 2020 Acta Phys. Sin. 69 014209 Google Scholar
[7]	Cogan J, Kagan M, Strauss E, Schwarztman A 2015 J. High Energy Phys. 2015 118 Google Scholar
[8]	Almeida L G, Backović M, Cliche M, Lee S J, Perelstein M 2015 J. High Energy Phys. 2015 86 Google Scholar
[9]	Chen Y C J, Chiang C W, Cottin G, Shih D 2020 Phys. Rev. D 101 053001 Google Scholar
[10]	Qu H, Gouskos L 2020 Phys. Rev. D 101 056019 Google Scholar
[11]	Diefenbacher S, Frost H, Kasieczka G, Plehn T, Thompson J 2020 SciPost Phys. 8 023 Google Scholar
[12]	Fraser K, Schwartz M D 2018 J. High Energy Phys. 2018 93 Google Scholar
[13]	Macaluso S, Shih D 2018 J. High Energy Phys. 2018 121 Google Scholar
[14]	Lin J, Freytsis M, Moult I, Nachman B 2018 J. High Energy Phys. 2018 101 Google Scholar
[15]	Komiske P T, Metodiev E M, Schwartz M D 2017 J. High Energy Phys. 2017 110 Google Scholar
[16]	Kasieczka G, Plehn T, Russell M, Schell T 2017 J. High Energy Phys. 2017 6 Google Scholar
[17]	Baldi P, Bauer K, Eng C, Sadowski P, Whiteson D 2016 Phys. Rev. D 93 094034 Google Scholar
[18]	Oliveira L, Kagan M, Mackey L, Nachman B, Schwartzman A 2016 J. High Energy Phys. 2016 69 Google Scholar
[19]	Guest D, Collado J, Baldi P, Hsu S C, Urban G, Whiteson D 2016 Phys. Rev. D 94 112002 Google Scholar
[20]	Butter A, Kasieczka G, Plehn T, Russell M 2018 SciPost Phys. 5 028 Google Scholar
[21]	Kasieczka G, Kiefer N, Plehn T, Thompson J 2019 SciPost Phys. 6 069 Google Scholar
[22]	Erdmann M, Geiser E, Rath Y, Rieger M 2019 J. Instrum. 14 P06006 Google Scholar
[23]	Abdughani M, Ren J, Wu L, Yang J M 2019 J. High Energy Phys. 2019 55 Google Scholar
[24]	Komiske P T, Metodiev E M, Thaler J 2019 J. High Energy Phys. 2019 121 Google Scholar
[25]	Sjöstrand T, Ask S, Christiansen J R, Corke R, Desai N, Ilten P, Mrenna S, Prestel S, Rasmussen C O, Skands P Z 2015 Comput. Phys. Commun. 191 159 Google Scholar
[26]	Cacciari M, Salam G P, Soyez G 2012 Eur. Phys. J. C 72 1896 Google Scholar
[27]	Cacciari M, Salam G P, Soyez G 2008 J. High Energy Phys. 2008 63 Google Scholar
[28]	Krohn D, Thaler J, Wang L T 2010 J. High Energy Phys. 2010 84 Google Scholar

施引文献

图 1 (a)信号平均喷注图; (b)背景平均喷注图; 横坐标$ \eta $代表赝快度, 纵坐标代表方位角$ \phi $.

Fig. 1. (a) Signal average jet image; (b) background average jet image. $ \eta $ and $ \phi $ represent pseudo-rapidity and azimuth respectively

下载: 全尺寸图片幻灯片

图 2 CNN 3结构示意图, 产生这张图片的程序来自https://github.com/gwding/draw_convnet

Fig. 2. Architecture of the CNN 3. This figure was generated by adapting the code from https://github.com/gwding/draw_convnet.

下载: 全尺寸图片幻灯片

图 3 (a)胖喷注的质量分布; (b)胖喷注的横向动量分布; (c)胖喷注含有的首要与次要喷注的距离分布; (d) N-subjettiness $ {\tau }_{21} $的分布

Fig. 3. (a) Mass distribution of fat jets; (b) transverse momentum distribution of fat jets; (c) distribution of distance between leading and subleading subjets; (d) distribution of N-subjettiness $ {\tau }_{21} $.

下载: 全尺寸图片幻灯片

图 4 不同模型的ROC曲线

Fig. 4. ROC curves of different models.

下载: 全尺寸图片幻灯片

图 5 CNN 3信号神经元对于信号(橘色)与背景(蓝色)的输出分布

Fig. 5. Distribution of the signal neuron of the CNN 3 on signal and background samples.

下载: 全尺寸图片幻灯片

图 6 最优与最差的信号喷注图

Fig. 6. The best and the worst signal jet images.

下载: 全尺寸图片幻灯片

图 7 最优与最差的背景喷注图

Fig. 7. The best and the worst background jet images.

下载: 全尺寸图片幻灯片

图 8 CNN 3在测试集上的混淆矩阵, 其中纵坐标代表喷注图的真实类别, 横坐标代表模型预测的类别

Fig. 8. Confusion matrix of the CNN 3 on the test set. The true label is on the vertical axis, and the predicted label in on the horizontal axis.

下载: 全尺寸图片幻灯片

表 1 用来衡量不同模型表现的性能参数

Table 1. Metrics to evaluate performance of different models.

模型 AUC ACC R50

CNN 1 0.8754 0.8150 39.1103
CNN 2 0.8688 0.8252 53.3583
CNN 3 0.8980 0.8324 80.6715
CNN 4 0.8993 0.8328 79.9350
BDT 1 0.8955 0.8337 32.5351
BDT 2 0.8963 0.8342 32.8072
BDT 3 0.8969 0.8346 33.0144

下载: 导出CSV

[1]	Kogler R, Nachman B, Schmidt A, Asquith L, Winkels E, Campanelli M, Delitzsch C, Harris P, Hinzmann A, Kar D, McLean C, Pilot J, Takahashi Y, Tran N, Vernieri C, Vos M 2019 Rev. Mod. Phys. 91 045003 Google Scholar
[2]	Kasieczka G, Plehn T, Butter A, Cranmer K, Debnath D, Dillon B M, Fairbairn M, Faroughy D A, Fedorko W, Gay C, Gouskos L, Kamenik J F, Komiske P, Leiss S, Lister A, Macaluso S, Metodiev E, Moore L, Nachman B, Nordström K, Pearkes J, Qu H, Rath Y, Rieger M, Shih D, Thompson J, Varma S 2019 SciPost Phys. 7 014 Google Scholar
[3]	Larkoski A J, Moult I, Nachman B 2020 Phys. Rep. 841 1 Google Scholar
[4]	来志, 郭亮, 李小珍, 党文佳 2013 物理学报 62 184207 Google Scholar Lai Z, Guo L, Li X Z, Dang W J 2013 Acta Phys. Sin. 62 184207 Google Scholar
[5]	杨自欣, 高章然, 孙晓帆, 蔡宏灵, 张凤鸣, 吴小山 2019 物理学报 68 210502 Google Scholar Yang Z X, Gao Z R, Sun X F, Cai H L, Zhang F M, Wu X S 2019 Acta Phys. Sin. 68 210502 Google Scholar
[6]	徐启伟, 王佩佩, 曾镇佳, 黄泽斌, 周新星, 刘俊敏, 李瑛, 陈书青, 范滇元 2020 物理学报 69 014209 Google Scholar Xu Q W, Wang P P, Zeng Z J, Huang Z B, Zhou X X, Liu J M, Li Y, Chen S Q, Fan D Y 2020 Acta Phys. Sin. 69 014209 Google Scholar
[7]	Cogan J, Kagan M, Strauss E, Schwarztman A 2015 J. High Energy Phys. 2015 118 Google Scholar
[8]	Almeida L G, Backović M, Cliche M, Lee S J, Perelstein M 2015 J. High Energy Phys. 2015 86 Google Scholar
[9]	Chen Y C J, Chiang C W, Cottin G, Shih D 2020 Phys. Rev. D 101 053001 Google Scholar
[10]	Qu H, Gouskos L 2020 Phys. Rev. D 101 056019 Google Scholar
[11]	Diefenbacher S, Frost H, Kasieczka G, Plehn T, Thompson J 2020 SciPost Phys. 8 023 Google Scholar
[12]	Fraser K, Schwartz M D 2018 J. High Energy Phys. 2018 93 Google Scholar
[13]	Macaluso S, Shih D 2018 J. High Energy Phys. 2018 121 Google Scholar
[14]	Lin J, Freytsis M, Moult I, Nachman B 2018 J. High Energy Phys. 2018 101 Google Scholar
[15]	Komiske P T, Metodiev E M, Schwartz M D 2017 J. High Energy Phys. 2017 110 Google Scholar
[16]	Kasieczka G, Plehn T, Russell M, Schell T 2017 J. High Energy Phys. 2017 6 Google Scholar
[17]	Baldi P, Bauer K, Eng C, Sadowski P, Whiteson D 2016 Phys. Rev. D 93 094034 Google Scholar
[18]	Oliveira L, Kagan M, Mackey L, Nachman B, Schwartzman A 2016 J. High Energy Phys. 2016 69 Google Scholar
[19]	Guest D, Collado J, Baldi P, Hsu S C, Urban G, Whiteson D 2016 Phys. Rev. D 94 112002 Google Scholar
[20]	Butter A, Kasieczka G, Plehn T, Russell M 2018 SciPost Phys. 5 028 Google Scholar
[21]	Kasieczka G, Kiefer N, Plehn T, Thompson J 2019 SciPost Phys. 6 069 Google Scholar
[22]	Erdmann M, Geiser E, Rath Y, Rieger M 2019 J. Instrum. 14 P06006 Google Scholar
[23]	Abdughani M, Ren J, Wu L, Yang J M 2019 J. High Energy Phys. 2019 55 Google Scholar
[24]	Komiske P T, Metodiev E M, Thaler J 2019 J. High Energy Phys. 2019 121 Google Scholar
[25]	Sjöstrand T, Ask S, Christiansen J R, Corke R, Desai N, Ilten P, Mrenna S, Prestel S, Rasmussen C O, Skands P Z 2015 Comput. Phys. Commun. 191 159 Google Scholar
[26]	Cacciari M, Salam G P, Soyez G 2012 Eur. Phys. J. C 72 1896 Google Scholar
[27]	Cacciari M, Salam G P, Soyez G 2008 J. High Energy Phys. 2008 63 Google Scholar
[28]	Krohn D, Thaler J, Wang L T 2010 J. High Energy Phys. 2010 84 Google Scholar

[1]	田文静, 杨宗谕, 许敏, 龙婷, 何小雪, 柯锐, 杨硕苏, 余德良, 石中兵, 高喆. 光谱诊断中神经网络快速分析模型及外推方法. 物理学报, 2025, 74(7): 078901. doi: 10.7498/aps.74.20241739
[2]	魏凯文, 尚天帅, 田榕赫, 杨东, 李春娟, 陈军, 李剑, 黄小龙, 朱佳丽. 基于神经网络方法研究β^–衰变释放粒子的平均能量数据. 物理学报, 2025, 74(18): 182901. doi: 10.7498/aps.74.20250655
[3]	陈海军, 盛浩文, 黄文豪, 吴彬琪, 赵天亮, 包小军. 基于神经网络方法研究超重核的稳定性和衰变性质. 物理学报, 2025, 74(19): 192301. doi: 10.7498/aps.74.20250720
[4]	马锐垚, 王鑫, 李树, 勇珩, 上官丹骅. 基于神经网络的粒子输运问题高效计算方法. 物理学报, 2024, 73(7): 072802. doi: 10.7498/aps.73.20231661
[5]	杨莹, 曹怀信. 量子混合态的两种神经网络表示. 物理学报, 2023, 72(11): 110301. doi: 10.7498/aps.72.20221905
[6]	方波浪, 王建国, 冯国斌. 基于物理信息神经网络的光斑质心计算. 物理学报, 2022, 71(20): 200601. doi: 10.7498/aps.71.20220670
[7]	孙立望, 李洪, 汪鹏君, 高和蓓, 罗孟波. 利用神经网络识别高分子链在表面的吸附相变. 物理学报, 2019, 68(20): 200701. doi: 10.7498/aps.68.20190643
[8]	黄宇航, 陈理想. 基于未训练神经网络的分数傅里叶变换成像. 物理学报, 2017, 1(111): 094201. doi: 10.7498/aps.73.20240050
[9]	魏德志, 陈福集, 郑小雪. 基于混沌理论和改进径向基函数神经网络的网络舆情预测方法. 物理学报, 2015, 64(11): 110503. doi: 10.7498/aps.64.110503
[10]	李欢, 王友国. 一类非线性神经网络中噪声改善信息传输. 物理学报, 2014, 63(12): 120506. doi: 10.7498/aps.63.120506
[11]	陈铁明, 蒋融融. 混沌映射和神经网络互扰的新型复合流密码. 物理学报, 2013, 62(4): 040301. doi: 10.7498/aps.62.040301
[12]	李华青, 廖晓峰, 黄宏宇. 基于神经网络和滑模控制的不确定混沌系统同步. 物理学报, 2011, 60(2): 020512. doi: 10.7498/aps.60.020512
[13]	赵海全, 张家树. 混沌通信系统中非线性信道的自适应组合神经网络均衡. 物理学报, 2008, 57(7): 3996-4006. doi: 10.7498/aps.57.3996
[14]	王永生, 孙瑾, 王昌金, 范洪达. 变参数混沌时间序列的神经网络预测研究. 物理学报, 2008, 57(10): 6120-6131. doi: 10.7498/aps.57.6120
[15]	牛培峰, 张君, 关新平. 基于遗传算法的统一混沌系统比例-积分-微分神经网络解耦控制研究. 物理学报, 2007, 56(5): 2493-2497. doi: 10.7498/aps.56.2493
[16]	行鸿彦, 徐伟. 混沌背景中微弱信号检测的神经网络方法. 物理学报, 2007, 56(7): 3771-3776. doi: 10.7498/aps.56.3771
[17]	王瑞敏, 赵鸿. 神经元传输函数对人工神经网络动力学特性的影响. 物理学报, 2007, 56(2): 730-739. doi: 10.7498/aps.56.730
[18]	王耀南, 谭文. 混沌系统的遗传神经网络控制. 物理学报, 2003, 52(11): 2723-2728. doi: 10.7498/aps.52.2723
[19]	谭文, 王耀南, 刘祖润, 周少武. 非线性系统混沌运动的神经网络控制. 物理学报, 2002, 51(11): 2463-2466. doi: 10.7498/aps.51.2463
[20]	. 神经网络的自适应删剪学习算法及其应用. 物理学报, 2001, 50(4): 674-681. doi: 10.7498/aps.50.674

计量

文章访问数: 8654
PDF下载量: 91
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

搜索

留言板

识别Z玻色子喷注的卷积神经网络方法