搜索

x

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

大语言模型在电池科研全流程应用的测评与无机固态电解质综合数据库构建

吴思远 李泓

引用本文:
Citation:

大语言模型在电池科研全流程应用的测评与无机固态电解质综合数据库构建

吴思远, 李泓

Evaluation of Large Language Models in the Full Process of Battery Research and Development and Inorganic Solid Electrolyte Materials Database

Siyuan Wu, Hong Li
Article Text (iFLYTEK Translation)
PDF
导出引用
  • 大语言模型的出现极大地推动了科学研究的进步。以ChatGPT为代表的语言模型和DeepSeek R1为代表的推理模型,为科研范式带来了显著变革。尽管这些模型均为通用型,但它们在电池领域,尤其是固态电池的研究中,展现出强大的泛化能力。在本研究中,我们系统性地筛选了2024年及之前重点期刊中的5,309,268篇文章,精准提取了124,021篇电池相关文献。同时,我们全面检索了欧洲专利局与美国专利局2024年及以前的申请与授权专利,共计17,559,750篇,从中筛选出125,716篇电池相关专利。利用这些文献与专利,我们对语言模型的知识储备、实时学习、指令遵从和结构化输出能力进行了大量实验。通过多维度的模型评估与分析,我们发现:当前的大语言模型在信息分类和数据提取等的精度基本达到了研究生水平,语言模型在内容总结和趋势分析方面也展现出强大的能力。同时,我们也发现模型在极少数情况下可能出现数值幻觉问题。而在处理电池领域海量数据时,模型在工程应用方面仍存在优化空间。我们根据模型的特点和以上测试结果,利用模型提取了无机固态电解质材料数据,包括了离子电导率数据5970条、扩散系数数据387条、迁移势垒数据3094条,此外还包括1000多条化学、电化学、力学等数据,涵盖了无机固态电解质所涉及的几乎所有物理、化学、电化学性质,这也意味着大语言模型对科研的应用已经从辅助科研转向主动促进科研发展阶段。
    The emergence of large language models has significantly advanced scientific research. Representative models such as ChatGPT and DeepSeek R1 have brought notable transformations to the paradigm of scientific research. While these models are general-purpose, they have demonstrated strong generalization capabilities in the field of batteries, particularly in solid-state battery research. In this study, we systematically screened 5,309,268 articles from key journals up to 2024, accurately extracting 124,021 relevant battery-related papers.Additionally, we comprehensively searched through 17,559,750 patent applications and granted patents from the European Patent Office and the United States Patent and Trademark Office up to 2024, from which we filtered out 125,716 battery-related patents. Utilizing this extensive collection of literature and patents, we conducted numerous experiments to evaluate the knowledge base, in context learning, instruction-following, and structured output capabilities of language models. Through multi-dimensional model evaluations and analyses, we found the following: first, the model exhibited high accuracy in screening literature on inorganic solid-state electrolytes, equivalent to the level of a doctoral student in the relevant field. Based on 10,604 data entries, the model demonstrated good recognition capabilities in identifying literature on in-situ polymerization/solidification technology. However, its understanding accuracy for this emerging technology was slightly lower than that for solid-state electrolytes, requiring further fine-tuning to improve accuracy. Second, through testing with 10,604 data entries, the model achieved reliable accuracy in extracting inorganic ionic conductivity data. Third, based on solid-state lithium battery patents from four companies in South Korea and Japan over the past 20 years, the model proved effective in analyzing historical patent trends and conducting comparative analyses. Furthermore, the model-generated personalized literature reports based on the latest publications also showed high accuracy.Fourth, by leveraging the model's iteration strategies, we enabled DeepSeek to engage in self-thinking, thereby providing more comprehensive responses. The research results indicate that language models possess strong capabilities in content summarization and trend analysis. However, we also observed that the model may occasionally exhibit issues with numerical hallucinations. Additionally, while processing vast amounts of battery-related data, the model still has room for optimization in engineering applications. Based on the characteristics of the model and the above test results, we utilized the DeepSeek V3-0324 model to extract data on inorganic solid electrolyte materials, including 5,970 entries of ionic conductivity, 387 entries of diffusion coefficients, and 3,094 entries of migration barriers. Additionally, it includes over 1,000 entries of data related to chemical, electrochemical, and mechanical properties, covering nearly all physical, chemical, and electrochemical properties associated with inorganic solid electrolytes. This also signifies that the application of large language models in scientific research has transitioned from assisting research to actively advancing its development. The datasets presented in this paper can be acess at the website: https://cmpdc.iphy.ac.cn/literature/SSE.html (DOI: https://doi.org/10.57760/sciencedb.j00213.00172).
  • [1]

    ChatGPT ChatGPT website

    [2]

    DeepSeek-AI, 2025 arXiv 2501.12948

    [3]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 2017 arXiv 1706.03762

    [4]

    Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever 2018 OpenAI Blog

    [5]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova 2018 arXiv 1810.04805

    [6]

    Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019 OpenAI Blog

    [7]

    Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei 2020 arXiv 2005.14165

    [8]

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. 2022 arXiv 2201.11903

    [9]

    o1模型OpenAI o1 Hub|OpenAI

    [10]

    An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Zeyu Cui, Zhenru Zhang, Zhihao Fan 2024 arXiv 2407.10671

    [11]

    Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Dan Zhang, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Jingyu Sun, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, Zihan Wang. 2024 arXiv 2406.12793

    [12]

    Gemini Gemini

    [13]

    Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, Zheng Liu 2024 arXiv 2402.03216

    [14]

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela. 2000 arXiv 2005.11401

    [15]

    吴思远, 王宇琦, 肖睿娟, 陈立泉, 2020物理学报, 69(22): 226104

    [16]

    离子输运数据库http://e01.iphy.ac.cn/bmd

    [17]

    Xiao R J, Li H, Chen L Q 2015 Sci. Rep. 514227

    [18]

    He B, Chi S, Ye A J, Mi P H, Zhang L W, Pu B W, Zou Z Y, Ran Y B, Zhao Q, Wang D, Zhang W Q, Zhao J T, Adams S, Avdeev M, Shi S 2020 Sci. Data 7151

    [19]

    Fangling Yang, Egon Campos dos Santos, Xue Jia, Ryuhei Sato, Kazuaki Kisu, Yusuke Hashimoto, Shin-ichi Orimo, Hao Li 2024 Nano Materials Science 6256-262

    [20]

    Cameron J. Hargreaves, Michael W. Gaultois, Luke M. Daniels, Emma J. Watts, Vitaliy A. Kurlin, Michael Moran, Yun Dang, Rhun Morris, Alexandra Morscher, Kate Thompson, Matthew A. Wright, Beluvalli-Eshwarappa Prasad, Frédéric Blanc, Chris M. Collins, Catriona A. Crawford, Benjamin B. Duff, Jae Evans, Jacinthe Gamon, Guopeng Han, Bernhard T. Leube, Hongjun Niu, Arnaud J. Perez, Aris Robinson, Oliver Rogan, Paul M. Sharp, Elvis Shoko, Manel Sonni, William J. Thomas, Andrij Vasylenko, Lu Wang, Matthew J. Rosseinsky & Matthew S. Dyer 2023 npj Comput Mater 9, 9

    [21]

    无机固态电解质材料数据库https://cmpdc.iphy.ac.cn/literature/SSE.html

    [22]

    Siyuan Wu(吴思远), Tiannian Zhu(朱天念), Sijia Tu(涂思佳), Ruijuan Xiao(肖睿娟), Jie Yuan(袁洁), Quansheng Wu(吴泉生), Hong Li(李泓), and Hongming Weng(翁红明) Literature classification and its applications in condensed matter physics and materials science by natural language processing 2024 Chin. Phys. B 33050704

    [23]

    Yong Zhang, Meng-Xiang Xie, Wu Zhang, Jia-Li Yan, Gang-Qin Shao 2020 Materials Letters 266127508

    [24]

    Yuxiang Li, Shugo Daikuhara, Satoshi Hori, Xueying Sun, Kota Suzuki, Masaaki Hirayama, Ryoji Kanno 2020 Chemistry of Materials 328860-8867

    [25]

    Ruochen Xu, Zhang Wu, Shenzhao Zhang, Xiuli Wang, Yan Xia, Xinhui Xia, Xiaohua Huang, Jiangping Tu 2017 Chemistry – A European Journal 2313950-13956

    [26]

    Xuelei Li, Wenxiu Peng, Rongzheng Tian, Dawei Song, Zhenyu Wang, Hongzhou Zhang, Lingyun Zhu, Lianqi Zhang 2020 Electrochimica Acta 363137185

    [27]

    Fan Wu, William Fitzhugh, Luhan Ye, Jiaxin Ning, Xin Li 2018 Nature Communications 94037

    [28]

    MatElab平台https://in.iphy.ac.cn/eln/#/recusertype

  • [1] 曾交龙, 高城, 袁建民. 低密度铝铁金等离子体辐射不透明度数据库. 物理学报, doi: 10.7498/aps.74.20250301
    [2] 白靖宜, 黄桥高, 高鹏骋, 问昕, 褚勇. 基于去噪概率扩散模型的蝠鲼流场智能化预测. 物理学报, doi: 10.7498/aps.74.20241499
    [3] 徐佳歆, 徐乐辰, 刘靖阳, 丁华建, 王琴. 人工智能赋能量子通信与量子传感系统. 物理学报, doi: 10.7498/aps.74.20250322
    [4] 李晶宇, 杨晶, 王浩, 李雪鹏, 宁梓豪, 高宏伟, 王小军, 赵天卓, 樊仲维, 许祖彦. 基于人工智能算法的宽稳区大模场纳秒激光产生. 物理学报, doi: 10.7498/aps.74.20250519
    [5] 杨源, 胡乃方, 金永成, 马君, 崔光磊. 富锂正极材料在全固态锂电池中的研究进展. 物理学报, doi: 10.7498/aps.72.20230258
    [6] 潘新宇, 毕筱雪, 董政, 耿直, 徐晗, 张一, 董宇辉, 张承龙. 叠层相干衍射成像算法发展综述. 物理学报, doi: 10.7498/aps.72.20221889
    [7] 耿晓彬, 李顶根, 徐波. 固态电解质电池锂枝晶生长机械应力-热力学相场模拟研究. 物理学报, doi: 10.7498/aps.72.20230824
    [8] 侯晨阳, 孟凡超, 赵一鸣, 丁进敏, 赵小艇, 刘鸿维, 王鑫, 娄淑琴, 盛新志, 梁生. “机器微纳光学科学家”: 人工智能在微纳光学设计的应用与发展. 物理学报, doi: 10.7498/aps.72.20230208
    [9] 沈培鑫, 蒋文杰, 李炜康, 鲁智德, 邓东灵. 量子人工智能中的对抗学习. 物理学报, doi: 10.7498/aps.70.20210789
    [10] 固态电池中的物理问题专题编者按. 物理学报, doi: 10.7498/aps.69.220101
    [11] 吴思远, 王宇琦, 肖睿娟, 陈立泉. 电池材料数据库的发展与应用. 物理学报, doi: 10.7498/aps.69.20201542
    [12] 王晨阳, 段倩倩, 周凯, 姚静, 苏敏, 傅意超, 纪俊羊, 洪鑫, 刘雪芹, 汪志勇. 基于遗传算法优化卷积长短记忆混合神经网络模型的光伏发电功率预测. 物理学报, doi: 10.7498/aps.69.20191935
    [13] 彭林峰, 曾子琪, 孙玉龙, 贾欢欢, 谢佳. 富钠反钙钛矿型固态电解质的简易合成与电化学性能. 物理学报, doi: 10.7498/aps.69.20201227
    [14] 刘玉龙, 辛明杨, 丛丽娜, 谢海明. 聚氧乙烯基聚合物固态电池的界面研究进展. 物理学报, doi: 10.7498/aps.69.20201588
    [15] 余启鹏, 刘琦, 王自强, 李宝华. 全固态金属锂电池负极界面问题及解决策略. 物理学报, doi: 10.7498/aps.69.20201218
    [16] 曹文卓, 李泉, 王胜彬, 李文俊, 李泓. 金属锂在固态电池中的沉积机理、策略及表征. 物理学报, doi: 10.7498/aps.69.20201293
    [17] 赵宁, 穆爽, 郭向欣. 石榴石型固态锂电池中的物理问题. 物理学报, doi: 10.7498/aps.69.20201191
    [18] 张桥保, 龚正良, 杨勇. 硫化物固态电解质材料界面及其表征的研究进展. 物理学报, doi: 10.7498/aps.69.20201581
    [19] 苏佳杭, 伍钧, 胡思得. 基于数据库进行乏燃料鉴别的多元统计分析研究. 物理学报, doi: 10.7498/aps.68.20190107
    [20] 郭 力, 梁林云, 陈 冲, 王命泰, 孔明光, 王孔嘉. 聚苯胺基固态染料敏化太阳电池中电子输运性能的研究. 物理学报, doi: 10.7498/aps.56.4270
计量
  • 文章访问数:  21
  • PDF下载量:  4
  • 被引次数: 0
出版历程
  • 上网日期:  2025-06-23

/

返回文章
返回