2025年秋季《FM meets AI》3+X讨论班面向北京大学数学科学学院本科生开设。

时间:周五下午15:00-17:00。

⼈⼯智能技术特别是近年来以ChatGPT为代表的大语言模型已经成为新⼀轮科技⾰命和产业变⾰的核⼼驱动⼒,对经济发展和社会进步产⽣了极其深刻的影响。有别于传统软件,⼈⼯智能软件的开发往往依赖于⼤规模的数据统计,训练出的模型也不像传统的程序⼀样具有清晰的行为逻辑,难以被理解和验证,因此其安全性和可靠性很难从理论上得到保障。近年来,利⽤形式化⽅法研究⼈⼯智能系统的安全可信问题已成为学术界和⼯业界的热点问题之⼀,通过严格的推理和⾃动化的数学证明,为人工智能系统提供安全性、可靠性的保证,成为该领域主要的研究⽬标。另一方面,虽然近年来形式化方法研究取得了显著进展,在芯片设计研制、操作系统微内核验证、编译器验证、区块链智能合约验证等方面均有大量成功应用,但仍然面临成本高昂和可扩展性差的巨大挑战,利用人工智能技术特别是大模型技术提升逻辑推理和形式化验证的自动化程度,降低验证成本,已经成为形式化方法领域的研究潮流,有望为形式化方法的发展带来深刻变革。

⼈⼯智能与形式化⽅法的结合,将为人工智能系统的安全可信问题提供新的理论基础与技术⽀持,为⼈⼯智能在安全攸关领域实现⼤规模应⽤提供可信保障,为未来⼈与智能系统的安全协作奠定基础。 本学期“FM meets AI”讨论班将重点关注大模型技术对形式化方法的赋能(LLM for FM)方面的问题,对相关研究进行探讨。

时间/地点 报告内容 报告人
2025.10.10(勾股-216) 大语言模型赋能形式化方法 孙猛
2025.10.31(盈不足-109)   苏亭(华东师大)
2025.11.14(武义-312) LeanDojo: Theorem Proving with Retrieval-Augmented Language Models 邬程灿
2025.11.21(武义-312)    
2025.12.5(武义-312)    
2025.12.12(武义-312)    
2025.12.19(武义-312)    

建议报告论文:

[1] Francesco Fuggitti, Tathagata Chakraborti. NL2LTL - a Python Package for Converting Natural Language (NL) Instructions to Linear Temporal Logic (LTL) Formulas. Proceedings of AAAI 2023, pages 16428-16430, 2023.

[2] Zhi Ma, Cheng Wen, Bin Yu, Jie Su. Integrating ensemble learning and Large Language Models for efficient formal verification of IP-based aerospace systems. Information Fusion, vol. 125: 103466, 2026.

[3] Mengyan Zhao, Ran Tao, Yanhong Huang, Jianqi Shi, Shengchao Qin, Yang Yang. NL2CTL: Automatic Generation of Formal Requirements Specifications via Large Language Models. Proceedings of ICFEM 2024, LNCS 15394, pages 1-17, Springer, 2024.

[4] Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, Shuvendu K. Lahiri. Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions? Proc. ACM Softw. Eng. 1(FSE): 1889-1912, 2024.

[5] Adharsh Kamath, Nausheen Mohammed, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K. Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma. Leveraging LLMs for Program Verification. Proceedings of FMCAD 2024, pages 107-118, IEEE, 2024.

[6] Chang Liu, Xiwei Wu, Yuan Feng, Qinxiang Cao, Junchi Yan. Towards General Loop Invariant Generation: A Benchmark of Programs with Memory Manipulation. NeurIPS 2024.

[7] Emily First, Markus N. Rabe, Talia Ringer, Yuriy Brun. Baldur: Whole-Proof Generation and Repair with Large Language Models. Proceedings of ESEC/SIGSOFT FSE 2023, pages 1229-1241, ACM, 2023.

[8] Muhammad A. A. Pirzada, Giles Reger, Ahmed Bhayat, Lucas C. Cordeiro. LLM-Generated Invariants for Bounded Model Checking Without Loop Unrolling. Proceedings of ASE 2024, pages 1395-1407, ACM, 2024.

[9] Guangyuan Wu, Weining Cao, Yuan Yao, Hengfeng Wei, Taolue Chen, Xiaoxing Ma. LLM Meets Bounded Model Checking: Neuro-symbolic Loop Invariant Inference. Proceedings of ASE 2024, pages 406-417, ACM, 2024.

[10] Haoze Wu, Clark W. Barrett, Nina Narodytska. Lemur: Integrating Large Language Models in Automated Program Verification. ICLR 2024.

[11] Yixuan Li, Julian Parsert, Elizabeth Polgreen. Guiding Enumerative Program Synthesis with Large Language Models. Proceedings of CAV 2024 (Part 2), LNCS 14682, pages 280-301, Springer, 2024.

[12] Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, Cong Tian. Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program Verification. Proceedings of CAV 2024 (Part 2), LNCS 14682, pages 302-328, 2024.

[13] Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, Bing-Zhen Zhang, Ke Wei. AutoSAT: Automatically Optimize SAT Solvers via Large Language Models. CoRR abs/2402.10705 (2024)

[14] Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu. PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation. Accepted by NDSS Symposium 2025. CoRR abs/2405.02580 (2024)

[15] Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan J. Prenger, Animashree Anandkumar. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. NeurIPS 2023.

[16] Haonan Li, Yu Hao, Yizhuo Zhai, Zhiyun Qian. Assisting Static Analysis with Large Language Models: A ChatGPT Experiment. Proceedings of ESEC/SIGSOFT FSE 2023, pages 2107-2111, 2023.