2025年春季《FM meets AI》3+X讨论班面向北京大学数学科学学院本科生开设。

时间:周五下午15:00-17:00。

地点 :智华楼流形-217

⼈⼯智能技术特别是近年来以ChatGPT为代表的大语言模型已经成为新⼀轮科技⾰命和产业变⾰的核⼼驱动⼒,对经济发展和社会进步产⽣了极其深刻的影响。有别于传统软件,⼈⼯智能软件的开发往往依赖于⼤规模的数据统计,训练出的模型也不像传统的程序⼀样具有清晰的行为逻辑,难以被理解和验证,因此其安全性和可靠性很难从理论上得到保障。近年来,利⽤形式化⽅法研究⼈⼯智能系统的安全可信问题已成为学术界和⼯业界的热点问题之⼀,通过严格的推理和⾃动化的数学证明,为人工智能系统提供安全性、可靠性的保证,成为该领域主要的研究⽬标。另一方面,虽然近年来形式化方法研究取得了显著进展,在芯片设计研制、操作系统微内核验证、编译器验证、区块链智能合约验证等方面均有大量成功应用,但仍然面临成本高昂和可扩展性差的巨大挑战,利用人工智能技术特别是大模型技术提升逻辑推理和形式化验证的自动化程度,降低验证成本,已经成为形式化方法领域的研究潮流,有望为形式化方法的发展带来深刻变革。

⼈⼯智能与形式化⽅法的结合,将为人工智能系统的安全可信问题提供新的理论基础与技术⽀持,为⼈⼯智能在安全攸关领域实现⼤规模应⽤提供可信保障,为未来⼈与智能系统的安全协作奠定基础。 本学期“FM meets AI”讨论班将重点关注大模型与形式化方法的交叉融合,包括LLM for FM和FM for LLM两方面的问题,探讨形式化方法和大模型技术的双向赋能。

时间 报告内容 报告人
2025.4.11 [1.1] From Informal to Formal - Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs 雷斐然
2025.4.18 [2.1] Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks 王秭如
2025.4.25 大模型代码生成与代码推理能力研究(邀请报告) 曹嘉伦
2025.5.9 [1.9] nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models 岳关璋
2025.5.23 [1.4] Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis 高梓涵
2025.5.30 [1.3] Lemur: Integrating Large Language Models in Automated Program Verification 邬程灿
2025.5.30 [2.2] Formally Specifying the High-Level Behavior of LLM-Based Agents 张知辛

1. Large Language Models for Formal Verification:

建议报告论文:

[1.1] Jialun Cao, Yaojie Lu, Meiziniu Li, Haoyang Ma, Haokun Li, Mengda He, Cheng Wen, Le Sun, Hongyu Zhang, Shengchao Qin, Shing-Chi Cheung, Cong Tian. From Informal to Formal - Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs. CoRR abs/2501.16207 (2025)

[1.2] Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, Cong Tian. Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program Verification. Proceedings of CAV 2024, pages 302-328, LNCS 14682, 2024.

[1.3] Haoze Wu, Clark W. Barrett, Nina Narodytska. Lemur: Integrating Large Language Models in Automated Program Verification. ICLR 2024.

[1.4] Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing HONG, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-Yan Yeung, Lifeng Shang. Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis, ICLR 2024.

[1.5] Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, Bing-Zhen Zhang, Ke Wei. AutoSAT: Automatically Optimize SAT Solvers via Large Language Models. CoRR abs/2402.10705 (2024)

[1.6] Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu. PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation. Accepted by NDSS Symposium 2025. CoRR abs/2405.02580 (2024)

[1.7] Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan J. Prenger, Animashree Anandkumar. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. NeurIPS 2023.

[1.8] Haonan Li, Yu Hao, Yizhuo Zhai, Zhiyun Qian. Assisting Static Analysis with Large Language Models: A ChatGPT Experiment. Proceedings of ESEC/SIGSOFT FSE 2023, pages 2107-2111, 2023.

[1.9] Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel. nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models. Proceedings of CAV 2023, LNCS 13965, pages 383-396, Springer, 2023.

2. Formal Methods for Large Language Models:

建议报告论文:

[2.1]Abhinav Rao, Atharva Naik, Sachin Vashistha, Somak Aditya, Monojit Choudhury. Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks. LREC/COLING 2024: 16802-16830.

[2.2]Maxwell Crouse, Ibrahim Abdelaziz, Kinjal Basu, Soham Dan, Sadhana Kumaravel, Achille Fokoue, Pavan Kapanipathi, Luis A. Lastras. Formally Specifying the High-Level Behavior of LLM-Based Agents. CoRR abs/2310.08535 (2023)