北京大学数学学院

首页» 科学研究» 学术报告» 讨论班» Information Sciences

讨论班

机器学习实验室博士生系列论坛（第二十二期）——Recent Progress on the Theory of Robust MDPs

报告人：Wenhao Yang (PKU)

时间：2022-01-05 15:10-16:10

地点：北大理科一号楼1513会议室 & 腾讯会议 761 4699 1810

Abstract: Robust MDPs are proposed to handle the sensitive estimation errors in value estimation of MDPs, where the transition probability is allowed to take values in an uncertainty set. In recent years, many works have proposed computationally efficient learning algorithms to solve robust MDPs and obtained the near-optimal robust policy and value function. However, the statistical performances of the optimal robust policy and value function are less studied. In this talk, we will introduce the basic theories and algorithms of robust MDPs and figure out two questions: (a) How many samples are sufficient to guarantee the accuracy of the robust estimators; (b) whether it is possible to make statistical inferences from the robust estimators.

北大数学成就展

人才引进

捐赠