
基本信息
导师姓名:于超
担任职务:北京中关村学院 导师
主要研究领域:强化学习算法及其应用。关注多智能体强化学习,博弈论等;大模型背景下的大规模强化学习训练系统,具身智能,主要关注具身大模型,具身推理,世界模型;新一代基于大模型的人机交互
简介:于超,博士毕业于清华大学,师从电子系汪玉教授和交叉信息研究院吴翼助理教授。研究方向为基于强化学习的决策智能,关注新一代人机交互智能体、具身智能、无人机决策与控制等。曾获清华大学优秀博士毕业生,清华大学优秀博士论文,2024年度中国智能体与多智能体系统优秀博士论文提名奖(5人),国家奖学金等荣誉。博后入选清华大学“水木学者”计划,电子系“传信未来学者”计划;获得张克潜冠名博后资助;主持国家自然科学基金青年项目和博士后基金特别资助项目、博士后基金面上项目以及多项企业横向项目。在高水平国际会议和期刊发表论文30余篇,累计谷歌学术引用3300余次,代表作多智能体强化学习算法MAPPO,目前谷歌学术引用1600余次。
个人经历
教育经历:
2019-2023 清华大学 电子工程系 博士
2016-2019 清华大学 机械工程系 硕士
2012-2016 北京理工大学 自动化学院 学士
人物经历:
2025-至今 北京中关村学院 研究员
2023-2025 清华大学 电子工程系 博士后
2024-2025 南洋理工大学 数据与计算机学院 访问学者
社会兼职:
ICLR\ICML\NeurIPS\ICRA\IROS\RAL\TPAMI等审稿人
科学研究
承担科研项目:
2025-2027 自然科学基金青年项目,主持
2024-2025 博士后基金特别资助项目,主持
2024-2025 博士后基金面上项目,主持
2025-2026 清华大学-埃夫特具身感知与计算科研中心项目,主持
2023-2025 某部国家专项,参与
2025-2027 华为横向项目,参与
2021-2025 丰田横向项目,参与
近期主要学术成果:
l Chao Yu*, Akash Velu*, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, Yi Wu. The Surprising Effectiveness of PPO in Cooperative Multi-agent Games. in Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2022.
l Chao Yu, Zuxin Liu, Xin-Jun Liu, Fugui Xie, Yi Yang, Qi Wei, Fei Qiao. DS-SLAM: A semantic visual SLAM towards dynamic environments. In International Conference on Intelligent Robots and Systems (IROS), 2018.
l Chao Yu, Xinyi Yang, Jiaxuan Gao, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu, Yu Wang. Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-time Multi-robot Cooperative Exploration. in International Conference on Autonomous Agents and Multi-agent Systems (AAMAS), 2023.
l Chao Yu*, Xinyi Yang*, Jiaxuan Gao*, Huazhong Yang, Yu Wang, Yi Wu. Learning Efficient Multi-agent Cooperative Visual Exploration. in European Conference on Computer Vision (ECCV), 2022.
l Chao Yu*, Jiaxuan Gao*, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu. Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased. in International Conference on Learning Representations (ICLR), 2023.
l Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu. Discovering Diverse Multi-agent Strategic Behavior Via Reward Randomization. in International Conference on Learning Representations (ICLR), 2021.
l Jijia Liu*, Chao Yu*, Jiaxuan Gao*, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang, LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination. in International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024.
l Zelai Xu, Chao Yu+, Yancheng Liang, Yi Wu, and Yu Wang+, “Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play” , in Journal of Machine Learning Research (JMLR), 26, MIT Press and Microtome Publishing, 2025.
l Zelai Xu, Chao Yu, Fei Fang, Yu Wang, Yi Wu. Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game. in International Conference on Machine Learning (ICML), 2024.
l Wei Fu, Chao Yu, Zelai Xu, Jiaqi Yang, Yi Wu. Revisiting Some Common Practices in Cooperative Multi-agent Reinforcement Learning. in International Conference on Machine Learning (ICML), 2022.
l Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu+, Yi Wu+. Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study. in International Conference on Machine Learning (ICML), 2024.
l Botian Xu, Feng Gao, Chao Yu+, Ruize Zhang, Yi Wu, Yu Wang+. OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control. in IEEE Robotics and Automation Letters (RAL), 2024.
l Xinyi Yang*, Yuxiang Yang*, Chao Yu+, Jiayu Chen, Jincheng Yu, Haibing Ren, Huazhong Yang, Yu Wang+. Active Neural Topological Mapping for Multi-Agent Exploration. in IEEE Robotics and Automation Letters (RAL), 2024.
l Chao Yu, Qixin Tan, Hong Lu, Jiaxuan Gao, Xinting Yang, Yu Wang+, Yi Wu+, Eugene Vinitsky+,”Few-shot In-context Preference Learning”, in https://arxiv.org/pdf/2410.17233, 2024.
l Jiayu Chen*, Chao Yu*+, Yuqing Xie, Feng Gao, Yinuo Chen, Shu'ang Yu, Wenhao Tang, Shilong Ji, Mo Mu, Yi Wu, Huazhong Yang, Yu Wang+,”What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study”, in https://arxiv.org/abs/2412.11764, 2024.
l Jiayu Chen*, Chao Yu*+, Guosheng Li, Wenhao Tang, Xinyi Yang, Botian Xu, Huazhong Yang, Yu Wang+, “Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning” , in https://arxiv.org/pdf/2409.15866, 2024.
l Feng Gao, Chao Yu+, Yu Wang, Yi Wu+, “Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback”, in https://arxiv.org/abs/2411.13079, 2024.
l Zelai Xu, Chao Yu+, Ruize Zhang, Huining Yuan, Xiangmin Yi, Shilong Ji, Chuqi Wang, Wenhao Tang, Yu Wang+, “VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play” , in https://arxiv.org/abs/2502.01932, 2025.
荣誉表彰:
l 清华大学水木学者
l 清华大学电子系传信未来学者
l 2024年度中国智能体与多智能体系统优秀博士论文提名奖
l 清华大学优秀博士毕业生
l 清华大学优秀博士论文
l 清华大学优秀硕士论文
l 国家奖学金