师资队伍

于超 北京中关村学院 导师

2025-03-28

于超.jpg

基本信息

导师姓名:于超

担任职务:北京中关村学院 导师

主要研究领域:强化学习算法及其应用。关注多智能体强化学习,博弈论等;大模型背景下的大规模强化学习训练系统,具身智能,主要关注具身大模型,具身推理,世界模型;新一代基于大模型的人机交互

简介:于超,博士毕业于清华大学,师从电子系汪玉教授和交叉信息研究院吴翼助理教授。研究方向为基于强化学习的决策智能,关注新一代人机交互智能体、具身智能、无人机决策与控制等。曾获清华大学优秀博士毕业生,清华大学优秀博士论文,2024年度中国智能体与多智能体系统优秀博士论文提名奖(5人),国家奖学金等荣誉。博后入选清华大学“水木学者”计划,电子系“传信未来学者”计划;获得张克潜冠名博后资助;主持国家自然科学基金青年项目和博士后基金特别资助项目、博士后基金面上项目以及多项企业横向项目。在高水平国际会议和期刊发表论文30余篇,累计谷歌学术引用3300余次,代表作多智能体强化学习算法MAPPO,目前谷歌学术引用1600余次。

个人经历

教育经历:

2019-2023   清华大学 电子工程系 博士

2016-2019   清华大学 机械工程系 硕士

2012-2016   北京理工大学 自动化学院 学士

人物经历:

2025-至今    北京中关村学院  研究员 

2023-2025   清华大学 电子工程系  博士后 

2024-2025   南洋理工大学 数据与计算机学院 访问学者 

社会兼职: 

ICLR\ICML\NeurIPS\ICRA\IROS\RAL\TPAMI等审稿人

科学研究

承担科研项目:

2025-2027    自然科学基金青年项目,主持

2024-2025    博士后基金特别资助项目,主持

2024-2025    博士后基金面上项目,主持

2025-2026    清华大学-埃夫特具身感知与计算科研中心项目,主持

2023-2025    某部国家专项,参与

2025-2027    华为横向项目,参与

2021-2025    丰田横向项目,参与

近期主要学术成果:

l  Chao Yu*, Akash Velu*, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, Yi Wu. The Surprising Effectiveness of PPO in Cooperative Multi-agent Games. in Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2022. 

l Chao Yu, Zuxin Liu, Xin-Jun Liu, Fugui Xie, Yi Yang, Qi Wei, Fei Qiao. DS-SLAM: A semantic visual SLAM towards dynamic environments. In International Conference on Intelligent Robots and Systems (IROS), 2018. 

l Chao Yu, Xinyi Yang, Jiaxuan Gao, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu, Yu Wang. Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-time Multi-robot Cooperative Exploration. in International Conference on Autonomous Agents and Multi-agent Systems (AAMAS), 2023. 

l Chao Yu*, Xinyi Yang*, Jiaxuan Gao*, Huazhong Yang, Yu Wang, Yi Wu. Learning Efficient Multi-agent Cooperative Visual Exploration. in European Conference on Computer Vision (ECCV), 2022. 

l Chao Yu*, Jiaxuan Gao*, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu. Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased. in International Conference on Learning Representations (ICLR), 2023. 

l Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu. Discovering Diverse Multi-agent Strategic Behavior Via Reward Randomization. in International Conference on Learning Representations (ICLR), 2021.

l Jijia Liu*, Chao Yu*, Jiaxuan Gao*, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang, LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination. in International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024.

l Zelai Xu, Chao Yu+, Yancheng Liang, Yi Wu, and Yu Wang+, “Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play” , in Journal of Machine Learning Research (JMLR), 26, MIT Press and Microtome Publishing, 2025.

l Zelai Xu, Chao Yu, Fei Fang, Yu Wang, Yi Wu. Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game. in International Conference on Machine Learning (ICML), 2024. 

l Wei Fu, Chao Yu, Zelai Xu, Jiaqi Yang, Yi Wu. Revisiting Some Common Practices in Cooperative Multi-agent Reinforcement Learning. in International Conference on Machine Learning (ICML), 2022. 

l Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu+, Yi Wu+. Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study. in International Conference on Machine Learning (ICML), 2024. 

l Botian Xu, Feng Gao, Chao Yu+, Ruize Zhang, Yi Wu, Yu Wang+. OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control. in IEEE Robotics and Automation Letters (RAL), 2024. 

l Xinyi Yang*, Yuxiang Yang*, Chao Yu+, Jiayu Chen, Jincheng Yu, Haibing Ren, Huazhong Yang, Yu Wang+. Active Neural Topological Mapping for Multi-Agent Exploration. in IEEE Robotics and Automation Letters (RAL), 2024. 

l Chao Yu, Qixin Tan, Hong Lu, Jiaxuan Gao,  Xinting Yang, Yu Wang+, Yi Wu+, Eugene Vinitsky+,”Few-shot In-context Preference Learning”, in https://arxiv.org/pdf/2410.17233, 2024. 

l Jiayu Chen*, Chao Yu*+, Yuqing Xie, Feng Gao, Yinuo Chen, Shu'ang Yu, Wenhao Tang, Shilong Ji, Mo Mu, Yi Wu, Huazhong Yang, Yu Wang+,”What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study”, in https://arxiv.org/abs/2412.11764, 2024.

l Jiayu Chen*, Chao Yu*+, Guosheng Li, Wenhao Tang, Xinyi Yang, Botian Xu, Huazhong Yang, Yu Wang+, “Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning” , in https://arxiv.org/pdf/2409.15866, 2024.

l Feng Gao, Chao Yu+, Yu Wang, Yi Wu+, “Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback”, in https://arxiv.org/abs/2411.13079, 2024.

l Zelai Xu, Chao Yu+, Ruize Zhang, Huining Yuan, Xiangmin Yi, Shilong Ji, Chuqi Wang, Wenhao Tang, Yu Wang+, “VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play” , in https://arxiv.org/abs/2502.01932, 2025.

荣誉表彰:

l    清华大学水木学者

l    清华大学电子系传信未来学者

l    2024年度中国智能体与多智能体系统优秀博士论文提名奖

l    清华大学优秀博士毕业生

l    清华大学优秀博士论文

l    清华大学优秀硕士论文

l    国家奖学金

Loading...