I am a final-year Ph.D. student in Computer Science at Nanjing University, advised by Prof. Tong Lu. Later this year, I will be joining NVIDIA Research as a Research Scientist.
My research focuses on Vision-Language Models (VLMs) and Autonomous Driving Perception. I have published 10+ papers(Google Scholar) at the top international AI conferences such as CVPR, ICCV, ECCV and etc.
Currently, I am working on frontier Vision-Language Models at NVIDIA, collaborating with Zhiding Yu, Guilin Liu, and other outstanding researchers on Project Eagle. Eagle is contributing to NVIDIA Cosmos Nemotron and NVIDIA Isaac GR00T N1.
Previously, I focused on autonomous driving, where I developed BEVFormer, a widely recognized method in the field. I have also won three consecutive championships in autonomous driving challenges at CVPR 2022, 2023, and 2024.
I am also the first Ph.D. student from a Chinese institution awarded the NVIDIA PhD Fellowship.
🔥 News
-
2025-03: Eagle2 has been adopted by NVIDIA GEAR Team to develop robotic foundation model GR00T N1.
-
2025-01: We present the frontier VLM, Eagle2 and the model weight has been released at huggingface.
-
2024-11: BEVFormer was accepted at T-PAMI.
-
2024-06: On CVPR 2024, our Team-NVIDIA won the outstanding champion and Innovation Award of End-to-End Driving at Scale Challenge. Our Team at NJU-ImagineLab won the True outstanding champion of Driving with Language.
-
2024-02: BEV-Planner and DCNv4 were accpeted at CVPR 2024.
-
2023-11: VCD was accepted at NeurIPS 2023 and one BEV Survey was accepted to TPAMI
-
2023-10: InternImage was selected as one of CVPR 2023 Top-10 Influential Papers.
-
2023-07: FB-BEV was acceped at ICCV 2023
-
2023-06: Our team wins both the outstanding champion and Innovation Award of Occupancy Prediction Challenge (15,000+5,000 UDS Bonus) on CVPR2023.
- 2023-01: BEVFormer is selected as one of ECCV 2022 Top-10 Influential Papers and the 100 most cited AI papers in 2022
-
2022-11: We release InternImage, setting a new record 65.4 box mAP on COCO test-dev
- 2022-06: Our team wins the champion of Waymo 2022 3D Camera-Only Detection Task (15,000 USD Bonus)
📝 Publications and Reports

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models
Zhiqi Li, Guo Chen, Shilong Liu, Shihao Wang, Vibashan VS, Yishen Ji, Shiyi Lan, Hao Zhang, Yilin Zhao, Subhashree Radhakrishnan, Nadine Chang, Karan Sapra, Amala Sanjay Deshmukh, Tuomas Rintamaki, Matthieu Le, Ilia Karmanov, Lukas Voegtle, Philipp Fischer, De-An Huang, Timo Roman, Tong Lu, Jose M Alvarez, Bryan Catanzaro, Jan Kautz, Andrew Tao, Guilin Liu, Zhiding Yu
- This work focuses on developing open-source vision-language models by emphasizing data strategy in post-training, resulting in the performant Eagle2 models that achieve state-of-the-art results across various multimodal benchmarks.
🎖 Honors and Awards
- 2024-06: True outstanding champion of Driving with Language
- 2024-06: Outstanding champion and Innovation Award of End-to-End Driving at Scale Challenge
- 2023-12: NVIDIA Graduate Fellowship (Top-10 globally, First awarded student from Chinese institution)
- 2023-12: Distinguished Student of Nanjing University (南京大学学生年度人物, Top-10)
- 2023-11: National Scholarship
- 2023-06: Occupancy Prediction Challenge, 1st Place and Innovation Award (15,000+5000 USD Bonus)
- 2022-06: Waymo 2022 3D Camera-Only Detection Task, 1st Place (15,000 USD Bonus)
📖 Educations
- 20.09 - present, Nanjing University, China
- 16.09 - 20.06, Nanjing University, China