I am a final-year Ph.D. student in Computer Science at Nanjing University, advised by Prof. Tong Lu. Later this year, I will be joining NVIDIA Research as a Research Scientist.

My research focuses on Vision-Language Models (VLMs) and Autonomous Driving Perception. I have published 10+ papers(Google Scholar) at the top international AI conferences such as CVPR, ICCV, ECCV and etc.

Currently, I am working on frontier Vision-Language Models at NVIDIA, collaborating with Zhiding Yu, Guilin Liu, and other outstanding researchers on Project Eagle. Eagle is contributing to NVIDIA Cosmos Nemotron and NVIDIA Isaac GR00T N1.

Previously, I focused on autonomous driving, where I developed BEVFormer, a widely recognized method in the field. I have also won three consecutive championships in autonomous driving challenges at CVPR 2022, 2023, and 2024.

I am also the first Ph.D. student from a Chinese institution awarded the NVIDIA PhD Fellowship.

🔥 News

📝 Publications and Reports

Arxiv
sym

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Zhiqi Li, Guo Chen, Shilong Liu, Shihao Wang, Vibashan VS, Yishen Ji, Shiyi Lan, Hao Zhang, Yilin Zhao, Subhashree Radhakrishnan, Nadine Chang, Karan Sapra, Amala Sanjay Deshmukh, Tuomas Rintamaki, Matthieu Le, Ilia Karmanov, Lukas Voegtle, Philipp Fischer, De-An Huang, Timo Roman, Tong Lu, Jose M Alvarez, Bryan Catanzaro, Jan Kautz, Andrew Tao, Guilin Liu, Zhiding Yu

PDF code

  • This work focuses on developing open-source vision-language models by emphasizing data strategy in post-training, resulting in the performant Eagle2 models that achieve state-of-the-art results across various multimodal benchmarks.

🎖 Honors and Awards