About
I am a Research Scientist at NVIDIA Research, based in Singapore. Previously, I earned my Ph.D. in Computer Science from Nanjing University, advised by Prof. Tong Lu.
Currently, I am working on frontier VLM/VLA models at NVIDIA. Previously we developed Project Eagle. Eagle has contributed to NVIDIA Nemotron VLMs and NVIDIA Isaac GR00T N1.
During my Ph.D., I focused on autonomous driving, where I developed BEVFormer, a widely recognized method in the field. I have also won three consecutive championships in autonomous driving challenges at CVPR 2022, 2023, and 2024.
I am also the first Ph.D. student from a Chinese institution awarded the NVIDIA PhD Fellowship.
Hiring interns. We are hiring interns to work on VLM/VLA models for NVIDIA flagship projects. Contact zhiqil@nvidia.com — December 2025
Recent News
We release Vesta, a generalist embodied-reasoning foundation model that consolidates localization, spatial reasoning, navigation, and long-horizon planning into a single system. With curated embodied data and a multimodal memory harness, it surpasses the strongest specialist baselines by over 20% on average and improves memory-heavy real-robot tasks by 38.3%.
We release LocateAnything, a unified vision-language model for fast, high-quality visual grounding. Its Parallel Box Decoding predicts whole bounding boxes in a single pass — up to 2.5× higher throughput — across object detection, GUI grounding, and document/OCR localization. The model is available on Hugging Face.
Eagle2 has been adopted by NVIDIA GEAR Team to develop robotic foundation model GR00T N1.
We present the frontier VLM, Eagle2, and the model weights have been released on Hugging Face.
BEVFormer was accepted at T-PAMI.
At CVPR 2024, our Team-NVIDIA won the outstanding champion and Innovation Award of the End-to-End Driving at Scale Challenge. Our team at NJU-ImagineLab won the True outstanding champion of Driving with Language.
BEV-Planner and DCNv4 were accepted at CVPR 2024.
VCD was accepted at NeurIPS 2023, and one BEV Survey was accepted to T-PAMI.
InternImage was selected as one of the CVPR 2023 Top-10 Influential Papers.
FB-BEV was accepted at ICCV 2023.
Our team won both the outstanding champion and Innovation Award of the Occupancy Prediction Challenge (15,000 + 5,000 USD Bonus) at CVPR 2023.
BEVFormer is selected as one of the ECCV 2022 Top-10 Influential Papers and one of the 100 most cited AI papers in 2022.
We release InternImage, setting a new record of 65.4 box mAP on COCO test-dev.
Our team won the champion of the Waymo 2022 3D Camera-Only Detection Task (15,000 USD Bonus).
Selected Publications
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
CVPR 2023 · Top-10 Influential Paper
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
ECCV 2022 · Top-10 Influential Paper
Honors & Awards
True outstanding champion of Driving with Language
Outstanding champion and Innovation Award of the End-to-End Driving at Scale Challenge
NVIDIA Graduate Fellowship — Top-10 globally, first awarded student from a Chinese institution
Distinguished Student of Nanjing University (南京大学学生年度人物, Top-10)
National Scholarship
Occupancy Prediction Challenge — 1st Place and Innovation Award (15,000 + 5,000 USD Bonus)
Waymo 2022 3D Camera-Only Detection Task — 1st Place (15,000 USD Bonus)
Education
Ph.D., Computer Science · Nanjing University, China
B.Sc. · Nanjing University, China