I am a fourth year (2017-) Computer Science Ph.D. student at The University of Texas at Austin, supervised by Prof. Philipp Krähenbühl. I obtained my bachelor degree from School of Computer Science at Fudan University, advised by Prof. Wei Zhang and Prof. Xiangyang Xue. I have interned with Dr. Yichen Wei at Microsoft Research Asia, Tyler Zhu and Dr. Kevin Murphy at Google Research, and Dr. Vladlen Koltun at Intel Lab. My research focuses on object-level visual recognition, including object detection, 3D perception, pose estimation, and tracking. CV / Google Scholar / GitHub / LinkedIn |
![]() |
The profile picture is taken by my lovely girlfriend Jiarui Gao.
Last updated 10/30/2020
Computer vision is fractured into datasets, and split into a multitude of visual domains. My research aims to remove the artificial barriers of datasets and make computer vision work in the wild. Towards this goal, my research focuses on two components: unified point-based objects representations, and a framework to automatically unify taxonomies of multiple datasets. I developed a point-based detection framework, CenterNet, that unifies many object-based recognition tasks, including object detection, human pose estimation, tracking, and 3D detection. The CenterNet framework forms the basis of a family of detection models, and is already widely used. In ongoing work, we explore how to learn object-based recognition from multiple data sources. Our current prototype merges taxonomies across different datasets using a completely visual distance metric. This prototype won the ECCV 2020 Robust Vision Challenge. Going forward, I want to unify all object-based computer vision tasks in a single model, by jointly training on multiple datasets with different task annotations --- there will be one computer vision model instead of a zoo of domain-specific ones.