Research in my group focuses on two intimately connected research threads: Robotics and Embodied AI. We investigate the synergistic relations of perception and action in embodied agents and build intelligent algorithms that give rise to general-purpose robot autonomy.

In Robotics, we develop methods and mechanisms that enable autonomous robots to reason about the real world through their senses, to flexibly perform a wide range of tasks, and to adaptively learn new tasks. To deploy general-purpose robot autonomy in the wild, we have to deal with the variability and uncertainty of the unstructured environments. We address this challenge by closing the perception-action loop using robot perception and learning techniques.

In Embodied AI, we build computational frameworks of embodied agents. In these frameworks, perception arises from an embodied agent’s active, situated, and skillful interactions in the open world; and its ability to make sense of the world through the lenses of perception, in turn, facilitates intelligent behaviors.

Our work draws theories and methods from robotics, machine learning, and computer vision, along with inspirations from human cognition, neuroscience, and philosophy, to solve open problems at the forefront of Robotics and AI. We are always looking out for talented members to join our group.

Talks and Tutorials

You can learn more about my recent research from my talks and tutorials.

  • Visual Imitation Learning: Generalization, Perceptual Grounding, and Abstraction. RSS’20 Workshop on Advances & Challenges in Imitation Learning for Robotics, July 2020. (workshop, slides)

  • Building General-Purpose Robot Autonomy: A Progressive Roadmap. Samsung Forum, June 2020. (video, slides)

  • Learning Keypoint Representations for Robot Manipulation. IROS’19 Workshop on Learning Representations for Planning and Control, November 2019. (workshop, slides)

  • Learning How-To Knowledge from the Web. IROS’19 Workshop on the Applications of Knowledge Representation and Semantic Technologies in Robotics, November 2019. (workshop, slides)

  • Closing the Perception-Action Loop: Towards General-Purpose Robot Autonomy. Stanford Ph.D. Defense, August 2019. (dissertation, slides)

Open-Source Software & Data

I devote effort to making scientific research more reproducible and making knowledge accessible to a broader population. Open-sourcing research software and datasets is one of my key practices. You can find open-source code and data out of my research in the Publications page or on my GitHub. I highlight some public resources below:

  • RoboTurk: large-scale crowdsourced teleoperation dataset for robotic imitation learning

  • SURREAL: distributed reinforcement learning framework and robot manipulation benchmark

  • AI2-THOR: open-source interactive environments for embodied AI

  • Visual Genome: visual knowledge base that connects structured image concepts to language

Media Coverage (Selected)

Tech Xplore, November 21, 2018
In the future, RoboTurk could become a key resource in the field of robotics, aiding the development of more advanced and better performing robots.

Stanford News, October 26, 2018
With a smartphone and a browser, people worldwide will be able to interact with a robot to speed the process of teaching robots how to do basic tasks.

NVIDIA, April 3, 2018
Robots learning to do things by watching how humans do it? That’s the future.

Digital Trends, February 19, 2018
Robots are getting better at dealing with the complexity of the real world, but they still need a helping hand when taking their first tentative steps outside of easily defined lab conditions.

MIT Technology Review, February 16, 2018
A new digital training ground that replicates an average home lets AI learn how to do simple chores like slicing apples, making beds, or carrying drinks in a low-stakes environment.

IEEE Spectrum, February 15, 2018
AI2-THOR, an interactive simulation based on home environments, can prepare AI for real-world challenges.

MIT Technology Review, January 26, 2016
A new database will gauge progress in artificial intelligence, as computers try to grasp what’s going on in scenes shown in photographs.