Dian Chen

I am a second-year PhD student in CS at UT Austin, advised by Prof. Philipp Krähenbühl .

Previously I studied at UC Berkeley majoring in Computer Science and Applied Mathematics, where I worked with Dr. Pulkit Agrawal, Deepak Pathak, Prof. Sergey Levine, Prof. Pieter Abbeel, and Prof. Jitendra Malik as a research assistant in the Berkeley Artificial Intelligence Research (BAIR) Lab.

Email  /  GitHub  /  Scholar


My research interests lie in robotics, computer vision and machine learning including reinforcement learning.

Learning Instance Segmentation by Interaction
Deepak Pathak*, Fred Shentu*, Dian Chen*, Pulkit Agrawal*, Trevor Darrell, Sergey Levine, Jitendra Malik (*equal contribution)
Robotics Vision Workshop, Conference on Computer Vision and Pattern Recognition (CVPR), 2018
website / arxiv

We present a robotic system that learns to segment its visual observations into individual objects by experimenting with its environment in a completely self-supervised manner. Our system is at par with the state-of-art instance segmentation algorithm trained with strong supervision.

Zero-Shot Visual Imitation
Deepak Pathak*, Parsa Mahmoudieh*, Michael Luo*, Pulkit Agrawal*, Dian Chen, Fred Shentu, Evan Shelhamer, Jitendra Malik, Alexei Efros, Trevor Darrell (*equal contribution)
(Oral Presentation) International Conference on Learning Representation (ICLR), 2018
website / arxiv

We present a novel skill policy architecture and dynamics consistency loss which extend visual imitation to more complex environments while improving robustness. Experiments results are shown in a robot knot tying task and a first-person visual navigation task.

Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulationg
Ashvin Nair*, Dian Chen*, Pulkit Agrawal*, Phillip Isola, Jitendra Malik, Pieter Abbeel, Sergey Levine (*equal contribution)
IEEE International Conference on Robotics and Automation (ICRA), 2017
website / arxiv

We present a system where a robot takes as input a sequence of images of a human manipulating a rope from an initial to goal configuration, and outputs a sequence of actions that can reproduce the human demonstration, using only monocular images as input.

CS395T - Deep Learning Seminar - Fall 2019
Teaching Assistant
CS342 - Neural Networks - Fall 2018
Teaching Assistant