table tennis

Reza Mahjourian

Researcher in robotics
and computer vision

About

I'm a PhD student in the Computer Science Department at UT Austin. I'm interested in learning algorithms, especially in the context of robotics and computer vision.

I recently completed a Student Researcher Program at Google Brain Robotics in Mountain View, CA, lasting about 19 months.

I got my Master's in Computer Science from University of Florida, working mostly on theoretical computer science and approximation algorithms. I got my Bachelor's degree in Computer Engineering from Sharif Univerisity of Technology.

Before starting graduate school, I worked as a software engineering. Examples of my work include: lead software developer for a startup, light-weight web framework open-sourced in 2002, and library functions contributed to TensorFlow.



My PhD talk on sample-efficient learning of robot table tennis in a virtual reality environment.
High-resolution videos are available on the project website.

News

Research


Computer Vision

In computer vision, my focus has been on unsupervised and self-supervised learning to extract information from readily-available sources of data. In particular, I have worked on applying deep learning and geometry to estimate scene depth and camera motion just from analyzing the movement of pixels in raw single-view videos.



Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
AAAI, 2019
Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
This paper presents a refined unsupervised depth and motion prediction model that is capable of predicting depth and motion of dynamic objects in addition to the motion of the camera, all from raw single-view (monocular) video. In addition, if multiple frames are available at inference time, a refinement process produces more accurate depth and motion estimates.
Future Semantic Segmentation Using 3D Structure
ECCV 3D Reconstruction meets Semantics Workshop, 2018
Suhani Vora, Reza Mahjourian, Soeren Pirk, Anelia Angelova
Given a stream of monocular video frames sparsely labelled with semantic segmentation maps, this method estimates the 3D structure of the scene and uses that to predict the semantic segmentation of future frames.
Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints
CVPR, 2018
Reza Mahjourian, Martin Wicke, Anelia Angelova
This paper applies deep learning and geometry to estimate scene depth and camera motion just from analyzing the movement of pixels in raw single-view videos. The neural network estimates 3D point clouds for each frame and the camera motion between adjacent frames. Transforming the point clouds based on the estimated camera motion and aligning them in 3D provides the supervisory signal for learning both depth and camera motion without ground truth.
Geometry-Based Next Frame Prediction from Monocular Video
IEEE Intelligent Vehicles, 2017
Reza Mahjourian, Martin Wicke, Anelia Angelova
A recurrent neural network with convolutional LSTM cells is trained to predict depth from a sequence of monocular video frames. The memory in LSTM cells allows the network to The depth prediction along with the camera trajectory is then used to compute a prediction for the next frame.




Robotics and Reinforcement Learning

In robotics, my focus has been on developing approaches that are sample-efficient enough that learning algorithms can be used to solve complex robotic tasks. My work has explored active learning for robotics. I have worked on applying hierarchical learning to robotics in setups where the low-level control problems are solved using optimal control and model-free learning is used only for high-level behaviors. Our recent work on learning robot table tennis is such an approach, which trains zero-shot striking skills based on dynamics models trained from observing human games in a virtual reality environment, and applies model-free reinforcement learning tactfully to discover novel game-play strategies.

In studying reinforcement learning, I have worked on understanding the properties of learning algorithms and problem domains that contribute to the success or failure of learning approaches. I have studied the impact of domain properties like ergodicity and stochasticity on reinforcement learning with self-play. I have also worked on meta-learning to discover effective feature sets for reinforcement learning.


Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play
arXiv preprint, 2018
Reza Mahjourian, Risto Miikkulainen, Nevena Lazic, Sergey Levine, Navdeep Jaitly
This work studies sample-efficient learning of complex policies in the context of robot table tennis. Human demonstrations in a virtual reality environment are used to train dynamics models for the game objects, which together with an analytic paddle controller allow any robot anatomy to play table tennis without training episodes. Self-play is used to train cooperative and adversarial game-play strategies on top of model-based striking skills trained from human demonstrations. Further experiments demonstrate that more flexible variants of the policy can discover new strikes not demonstrated by humans and achieve higher performance at the expense of lower sample-efficiency. The high sample-efficiency demonstrated in the evaluations show that the proposed method is suitable for learning directly on physical robots without transfer of models or policies from simulation.
Task Planning with Guided Policy Search
Preprint, 2016
Reza Mahjourian, Risto Miikkulainen
Discovering suitable cost functions allows Guided Policy Search (GPS) to solve tasks that require planning for intermediate goals. As the animation in the video shows, direct optimization may lead to local optima.
Neuroevolutionary Planning for Robotic Control
PhD Proposal, 2016
Reza Mahjourian, Risto Miikkulainen
In this work, an evolutionary strategy is applied to discover robotic controllers for an object manipulation task. For simple control tasks, controllers with precise behavior are learned. However, when the task is complex enough that it require strategy and planning, finding solutions becomes hard. This work proposes a new evolutionary method to discover and complete subtasks leading to completion of an original objective.
Robotic Control Through Neuroevolution
BEACON, 2014
Reza Mahjourian, Risto Miikkulainen
This work studies the impact of neural network architecture on efficiency of neuroevolution (NEAT) on object manipulation tasks using the Atlas robot.
An Evolutionary Feature Discovery Method for Reinforcement Learning
GECCO submission, 2013
Reza Mahjourian, Peter Stone
This work presents a meta-learning approach for generating and evaluating candidate feature sets for reinforcement learning with linear function approximators (Gradient-Descent Sarsa(λ)).
Studying Impact of Domain Ergodicity and Stochasticity on Reinforcement Learning with Self-Play
Preprint, 2011
Reza Mahjourian, Prateek Maheshwari, Risto Miikkulainen
This work studies hypotheses on why reinforcement learning worked so well for backgammon in TD-Gammon. Does backgammon have particular properties that make it easier for reinforcement learning and self-play to work? Can these properties be exploited to design better general learning algorithms? Follow-up experiments show domain stochasticity to have a strong impact on reinforcement learning with self-play.
Optimizing Selection of Training Samples for Robotics Learning Problems
Preprint, 2011
Reza Mahjourian, Peter Stone
Uses an ensemble of neural networks and selects samples by prioritizing data points where the networks in the ensemble disagree the most about predictions (most variance).




Theoretical Computer Science

An Approximation Algorithm for Conflict-Aware Broadcast Scheduling in Wireless Ad Hoc Networks
The ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 2008
Reza Mahjourian, Feng Chen, Ravi Itwari, My Thai, Hongqiang Zhai, Yuguang Fang
This paper introduces and proves correctness of a constant approximation algorithm for minimum-latency conflict-aware broadcast scheduling in wireless networks. A constant approximation algorithm is a polynomial-time solution to an NP-hard problem such that the solution is within a constant multiple of the optimal solution to the problem.




Software Engineering

An Architectural Style for Data-Driven Systems
International Conference on Software Reuse (ICSR), 2008
Reza Mahjourian
This paper describes the design of XPage, a light-weight web application framework, which is also published as open-source software in 2002, and deployed in six data management apps by the author. It is designed specifically for data management applications and allows the developer to specify each application page at a very high level by specifying the data sources and attributes that it retrieves or modifies.
Software Connector Classification and Selection for Data-Intensive Systems
International Workshop on Incorporating COTS Software into Software Systems, 2007
Chris A. Mattmann, David Woollard, Nenad Medvidovic, Reza Mahjourian
This work explores the role of software connectors in systems specifically designed for distributing large volumes of data.