__Education__

__Education__

**Master of Science in Computer Science 2014 - 2016**

Courant Institute of Mathematical Sciences, New York University

CGPA: 3.975 / 4.000

**Bachelor of Technology in Computer Science and Engineering 2010 - 2014**

Indian Institute of Technology, Delhi

CGPA: 8.982 / 10.000

__Publications__

__Publications__

Corinna Cortes, Prasoon Goyal, Vitaly Kuznetnov and Mehryar Mohri.
"

We proposed a kernel selection algorithm that takes a set of families of kernel functions, and learns a classifier that is resistant to overfitting, by penalizing kernels based on their local Rademacher complexities. We dervied strong theoretical guarantees for our formulation, and showed that the classifier learnt often outperforms other algorithms, at the same time being much sparser.

**Kernel Extraction via Voted Risk Minimization.**"*Twenty Ninth Annual Conference on Neural Information Processing Systems (NIPS) Workshop, 2015. [pdf]*We proposed a kernel selection algorithm that takes a set of families of kernel functions, and learns a classifier that is resistant to overfitting, by penalizing kernels based on their local Rademacher complexities. We dervied strong theoretical guarantees for our formulation, and showed that the classifier learnt often outperforms other algorithms, at the same time being much sparser.

Happy Mittal, Prasoon Goyal, Vibhav Gogate and Parag Singla.
"

In this work, we proposed two new rules for lifted maximum a posteriori (MAP) inference over Markov logic networks (MLNs), that make inference on a large class of MLNs independent of the domain size of variables. We also proposed algorithms to optimally apply these rules to achieve maximum speed-up where domain-independent inference is not achievable. We achieved significant speed-up over the state-of-the-art on several datasets.

**New Rules for Domain Independent Lifted MAP Inference.**"*Twenty Eighth Annual Conference on Neural Information Processing Systems (NIPS), 2014. [pdf]*In this work, we proposed two new rules for lifted maximum a posteriori (MAP) inference over Markov logic networks (MLNs), that make inference on a large class of MLNs independent of the domain size of variables. We also proposed algorithms to optimally apply these rules to achieve maximum speed-up where domain-independent inference is not achievable. We achieved significant speed-up over the state-of-the-art on several datasets.

Cijo Jose, Prasoon Goyal, Parv Aggrwal, and Manik Varma.
"

We proposed a sparse feature mapping formulation for non-linear SVMs, by learning computationally deep local features. The key idea is to represent the attributes of the feature space as nodes in a tree, such that for every point in the space, exactly one path from the root node to a leaf node is active, while attributes corresponding to all other nodes are zero. We achieved orders of magnitude speed-up over RBF-SVM with acceptable loss in accuracy, and got significant speed-up and accuracy gain over other existing methods.

**Local Deep Kernel Learning for Efficient Non-linear SVM prediction.**"*Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 486-494. 2013. [pdf | code]*We proposed a sparse feature mapping formulation for non-linear SVMs, by learning computationally deep local features. The key idea is to represent the attributes of the feature space as nodes in a tree, such that for every point in the space, exactly one path from the root node to a leaf node is active, while attributes corresponding to all other nodes are zero. We achieved orders of magnitude speed-up over RBF-SVM with acceptable loss in accuracy, and got significant speed-up and accuracy gain over other existing methods.

##
**Presentations & Talks**

**Presentations & Talks**

Corinna Cortes, Prasoon Goyal, Vitaly Kuznetsov and Mehryar Mohri.
"

Spotlight Talk and Poster Presentation, Machine Learning Symposium 2016, New York Academy of Sciences. [Download poster]

Prasoon Goyal, Sahil Goel and Kshitiz Sethia. "

Poster Presentation, Spring 2015 Computer Science Showcase, New York University. [Download poster]

Happy Mittal, Prasoon Goyal, Vibhav Gogate and Parag Singla. "

Poster presentation, NIPS 2014, Montreal, Canada. [Download poster]

Cijo Jose, Prasoon Goyal, Parv Aggrwal, and Manik Varma. "

Poster presentation, Open House 2013, Indian Institute of Technology Delhi. [Download poster]

**Kernel Extraction via Voted Risk Minimization.**"Spotlight Talk and Poster Presentation, Machine Learning Symposium 2016, New York Academy of Sciences. [Download poster]

Prasoon Goyal, Sahil Goel and Kshitiz Sethia. "

**Text Summarization for Wikipedia Articles.**"Poster Presentation, Spring 2015 Computer Science Showcase, New York University. [Download poster]

Happy Mittal, Prasoon Goyal, Vibhav Gogate and Parag Singla. "

**New Rules for Domain Independent Lifted MAP Inference.**"Poster presentation, NIPS 2014, Montreal, Canada. [Download poster]

Cijo Jose, Prasoon Goyal, Parv Aggrwal, and Manik Varma. "

**Local Deep Kernel Learning for Efficient Non-linear SVM Prediction.**"Poster presentation, Open House 2013, Indian Institute of Technology Delhi. [Download poster]

**Professional experience**

**Professional experience**

**Carnegie Mellon University**

*Research Scholar, Oct 2016 - Present*

I am working on developing algorithms for video analysis, using Bayesian non-parametric methods and deep learning.

*Software Engineering Intern in Research, Summer 2016*

Worked on improving transliteration system for Android keyboard, in terms of accuracy and model sizes, with a focus on English to Hindi transliteration. The task was modelled as a composition of Weighted Finite State Transducers (WFSTs), which included a transliteration FST (T) trained on pair language data to transduce English sequences to candidate Hindi sequences, and an n-gram language model FST (G) trained on monolingual Hindi corpus to filter out low probability Hindi sequences. My first task involved improving the accuracy of the transliteration FST T, by removing the simplifying assumptions used in the baseline model. My second task involved reducing the sizes of T and G, by removing low probability paths. Overall, I was able to achieve a 5% relative reduction in error, and simultaneously a reduction in model sizes by a factor of 20.

**NVIDIA Autonomous Driving Team**[Report]

*Machine Learning Intern, Spring 2016*

The team is building an end-to-end neural network for autonomous driving. My first task involved studying the sensitivity of the training pipeline to noise in the training labels. That is, as the data were collected by human drivers, if the driver were driving a few centimeters too left of the center of the lane, the ground truth steering angle according to the driver would be to go straigt, while we would want the neural network to steer right to come back to the center of the lane. This noise can, in general, be detrimental to performance of the network. To study the effect of this noise, we atificially injected Gaussian noise in the training labels, and studied the effect on the autonomy of the network trained on this noisy data. We concluded that the autonomy does not drop as long as the mean and standard deviation are below a threshold. We estimated the mean and standard deviation of actual noise in the training data, and found the values to be below the calculated threshold. My second task involved experimenting with memory-based networks to handle cases like maneuvering around parked cars and driving in heavy-traffic conditions, which require more than the current frame to perform correct driving actions. I laid the groundwork for memory-based network training, by modifying the data processing pipeline, and trained some baseline models by the end of my internship.

**Center for Data Science, New York University**

*Grader for graduate-level probabilistic graphical models course*

**Inference and Representation**: Fall 2015Course Instuctor: Prof. David Sontag

**Center for Data Science, New York University**

*Grader for graduate-level introductory machine learning course*

**Computational Statistics and Machine Learning**:Spring 2015Course Instructor: Prof. David Rosenberg

**Center for Data Science, New York University**

*Grader for graduate-level probabilistic graphical models course*

**Inference and Representation**: Fall 2014Course Instructor: Prof. David Sontag

**Tower Research Capital India Pvt. Ltd.**

*Summer Intern, Summer 2013*

The company works on developing investment strategies and algorithms for stock market trading. My project involved optimizing an existing market trading protocol, using techniques such as sniffing data packets to understand the behavior of the software, and replicating the behavior on mirrored data.

__Academic Experience__

__Academic Experience__

**Research Projects**

**B.Tech. Thesis: Approximation Algorithms for Deadline-TSP and graph cover problem**

*Advisor: Prof. Naveen Garg (IIT Delhi) July 2013 - May 2014*

The deadline-traveling salesman problem (deadline-TSP) is an NP-hard graph theoretic problem, which is a variant of the classic travelling salesman problem, with the difference that here each node must be reached by a deadline. We worked on developing a constant factor polynomial-time approximation algorithm for trees. We also worked on graph cover problem, where we try to cover all vertices of a graph with trees or stars, minimizing an objective function. The techniques explored included LP-relaxations, primal-dual algorithms and local search.

**Course Projects**

**Accelerating Genetic Algorithms using Constant memory in GPUs**[Report]

*Course: Graphics Processing Units (GPUs) Sept 2015 - Dec 2015*

GPUs have several kinds of memories, one of which is constant memory, that is very small (typically 64 KB), read-only and shared across all threads. We proposed and implemented a genetic algorithm for GPUs that uses constant memory to store the "elite" population, thereby making convergence faster. We showed performance improvement over existing algorithms on an instance of Traveling Salesman Problem and Ackley's function.

**Text Summarization for Wikipedia Articles**[Report]

*Course: Natural Language Processing Sept 2014 - Dec 2014*

Proposed an unsupervised graph-based greedy algorithm and a supervised machine learning based algorithm for extractive summarization. For graph-based approach, sentences were represented as nodes in the graphs, and the edge weights were determined according to similarity of words using word embeddings. For ML-based approach, we used tfidf to generate sentence-level features, and Latent Dirichlet Allocation (LDA) for paragraph-level and document-level features.

**Sparse Support Vector Machine**[Report]

*Course: Foundations of Machine Learning Sept 2014 - Dec 2014*

Proposed an alternate SVM formulation that seeks to find a maximum margin hyperplane with low empirical error, while enforcing sparsity on the weight vector. We proved that the proposed formulation admits similar theoretical guarantees as the standard SVM formulation. Experiments showed that it performed comparable in terms of prediction accuracy to conventional SVM formulation, while giving a much sparser weight vector.

**MATLAB for Machine Learning**[Report]

*Course: Numerical Optimization Sept 2014 - Dec 2014*

Studied various optimization algorithms implemented in MATLAB, and compared their performance on important optimization problems arising from machine learning tasks on parameters like running time and numerical stability of the solution.

**Speeding up training on Large Datasets**

*Course: Fundamentals of Machine Learning Aug 2013 - Dec 2013*

Worked on developing a generic algorithm, independent of the training algorithm, which attempted to find the most informative points in a large dataset. The idea was to learn a distribution on the training instances such that the value for each training instance is proportional to the information content in the instance by repeatedly picking a small subset of the dataset, learning a classifier using the training algorithm, and updating the distribution based on the classifier learnt, using ideas from active learning.

**Unsupervised Unusual Activity Detection**

*Course: Computer Vision Jan 2013 - May 2013*

This work aims at detecting unusual activity in a video clip. The events in the training set are represented as STIP features, which are used to learn a dictionary, using dynamic sparse auto-encoders. A new event is flagged as unusual if it cannot be constructed as a sparse linear combination of the events in the dictionary. The dictionary is updated online based on new events seen. We were able to detect unusual activity in simple videos.

**Stock Market Prediction using Machine Learning Techniques**

*Course: Machine Learning and Optimization for Market Applications Aug 2012 - Dec 2012*

In this project, we tried to develop algorithms that can predict trends in stock market, by identifying non-random patterns. We used ideas from Recurrent Reinforcement Learning, in addition to stock market indicators, to develop a model that performed well on synthetic market data.

**Cricket Fever**

*Course: Database Systems Jan 2012 - May 2012*

This project involved developing an end-to-end website about cricket. The data used was downloaded from Yahoo! Cricket using a webcrawler written in Python. HTML was used for front-end coding, PHP for back-end scripting and PostgreSQL was used for back-end database management.

**Term papers**

**Probabilistic Databases**

*Course: Database Management Systems Jan 2012 - May 2012*

[Download pdf]

**Quantum Computing**

*Course: Digital Electronics July 2011 - Dec 2011*

[Download pdf]

__Technical Skills__

__Technical Skills__

**Programming Languages:**

C, C++, Java, MATLAB, Python, Standard ML, SQL, Prolog, Visual Basic, Ada, Scheme

**Platforms:**

Linux, Windows

**Other Software & Skills:**

LaTeX, Eclipse, Android Programming