About

Generic placeholder image

Nayan Singhal

I am a first year graduate student in Department of Computer Science at The University of Texas at Austin .

In the past, I worked as a Software Engineer in Microsoft India R&D. Before that, I was a student researcher with the Department of Bioengineering at Stanford University under Prof. Kwabena Boahen. Helped them in achieving a real time manipulation for an articulated robot with three or more degrees of freedom having a kinematic or loop like structure using spatial vectors(6-Dimension Vector).

I have majored in Information Technology from Indian Institute of Information Technology, Allahabad. My research interests are in Machine Learning, Natural Language Processing, Robotics, and Artificial Intelligence.

I am very enthusiastic, adventurous and loves travelling. My hobbies are painting, photography & playing flute.

Projects

img01

Object Instance Segmentation

img01

Reference Cartoon Colorization

img01

Yearbook Photo Dating

img01

Robot Dynamic Engine

img05

Neural Simulator

img08

Natural Language QA System

img02

MultiWord Expression Extraction

img04

Diagnosis of ECG Signals

img03

Object Tracking

my projects

Object Instance Segmentation

Advisor:

Prof. Kristen Grauman, UT Austin, USA

Background

Convolution Neural Networks have gained a significant attention from vision community due to its state of the art performance in Object Classification task. But when it comes to segmentation, feed-forward CNNs are not sufficient, because only abstract object level informa- tion present in higher layers of CNN is not enough. Pixel level information present in lower layers of CNN is equally important. harpmask [9] proposed a refinement module that refines the coarse mask produced by CNN by successively integrating information from earlier layers in a top-down manner. We explore the possiblity of duplicating this module and use features learned in the top-down pass of the first unit in the bottom-up pass of the second unit.

Approach and Results

We develop our approach on top of SharpMask. Follow- ing is our idea: If the combined information present in the top-down pass of the SharpMask can be used in the bottom- up pass again, the coarse mask produced by that bottom-up pass will be more accurate in the first place itself and again refining it in top-down pass would give a more refined mask.

my projects

By quantitative and qualitative analysis, we have shown that it performs better than current state-of-the-art method SharpMask. One of the drawbacks we have found is the decrease in testing speed. [Report] [Poster]

my projects

Reference Constrained Cartoon Colorization

Advisor:

Prof. Philipp Krähenbühl, UT Austin, USA

Background

Recent work on colorization using deep networks has achieved impressive results on natural images. The colorization task in these works is underconstrained - a grayscale image of a t-shirt could be any of several colors, which results in ambiguous colorization. In this work, we look at the cartoon domain, where the color ambiguity is much higher than natural images. Also, the pixel gradients are much lower across pixels - most regions are uniformly colored except at the edges where there is an abrupt change. We explore this colorization task with an additional constraint of being consistent with a reference image.

Approach and Results

For the reference image, we choose natural images from live-action adaptions of cartoons rather than other cartoons. The motivation behind this decision is two-fold. (1) Using cartoon references reduces this problem to a version of the colorization by super-pixel matching task in R.K.Gupta[2012]. We would like to incorporate learning into our system to allow for better generalization. (2) We would like to test the generalizability of our system for natural images as well. Given a cartoon of cat, and a natural cat images with different color, textures; does the system impart those colors/textures onto the cartoon image?We build a dataset from 15 cartoons and their live-action adaptations for this purpose. We train several variations of this deep net to predict the A,B channels of an image in LAB color-space given the L channel and a reference image.

my projects

Model Architecture: The input and reference branch have the same architecture as R.Zhang[2016]. Information from each branch can propagate to the other via bridging connection at each conv block. The information is combined using element-wise addition after a transformation to account for the input to reference or reference to input domain change. Color information from the reference image is used to influence the final color prediction for the input image, while boundary information from the input branch is used to influence what regions of color information are important.

The results of our experiments show that we indeed are able to bring some color information from the reference image into the colorization of our input image, but only to a certain degree. We used element-wise addition as a means for information transfer between the two branches in our network. This may have been an oversimplification considering the ability that we wished the network to possess. [Report]

my projects

Yearbook Photo Dating

Classification of images using deep networks has achieved impressive results on most of the categories. We trained both AlexNet and VGG classification model for dating photographs by using the Yearbook dataset, which contains frontal-facing American high school yearbook photos and the year they were taken. From the classification and regression analysis on AlexNet and VGG, we can conclude that despite the change of architecture, regression and classification tend to behave in their own particular way. And for the given task, re- gression might be better over a long run than classification. [Report]

my projects

Robot Dynamics Engine with Spiking Neurons

Advisor:

Samir Menon, Phd Student, Stanford, USA
Prof. Kwabena Boahen, Stanford, USA

Develop a system that can achieve a real time manipulation for an articulated robot with three or more degrees of freedom having kinematic or loop like structure using spiking silicon neurons. Modeled the framework which simplifies the process of expressing and analyzing the dynamics of a simple rigid body system. It calculates the joint acceleration using Articulated Body algorithm and Newton Euler algorithm designed in spatial vector (6-Dimension vector) and integrates the joint acceleration using Heun’s Integrator to reckon the joint position and joint velocity. [Report]

Total Energy (Kinetic Energy + Potential Energy) of the Puma Robot is conserved using Dynamic Engine built through Spatial Vectors.

PR2 Robot drawing figure 8 using Dynamic Engine.

my projects

Neural Simulator

Advisor:

Samir Menon, Phd Student, Stanford, USA
Prof. Kwabena Boahen, Stanford, USA

Developed a C++ neural simulator for simulating large scale neural system. It uses the leaky integrate-and-fire (LIF) neuron model to encode the current changes in the neuron’s soma due to dendritic input by calculating the spike rate of the neuron and linear decoding is used to estimate the magnitude that was encoded with a nonlinear process while operating in a noisy environment.

my projects

Natural Language Question Answering System

Advisor:

Prof. Sudip Sanyal, IIIT-A, India

Background

The basic architecture of a Question Answering System (QAs), based on Natural Language Processing, subsumes question analysis and answer extraction. Extensive research has already been done in this field which comprises of keyword matching, rule based matching, semantic web, ontology, semantic reformulation and template based. But the question phrased in complex form do not return the correct answer. Also, most of the system doesn't focus on "why" type of questions or cannot answer non-factoid type of questions. In this work, we are trying to extract the answer to a question from a corpus data even when the questions are allowed to be unconstrained.

Approach and Results

Developed a system which relates the word logically and provides an admissible answer to the user query. It analyzes a question to reduce it to its canonical form, expressed as a dependency tree generated using Stanford Parser. Next, it categorizes the type of answer expected using the rules formed in the dependency tree and then extracts the answer by searching through the generated state graph using certain heuristics.

my projects

The system works with an efficiency of 0.599 (MRR value). The question analyzer gives the same expected answer type even when the same question is rephrased in complex forms. Also the answer analyzer extracts the answer distributed across various sentences, not necessarily occurring together in the paragraph or directly stated in the text. [Report]

my projects

MultiWord Expression Extraction

Advisor:

Prof. Sudip Sanyal, IIIT-A, India

Developed an unsupervised and automated system for large scale extraction of multiword expressions from a large corpus. The core idea behind it is to explore the extent up to which word alignment can be used to distinguish idiomatic expressions and literal expressions. The system is based on linguistic and statistical filtering using Pointwise Mutual Information.

A sample run of the algorithm is shown in the video. Multi-Word Expressions are extracted from the corpus and then verified it in Google.

my projects

Segmentation of ECG Beat and Diagnosis of some Cardiac Problems

Advisor:

Prof. Sudip Sanyal, IIIT-A, India

Background

ECG reflects the state of cardiac heart and hence is like a pointer to the health conditions of a human being. ECG, if properly analysed, can provide us information regarding various diseases related to heart. However, ECG being a non-stationary signal, the irregularities may not be periodic and may show up at different intervals. Clinical observation of ECG can hence take long hours and can be very tedious. Moreover, visual analysis cannot be relied upon. This calls for computer-based techniques for ECG analysis. Various contributions have been made in literature regarding beat detection and classification of ECG. Most of these use frequency or time domain representation of ECG signals. But the major problem faced by the coders is the vast variations in the morphologies of ECG signals. Moreover, we have to consider the time constraints as well. Thus our basic objective is to come up with a simple method having less computational time without compromising with the efficiency.

Approach and Results

Developed a semi-automatic system to analyze the electric impulses obtained from ECG tests of human heart and diagnose them for some specific cardiac problems which include Normal Sinus Rhytnm, Arythmia, Atrial Fibrillation, Supraventricular Arythmia and S-T change by using SVM (Support Vector Machine). The diagnosis is based on the analysis of signals using Gaussian mixture model and features extracted through correlation coefficient, retrieved after the segmentation of each ECG beats.

my projects

Object Tracking under Cluttered Environment

Develop an Object Tracking System which idetect and track a person or an object in a video using OpenCV. Formulate the algorithm that consists of a sequence of steps like foreground detection, connected component analysis, basic tracking, people grouping, and estimation of group size.

Resume

img01

Contact Me