My main interests are in complexity theory, pseudorandomness and data mining (specifically, matrix rank minimization related
problems).
More generally, I like probablity and combinatorics related stuff.
Theory Stuff
Pseudorandom Generators for Polynomial Threshold Functions
Raghu Meka and David Zuckerman
ABSTRACT arXiv
We study the natural question of constructing pseudorandom generators (PRGs) for low-degree polynomial threshold functions (PTFs). We give a PRG with seed-length log n/eps^{O(d)} fooling degree d PTFs with error at most eps. Previously, no nontrivial constructions were known even for quadratic threshold functions and constant error eps. For the class of degree 1 threshold functions or halfspaces, we construct PRGs with much better dependence on the error parameter eps and obtain the following results.
1) A PRG with seed length O(log n log(1/eps)) for eps > 1/poly(n).
2) A PRG with seed length O(log n) for eps > 1/poly(log n).
Previously, only PRGs with seed length O(log n log^2(1/eps)/eps^2) were known for halfspaces. We also obtain PRGs with similar seed lengths for fooling halfspaces over the n-dimensional unit sphere.
The main theme of our constructions and analysis is the use of invariance principles to construct pseudorandom generators. We also introduce the notion of monotone read-once branching programs, which is key to improving the dependence on the error rate eps for halfspaces. These techniques may be of independent interest.
Bounding the Sensitivity of Polynomial Threshold Functions
Prahladh Harsha, Adam Klivans and Raghu Meka
ABSTRACT arXiv
We give the first non-trivial upper bounds on the average sensitivity and noise sensitivity of polynomial threshold functions. More specifically, for a Boolean function f on n variables equal to the sign of a real, multivariate polynomial of total degree d we prove
1) The average sensitivity of f is at most O(n^{1-1/(4d+3)}) (we also give a simple combinatorial proof of the bound O(n^{1-1/2^d}).
2) The noise sensitivity of f with noise rate \delta is at most O(\delta^{1/(4d+6)}).
Previously, only bounds for the linear case were known. Along the way we show new structural theorems about random restrictions of polynomial threshold functions obtained via hypercontractivity. These structural results may be of independent interest as they provide a generic template for transforming problems related to polynomial threshold functions defined on the Boolean hypercube to polynomial threshold functions defined in Gaussian space.
Small-Bias Spaces for Group Products
Random 2009. Raghu Meka and David Zuckerman
ABSTRACT BibTex PDF Slides
Small-bias, or $\epsilon$-biased, spaces have found many applications in complexity theory, coding theory, and derandomization. We generalize the notion of small-bias spaces to the setting of group products. Besides being natural, our extension captures some of the difficulties in constructing pseudorandom generators for constant-width branching programs - a longstanding open problem. We provide an efficient deterministic construction of small-bias spaces for solvable groups. Our construction exploits the fact that solvable groups have nontrivial normal subgroups that are abelian and builds on the construction of Azar et al.~\cite{azarmotwani} for abelian groups. For arbitrary finite groups, we give an efficient deterministic construction achieving constant bias. We also construct pseudorandom generators fooling linear functions mod $p$ for primes $p$.
Datamining Stuff
Guaranteed Rank Minimization via Singular Value Projection
Raghu Meka, Prateek Jain and Inderjit Dhillon.
ABSTRACT arXiv Code
Minimizing the rank of a matrix subject to affine constraints is a fundamental problem with many important applications in machine learning and statistics. In this paper we propose a simple and fast algorithm SVP (Singular Value Projection) for rank minimization with affine constraints (ARMP) and show that SVP recovers the minimum rank solution for affine constraints that satisfy the "restricted isometry property" and show robustness of our method to noise. Our results improve upon a recent breakthrough by Recht, Fazel and Parillo (RFP07) and Lee and Bresler (LB09) in
three significant ways:
1) our method (SVP) is significantly simpler to analyze and easier to implement,
2) we give recovery guarantees under strictly weaker isometry assumptions
3) we give geometric convergence guarantees for SVP even in presense of noise and, as demonstrated empirically, SVP is significantly faster on real-world and
synthetic problems.
In addition, we address the practically important problem of low-rank matrix completion (MCP), which can be seen as a special case of ARMP. We empirically demonstrate that our algorithm recovers low-rank incoherent matrices from an almost optimal number of uniformly sampled entries. We make partial progress towards proving exact recovery and provide some intuition for the strong performance of SVP applied to matrix completion by showing a more restricted isometry property. Our algorithm outperforms existing methods, such as those of \cite{RFP07,CR08,CT09,CCS08,KOM09,LB09}, for ARMP and the matrix-completion problem by an order of magnitude and is also significantly more robust to noise.
Matrix Completion from Power-Law Distributed Samples
To appear in Nips 2009. Raghu Meka, Prateek Jain and Inderjit Dhillon.
ABSTRACT PDF
The low-rank matrix completion problem is a fundamental problem with many important applications. Recently, \cite{candesrecht},\cite{montanari} and \cite{candestao} obtained the first non-trivial theoretical results for the problem assuming that the observed entries are sampled uniformly at random. Unfortunately, most real-world datasets do not satisfy this assumption, but instead exhibit power-law distributed samples. In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models. Our method is easier to analyze than previous methods with the analysis reducing to computing the threshold for {\sl complete cascades} in random graphs, a problem of independent interest. By analyzing the graph theoretic problem, we show that our method achieves exact recovery when the observed entries are sampled from the Chung-Lu-Vu model, which can generate power-law distributed graphs. We also hypothesize that our algorithm solves the matrix completion problem from an optimal number of entries for the popular preferential attachment model and provide strong empirical evidence for the claim. Furthermore, our method is easier to implement and is substantially faster than existing methods. We demonstrate the effectiveness of our method on examples when the low-rank matrix is sampled according to the prevalent random graph models for complex networks and also on the Netflix challenge dataset.
Rank Minimization via Online Learning
ICML 2008. Raghu Meka, Prateek Jain, Constantine Caramanis and Inderjit Dhillon
ABSTRACT BibTex PDF
Minimum rank problems arise frequently in machine learning applications and are notoriously difficult to solve due to the non-convex nature of the rank objective. In this paper, we present the first online learning approach for the problem of rank minimization of matrices over polyhedral sets. In particular, we present two online learning algorithms for rank minimization - our first algorithm is a multiplicative update method based on a generalized experts framework, while our second algorithm is a novel application of the online convex programming framework \cite{zinkevich}. In the latter, we flip the role of the decision maker by making the decision maker search over the constraint space instead of feasible points, as is usually the case in online convex programming. A salient feature of our online learning approach is that it allows us to give provable approximation guarantees for the rank minimization problem over polyhedral sets. We demonstrate the effectiveness of our methods on synthetic examples, and on the real-life application of low-rank kernel learning.
Simultaneous Unsupervised Learning of Disparate Clusterings
SDM 2008. Prateek Jain, Raghu Meka and Inderjit Dhillon.
Journal Version: Statistical Analysis and Data Mining, Volume 1, Issue 3.
ABSTRACT BibTex PDF Best Paper Runner-up
Most clustering algorithms produce a single clustering for a given data set even when the data can be clustered naturally in multiple ways. In this paper, we address the difficult problem of uncovering disparate clusterings from the data in a {\em totally unsupervised manner}. We propose two new approaches for this problem. In the first approach we aim to find good clusterings of the data that are also {\em decorrelated} with one another. To this end, we give a new and tractable characterization of decorrelation between clusterings, and present an objective function to capture it. We provide an iterative ``decorrelated'' $k$-means type algorithm to minimize this objective function. In the second approach, we model the data as a sum of mixtures and associate each mixture with a clustering. This approach leads us to the problem of learning a convolution of mixture distributions. Though the latter problem can be formulated as one of factorial learning \cite{zoubin,hinton,mcvq}, the existing formulations and methods do not perform well on many real high-dimensional data sets. We propose a new regularized factorial learning framework that is more suitable for capturing the notion of disparate clusterings in modern, high-dimensional data sets. The resulting algorithm does well in uncovering multiple clusterings, and is much improved over existing methods.