For Spring 2019, the Theory Seminar will meet on Fridays from 2:00-3:00pm in GDC 4.304 unless otherwise noted. This schedule is being updated throughout the semester.

02/01/2019 - Lijie Chen, MIT: "Non-deterministic Quasi-Polynomial Time is Average-case Hard for ACC Circuits"
02/08/2019 - Aline Ene, Boston University: "Submodular Maximization with Nearly-optimal Approximation and Adaptivity"
02/15/2019 - William Hoza, University of Texas at Austin: "Simple Optimal Hitting Sets for Small-Success RL"
03/01/2019 - Ankur Moitra, MIT: "Learning Restricted Boltzmann Machines"
03/11/2019 - Rong Ge, Duke University: "Learning Two-Layer Neural Networks with Symmetric Inputs" *Note: Rm 6.302 on Monday, March 11th
03/15/2019 - Arkadev Chattopadhyay, Tata Institute of Fundamental Research: "The Log-Approximate-Rank Conjecture is False"
03/25/2019 - Pasin Manurangsi, Berkeley: "On The Complexity of Closest Pair" *Note: Rm 6.302 on Monday, March 25th at 10:30am
03/29/2019 - Seth Pettie, University of Michigan: "Triangle Enumeration in the Congest Model"
04/05/2019 - Satish Rao, Berkeley: "Thoughts on "Linear Time'' solutions for linear systems and maximum flow"
04/12/2019 - Ilias Diakonikolas, University of Southern California: "Algorithmic Questions in High-Dimensional Robust Statistics"
04/19/2019 - David Wu, University of Virginia: "New Constructions of Reusable Designated-Verifier NIZKs"
04/22/2019 - Michal Moshkovitz, UCSD: "On Bounded-Memory Learning" *Note: Rm 6.302 on Monday, April 22nd
04/26/2019 - Ryan O'Donnell, Carnegie Mellon University: "X-Ramanujan Graphs"
05/03/2019 - Alexander Golovnev, Harvard: "Circuit Depth Reductions"
05/08/2019 - Rasmus Kyng, Harvard: "A Numerical Analysis Approach to Convex Optimization" *Note: Rm 4.516 on Wednesday, May 8th at 11am
05/13/2019 - Omer Reingold, Stanford: "A Complexity-Theoretic Perspective on Algorithmic Fairness" *Note: Rm 4.302, Monday, May 13th at 11am

Lijie Chen, MIT
Non-deterministic Quasi-Polynomial Time is Average-case Hard for ACC Circuits

Following the seminal work of [Williams, J. ACM 2014], in a recent breakthrough, [Murray and Williams, STOC 2018] proved that NQP (non-deterministic quasi-polynomial time) does not have polynomial-size ACC^0 circuits.

We strengthen the above lower bound to an average case one, by proving that for all constants c, there is a language in non-deterministic quasi-polynomial time, which is not (1/2+1/log^c n)-approximable by polynomial-size ACC^0 circuits. In fact, our lower bound holds for a larger circuit class: 2^(log^a n)-size ACC^0 circuits with a layer of threshold gates at the bottom, for all constants a. Our work also improves the average-case lower bound for NEXP against polynomial-size ACC^0 circuits by [Chen, Oliveira and Santhanam, LATIN 2018].

Alina Ene, Boston University
Submodular Maximization with Nearly-optimal Approximation and Adaptivity

In this talk, we present recent progress on understanding the tradeoff between the approximation guarantee and adaptivity for submodular maximization. The adaptivity of an algorithm is the number of sequential rounds of queries it makes to the evaluation oracle of the function, where in every round the algorithm is allowed to make polynomially-many parallel queries. Adaptivity is an important consideration in settings where the objective function is estimated using samples and in applications where adaptivity is the main running time bottleneck.

This talk is based on joint work with Huy Nguyen (Northeastern) and Adrian Vladu (Boston University).

William Hoza, University of Texas at Austin
Simple Optimal Hitting Sets for Small-Success RL

We give a simple explicit hitting set generator for read-once branching programs of width w and length r with known variable order. When r = w, our generator has seed length O(log^2 r + log(1/eps)). When r = polylog w, our generator has optimal seed length O(log w + log(1/eps)). For intermediate values of r, our generator's seed length smoothly interpolates between these two extremes.

Our generator's seed length improves on recent work by Braverman, Cohen, and Garg (STOC '18). In addition, our generator and its analysis are dramatically simpler than the work by Braverman et al. Our generator's seed length improves on all the classic generators for space-bounded computation (Nisan Combinatorica '92; Impagliazzo, Nisan, and Wigderson STOC '94; Nisan and Zuckerman JCSS '96) when eps is small.

As a corollary of our construction, we show that every RL algorithm that uses r random bits can be simulated by an NL algorithm that uses only O(r/log^c n) nondeterministic bits, where c is an arbitrarily large constant. Finally, we show that any RL algorithm with small success probability eps can be simulated deterministically in space O(log^{3/2} n + log n log log(1/eps)). This improves on work by Saks and Zhou (JCSS '99), who gave an algorithm that runs in space O(log^{3/2} n + sqrt(log n) log(1/eps)).

This is joint work with David Zuckerman.

Ankur Moitra, MIT
Learning Restricted Boltzmann Machines

Graphical models are a rich language for describing high-dimensional distributions in terms of their dependence structure. While there are algorithms with provable guarantees for learning undirected graphical models in a variety of settings, there has been much less progress in the important scenario when there are latent variables, which is the focus of our work.

In particular, we study Restricted Boltzmann Machines (RBMs) which are a popular model with wide-ranging applications. We gave a simple greedy algorithm for learning ferromagnetic RBMs. Our analysis is based on tools from mathematical physics that were developed to show the concavity of magnetization. Conversely we show that even for a constant number of latent variables with constant degree, without ferromagneticity the problem is as hard as sparse parity with noise. This hardness result is based on a sharp and surprising characterization of the representational power of bounded degree RBMs. 

Based on joint work with Guy Bresler and Frederic Koehler

Rong Ge, Duke University
Learning Two-Layer Neural Networks with Symmetric Inputs

**Note** Rm 6.302

Deep learning has been extremely successful in practice. However, existing guarantees for learning neural networks are limited even when the network has only two layers - they require strong assumptions either on the input distribution or on the norm of the weight vectors. In this talk we give a new algorithm that is guaranteed to learn a two-layer neural network under much milder assumptions on the input distribution. Our algorithms works whenever the input distribution is symmetric - which means two inputs x and -x have the same probability. 

Based on joint work with Rohith Kuditipudi, Zhize Li and Xiang Wang

Arkadev Chattopadhyay, Tata Institute of Fundamental Research in Mumbai
The Log-Approximate-Rank Conjecture is False

We construct a simple and total XOR function F on 2n variables that has only O(n) 
spectral norm, O(n^2) approximate rank and O(n^{2.5}) approximate nonnegative rank. We
show it has polynomially large randomized bounded-error communication complexity of
Omega(sqrt(n)). This yields the first exponential gap between the logarithm of the
approximate rank and randomized communication complexity for total functions. Thus, F
witnesses a refutation of the Log-Approximate-Rank Conjecture which was posed by
Lee and Shraibman (2007) as a very natural analogue for randomized communication of the
still unresolved Log-Rank Conjecture for deterministic communication. The best known
previous gap for any total function between the two measures was a recent 4th-power
separation by Göös, Jayram, Pitassi and Watson (2017).

Additionally, our function F refutes Grolmusz's Conjecture (1997) and a variant of the
Log-Approximate-Nonnegative-Rank Conjecture, suggested by Kol, Moran, Shpilka and
Yehudayoff (2014), both of which are implied by the LARC. The complement of F has
exponentially large approximate nonnegative rank. This answers a question of Lee (2012)
and Kol et al. (2014), showing that approximate nonnegative rank can be exponentially
larger than approximate rank. The function F also falsifies a conjecture about parity
measures of Boolean functions made by Tsang, Wong, Xie and Zhang (2013). The latter
conjecture implied the Log-Rank Conjecture for XOR functions.

Remarkably, after our manuscript was published in the public domain, two groups of
researchers, Anshu-Boddu-Touchette (2018) and Sinha-de-Wolf (2018), showed independently
that the function F even refutes the Quantum-Log-Approximate-Rank Conjecture.

(This is joint work with Nikhil Mande and Suhail Sherif. Manuscript available at: )

*Note: Rm 6.302 on Monday, March 25th at 10:30am
Pasin Manurangsi, University of Berkeley
On The Complexity of Closest Pair

Given a set of n points in R^d, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the l_p-metric. Closest Pair is a fundamental problem in Computational Geometry that has relations to many other problems such as nearest neighbor search and Euclidean minimum spanning tree. Clearly, the problem can be solved in O(n^2*d) time, and whether this can be significantly improved for moderate dimension (i.e. d=poly(log n) ) was raised as an open question in several recent works (Abboud-Rubinstein-Williams [FOCS'17], Williams [SODA'18], David-Karthik-Laekhanukit [SoCG'18]). 

In this talk, I will provide a simple reduction that yields a negative answer to the question, assuming the Strong Exponential Time Hypothesis (SETH). I will also discuss how to extend this lower bound to the approximate version of the problem, although the running time lower bound there is not yet tight. At the heart of our proofs is the construction of a dense bipartite graph with low contact dimension; this construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory'03].

Based on joint work with Karthik C.S.

Seth Pettie, University of Michigan
Triangle Enumeration in the Congest Model

In this talk I'll survey the complexity of subgraph enumeration problems in distributed models of computation such as CONGEST and the Congested Clique, and present a new algorithm for Triangle Enumeration in CONGEST.  At a high level, the algorithm can be viewed as a simulation of a (very straightforward) Congested Clique algorithm.  In order to perform this type of simulation effectively, we introduce a fast algorithm for quickly computing a type of expander decomposition in distributed networks.  We show that the edge set E can be partitioned in n^{1-\delta} time into three parts Em, Es, Er  satisfying the following properties:

Em : each connected component induced by Em has polylog(n) mixing time,

Es : has arboricity n^\delta,

Er : has size at most |E|/6.  

Based on this 3-way decomposition, we show that Triangle Enumeration can be solved in O(sqrt(n)) time in the CONGEST model, improving a previous O(n^{3/4}) bound of Izumi and LeGall.

Satish Rao, Berkeley
Thoughts on "Linear Time'' solutions for linear systems and maximum flow

Now, not so recent, progress on the  linear system solving was a remarkable breakthrough.  The application of the enabling ideas to the maximum flow problem, and other problems, including linear time spanning trees has also been very exciting.

I will provide my own view, largely from a spectator's standpoint, about why and how some of the ideas behind these results. In particular, I will review the remarkable result of Kelner, Orecchia, Sidford and Zheyuan on linear time solvers, and the result of Sherman on linear time approximations for the maximum flow problem.

Ilias Diakonikolas, University of Southern California

Fitting a model to a collection of observations is one of the quintessential questions in statistics. The standard assumption is that the data was generated by a model of a given type (e.g., a mixture model). This simplifying assumption is at best only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted data. This is the prototypical goal in robust statistics, a field that took shape in the 1960s with the pioneering works of Tukey and Huber. Until recently, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the quality of the common heuristics degrades badly as the dimension increases. 

In this talk, we will survey the recent progress in algorithmic high-dimensional robust statistics. We will describe the first computationally efficient algorithms for robust mean and covariance estimation and the main insights behind them. We will also present practical applications of these estimators to exploratory data analysis and adversarial machine learning. Finally, we will discuss new directions and opportunities for future work. 

The talk will be based on a number of joint works with (various subsets of) G. Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart.

Michal Moshkovitz, University of San Diego
On Bounded-Memory Learning

One can learn any hypothesis class H with O(log |H|) labeled examples. Alas, learning with so few examples requires saving the examples in memory, and this requires |X|^(O(log|H|)) memory states, where X is the set of all labeled examples. This motivates the question of how many labeled examples are needed in case the memory is bounded. One might wonder whether a general combinatorial condition exists for (un)learnability with bounded memory. In this talk we give a combinatorial condition for learnability with bounded memory and a combinatorial condition for unlearnability with bounded memory. 

The talk is based on joint works with Dana Moshkovitz and Naftali Tishby.

David Wu, University of Virginia
New Constructions of Reusable Designated-Verifier NIZKs

Non-interactive zero-knowledge arguments (NIZKs) are important cryptographic primitives that has received extensive study. Today, we have many types of constructions relying on different abstractions and from a broad range of assumptions like factoring, pairing-based assumptions, and lattice-based assumptions. In this work, we study whether there are generic frameworks for building NIZKs.

We specifically consider the setting of designated-verifier NIZKs (DV-NIZKs), where a trusted setup generates a common reference string together with a secret key for the verifier. We seek reusable schemes where the verifier can reuse the secret key to verify many different proofs and soundness should hold against provers who can learn whether various proofs are accepted or rejected. In this talk, I will describe a generic approach for constructing reusable designated-verifier NIZKs based on Sigma protocols and any single-key attribute-based encryption (ABE) scheme satisfying a weak function-hiding property. Our framework can be instantiated from most algebraic assumptions known to imply public-key encryption, including the CDH, LWE, and LPN assumptions. Notably, our work gives the first reusable DV-NIZK construction from LPN.

Finally, I describe an extension of our framework to obtain the stronger notion of malicious-designated-verifier NIZKs where the only trusted setup consists of a common random string, and there is a separate untrusted setup where the verifier chooses its own public/secret key needed to generate/verify proofs, respectively. Our generic framework yields new constructions of malicious DV-NIZKs from the same assumptions as above.

Joint work with Alex Lombardi, Willy Quach, Ron D. Rothblum, and Daniel Wichs

Ryan O’Donnell, CMU
X-Ramanujan Graphs

Let X be the infinite graph partially pictured below, the "free product graph" C_4 * C_4 * C_4.  Let FinQuo(X) be the set of all finite graphs G that covered by X (i.e., that 'locally resemble' X; i.e., that consist of C_4's joined 3-to-a-vertex).  A known Alon-Boppana-type theorem implies that every G in FinQuo(X) has second eigenvalue at least sqrt(5sqrt(5)+11) - o(1), where that magic number is the spectral radius of X.  We may then ask the "Ramanujan question": are there infinitely many G in FinQuo(X) with second eigenvalue at most sqrt(5sqrt(5)+11)?  When X is the infinite d-regular tree, this is the well known Ramanujan graph problem resolved by Marcus, Spielman, and Srivastava [MSS].

We show a positive answer to the problem for a wide variety of infinite X (not necessarily trees).  Techniques involve a new infinite graph product, a new graph polynomial (generalizing the matching polynomial), "freelike walks", Heaps of Pieces, and of course [MSS]'s Interlacing Polynomials Method.

Alexander Golovnev, Harvard
Circuit Depth Reductions

The best known size lower bounds against unrestricted circuits have remained around 3n for several decades. Moreover, the only known technique for proving lower bounds in this model, gate elimination, is inherently limited to proving lower bounds of less than 5n. In this work, we propose a non-gate-elimination approach for obtaining circuit lower bounds, via certain depth-three lower bounds. We prove that every (unbounded-depth) circuit of size s can be expressed as an OR of 2^{s/3.9} 16-CNFs. For DeMorgan formulas, the best known size lower bounds have been stuck at around n^{3-o(1)} for decades. Under a plausible hypothesis about probabilistic polynomials, we show that n^{4-eps}-size DeMorgan formulas have  2^{n^{1-Omega(eps)}}-size depth-3 circuits which are approximate sums of n^{1-Omega(eps)}-degree polynomials over F_2. While these structural results do not immediately lead to new lower bounds, they do suggest new avenues of attack on these longstanding lower bound problems.
Joint work with Alexander Kulikov and Ryan Williams.

Rasmus Kyng, Harvard
A Numerical Analysis Approach to Convex Optimization
Note: 11am in GDC 4.516

In the current optimization literature, for most convex problems, it is significantly faster to obtain O(1)-approximate solutions as compared to high accuracy (1+1/poly(n))-approximate solutions. This gap is reflected in the differences between interior point methods vs. (accelerated) gradient descent for regression problems, and between exact vs. approximate undirected max-flow. One major exception is  L2-regression, where an algorithm for computing an approximate solution can be converted into a high-accuracy solution via iterative refinement. Iterative refinement is a fundamental building block of algorithms in numerical analysis but is largely limited to L2 problems.

In this talk, I will present generalizations of iterative refinement to p-norms. This leads to algorithms that produce high accuracy solutions by crudely solving only a polylogarithmic number of residual problems. I will also discuss several results that build on this new approach to high-accuracy algorithms, including p-norm regression using m^{1/3} linear system solves, and p-norm flow in undirected unweighted graphs in almost-linear time.

The talk will be based on joint works with Deeksha Adil, Richard Peng, Sushant Sachdeva, and Di Wang.

Omer Reingold, Stanford
A Complexity-Theoretic Perspective on Algorithmic Fairness
Note: 11am in GDC 4.302

A prominent concern, in the age of machine learning and data analysis, is that left to their own devices, algorithms will propagate - even amplify - existing biases. Common definitions of fairness are group-based, typically requiring that a given statistic be equal across a few demographic groups, socially identified as deserving protection. Such definitions tend to be easy to study and satisfy but tend to provide exceedingly weak protection from unfair discrimination. This motivated the introduction of definitions that aim at individual-level protections. Such protection is much stronger and more nuanced but harder to satisfy.

We will discuss a recent sequence of results, where protection is provided to a large collection of populations, definedcomplexity-theoretically. This gives a surprising approach to a question that has been debated by a variety of scientific communities – which population should be protected? Our approach suggests protecting every group that can be identified given the data and our computational limitations, which in a sense is the best we can hope to do. We will discus this approach in different contexts as well as its relation to pseudorandomness.

Based on joint works with Cynthia Dwork, Úrsula Hébert-Johnson, Michael P. Kim, Guy Rothblum and Gal Yona.