Spring 2021 Theory Seminars will meet virtually this semester, Fridays from 1:30-2:30 pm. This schedule will be updated throughout the semester.

01/22/2021 - Richard Peng, Georgia Institute of Technology: Solving Sparse Linear Systems Faster than Matrix Multiplication

02/05/2021 - Rahul Ilango, Massachusetts Institute of Technology: Towards Hardness for the Minimum Circuit Size Problem

02/12/2021 - Mark Bun, Boston University: An Equivalence between Private Classification and Online Prediction

03/05/2021 - Rocco Servedio, Columbia University: Some Recent Results on Trace Reconstruction

03/12/2021 - Daniel Kane, University of California San Diego: Point Location and Active Learning

03/26/2021 - Muthu Venkitasubramaniam, University of Rochester: Average-case Complexity Through the Lens of Interactive Puzzles

04/02/2021 - Andy Drucker, University of Chicago: An Improved Exponential-Time Approximation Algorithm for Fully-Alternating Games Against Nature

04/09/2021 - Mark Braverman, Princeton: Optimal Tiling of the Euclidean Space Using Symmetric Bodies

04/16/2021 - Matthew Fahrbach, Google: Edge-Weighted Online Bipartite Matching

04/23/2021 - Gil Cohen, Tel Aviv University: Tree Codes and Interactive Coding Schemes

04/30/2021 - Eric Price, UT: Instance-Optimal Compressed Sensing via Conditional Resampling

05/07/2021 - Rahul Santhanam, Oxford: An Unconditional Polynomial-Time Pseudorandom Generator and its Consequences

**01/22/2021Richard Peng, Georgia Institute of Technology
Solving Sparse Linear Systems Faster than Matrix Multiplication**

Can linear systems be solved faster than matrix multiplication? While there has been remarkable progress for the special cases of graph structured linear systems, in the general setting, the bit complexity of solving an n-by-n linear system Ax=b is n^\omega, where \omega<2.372864 is the matrix multiplication exponent. Improving on this has been an open problem even for sparse linear systems with poly(n) condition number.

We present an algorithm that solves linear systems in sparse matrices asymptotically faster than matrix multiplication for any \omega>2. This speedup holds for any input matrix A with o(n^{\omega -1}/\log(\kappa(A))) non-zeros, where \kappa(A) is the condition number of A. For poly(n)-conditioned matrices with O(n) nonzeros, and the current value of \omega, the bit complexity of our algorithm to solve to within any 1/poly(n) error is O(n^{2.331645}).

Our algorithm can be viewed as an efficient randomized implementation of the block Krylov method via recursive low displacement rank factorizations. It is inspired by the algorithm of [Eberly-Giesbrecht-Giorgi-Storjohann-Villard ISSAC `06 `07] for inverting matrices over finite fields. In our analysis of numerical stability, we develop matrix anti-concentration techniques to bound the smallest eigenvalue and the smallest gap in eigenvalues of semi-random matrices.

Joint work with Santosh Vempala, manuscript at https://arxiv.org/abs/2007.10254.

**02/05/2021Rahul Ilango, Massachusetts Institute of Technology
Towards Hardness for the Minimum Circuit Size Problem**

While understanding the computational complexity of the Minimum Circuit Size Problem (MCSP) is strongly motivated by connections to areas such as learning theory, cryptography, and average-case complexity, we still know relatively little about the problem. In particular, it is a longstanding open problem to base the intractability of MCSP on standard worst-case assumptions.

In this talk, I will survey a series of recent works that prove hardness results for natural variants of MCSP. This includes NP-hardness for an oracle-version, a multi-output version, and a constant-depth formula version. I'll discuss the last result in the most detail, and I'll highlight an application that uses similar techniques to establish the existence of functions with a 2^{\Omega_d(n)} additive gap between their depth-d and depth-(d+1) formula complexity.

**02/12/2021Mark Bun, Boston University
An Equivalence between Private Classification and Online Prediction**

Differential privacy enables rich statistical analyses on data while provably protecting individual-level privacy. The last 15 years of research has shown that, at least in principle, a number of fundamental statistical tasks are compatible with differential privacy. However, privacy-preserving analyses often require additional complexity over their non-private counterparts, for instance, in terms of the number of data samples one needs to collect in order to get accurate results. In fact, some infinite concept classes that are "easy" to learn in standard computational learning theory become impossible to learn under differential privacy using any finite number of samples.

Alon, Livni, Malliaris, and Moran recently showed that finite Littlestone dimension is a necessary condition for a concept class to be privately learnable, and conjectured that it is sufficient as well. Here, Littlestone dimension characterizes learnability in the mistake-bound model of online learning. We prove their conjecture, establishing a qualitative equivalence between private learnability and online learnability. Our result goes by way of a new connection between privacy and algorithmic stability for learning.

Joint work with Roi Livni and Shay Moran.

**03/05/2021Rocco Servedio, Columbia University
Some Recent Results on Trace Reconstruction**

I will give a broad overview of the "trace reconstruction problem", which is the problem of reconstructing a string from deletion-plagued copies of it. The general problem is largely open with an exponential gap between known upper and lower bounds; the main focus of the talk will be on some recent results in settings where efficient trace reconstruction is known to be possible.

**03/12/2021Daniel Kane, University of California San Diego
Point Location and Active Learning**

In the point location one is given a hyperplane arrangement and an unknown point. By making linear queries about that point one wants to determine which cell of the hyperplane arrangement it lies in. This problem has an unexpected connection to the problem in machine learning of actively learning a halfspace. We discuss these problems and their relationship and provide a new and nearly optimal algorithm for solving them.

**03/26/2021Muthu Venkitasubramaniam, University of Rochester
Average-case Complexity Through the Lens of Interactive Puzzles**

Consider the following two fundamental open problems in complexity theory: (a) Does a hard-on-average language in $\NP$ imply the existence of one-way functions?, or (b) Does a hard-on-average language in NP imply a hard-on-average problem in TFNP (i.e., the class of total NP search problem)? Our main result is that the answer to (at least) one of these questions is yes.

Both one-way functions and problems in TFNP can be interpreted as promise-true distributional NP search problems---namely, distributional search problems where the sampler only samples true statements. As a direct corollary of the above result, we thus get that the existence of a hard-on-average distributional NP search problem implies a hard-on-average promise-true distributional NP search problem. In other words, ”It is no easier to find witnesses (a.k.a. proofs) for efficiently-sampled statements (theorems) that are guaranteed to be true.”

This result follows from a more general study of interactive puzzles---a generalization of average-case hardness in NP—and in particular, a novel round-collapse theorem for computationally-sound protocols, analogous to Babai-Moran's celebrated round-collapse theorem for information-theoretically sound protocols.

Joint work with Rafael Pass.

**04/02/2021Andy Drucker, University of Chicago
An Improved Exponential-Time Approximation Algorithm for Fully-Alternating Games Against Nature**

"Games against Nature" [Papadimitriou '85] are two-player games of perfect information, in which one player's moves are made randomly. Estimating the value of such games (i.e., winning probability under optimal play by the strategic player) is an important goal in the study of decision-making under uncertainty. The problem's PSPACE-completeness does not rule out non-trivial algorithms, a goal fulfilled in certain cases by Littman, Majercik, and Pitassi [LMP '01].

We study the case in which the players strictly alternate with binary moves, for which [LMP '01] does not give a general speedup. We give a randomized algorithm to approximate the value of such games (and to produce near-optimal play) . Our algorithm achieves exponential savings over brute-force, making 2^{(1-delta)n} queries to the input game's lookup table for some absolute constant delta > 0, and certifies a lower bound on the game value with exponentially-small expected additive error. (On the downside, delta is tiny and the algorithm uses exponential space.)

Our algorithm is recursive, and bootstraps a "base-case" algorithm for fixed-size inputs. The base-case algorithm and the form of recursive composition used are interesting and, we feel, likely to find uses beyond the present work.

**04/09/2021Mark Braverman, Princeton
Optimal Tiling of the Euclidean Space Using Symmetric Bodies**

What is the least surface area of a symmetric body B whose Z^n translations tile R^n? Since any such body must have volume 1, the isoperimetric inequality implies that its surface area must be at least \Omega(\sqrt{n}). Remarkably, Kindler et al. showed that for general bodies B this is tight, i.e. that there is a tiling body of R^n whose surface area is O(\sqrt{n}). In contrast, the surface area of the "trivial" tiling by cubes [0,1]^n is 2n.

In theoretical computer science, the tiling problem is intimately to the study of parallel repetition theorems (which are an important component in PCPs), and more specifically in the question of whether a ``strong version'' of the parallel repetition theorem holds. Raz showed, using the odd cycle game, that strong parallel repetition fails in general, and subsequently these ideas were used to construct non-trivial tilings of R^n.

Motivated by the study of a symmetric parallel repetition, we consider the symmetric variant of the tiling problem in R^n. We show that any symmetric body that tiles $\mathbb{R}^n$ must have surface area at least \Omega(n/\sqrt{\log n}), and that this bound is tight, i.e.\ that there is a symmetric tiling body of R^n with surface area O(n/\sqrt{\log n}). We also give matching bounds for the value of the symmetric parallel repetition of Raz's odd cycle game.

Based on joint work with Dor Minzer (MIT).

**04/16/2021Matthew Fahrbach, Google
Edge-Weighted Online Bipartite Matching**

Online bipartite matching is one of the most fundamental problems in the online algorithms literature. Karp, Vazirani, and Vazirani (STOC 1990) introduced an elegant algorithm for the unweighted bipartite matching that achieves an optimal competitive ratio of 1-1/e. Aggarwal et al. (SODA 2011) later generalized their algorithm and analysis to the vertex-weighted case. Little is known, however, about the most general edge-weighted problem aside from the trivial 1/2-competitive greedy algorithm. In this paper, we present the first online algorithm that breaks the long-standing 1/2 barrier and achieves a competitive ratio of at least 0.5086. In light of the hardness result of Kapralov, Post, and Vondrák (SODA 2013) that restricts beating a 1/2 competitive ratio for the more general problem of monotone submodular welfare maximization, our result can be seen as strong evidence that edge-weighted bipartite matching is strictly easier than submodular welfare maximization in the online setting.

The main ingredient in our online matching algorithm is a novel subroutine called online correlated selection (OCS), which takes a sequence of pairs of vertices as input and selects one vertex from each pair. Instead of using a fresh random bit to choose a vertex from each pair, the OCS negatively correlates decisions across different pairs and provides a quantitative measure on the level of correlation. We believe our OCS technique is of independent interest and will find further applications in other online optimization problems.

**04/23/2021Gil Cohen, Tel Aviv University
Tree Codes and Interactive Coding Schemes**

Interactive coding is the study of conversing over imperfect communication channels. In this talk I will survey key results in this 30 year old field, focusing on the related combinatorial problem of constructing tree codes.

**04/30/2021Eric Price, UT
Instance-Optimal Compressed Sensing via Conditional Resampling**

Given a distribution P of images, how many random linear measurements are required for approximate recovery? Classical compressed sensing gives an upper bound for approximately sparse P, but actual distributions may have more or different structure than sparsity. We instead give an instance-optimal bound: when P is (approximately) known, then conditional resampling, where we output a sample of P(x | y), is within constant factors of the best possible recovery algorithm.

When applied to state-of-the-art deep generative models, we find that conditional resampling produces less washed-out, more realistic results than prior methods. It also has nice fairness properties, including "proportional representation": the representation of any group in the output matches the input distribution.

**05/07/2021Rahul Santhanam, Oxford
An Unconditional Polynomial-Time Pseudorandom Generator and its Consequences**

The existence of pseudo-random generators (PRGs) secure against general polynomial-time adversaries is one of the central questions in cryptography and pseudorandomness. For each $\epsilon > 0$, we give an unconditional construction of a PRG G from n^{\epsilon} bits to n bits that is infinitely often secure against probabilistic linear-time adversaries, such that G is computable in probabilistic polynomial time with one bit of advice. As a consequence, we show that there are infinitely many n for which there is a prime p_n of length n with a succinct representation of size n^{\epsilon} from which p_n can be feasibly recovered.

Our broader motivation is the study of pseudodeterministic algorithms, pioneered by Gat and Goldwasser. A pseudodeterministic algorithm for a search problem is one that outputs a fixed answer to the search problem with high probability. We show a close correspondence between pseudodeterministic algorithms for randomized search problems and long-standing open problems about the structure of probabilistic time, such as the existence of complete problems and of tight hierarchies.

Joint work with Zhenjian Lu and Igor Oliveira.