Fall 2020 Theory Seminars will meet virtually this year, Fridays from 2:30-3:30 pm. This schedule will be updated throughout the semester.

08/28/2020 - Michal Moshkovitz, UC San Diego: Unexpected Effects of Online K-means Clustering
09/04/2020 - Greg Plaxton, UT Austin: On the Existence of Three-Dimensional Stable Matchings with Cyclic Preferences
09/11/2020 - Ben Lee Volk , UT Austin: A Lower bound on Determinantal Complexity
09/25/2020 - Alexander Sherstov, UCLA: An Optimal Separation of Randomized and Quantum Query Complexity
10/02/2020 - Eshan Chattopadhyay, Cornell: Pseudorandom Generators from Fourier Tail Bounds
10/09/2020 - Sumegha Garg, Harvard: The Coin Problem with Applications to Data Streams
*All meetings after 10/16/2020 will begin at 2:00 PM.*
10/16/2020 - Shalev Ben-David, University of Waterloo: Forecasting Algorithms, Minimax Theorems, and Randomized Lower Bounds 
10/23/2020 - Anup Rao, University of Washington: Anti-concentration and the Gap-Hamming problem
10/30/2020 - Seth Pettie, University of Michigan: Planar Distance Oracles
11/13/2020 - Rafael Pass, Cornell University: On One-Way Functions and Kolmogorov Complexity
11/20/2020 - Xi Chen, Columbia University: An Improved Analysis of the Quadtree for High Dimensional EMD
12/04/2020 - Andrea Lincoln, UC Berkeley: New Techniques for Proving Fine-Grained Average-Case Hardness
12/11/2020 - Zhao Song, Princeton & Institute for Advanced Study: Optimization: From Linear Programming to Deep Learning

Michal Moshkovitz, UC San Diego
Unexpected Effects of Online K-means Clustering

Offline k-means clustering was studied extensively and algorithms with a constant approximation are available. However, online clustering is still uncharted. New factors come into play: the ordering of the dataset and whether the number of points, n, is known in advance or not. Their exact effects are unknown. In this paper we focus on the online setting where the decisions are irreversible: after a point arrives the algorithm needs to decide whether to take the point as a center or not, and this decision is final. How many centers are needed and sufficient to achieve constant approximation in this setting? We show upper and lower bounds for all the different cases. These bounds are exactly the same up to a constant, thus achieving optimal bounds. For example, for k-means cost with constant k>1 and random order, Θ(logn) centers are enough to achieve a constant approximation, while the mere apriori knowledge of n reduces the number of centers to a constant. These bounds hold for any distance function that obeys a triangle-type inequality.

Greg Plaxton, UT Austin
On the Existence of Three-Dimensional Stable Matchings with Cyclic Preferences

In the most basic version of the stable marriage problem, we are given an equal number of men and women, each of whom has complete, strict preferences over the agents of the opposite sex, and we seek a perfect matching (i.e., a partition of the agents into disjoint man-woman pairs) that is "stable" in the sense that no man and woman prefer each other to their matches.  In a famous result, Gale and Shapley presented an algorithm that computes a stable matching for any given stable marriage instance.

Knuth asked whether stable matchings are guaranteed to exist when there are three types of agents, say men, women, and dogs. In this setting, we seek to partition the agents into a stable collection of man-woman-dog families. Several variants of Knuth's question -- associated with different restrictions on the agent preferences -- have been addressed in the literature. In this talk, we study the special case where the men care only about the women, the women care only about the dogs, and the dogs care only about the men.  Our main theorem disproves some published conjectures.

Joint work with Chi-Kit Lam.

Ben Lee Volk, UT Austin
A lower bound on determinantal complexity

This talk is about algebraic complexity, which is a field that studies the complexity of computing algebraic problems using basic arithmetic operations. In the first part of the talk I will give a brief introduction to this area. The nature of questions arising in the study of polynomials as complexity theoretic objects suggests that techniques from algebraic geometry could be useful. As an example, I will then describe a recent result establishing a lower bound on the well studied notion of determinantal complexity, which is closely related to circuit lower bounds. The determinantal complexity of a polynomial f is the minimal integer m such that there exists an mxm matrix M of linear functions such that f(x)=Det(M(x)).
We give an explicit n-variate polynomial of degree n whose determinantal complexity is at least 1.5n-3. This is the strongest lower bound known as a function of the number of variables.
(Based on a joint work with Mrinal Kumar)

Alexander Sherstov, UCLA
An Optimal Separation of Randomized and Quantum Query Complexity

We prove that for every decision tree, the absolute values of the Fourier coefficients of given order $\ell\geq1$ sum to at most $c^{\ell}\sqrt{\binom{d}{\ell}(1+\log n)^{\ell-1}},$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant. This bound is essentially tight and settles a conjecture due to Tal (arxiv 2019; FOCS 2020). The bounds prior to our work degraded rapidly with $\ell,$ becoming trivial already at $\ell=\sqrt{d}.$

As an application, we obtain, for any positive integer $k,$ a partial function on $n$ bits that has bounded-error quantum query complexity at most ceiling(k/2) and randomized query complexity $\tilde{\Omega}(n^{1-1/k}).$ This separation of bounded-error quantum versus randomized query complexity is best possible, by the results of Aaronson and Ambainis (STOC 2015). Prior to our work, the best known separation was polynomially weaker: $O(1)$ versus $n^{2/3-\epsilon}$ for any $\epsilon>0$ (Tal, FOCS 2020).

No familiarity with quantum computing will be assumed. Joint work with Andrey Storozhenko (UCLA) and Pei Wu (UCLA): https://arxiv.org/abs/2008.10223.

Eshan Chattopadhyay, Cornell
Pseudorandom Generators from Fourier Tail Bounds

A central goal in complexity theory is to construct unconditional pseudorandom generators (PRGs) for various computational models. In this talk, I will survey a recent line of work that develops a new method for constructing PRGs via exploiting Fourier tail bounds of Boolean function classes. This gives a unified way of constructing PRGs for many well-studied classes of functions including small-depth circuits (AC0), small-width branching programs, low-sensitivity functions, and low-degree polynomials over F_2.  I will highlight open problems around this new framework that will imply application of this method in constructing PRGs for classes of functions (such as AC0 with PARITY gates) that have been beyond reach for decades.

Based on joint works with Jason Gaitonde, Pooya Hatami, Kaave Hosseini, Chin Ho Lee, Shachar Lovett, Abhishek Shetty, Avishay Tal, and David Zuckerman.

Sumegha Garg,  Harvard
The Coin Problem with Applications to Data Streams

Consider the problem of computing the majority of a stream of n i.i.d. uniformly random bits. This problem, known as the coin problem, is central to a number of counting problems in different data stream models. We show that any streaming algorithm for solving this problem with large constant advantage (over the uniform distribution) must use Ω(log n) bits of space. Previously, it was known that computing the majority on every input with a constant probability takes Ω(log n) space. We extend our lower bound to proving tight lower bounds for solving multiple, randomly interleaved copies of the coin problem, as well as for solving the OR of multiple copies of a variant of the coin problem. Our proofs involve new measures of information complexity that are well-suited for data streams.

We use these lower bounds to obtain a number of new results for data streams. In each case there is an underlying d-dimensional vector x with additive updates to its coordinates given in a stream of length m. The input streams arising from our coin lower bound have nice distributional properties, and consequently for many problems for which we only had lower bounds in general turnstile streams, we now obtain the same lower bounds in more natural models, such as the bounded deletion model, in which ||x||_2 never drops by a constant fraction of what it was earlier, or in the random order model, in which the updates are ordered randomly. 

Based on joint work with Mark Braverman and David P. Woodruff.

10/16/2020 *TIME CHANGE: Meeting will begin at 2:00 PM.*
Shalev Ben-David, University of Waterloo
Forecasting Algorithms, Minimax Theorems, and Randomized Lower Bounds

I will present a new approach to randomized lower bounds, particularly in the setting where we wish to give a fine-grained analysis of randomized algorithms that achieve small bias. The approach is as follows: instead of considering ordinary randomized algorithms which give an output in {0,1} and may err, we switch models to look at "forecasting" randomized algorithms which output a confidence in [0,1] for whether they think the answer is 1. When scored by a proper scoring rule, the performance of the best forecasting algorithm is closely related to the bias of the best (ordinary) randomized algorithm, but is more amenable to analysis.

As an application, I'll present a new minimax theorem for randomized algorithms, which can be viewed as a strengthening of Yao's minimax theorem. Yao's minimax theorem guarantees the existence of a hard distribution for a function f such that solving f against this distribution (to a desired error level) is as hard as solving f in the worst case (to that same error level). However, the hard distribution provided by Yao's theorem depends on the chosen error level. Our minimax theorem removes this dependence, giving a distribution which certifies the hardness of f against all bias levels at once. In recent work, we used this minimax theorem to give a tight composition theorem for randomized query complexity.

Based on joint work with Eric Blais.

10/23/2020  *TIME CHANGE: Meeting will begin at 2:00 PM.*
Anup Rao, University of Washington
Anti-concentration and the Gap-Hamming problem

We prove new lower bounds on the well known Gap-Hamming problem in communication complexity. Our main technical result is an anti-concentration bound for the inner product of two independent random vectors. We show that if A, B are arbitrary  subsets of the cube {±1}^n with |A| · |B| ≥ 2^(1.01n), and X ∈ A and Y ∈ B are sampled independently and uniformly, then the inner product <X,Y> must be anti-concentrated: it takes on any fixed value with probability at most O(1/sqrt(n)). In fact, the following stronger claim holds: for any integer k, |Pr[<X,Y>=k] - Pr[<X,Y> = k+4]| is at most O(1/n). Remarkably, this last claim is false if 4 is replaced with an integer that is not divisible by 4. I will explain why this happens in my talk.

This is joint work with Amir Yehudayoff.

Seth Pettie, University of Washington
Planar Distance Oracles

We consider the problem of preprocessing a weighted planar graph in order to answer exact distance and shortest path queries. As in the recent algorithms of Cohen-Addad et al. (2017), Gawrychowski et al. (2018), and Charalampopoulos et al. (2019), our algorithm is based on solving the point-location problem in weighted planar Voronoi diagrams.

We give a new, efficient method to solve point location using persistent data structures, which leads to a new time-space tradeoff for the problem. At the extremes of this tradeoff, the data structure:

* occupies $n^{1+o(1)}$ space and answers distance queries in $\log^{2+o(1)} n$ time, or

* occupies $n\log^{2+o(1)} n$ space and answers distance queries in$n^{o(1)}$ time.

Joint work with Yaowei Long, to appear in SODA 2021. arXiv:2007.08585.

Rafael Pass, Cornell University
On One-Way Functions and Kolmogorov Complexity

We prove the equivalence of two fundamental problems in the theory of computing. For every polynomial t(n)>2n, the following are equivalent:

  • Cryptographic one-way functions exists; 
  • The t-time bounded Kolmogorov Complexity problem is mildly hard-on-average.

In doing so, we present the first natural, and well-studied, computational problem characterizing the feasibility of the central private-key primitives and protocols in Cryptograpy. 

Joint work with Yanyi Liu: https://eccc.weizmann.ac.il/report/2020/052/.

Xi Chen, Columbia University
An Improved Analysis of the Quadtree for High Dimensional EMD

The Earth Mover's Distance (EMD) between two multi-sets A,B in R^d of size s is the min-cost of bipartite matchings between points in A and B, where the cost is measured by distance between points. We study a class of divide-and-conquer algorithms based on a quadtree data structure. Our analysis improves on the O(min{log s, log d} log s)-approximation of Andoni et al. (2008) and Backurs et al. (2020), showing that matchings from quadtrees can be used to achieve an O(log s)-approximation of EMD. As further applications, we give new linear sketches (and streaming algorithms) for the approximation of EMD. The main conceptual contribution is an analytical framework for studying quadtrees which goes beyond the worst-case distortion of randomized tree embeddings.

Joint work with Rajesh Jayaram, Amit Levi and Erik Waingarten.

Andrea Lincoln, UC Berkeley
New Techniques for Proving Fine-Grained Average-Case Hardness

In this talk I will cover a new technique for worst-case to average-case reductions. There are two primary concepts introduced in this talk: "factored" problems and a framework for worst-case to average-case fine-grained (WCtoACFG) self reductions.
We will define new versions of OV, kSUM and zero-k-clique that are both worst-case and averagecase fine-grained hard assuming the core hypotheses of fine-grained complexity. We then use these as a basis for fine-grained hardness and average-case hardness of other problems. Our hard factored problems are also simple enough that we can reduce them to many other problems, e.g. to edit distance, k-LCS and versions of Max-Flow. We further consider counting variants of the factored problems and give WCtoACFG reductions for them for a natural distribution.

To show hardness for these factored problems we formalize the framework of [Boix-Adsera et al. 2019]  that was used to give a WCtoACFG reduction for counting k-cliques. We define an explicit property of problems such that if a problem has that property one can use the framework on the problem to get a WCtoACFG self reduction.

In total these factored problems and the framework together give tight fine-grained average-case hardness for various problems including the counting variant of regular expression matching. Based on joint work with Mina Dalirrooyfard and Virginia Vassilevska Williams. 

Zhao Song, Princeton & Institute for Advanced Study
Optimization: From Linear Programming to Deep Learning

Many real-life problems can be solved by optimization methods including both linear programming and deep learning. The running time of optimization algorithms usually consists of two components, the number of iterations and the cost per iteration. Over the past several decades, researchers have been putting tremendous efforts on reducing the number of iterations. Recently, we found new angles and proposed novel techniques to speed up the cost per iteration instead. In this talk, we first show how to use those ideas to speed up linear programming to n^omega + n^{2+1/18} time. Further, we show our ideas can be applied to the field of deep learning theory, and can be used to train neural networks more efficiently in practice.

This talk is based on joint works with Jan van den Brand, Shunhua Jiang, Binghui Peng, Omri Weinstein, Hengjie Zhang.