Abstracts
The New Complexity Landscape around Circuit Minimization
Eric Allender
The Minimum Circuit Size Problem (MCSP) has been a central problem in the study of computational complexity theory for a long time, and promises to remain a focus of attention for many years to come. This problem can be said to capture the question of whether it is difficult to prove lower bounds; it appears to be hard, but is it really? This talk will touch on the rich connections between MCSP and other topics in theoretical computer science, and will survey some of the remarkable recent progress that has been made.
slides
Spectral graph modification for improved spectral clustering
Yiannis Koutis
Spectral clustering algorithms provide approximate solutions
to hard optimization problems that formulate graph partitioning in
terms of the graph conductance. It is well understood that the quality
of these approximate solutions is negatively affected by a possibly
significant gap between the conductance and the second eigenvalue of
the graph. In this paper we show that for
any graph
G,
there exists a
spectral maximizer
graph
H which is cut-similar to
G, but has eigenvalues that are near the theoretical limit implied
by the cut structure of
G. Applying then spectral clustering on
H
has the potential to produce improved cuts that also exist in
G due
to the cut similarity. This leads to the second contribution of this
work: we describe a practical spectral modification algorithm that
raises the eigenvalues of the input graph, while preserving its cuts.
Combined with spectral clustering on the modified graph, this yields
demonstrably improved cuts.
Fast, cheap, but in control: sublinear-time algorithms for approximate computations
Dana Ron
When we refer to efficient algorithms, we usually mean polynomial-time algorithms.
However, there are many scenarios in which the input is very large, and we seek even more efficient algorithms whose running time is sublinear in the size of the input. Such algorithms do not even read the entire input, but rather sample random parts of the input and are required to output approximately good answers with high success probability.
In this talk I will try and give a flavor of research in the area of sublinear-time algorithms.
slides
Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem
Vaggos Chatziafratis
Understanding the representational power of Deep Neural Networks
(DNNs) and how their structural properties (e.g., depth, width, type
of activation
unit) affect the functions they can compute, has been an important
yet challenging question in deep learning and approximation theory.
In a seminal paper, Telgarsky high- lighted the benefits of depth by
presenting a family of functions (based on simple triangular waves)
for which DNNs achieve zero classification error, whereas shallow
networks with fewer than exponentially many nodes incur constant
error. Even though Telgarsky's work reveals the limitations of
shallow neural networks, it doesn't inform us on why these functions
are difficult to represent and in fact he states it as a tantalizing
open question to characterize those functions that cannot be
well-approximated by smaller depths. In this work, we point to a new
connection between DNNs expressivity and Sharkovsky's Theorem from
dynamical systems, that enables us to characterize the depth-width
trade-offs of ReLU networks for representing functions based on the
presence of a generalized notion of fixed points, called periodic
points (a fixed point is a point of period 1). Motivated by our
observation that the triangle waves used in Telgarsky's work contain
points of period 3 - a period that is special in that it implies
chaotic behaviour based on the celebrated result by Li-Yorke - we
proceed to give general lower bounds for the width needed to
represent periodic functions as a function of the depth. Technically, the crux of our approach is based on an eigenvalue analysis of the dynamical systems associated with such functions.
(Hidden) Zero-Sum Games: The Final (Non-Convex Non-Concave) Frontier
*A story by Poincaré, Bendixson, Hamilton and von Neumann*
Manolis Vlatakis
A fundamental question of game theory is how will agents participating in
a game behave over time. Will their dynamics equilibrate? If not, can we
make any other predictions about their long term behavior?Recently this
problem has attracted renewed interest motivated by the advent of
Generative Adversarial Networks (GANs) and their numerous
applications. This AI architecture is effectively a zero-sum game between
two NNs (Generator & Discriminator) that are trying to outcompete each
other, critically they do not belong in the standard class of
convex-concave games where the minmax theorem applies. In this
work, motivated by the indirect nature of the competition in GANS, where
players control the parameters of a neural network while the actual
competition happens between the distributions that the generator and
discriminator capture, we study a wide class of non-convex non-concave
min-max games that generalizes over standard bilinear zero-sum games. In
this class, players control the inputs of a smooth function whose output is
being applied to a bilinear zero-sum game. We establish theoretically, that
depending on the specific instance of the problem gradient-descent-ascent
dynamics can exhibit a variety of behaviors antithetical to convergence to
the game theoretically meaningful min-max solution. Specifically, different
forms of recurrent behavior (including periodicity and Poincaré recurrence)
are possible as well as convergence to spurious (non-min-max) equilibria
for a positive measure of initial conditions. At the technical level, our
analysis combines tools from optimization theory, game theory and dynamical
systems.
Computational Social Choice and Computational Complexity: BFFs?
Lane A. Hemaspaandra
We discuss the connection between computational complexity and computational social choice (aka
comsoc,
which for this talk will mostly be the study of computational aspects of election systems: who wins, how elections can be manipulatively attacked, etc.). We stress the work so far on, and urge continued focus on, two less-recognized aspects of this connection. Firstly, this is very much a two-way street: Everyone knows complexity classification is used in comsoc, but we also highlight benefits to complexity that have arisen from its use in comsoc. Secondly, less-known complexity tools often can be very productively used in comsoc.
This is an updated version of a talk presented in the AAAI 2018 Senior Member Track.
On the Complexity of Decomposable Randomized Encodings, or: How Friendly Can a Garbling-Friendly PRF be?
Marshall Ball
Garbling schemes, also known as decomposable randomized encodings (DRE), have found many applications in cryptography. However, despite a large body of work on constructing such schemes, very little is known about their limitations.
We initiate a systematic study of the DRE complexity of Boolean functions, obtaining the following main results:
- [Near-quadratic lower bounds] We use a classical lower bound technique of Neciporuk [Dokl. Akad. Nauk SSSR '66] to show an
)
lower bound on the size of any DRE for many explicit Boolean functions. For some natural functions, we obtain a corresponding upper bound, thus settling their DRE complexity up to polylogarithmic factors. Prior to our work, no superlinear lower bounds were known, even for non-explicit functions.
- [Garbling-friendly PRFs.] We show that any exponentially secure PRF has
)
DRE size, and present a plausible candidate for a
garbling-optimal
. PRF that nearly meets this bound. This candidate establishes a barrier for super-quadratic DRE lower bounds via natural proof techniques. In contrast, we show a candidate for a
weak PRF with near-exponential security and linear DRE size.
Our results establish several qualitative separations, including near-quadratic separations between computational and information-theoretic DRE size of Boolean functions, and between DRE size of weak vs. strong PRFs.
Joint work with Justin Holmgren, Yuval Ishai, Tianren Liu, and Tal Malkin.
Secure Multi-Party Computation for Federated Learning
Antigoni Polychroniadou
Federated Learning is a technique that enables a large number of users to jointly learn a shared machine learning model, managed by a centralized server, while the training data remains on user devices. That said, federated learning reduces data privacy risks.
However, privacy concerns still exist since it is possible to leak information about the training data set from the trained model's weights or parameters. Most federated learning systems use the technique of differential privacy to add noise to the weights so that it is harder to reverse-engineer the individual data sets. Differential privacy reduces the risk but does not eliminate leakage from the data. The combination of differential privacy and cryptography can eliminate leakage.
Unlike prior works which combine differential privacy and cryptography, we propose a new mechanism that protects against any attempt to undermine differential privacy by any collusion of parties.
Robust ML: Progress, challenges, and a new perspective
Dimitris Tsipras
Recent progress has made the deployment of machines learning (ML) systems in the real world an imminent possibility.
But are our current ML systems really up to the task?
In this talk, I will discuss the pervasive brittleness of existing ML tools and offer a new perspective on how it arises. I will then describe a conceptual framework that aims to deliver models that are more reliable, and robust to adversarial manipulation. Finally, I will outline how this framework constitutes a new learning paradigm, how it differs from the classic perspective, and what benefits it provides, beyond robustness itself.
Private Identity Testing for High Dimensional Distributions
Lydia Zakynthinou
As hypothesis tests are applied to sensitive data in a myriad of settings, an urgent need arises for tests that respect the privacy of their participants' data. Differential privacy is a mathematically rigorous technique that has become the standard for ensuring privacy in machine learning models and has been adopted in several real-world applications, although constructing sample-efficient private algorithms has been a challenge, especially for high-dimensional data.
In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: d-dimensional Gaussians with known covariance and product distributions over {-1,+1}
d. Our testers have improved sample complexity compared to those derived from previous techniques, and are the first testers whose sample complexity matches the order-optimal non-private sample complexity in many parameter regimes. We construct two types of testers, exhibiting tradeoffs between sample complexity and computational complexity. In this talk, I will describe the computationally inefficient and efficient testers for product distributions, sketching the two different techniques behind these algorithms.
This is joint work with Clément Canonne, Gautam Kamath, Audra McMillan, and Jonathan Ullman.
Grinding the Space: Learning to Classify Against Strategic Agents
Chara Podimata
We study the problem of online learning in strategic classification settings from the perspective of the learner, who is repeatedly facing myopically rational strategic agents. We model this interplay as a repeated Stackelberg game, where at each timestep the learner deploys a high-dimensional linear classifier first and an agent, after observing the classifier, along with his real feature vector, and according to his underlying utility function, best-responds with a (potentially altered) feature vector. We measure the performance of the learner in terms of Stackelberg regret for her 0 - 1 loss function.
Surprisingly, we prove that in strategic settings like the one considered in this paper there exist worst-case scenarios, where any sequence of actions providing sublinear external regret might result in linear Stackelberg regret and vice versa. We then provide the Grinder Algorithm, an adaptive discretization algorithm, potentially of independent interest in the online learning community, and prove its data-dependent upper bound on the Stackelberg regret given oracle access, while being computationally efficient. We also provide a nearly matching lower bound for the problem of strategic classification. We complement our theoretical analysis with simulation results, which suggest that our algorithm outperforms the benchmarks, even given access to approximation oracles. Our results advance the known state-of-the-art results in the growing literature of online learning from revealed preferences, which has so far focused on
smoother
utility and loss functions from the perspective of the agents and the learner respectively.
This is joint work with Yiling Chen (Harvard) and Yang Liu (UC Santa Cruz).