The Georgia Tech ACO Student Seminar is run by students in the ACO program.
The purpose of this seminar is to keep students updated with current research, and to practice presenting skills for those who will give a talk in a conference. Any topic related (but not restricted) to algorithms, combinatorics, and optimization is very welcome. You can share research results, present a work in progress or just share something of general interest. There will also be occasional talks by ACO faculty.
The seminar currently meets on Fridays from 1pm to 2pm at Skiles 254, unless noted below. For more information, including the online meeting links, please refer to the announcement sent by the organizers (or request the organizers for the meeting link if you miss the annoucement).
Subscribe to the ACO Student Seminar mailing list to receive the annocement of talks at this seminar regularly.
Abstract:
Designing mechanisms to maximize revenue is a fundamental problem in mathematical economics and has various applications like online ad auctions and spectrum auctions. Unfortunately, optimal auctions for selling multiple items can be unreasonably complex and computationally intractable. In this talk, we consider a revenue-maximizing seller with n items facing a single unit-demand buyer. Our work shows that simple mechanisms can achieve almost optimal revenue. We approached the tradeoffs of simplicity formally through the lens of computation and menu size. Our main result provides a mechanism that gets a (1 − ε)-approximation to the optimal revenue in time quasi-polynomial in n and has quasi polynomial (symmetric) menu complexity.
Joint work with Pravesh Kothari, Ariel Schvartzman, Sahil Singla, and Matt Weinberg.
Abstract: Queueing delays are ubiquitous in many domains, including computer systems, service systems, communication networks, supply chains, and transportation. Queueing and scheduling theory provide a rigorous basis for understanding how to reduce delays with scheduling, including evaluating policy performance and guiding policy design. Unfortunately, state-of-the-art theory fails to address many practical concerns. For example, scheduling theory seldom treats nontrivial preemption limitations, and there is very little theory for scheduling in multiserver queues. We present two new, broadly applicable tools that greatly expand the reach of scheduling theory, using each to solve multiple open problems. The first tool, called “SOAP”, is a new unifying theory of scheduling in single-server queues, specifically the M/G/1 model. SOAP characterizes the delay distribution of a broad space of policies, most of which have never been analyzed before. Such policies include the Gittins index policy, which minimizes mean delay in low-information settings, and many policies with preemption limitations. The second tool, called “WINE”, is a new queueing identity that complements Little’s law. WINE enables a new method of analyzing complex queueing systems by relating them to simpler systems. This results in the first delay bounds for SRPT (shortest remaining processing time) and the Gittins index policy in multiserver queues, specifically the M/G/k model.
Abstract: The k-Cap Process on Geometric Random Graphs The k-cap (or k-winners-take-all) process on a graph works as follows: in each iteration, exactly k vertices of the graph are in the cap (i.e., winners); the next round winners are the vertices that have the highest total degree to the current winners, with ties broken randomly. This natural process is a simple model of firing activity in the brain. We study its convergence on geometric random graphs revealing rather surprising behavior
Abstract: One exciting new development of the past decade is the evolution of the sum-of-squares method for algorithm design for high-dimensional statistical estimation. This paradigm can be viewed as a principled approach to generating and analyzing semidefinite programming relaxations for statistical estimation problems by thinking of the duals as proofs of statistical identifiability -- i.e., proof that the input data uniquely identifies the unknown target parameters. In this sequence of three lectures, I will give an overview of the sum-of-squares method for statistical estimation. Specifically, I will discuss how strengthening (via semidefinite certificates) of basic analytic properties of probability distributions such as subgaussian tails, hypercontractive moments, and anti-concentration yield new algorithms for problems such as learning spherical and non-spherical Gaussian mixture models and basic tasks in algorithmic robust statistics.
Abstract: In lifelong learning, the tasks to be learned arrive sequentially over time in arbitrary order. During training, knowledge from previous tasks can be captured and transferred to subsequent ones to improve sample efficiency. In this talk, I will discuss the recent work on a provably efficient lifelong learning algorithm. Considering the setting where all tasks share a small number of common features, our algorithm’s sample complexity improves significantly on existing bounds and matches a lower bound for the lifelong learning algorithms.
Abstract: Given an n-by-d matrix A and a vector of size n, the p-norm problem asks for a vector x that minimizes the following \sum_{i=1}^n (a_i^T x - b_i)^p, where a_i is the i’th row of A. The study of p=2 and p=1 cases dates back to Legendre, Gauss, and Laplace. Other cases of p have also been used in statistics and machine learning as robust estimators for decades. In this talk, we present the following improvements in the running time of solving p-norm regression problems. For the case of 1<p<2, we improve the running time from O(nnz(A)+d^{0.5p/(p-1)+1}+d^{\omega}) to O(nnz(A)+d^4), where nnz(A) is the number of nonzero entries of A and \omega is the matrix multiplication exponent. The former is due to Bubeck-Cohen-Lee-Li. Note that when p goes to one, 0.5p/(p-1) goes to infinity. For 1<p<\infty, we improve the running time from O(n^{\omega}) to O(n^{\theta}) for sufficiently sparse matrices, where \theta<\omega. The former is due to Adil-Kyng-Peng-Sachdeva. The talk is based on a joint work with Richard Peng and Santosh Vempala.
Abstract:
In this talk, I will introduce some recent advances in designing stochastic primal-dual methods for bilinear saddle point problems, in the
form of min_x max_y y^TAx under different geometries of x and y. These problems are prominent in economics, linear programming, machine
learning and reinforcement learning. Specifically, our methods apply to Markov decision processes (MDPs), linear regression, and computational
geometry tasks.
In our work, we propose a variance-reduced framework for solving convex-concave saddle-point problems, given a gradient estimator satisfying
some local properties. Further, we show how to design such gradient estimators for bilinear objectives under different geometry including simplex
(l_2), Euclidean ball (l_1) or box (l_inf) domains. For matrix A with larger dimension n, nonzero entries nnz and accuracy epsilon, our proposed
variance-reduced primal dual methods obtain a runtime complexity of nnz+\sqrt{nnz*n}/epsilon, improving over the exact gradient methods and fully
stochastic methods in the accuracy and/or the sparse regime (when epsilon < n/nnz). For finite-sum saddle-point problems sum_{k=1}^K f_k(x,y)
where each f is 1-smooth, we show how to obtain an epsilon-optimal saddle point within gradient query complexity of K+\sqrt{K}/epsilon.
Moreover, we also provide a class of coordinate methods for solving bilinear saddle-point problems. These algorithms use either O(1)-sparse
gradient estimators to obtain improved sublinear complexity over fully stochastic methods, or their variance-reduced counterparts for improved
nearly-linear complexity, for sparse and numerically sparse instances A.
This talk is based on several joint works with Yair Carmon, Aaron Sidford and Kevin Tian, with links of papers below:
Variance Reduction for Matrix Games
Coordinate Methods for Matrix Games
Efficiently Solving MDPs using Stochastic Mirror Descent
Abstract: We present a mixed atomic matrix norm that, when used as regularization in optimization problems, promotes low-rank matrices with sparse factors. We show that in convex lifted formulations of sparse phase retrieval and sparse principal component analysis (PCA), this norm provides near-optimal sample complexity and error rate guarantees. Since statistically optimal sparse PCA is widely believed to be NP-hard, this leaves open questions about how practical it is to compute and optimize this atomic norm. Motivated by convex duality analysis, we present a heuristic algorithm in the case of sparse phase retrieval and show that it empirically matches existing state-of-the-art algorithms.
Abstract: Semidefinite programming is a useful type of convex optimization, which has applications in both graph theory and industrial engineering. Many semidefinite programs exhibit a kind of structured sparsity, which we can hope to exploit to reduce the memory requirements of solving such semidefinite programs. We will discuss an interesting relaxation of such sparse semidefinite programs, and a measurement of how well this relaxation approximates a true semidefinite program. We'll also discuss how these approximations relate to graph theory and the theory of sum-of-squares and nonnegative polynomials. This talk will not assume any background on semidefinite programming.
Abstract: We consider a service system consisting of parallel single server queues of infinite capacity. Work of different classes arrives as correlated Gaussian processes with known drifts but unknown covariances, and it is deterministically routed to the different queues according to some routing matrix. In this setting we show that, under some conditions, the covariance matrix of the arrival processes can be directly recovered from the large deviations behavior of the queue lengths. Also, we show that in some cases this covariance matrix cannot be directly recovered this way, as there is an inherent loss of information produced by the dynamics of the queues. Finally, we show how this can be used to quickly learn an optimal routing matrix with respect to some utility function.
Abstract:
The minimum linear ordering problem (MLOP) asks to minimize the aggregated cost of a set function f with respect to some ordering \sigma of the base set. That is, MLOP asks to find a permutation \sigma that minimizes the sum \sum_{i = 1}^{|E|}f({e \in E : \sigma(e) \le i}). Many instances of MLOP have been studied in the literature, for example, minimum linear arrangement (MLA) or minimum sum vertex cover (MSVC). We will cover how graphic matroid MLOP, i.e. where f is taken to be the rank function of a graphic matroid, is NP-hard. This is achieved through a series of reductions beginning with MSVC. During these reductions, we will introduce a new problem, minimum latency vertex cover (MLVC) which we will also show has a 4/3 approximation. Finally, using the theory of principal partitions, we will show MLOP with monotone submodular function f : E \to \mathbb{R}^+ has a 2 - (1 + \ell_f)/(1 + |E|) approximation where \ell_f = f(E)/(\max_{x \in E}f({x})). As a corollary, we obtain a 2 - (1 + r(E))/(1 + |E|) approximation for matroid MLOP where r is the rank function of the matroid. We will also end with some interesting open questions.
Joint work with Majid Farhadi, Swati Gupta, Shengding Sun, and Prasad Tetali.
Abstract:
We design an O~(m)-time randomized algorithm for the l2-norm flow diffusion problem, a recently proposed diffusion model based on network flow with demonstrated graph clustering related applications both in theory and in practice. Examples include finding locally-biased low conductance cuts. Using a known connection between the optimal dual solution of the flow diffusion problem and the local cut structure, our algorithm gives an alternative approach for finding such cuts in nearly linear time.
From a technical point of view, our algorithm contributes a novel way of dealing with inequality constraints in graph optimization problems. It adapts the high-level algorithmic framework of nearly linear time Laplacian system solvers, but requires several new tools: vertex elimination under constraints, a new family of graph ultra-sparsifiers, and accelerated proximal gradient methods with inexact proximal mapping computation.
Joint work with Richard Peng and Di Wang.
Abstract:
We show that if $G$ is a 2-connected simple subcubic graph on $n$ vertices with $n_2$ vertices of degree 2, then $G$ has a TSP walk (spanning closed walk) of length at most $\frac{5n}{4} + \frac{n_2}{4}-1$, confirming a conjecture of Dvo{\^r}{\'a}k, Kr{\'a}l', and Mohar. This upper bound is best possible; there are infinitely many subcubic (respectively, cubic) graphs whose minimum TSP walks have lengths $\frac{5n}{4}+\frac{n_2}{4}-1$ (respectively, $\frac{5n}{4} - 2$). Our proof implies a polynomial-time algorithm for finding such a TSP walk. In particular, we obtain a $\frac{5}{4}$-approximation algorithm for the graphic TSP on simple cubic graphs, improving on the previously best known approximation ratio of $\frac{9}{7}$.
Joint work with Michael Wigal and Xingxing Yu.
Abstract: TBA
Abstract: RNA is a biological molecule with many roles including information transfer and regulation of gene expression. This work examines a model for RNA secondary structure originally proposed by Hower and Heitsch in 2011 with the goal of exploring branching properties. Under this model, secondary structures are represented by plane trees. Plane trees are rooted, ordered trees, and they are counted by the Catalan numbers. Under the model, each plane tree is assigned an energy based on an underlying thermodynamic model, and a Gibbs distribution is defined from these energies. Two primary approaches are used to explore this distribution: Markov chain Monte Carlo sampling, and analysis of limiting distributions. We show how to define a Markov chain with the specified Gibbs distribution as its stationary distribution, and we prove that this chain is rapidly mixing. We also explore some combinatorial properties of plane trees and examine the distributions of these properties as the number of edges in the tree goes to infinity. A main result is a theorem relating the limiting distribution of these properties under different sets of thermodynamic parameters.
Abstract: We show that the popular \(k\)-means clustering algorithm (Lloyd's heuristic), used for a variety of scientific data, can result in outcomes that are unfavorable to subgroups of data (e.g., demographic groups). Such biased clusterings can have deleterious implications for human-centric applications such as resource allocation. We present a fair \(k\)-means objective and algorithm to choose cluster centers that provide equitable costs for different groups. The algorithm, Fair-Lloyd, is a modification of Lloyd's heuristic for \(k\)-means, inheriting its simplicity, efficiency, and stability. In comparison with standard Lloyd's, we find that on benchmark datasets, Fair-Lloyd exhibits unbiased performance by ensuring that all groups have equal costs in the output \(k\)-clustering, while incurring a negligible increase in running time, thus making it a viable fair option wherever \(k\)-means is currently used.
Abstract: We give an algorithm for computing exact maximum flows on graphs with \(m\) edges and integer capacities in the range \([1, U]\) in \(\tilde{O}(m^{3/2-1/328} \log U)\) time. For sparse graphs with polynomially bounded integer capacities, this is the first improvement over the \(\tilde{O}(m^{3/2} \log U)\) time bound from [Goldberg-Rao JACM '98]. Our algorithm revolves around dynamically maintaining the augmenting electrical flows at the core of the interior point method based algorithm from [Madry JACM '16]. This entails designing data structures that, in limited settings, return edges with large electric energy in a graph undergoing resistance updates.
Abstract: It is an old adage that correlation does not imply causation. To obtain guarantees that controlling one variable will produce meaningful changes in another, we usually need to perform a randomized experiment. Causal theory asks what kinds of inferences can be drawn from observational data without experiment, but where we make some assumptions about the ways that the observed variables and hidden variables depend on each other. We give some connections between causal theory and polynomial optimization, including a quantitative bound showing that as long as the observed variable to be controlled does not depend strongly on the hidden variables, then it is possible to conclude that the true causal distribution is close to the observed distribution. This bound is in terms of the mutual information between the controlled variable and any confounding variables.
Abstract: There are many problems in real life that involve aggregating evaluation from people, such as hiring, peer grading and conference peer review. In this talk,
I describe three types of biases that may arise in such problems, and propose methods to mitigate them. (1) We consider miscalibration, that is, different people have different calibration scales. We propose randomized algorithms that provably extract useful information under arbitrary miscalibration. (2) We consider the bias induced by the outcome experienced by people. For example, student ratings in teaching evaluation are affected by the grading leniency of the instructors. We propose an adaptive algorithm that debiases people's ratings under very mild assumptions of the biases. (3) Estimation bias arises when algorithms yield different performance on different subgroups of the population. We analyze the statistical bias (defined as the expected value of the estimate minus the true value) when using the maximum-likelihood estimator on pairwise comparison data, and then propose a simple modification of the estimator to reduce the bias.
Bio: Jingyan Wang is a final-year PhD student at Carnegie Mellon University advised by Nihar Shah. Her research interests lie in modeling and reducing biases in decision making problems such as peer grading and peer review, using tools from statistics and machine learning. She is the recipient of the Best Student Paper Award at AAMAS 2019. She received a B.S. in Electrical Engineering and Computer Sciences with a minor in Mathematics from the University of California, Berkeley in 2015.
Abstract: A conjecture by Alon, Krivelevish, Sudakov in 1999 states that for any graph \(H\),
there is a constant \(c_H>0\) such that if \(G\) is \(H\)-free of maximum degree \(\Delta\), then \(\chi(G) \leq c_H \Delta / \log\Delta\).
It follows from work by Davies et al. in 2020 that this conjecture holds for \(H\) bipartite (specifically \(H = K_{t, t}\)), with the constant \(c_H = (t+o(1))\).
We improve this constant to \(1+o(1)\) so it does not depend on \(H\), and extend our result to the DP-coloring (also known as correspondence coloring) case
introduced by Dvorak and Postle. That is, we show for every bipartite graph \(B\),
if \(G\) is \(B\)-free with maximum degree \(\Delta\), then \(\chi_{DP}(G) \leq (1+o(1))\Delta/\log(\Delta)\).
This is joint work with Anton Bernshteyn and James Anderson.
Abstract: Local spectral expansion is a method of analyzing random walks over set systems, e.g. on spanning trees, independent sets and etc., by analyzing smaller “local” random walks. This method was recently used in a number of breakthroughs in sampling problems. I will give a basic introduction to this method and survey some recent results. The main focus of this talk will be proving Oppenheim’s Theorem of Downward Trickling of Expansion (https://arxiv.org/abs/1709.04431), which is a particular method of proving “local spectral expansion”. No prior knowledge will be assumed.
Abstract: We give a polynomial-time algorithm for the problem of robustly estimating a mixture of \(𝑘\) arbitrary Gaussians in \(R^𝑑\), for any fixed \(𝑘\), in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient partial clustering algorithm that relies on the sum-of-squares method, and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms.
Abstract:
When a fault occurs in an electric distribution network, consumers downstream of the fault will lose power. We develop and analyze a distributed approach in which switches in the network attempt to close in a specified order upon detection of a fault, automatically restoring power. The problem of finding a switch reconnection ordering that optimizes reliability metrics is a special case of the well-known Min Sum Set Cover problem (MSSC), and we show it is NP-hard.
We generalize the kernel-based rounding approaches of Bansal et al. to give tight approximation guarantees for MSSC on c-uniform hypergraphs for all c. For all instances of MSSC, our methods have a strictly better approximation ratio guarantee than the best possible methods for general MSSC.
Finally, we consider optimizing multiple metrics simultaneously using local search methods that also reconfigure the system's base tree, ensuring fairness in outage times and reducing energy loss. We evaluate the performance of our reconfiguration methods and computationally validate our approach on synthetic networks.
Abstract: Optimization and machine learning algorithms often use real-world data that has been generated through complex socio-economic and behavioral processes. This data, however, is noisy, and naturally encodes difficult-to-quantify systemic biases. In this work, we model and address bias in the secretary problem, which has applications in hiring. We assume that utilities of candidates are scaled by unknown bias factors, perhaps depending on demographic information, and show that bias-agnostic algorithms are suboptimal in terms of utility and fairness. We propose bias-aware algorithms that achieve certain notions of fairness, while achieving order-optimal competitive ratios in several settings.
Abstract: We present a new rounding framework and improve the approximation bounds for min sum vertex cover and generalized min sum set cover, also known as multiple intents re-ranking problem. These classical combinatorial optimization problems find applications in scheduling, speeding up semidefinite-program solvers, and query-results diversification, among others.
Our algorithm is based on transforming the LP solution by a suitable kernel and applying a randomized rounding. It also gives a linear-programming based 4-approximation algorithm for min sum set cover, i.e., best possible due to Feige, Lovász, and Tetali. As part of the analysis of our randomized algorithm we derive an inequality on the lower tail of a sum of independent Bernoulli random variables, which may be of independent interest.
Joint work with Nikhil Bansal, Jatin Batra, and Prasad Tetali. [arXiv:2007.09172]
Abstract: I describe my ongoing work using tools from computational and combinatorial algebraic geometry to classify minimal problems and identify which can be solved efficiently.
I will not assume any background in algebraic geometry or computer vision.
Structure-from-motion algorithms reconstruct a 3D scene from many images, often by matching features (such as point and lines) between the images.
Matchings lead to constraints, resulting in a nonlinear system of polynomial equations that recovers the 3D geometry.
Since many matches are outliers, these methods are used in an iterative framework for robust estimation called RANSAC (RAndom SAmpling And Consensus), whose efficiency hinges on using a small number of correspondences in each iteration.
As a result, there is a big focus on constructing polynomial solvers for these "minimal problems" that run as fast as possible.
Our work classifies these problems in cases of practical interest (calibrated cameras, complete and partial visibility.)
Moreover, we identify candidates for practical use, as quantified by "algebraic complexity measures" (degree, Galois group.)
joint w/ Anton Leykin, Kathlen Kohn, Tomas Pajdla arxiv1903.10008 arxiv2003.05015+ Viktor Korotynskiy, TP, and Margaret Regan (ongoing.)
Abstract: In many optimization problems, a feasible solution induces a multi-dimensional cost vector. For example, in load-balancing a schedule induces a load vector across the machines. In k-clustering, opening k facilities induces an assignment cost vector across the clients. In this paper we consider the following minimum norm optimization problem : given an arbitrary monotone, symmetric norm, find a solution which minimizes the norm of the induced cost-vector. This generalizes many fundamental NP-hard problems. We give a general framework to tackle the minimum norm problem, and illustrate its efficacy in load balancing and, time permitting, in the clustering setting.
Abstract: Semidefinite programming is a powerful optimization tool, which involves optimizing linear functions on a slice of the positive semidefinite matrices. Locally PSD matrices are a natural relaxation of the PSD matrices which can be useful in reducing the space required for semidefinite optimization. We use the theory of hyperbolic polynomials to give precise quantitative bounds on the quality of the approximation resulting from optimizing over the locally-psd cone instead of the PSD cone.
I will present some upper and lower bounds for the \(\ell_2\) norm of negative eigenvalues of any locally PSD matrices, which have unit Frobenius norm. The focus will be on the variety of techniques and connections to other areas of research.
Abstract: A sunflower with \(p\) petals consists of \(p\) sets whose pairwise intersections are all the same set. The goal of the sunflower problem is to find the smallest \(r = r(p,k)\) such that every family of at least \(r^k\) \(k\)-element sets must contain a sunflower with \(p\) petals. Major breakthroughs within the last year by Alweiss-Lovett-Wu-Zhang and others show that \(r = O(p \log(pk))\) suffices. In this talk, after reviewing the history and significance of the Sunflower Problem, I will present our improvement to \(r = O(p \log k)\), which we obtained during the 2020 REU at Georgia Tech. As time permits, I will elaborate on key lemmas and techniques used in recent improvements.
Based on joint work with Suchakree Chueluecha (Lehigh University) and Lutz Warnke (Georgia Tech), see https://arxiv.org/abs/2009.09327
Abstract: The crossing number of a graph is the minimum number of crossings it can be drawn in a plane. Let \(\kappa(n, m)\) be the minimum crossing number of graphs with \(n\) vertices and (at least) \(m\) edges. Erd˝os and Guy conjectured and Pach, Spencer and T´oth proved that for any \(m = m(n)\) satisfying \(n \ll m \ll n^2\), the quatity \(\lim_{n \to \infty} \frac{\kappa(n,m) n^2}{m^3}\) exists and is positive. The \(k\)-planar crossing number of a graph is the minimum crossing number obtained when we partition the edges of the graph into \(k\) subgraphs and draw them in \(k\) planes. Using designs and a probabilistic algorithm, the guaranteed factor of improvement \(\alpha_k\) between the \(k\)-planar and regular crossing number is \(\frac{1}{k^2} (1 + o(1))\), while if we restrict our attention to biplanar graphs, this constant is \(\beta_k = \frac{1}{k^2}\) exactly. The lower bound proofs require the existence of a midrange crossing constant. Motivated by this, we show that the midrange crossing constant exists for all graph classes (including bipartite graphs) that satisfy certain mild conditions. The regular midrange crossing constant was shown to be is at most \(\frac{8}{9\pi^2}\); we present a probabilistic construction that also shows this bound.
Abstract:
For computational-intensive mixed integer non-linear optimization problems, a major challenge is to verify/guarantee the quality of any feasible solution under mild assumptions in a tractable fashion. In this talk, we focus on tackling this challenge by constructing tight relaxations and designing approximation algorithms for two different mixed integer non-linear optimization problems.
In the first part, we focus on the (row) sparse principal component analysis (rsPCA) problem.
Solving rsPCA is the problem of finding the top-r leading principal components of a covariance matrix such that all these principal components are linear combinations of a subset of k variables. The rsPCA problem is a widely used dimensionality reduction tool with an additional sparsity constraint to enhance its interpretability. We propose: (a) a convex integer programming relaxation of rsPCA that gives upper (dual) bounds for rsPCA, and; (b) a new local search algorithm for finding primal feasible solutions for rsPCA. We also show that, in the worst-case, the dual bounds provided by the convex IP are within an affine function of the global optimal value. We demonstrate our techniques applied to large-scale covariance matrices.
In the second part, we focus on improving the execution speed of compute-intensive numerical code. The compute-intensive numerical code, especially of the variety encountered in deep neural network inference and training, is often written using nested for-loops. One of the main bottlenecks that significantly influence the nested for-loops' execution speed is the so-called memory latency. Iteration space tiling is a common memory management technique used to deal with memory latency. We study the problem of automatically optimizing the implementation of these nested loops by formulating the iteration space tiling problem into an integer geometric programming (IGP) problem. We show how to design an efficient approximation algorithm for this problem and how to use the so-called "non-uniform tiling" technique to improve the execution speed.
The first part of the talk is joint work with Santanu S. Dey, Rahul Mazumder, Macro Molinaro, and the second part of the talk is joint work with Ofer Dekel.
Abstract: Various notions of dimension are important throughout mathematics, and for graphs the so-called Prague dimension was introduced by Nesetril, Pultr and Rodl in the 1970s.
Proving a conjecture of Furedi and Kantor, we show that the Prague dimension of the binomial random graph is typically of order \(n/\log n\) for constant edge-probabilities.
The main new proof ingredient is a Pippenger-Spencer type edge-coloring result for random hypergraphs with large uniformities, i.e., edges of size \(O(\log n)\).
Based on joint work with He Guo and Lutz Warnke.
Abstract: We consider the problem of learning optimal reserve price in repeated auctions against non- myopic bidders, who may bid strategically in order to gain in future rounds even if the single- round auctions are truthful. Previous algorithms, e.g., empirical pricing, do not provide non- trivial regret rounds in this setting in general. We introduce algorithms that obtain a small regret against non-myopic bidders either when the market is large, i.e., no single bidder appears in more than a small constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one. Our approach carefully controls what information is revealed to each bidder, and builds on techniques from differentially private online learning as well as the recent line of works on jointly differentially private algorithms.
Abstract: We give an efficient algorithm for robustly clustering of a mixture of two arbitrary Gaussians, a central open problem in the theory of computationally efficient robust estimation, assuming only that the the means of the component Gaussian are well-separated or their covariances are well-separated. Our algorithm and analysis extend naturally to robustly clustering mixtures of well-separated logconcave distributions. The mean separation required is close to the smallest possible to guarantee that most of the measure of the component Gaussians can be separated by some hyperplane (for covariances, it is the same condition in the second degree polynomial kernel). Our main tools are a new identifiability criterion based on isotropic position, and a corresponding Sum-of-Squares convex programming relaxation.
In the \(k\)-cut problem, we are given an edge-weighted graph and want to find the least-weight set of edges whose deletion breaks the graph into \(k\) connected components. Algorithms due to Karger-Stein and Thorup showed how to find such a minimum \(k\)-cut in time approximately \(O(n^{2k-2})\). The best lower bounds come from conjectures about the solvability of the \(k\)-clique problem and a reduction from \(k\)-clique to \(k\)-cut, and show that solving \(k\)-cut is likely to require time \(\Omega(n^k)\). Our recent results have given special-purpose algorithms that solve the problem in time \(n^{1.98k + O(1)}\), and ones that have better performance for special classes of graphs (e.g., for small integer weights). In this work, we resolve the problem for general graphs, by showing that for any fixed \(k \geq 2\), the Karger-Stein algorithm outputs any fixed minimum \(k\)-cut with probability at least \(\widehat{O}(n^{-k})\), where \(\widehat{O}(\cdot)\) hides a \(2^{O(\ln \ln n)^2}\) factor. This also gives an extremal bound of \(\widehat{O}(n^k)\) on the number of minimum \(k\)-cuts in an \(n\)-vertex graph and an algorithm to compute a minimum \(k\)-cut in similar runtime. Both are tight up to \(\widehat{O}(1)\) factors. The first main ingredient in our result is a fine-grained analysis of how the graph shrinks---and how the average degree evolves---under the Karger-Stein process. The second ingredient is an extremal result bounding the number of cuts of size at most \((2-\delta) OPT/k\), using the Sunflower lemma.
Abstract: Motivated by the Dikin walk, we develop aspects of an interior-point theory for sampling in high dimensions. Specifically, we introduce symmetric and strong self-concordance. These properties imply that the corresponding Dikin walk mixes in \(\tilde{O}(n\bar{\nu})\) steps from a warm start in a convex body in \(\mathbb{R}^{n}\) using a strongly self-concordant barrier with symmetric self-concordance parameter \(\bar{\nu}\). For many natural barriers, \(\bar{\nu}\) is roughly bounded by \(\nu\), the standard self-concordance parameter. We show that this property and strong self-concordance hold for the Lee-Sidford barrier. As a consequence, we obtain the first walk to mix in \(\tilde{O}(n^{2})\) steps for an arbitrary polytope in \(\mathbb{R}^{n}\). Strong self-concordance for other barriers leads to an interesting (and unexpected) connection --- for the universal and entropic barriers, it is implied by the KLS conjecture.
Abstract: Optimization and machine learning algorithms often use real-world data that has been generated through complex socio-economic and behavioral processes. This data, however, is noisy, and naturally encodes difficult-to-quantify systemic biases. In this work, we model and address bias in the secretary problem, which has applications in hiring. We assume that utilities of candidates are scaled by unknown bias factors, perhaps depending on demographic information, and show that bias-agnostic algorithms are suboptimal in terms of utility and fairness. We propose bias-aware algorithms that achieve certain notions of fairness, while achieving order-optimal competitive ratios in several settings.
March 20: Spring Break
April 17: Dr. Lutz Warnke, Math, Georgia TechAbstract: In this talk we introduce some machine learning problems in the setting of undirected graphical models, also known as spin systems. We take proper colorings as a representative example of a hard-constraint graphical model. The classic problem of sampling a proper coloring uniformly at random of a given graph has been well-studied. Here we consider two inverse problems: Given random colorings of an unknown graph G, can we recover the underlying graph G exactly? If we are also given a candidate graph H, can we tell if G=H? The former problem is known as structure learning in the machine learning field and the latter is called identity testing. We show the complexity of these problems in different range of parameters and compare these results with the corresponding decision and sampling problems. Finally, we give some results of the analogous problems for the Ising model, a typical soft-constraint model. Based on joint work with Ivona Bezakova, Antonio Blanca, Daniel Stefankovic and Eric Vigoda.
Abstract: For the first time in 2020, the US Census Bureau will apply a differentially private algorithm before publicly releasing decennial census data. Recently, the Bureau publicly released their code and end-to-end tests on the 1940 census data at various privacylevels. We will outline the DP algorithm (which is still being developed) and discuss the accuracy of these end-to-end tests. In particular, we focus on the bias and variance of the reported population counts. Finally, we discuss the choices the Bureau has yet to make that will affect the balance between privacy and accuracy. This talk is based on joint work with Abraham Flaxman.
Abstract: Graphs, which in their simplest forms are vertices connected by edges,
are widely used in high performance computing, machine learning and
network science. This talk will use recent progresses on two
well-studied algorithmic problems in static and dynamic graph,
max-flows and dynamic matchings, to discuss a methodology for
designing faster algorithm for large graphs. This approach is
motivated by a fundamental phenomenon in data structures: the
advantages of offline data structures over online ones.
I will start by describing how work on max-flows led to a focus on
finding short paths in residual graphs, and how investigating more
global notions of progress in residual graphs led to a more
sophisticated and general understanding of iterative methods and
preconditioning. I will then discuss a similar phenomenon in dynamic
graphs, where maintaining a large matching seems to require the online
detection of short augmenting paths, but can once again be
circumvented through the offline construction of smaller equivalent
graphs.
Abstract: In this talk we introduce a refined alteration approach for constructing \(H\)-free graphs: we show that removing
all edges in \(H\)-copies of the binomial random graph does not significantly change the independence number
(for suitable edge-probabilities); previous alteration approaches of Erdös and Krivelevich remove only a
subset of these edges. We present two applications to online graph Ramsey games of recent interest,
deriving new bounds for Ramsey, Paper, Scissors games and online Ramsey numbers (each time extending recent results of Fox–He–Wigderson and Conlon–Fox–Grinshpun–He).
Based on joint work with Lutz Warnke.
Abstract: In the past decade, the catalog of algorithms available to combinatorial optimizers has been substantially extended to settings which allow submodular objective functions. One significant recent result was a tight (1-1/e)-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint. These algorithmic developments were happening concurrently with research that found a wealth of new applications for submodular optimization in machine learning, recommender systems, and algorithmic game theory.
The related supermodular maximization models also offer an abundance of applications, but they appeared to be highly intractable even under simple cardinality constraints and even when the function has a nice structure. For example, the densest subgraph problem - suspected to be highly intractable - can be expressed as a monotonic supermodular function which has a particularly nice form. Namely, the objective can be expressed by a quadratic form \(x^T A x\) where \(A\) is a non-negative, symmetric, 0-diagonal matrix. On the other hand, when the entries \(A(u,v)\) form a metric, it has been shown that the associated maximization problems have constant factor approximations. Inspired by this, we introduce a parameterized class of non-negative functions called meta-submodular functions that can be approximately maximized within a constant factor. This class includes metric diversity, monotone submodular and other objectives appearing in the machine learning and optimization literature. A general meta-submodular function is neither submodular nor supermodular and so its multi-linear extension does not have the nice convexity/concavity properties which hold for submodular functions. They do, however, have an intrinsic one-sided smoothness property which is essential for our algorithms. This smoothness property might be of independent interest.
Abstract: Expander decomposition has been a central tool in designing graph algorithms in many fields (including fast centralized algorithms, approximation algorithms and property testing) for decades. Recently, we found that it also gives many impressive applications in dynamic graph algorithms and distributed graph algorithms. In this talk, I will survey these recent results based on expander decomposition, explain the key components for using this technique, and give some toy examples on how to apply these components.
October 18: ARC colloquium
Abstract: We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we develop a specific method that combines splitting with more accurate integration. For a broad class of \(d\)-dimensional distributions arising from generalized linear models, we prove that the resulting third-order algorithm produces samples from a distribution that is at most \(\varepsilon\) in Wasserstein distance from the target distribution in \(O(d^{1/3}/ \varepsilon^{2/3})\) steps. This result requires only Lipschitz conditions on the gradient. For general strongly convex potentials with α-th order smoothness, we prove that the mixing time scales as \(O (d^{1/3} / \varepsilon^{2/3} + d^{1/2} / \varepsilon^{1 / (\alpha - 1)} )\). Our high-order Langevin diffusion reduces the problem of log-concave sampling to numerical integration along a fixed deterministic path, which makes it possible for further improvements in high-dimensional MCMC problems. Joint work with Yi-An Ma, Martin J, Wainwright, Peter L. Bartlett and Michael I. Jordan.
Abstract: For an \(n\times n\) matrix \(A_n\), the \(r\to p\) operator norm is defined as \(\|A_n\|_{r \to p}= \sup_{\|x\|_r\leq 1 } \|A_n x\|_p\) for \(r,p\geq 1\). The \(r\to p\) operator norm puts a huge number of important quantities of interest in diverse disciplines under a single unified framework. The application of this norm spans a broad spectrum of areas including data-dimensionality reduction in machine learning, finding oblivious routing schemes in transportation network, and matrix condition number estimation.
In this talk, we will consider the \(r\to p\) norm of a class of symmetric random matrices with nonnegative entries, which includes the adjacency matrices of the Erd\H{o}s-R\'enyi random graphs and matrices with sub-Gaussian entries. For \(1< p \leq r < \infty\), we establish the asymptotic normality of the appropriately centered and scaled \(\|A_n\|_{r \to p}\), as \(n\to\infty\). The special case \(r=p=2\), which corresponds to the largest singular value of matrices, was proved in a seminal paper by Füredi and Komlós (1981). Of independent interest, we further obtain a sharp \(\ell_\infty\)-approximation for the maximizer vector. The results also hold for sparse matrices and further the \(\ell_\infty\)-approximation for the maximizer vector also holds for a broad class of deterministic sequence of matrices with certain asymptotic `expansion' properties.
This is based on a joint work with Souvik Dhara (MIT) and Kavita Ramanan (Brown U.).
Abstract: In this talk, we provide the details of our faster width-dependent algorithm for mixed packing-covering LPs. Mixed packing-covering LPs are fundamental to combinatorial optimization in computer science and operations research. Our algorithm finds a (1+\epsilon\) approximate solution in time \(O(Nw/ \varepsilon)\), where \(N\) is number of nonzero entries in the constraint matrix, and $w$ is the maximum number of nonzeros in any constraint. This algorithm is faster than Nesterov's smoothing algorithm which requires \(O(N\sqrt{n}w/ \epsilon)\) time, where \(n\) is the dimension of the problem. Our work utilizes the framework of area convexity introduced in [Sherman-FOCS’17] to obtain the best dependence on $\varepsilon$ while breaking the infamous \(\ell_{\infty}\) barrier to eliminate the factor of \(\sqrt{n}\). The current best width-independent algorithm for this problem runs in time \(O(N/\epsilon^2)\) [Young-arXiv-14] and hence has worse running time dependence on \(\varepsilon\). Many real life instances of mixed packing-covering problems exhibit small width and for such cases, our algorithm can report higher precision results when compared to width-independent algorithms. As a special case of our result, we report a \(1+\varepsilon\) approximation algorithm for the densest subgraph problem which runs in time \(O(md/ \varepsilon)\), where \(m\) is the number of edges in the graph and $d$ is the maximum graph degree.
Abstract: Fictitious play is one of the simplest and most natural dynamics for two-player zero-sum games. Originally proposed by Brown in 1949, the fictitious play dynamic has each player simultaneously best-respond to the distribution of historical plays of their opponent. In 1951, Robinson showed that fictitious play converges to the Nash Equilibrium, albeit at an exponentially-slow rate, and in 1959, Karlin conjectured that the true convergence rate of fictitious play after k iterations is \(O(k^{-1/2})\), a rate which is achieved by similar algorithms and is consistent with empirical observations. Somewhat surprisingly, Daskalakis and Pan disproved a version of this conjecture in 2014, showing that an exponentially-slow rate can occur, although their result relied on adversarial tie-breaking. In this talk, we show that Karlin’s conjecture holds if ties are broken lexicographically and the game matrix is diagonal. We also show a matching lower bound under this tie-breaking assumption. This is joint work with Jake Abernethy and Andre Wibisono.
November 29: Thanksgiving Break
Abstract: In this talk we will prove a conjecture of Talagrand, which is a fractional version of the “expectation-threshold” conjecture of Kalai and Kahn. This easily implies various difficult results in probabilistic combinatorics, e.g. thresholds for perfect hypergraph matchings (Johansson-Kahn-Vu) and bounded-degree spanning trees (Montgomery). Our approach builds on recent breakthrough work of Alweiss, Lovett, Wu, and Zhang on the Erdős-Rado “Sunflower Conjecture.”
This is joint work with Keith Frankston, Jeff Kahn, and Bhargav Narayanan.
Abstract: In current convex optimization literature, there are significant gaps
between algorithms that produce high accuracy
\((1+1/poly(n))\)-approximate solutions vs. algorithms that produce
\(O(1)\)-approximate solutions for symmetrized special cases. This gap is
reflected in the differences between interior point methods vs.
(accelerated) gradient descent for regression problems, and between
exact vs. approximate undirected max-flow.
In this talk, I will discuss generalizations of a fundamental building
block in numerical analysis, preconditioned iterative methods, to
convex functions that include p-norms. This leads to algorithms that
converge to high accuracy solutions by crudely solving a sequence of
symmetric residual problems. I will then briefly describe several
recent and ongoing projects, including p-norm regression using \(m^{1/3}\)
linear system solves, \(p\)-norm flow in undirected unweighted graphs in
almost-linear time, and further improvements to the dependence on \(p\) in
the runtime.
Abstract: We present a new general and simple method for rounding
semi-definite programs, based on Brownian motion. Our approach is inspired by
recent results in algorithmic discrepancy theory. We develop and present tools
for analyzing our new rounding algorithms, utilizing mathematical machinery
from the theory of Brownian motion, complex analysis, and partial differential
equations. We will present our method to several classical problems, including Max-Cut, Max-di-cut and Max-2-SAT, and derive new algorithms that are competitive with the best known results. In particular, we show that the basic algorithm achieves 0.861-approximation for Max-cut and a natural variant of the algorithm achieve 0.878-approximation, matching the famous Goemans-Williamson algorithm upto first three decimal digits.
This is joint work with Abbas-Zadeh, Nikhil Bansal, Guru Guruganesh, Sasho Nikolov and Roy Schwartz.
Abstract: Powerful AI systems, which are driven by machine learning, are
increasingly controlling various aspects of modern society: from
social interactions (e.g., Facebook, Twitter, Google, YouTube),
economics (e.g., Uber, Airbnb, Banking), learning (e.g., Wikipedia,
MOOCs), governance (Judgements, Policing, Voting), to autonomous
vehicles and weapons. These systems have a tremendous potential to
change our lives for the better, but, via the ability to mimic and
nudge human behavior, they also have the potential to be
discriminatory, reinforce societal prejudices, and polarize opinions.
Moreover, recent studies have demonstrated that these systems can be
quite brittle and generally lack the required robustness to be
deployed in various civil/military situations. The reason being that
considerations such as fairness, robustness, stability,
explainability, accountability etc. have largely been an afterthought
in the development of AI systems. In this talk, I will discuss the
opportunities that lie ahead in a principled and thoughtful
development of AI systems.
Bio
Nisheeth Vishnoi is a Professor of Computer Science at Yale University.
He received a B.Tech in Computer Science and Engineering from IIT
Bombay in 1999 and a Ph.D. in Algorithms, Combinatorics and
Optimization from Georgia Tech in 2004. His research spans several
areas of theoretical computer science: from approximability of NP-hard
problems, to combinatorial, convex and non-convex optimization, to
tackling algorithmic questions involving dynamical systems, stochastic
processes and polynomials. He is also broadly interested in
understanding and addressing some of the key questions that arise in
nature and society from the viewpoint of theoretical computer science.
Here, his current focus is on natural algorithms, emergence of
intelligence, and questions at the interface of AI, ethics, and
society. He was the recipient of the Best Paper Award at FOCS in 2005,
the IBM Research Pat Goldberg Memorial Award in 2006, the Indian
National Science Academy Young Scientist Award in 2011, and the IIT
Bombay Young Alumni Achievers Award in 2016.
Abstract: The popularity of machine learning is increasingly growing in transportation, with applications ranging from traffic engineering to travel demand forecasting and pavement material modeling, to name just a few. Researchers often find that machine learning achieves higher predictive accuracy compared to traditional methods. However, many machine-learning methods are often viewed as “black-box” models, lacking interpretability for decision making. As a result, increased attention is being devoted to the interpretability of machine-learning results.
In this talk, I introduce the application of machine learning to study travel behavior, covering both mode prediction and behavioral interpretation. I first discuss the key differences between machine learning and logit models in modeling travel mode choice, focusing on model development, evaluation, and interpretation. Next, I apply the existing machine-learning interpretation tools and also propose two new model-agnostic interpretation tools to examine behavioral heterogeneity. Lastly, I show the potential of using machine learning as an exploratory tool to tune the utility functions of logit models.
I illustrate these ideas by examining stated-preference travel survey data for a new mobility-on-demand transit system that integrates fixed-route buses and on-demand shuttles. The results show that the best-performing machine-learning classifier results in higher predictive accuracy than logit models as well as comparable behavioral outputs. In addition, results obtained from model-agnostic interpretation tools show that certain machine-learning models (e.g. boosting trees) can readily account for individual heterogeneity and generate valuable behavioral insights on different population segments. Moreover, I show that interpretable machine learning can be applied to tune the utility functions of logit models (e.g. specifying nonlinearities) and to enhance their model performance. In turn, these findings can be used to inform the design of new mobility services and transportation policies.
Bio: Xilei Zhao is a Postdoctoral Fellow in the School of Industrial and Systems Engineering at the Georgia Institute of Technology. Dr. Zhao received her B.E. in Civil Engineering from Southeast University, China, in 2013, her MSEs in Civil Engineering and Applied Mathematics and Statistics from Johns Hopkins University (JHU) in 2016 and 2017, respectively, and her Ph.D. in Civil Engineering from JHU in 2017. She specializes in applying data science (data analytics, machine learning, optimization, and simulation), complex systems modeling, and risk assessment to tackle challenging problems in transportation and resilience.
Abstract:
Correlation Clustering is an elegant model that captures fundamental graph cut problems such as Minimum s-t Cut, Multiway Cut, and Multicut, extensively studied in combinatorial optimization.
Here, we are given a graph with edges labeled + or - and the goal is to produce a clustering that agrees with the labels as much as possible: + edges within clusters and - edges across clusters.
The classical approach towards Correlation Clustering (and other graph cut problems) is to optimize a global objective, e.g., minimizing the total number of disagreements or maximizing the total number of agreements.
We depart from this and study local objectives: minimizing the maximum number of disagreements for edges incident on a single node, and the analogous max min agreements objective.
This naturally gives rise to a family of basic min-max graph cut problems.
A prototypical representative is Min-Max s-t Cut: find an s-t cut minimizing the largest number of cut edges incident on any node.
In this talk we will give a short introduction of Correlation Clustering and discuss the following results:
(1) an \(O(\sqrt{n})\)-approximation for the problem of minimizing the maximum total weight of disagreement edges incident on any node (thus providing the first known approximation for the above family of min-max graph cut problems)
(2) a remarkably simple 7-approximation for minimizing local disagreements in complete graphs (improving upon the previous best known approximation of 48)
(3) a \((1/(2+\epsilon))\)-approximation for maximizing the minimum total weight of agreement edges incident on any node, hence improving upon the \((1/(4+\epsilon))\)-approximation that follows from the study of approximate pure Nash equilibria in cut and party affiliation
games.
Join work with Moses Charikar and Neha Gupta.
Abstract: In an optimal design problem, we are given a set of linear experiments \(v_1,...,v_n \in R^d\) and \(k \ge d\), and our goal is to select a set or a multiset \(S\) subseteq \([n]\) of size \(k\) such that \(\Phi((\sum_{i \in [n]} v_i v_i^T )^{-1})\) is minimized. When \(\Phi(M) = det(M)^{1/d}\), the problem is known as the D-optimal design problem, and when \(\Phi(M) = tr(M)\), it is known as the A-optimal design problem. One of the most common heuristics used in practice to solve these problems is the local search heuristic, also known as the Fedorov's exchange method. This is due to its simplicity and its empirical performance. However, despite its wide usage no theoretical bound has been proven for this algorithm. In this paper, we bridge this gap and prove approximation guarantees for the local search algorithms for D-optimal design and A-optimal design problems. We show that the local search algorithms are asymptotically optimal when \(\frac{k}{d}\) is large. In addition to this, we also prove similar approximation guarantees for the greedy algorithms for D-optimal design and A-optimal design problems when \(k/d\) is large.
Abstract:
In network routing users often tradeoff different objectives in selecting their best route. An example is transportation networks, where due to uncertainty of travel times, drivers may tradeoff the average travel time versus the variance of a route. Or they might tradeoff time and cost, such as the cost paid in tolls.
We wish to understand the effect of two conflicting criteria in route selection, by studying the resulting traffic assignment (equilibrium) in the network. We investigate two perspectives of this topic: (1) How does the equilibrium cost of a risk-averse population compare to that of a risk-neutral population? (i.e., how much longer do we spend in traffic due to being risk-averse) (2) How does the equilibrium cost of a heterogeneous population compare to that of a comparable homogeneous user population?
We provide characterizations to both questions above.
Based on joint work with Richard Cole, Thanasis Lianeas and Nicolas Stier-Moses.
At the end I will mention current work of my research group on algorithms and mechanism design for power systems.
Biography:
Evdokia Nikolova is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Texas at Austin, where she is a member of the Wireless Networking & Communications Group. Previously she was an Assistant Professor in Computer Science and Engineering at Texas A&M University. She graduated with a BA in Applied Mathematics with Economics from Harvard University, MS in Mathematics from Cambridge University, U.K. and Ph.D. in Computer Science from MIT.
Nikolova's research aims to improve the design and efficiency of complex systems (such as networks and electronic markets), by integrating stochastic, dynamic and economic analysis. Her recent work examines how human risk aversion transforms traditional computational models and solutions. One of her algorithms has been adapted in the MIT CarTel project for traffic-aware routing. She currently focuses on developing algorithms for risk mitigation in networks, with applications to transportation and energy. She is a recipient of an NSF CAREER award and a Google Faculty Research Award. Her research group has been recognized with a best student paper award and a best paper award runner-up. She currently serves on the editorial board of the journal Mathematics of Operations Research.
Abstract: We study the dynamic graph connectivity problem in the massively parallel computation model. We give a data structure for maintaining a dynamic undirected graph that handles batches of updates and connectivity queries in constant rounds, as long as the queries fit on a single machine. This assumption corresponds to the gradual buildup of databases over time from sources such as log files and user interactions. Our techniques combine a distributed data structure for Euler Tour (ET) trees, a structural theorem about rapidly contracting graphs via sampling \(n^{\epsilon}\) random neighbors, as well as modifications to sketching based dynamic connectivity data structures.
Joint work with David Durfee, Janardhan Kulkarni, Richard Peng and Xiaorui Sun.
Abstract: As a generalization of many classic problems in combinatorial optimization, submodular optimization has found a wide range of applications in machine learning (e.g., in feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a (distributed) algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by such applications, we study the adaptivity and query complexity of adaptive submodular optimization.
Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint \(k\) that achieves a \((1-1/e-\varepsilon)\)-approximation in expectation. Furthermore, this algorithm runs in \(O(\log(n))\) adaptive rounds and makes \(O(n)\) calls to the function evaluation oracle in expectation. All three of these guarantees are optimal, and the query complexity is substantially less than in previous works. Finally, to show the generality of our simple algorithm and techniques, we extend our results to the submodular cover problem.
Joint work with Vahab Mirrokni and Morteza Zadimoghaddam (arXiv:1807.07889).
Abstract: Given a finite partially ordered set (poset) of width \(w\), Dilworth's theorem gives an existence and minimality of a chain partition of size \(w\). First-Fit is an online algorithm for creating a chain partition of a poset. Given a linear ordering of the points of the poset, \(v_1, \cdots, v_n\), First-Fit assigns the point \(v_i\) to the first available chain in the chain partition of the points \(v_1, \cdots, v_{i-1}\). It is a known fact that First-Fit has unbounded performance over the family of finite posets of width 2. We give a complete characterization of the family of finite posets in which First-Fit performs with subexponential efficiency in terms of width. We will also review what is currently known on the family of posets in which First-Fit performs with polynomial efficiency in terms of width. Joint work with Kevin Milans.
Abstract: Queueing systems are studied in various asymptotic regimes because they are hard to study in general. One popular regime of study is the heavy-traffic regime, when the system is loaded very close to its capacity. Heavy-traffic behavior of queueing systems is traditionally studied using fluid and diffusion limits. In this talk, I will present a recently developed method called the 'Drift Method', which is much simpler, and is based on studying the drift of certain test functions. In addition to exactly characterizing the heavy-traffic behavior, the drift method can be used to obtain lower and upper bounds for all loads. In this talk, I will present the drift method, and its successful application in the context of data center networks to resolve a decade-old conjecture. I will also talk about ongoing work and some open problems.
Bio: Siva Theja Maguluri is an Assistant Professor in the School of Industrial and Systems Engineering at Georgia Tech. Before that, he was a Research Staff Member in the Mathematical Sciences Department at IBM T. J. Watson Research Center. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in Electrical and Computer Engineering where he worked on resource allocation algorithms for cloud computing and wireless networks. Earlier, he received an MS in ECE and an MS in Applied Math from UIUC and a B.Tech in Electrical Engineering from IIT Madras. His research interests include Stochastic Processes, Optimization, Cloud Computing, Data Centers, Resource Allocation and Scheduling Algorithms, Networks, and Game Theory. The current talk is based on a paper that received the best publication in applied probability award, presented by INFORMS Applied probability society.
Abstract: At the heart of most algorithms today there is an optimization engine trying to learn online and provide the best decision, for e.g. rankings of objects, at any time with the partial information observed thus far in time. Often it becomes difficult to find near optimal solutions to many problems due to their inherent combinatorial structure that leads to certain computational bottlenecks. Submodularity is a discrete analogue of convexity and is a key property often exploited in tackling combinatorial optimization problems. In the first part of the talk, we will focus on computational bottlenecks that involve submodular functions: (a) convex function minimization over submodular base polytopes (for e.g. permutahedron) and (b) movement along a line inside submodular base polytopes. We give a conceptually simple and strongly polynomial algorithm Inc-Fix for the former, which is useful in computing Bregman projections in first-order projection-based methods like online mirror descent. For the latter, we will bound the iterations of the discrete Newton method which gives a running time improvement of at least \(n^6\) over the state of the art. This is joint work with Michel Goemans and Patrick Jaillet. In the second part of the talk, we will consider the dual problem of (a), i.e. minimization of composite convex and submodular objectives. We will resolve Bach's conjecture from 2015 about the running time of a popular Kelley's cutting plane variant to minimize these composite objectives. This is joint work with Madeleine Udell and Song Zhou.
Abstract: We investigate whether the standard dimensionality reduction techniques inadvertently produce data representations with different fidelity for two different populations. We show on several real-world datasets, PCA has higher reconstruction error on population A than B (for example, women versus men or lower versus higher-educated individuals). This can happen even when the dataset has similar number of samples from A and B . This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A as B . We give an efficient algorithm for finding a projection which is nearly-optimal with respect to this measure, and evaluate it on several datasets. This is a joint work with Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.
Abstract: We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions.
In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tails, and (ii) unknown bounded covariance. Given \(N\) samples on \(R^d\), an \(\epsilon\)-fraction of which may be arbitrarily corrupted, our algorithms run in time \(\tilde{O}(Nd)/poly(\epsilon)\) and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times \(\tilde{\Omega}(N d^2)\).
Our algorithms rely on a natural family of SDPs parameterized by our current guess \(\nu\) for the unknown mean \(\mu^\star\). We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for \(\mu^\star\) -- independent of our current guess \(\nu\) -- or the dual SDP yields a new guess \(\nu'\) whose distance from \(\mu^\star\) is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.
This is a joint work with Ilias Diakonikolas and Rong Ge.
Abstract: In this talk, I will present a popular distributed method, namely, distributed consensus-based gradient (DCG) method, for solving optimal learning problems over a network of agents. Such problems arise in many applications such as, finding optimal parameters over a large dataset distributed among a network of processors or seeking an optimal policy for coverage control problems in robotic networks. The focus is to present our recent results, where we study the performance of DCG when the agents are only allowed to exchange their quantized values due to their finite communication bandwidth. In particular, we develop a novel quantization method, which we refer to as adaptive quantization. The main idea of our approach is to quantize the nodes' estimates based on the progress of the algorithm, which helps to eliminate the quantized errors. Under the adaptive quantization, we then derive the bounds on the convergence rates of the proposed method as a function of the bandwidths and the underlying network topology, for both convex and strongly convex objective functions. Our results suggest that under the adaptive quantization, the rate of convergence of DCG with and without quantization are the same, except for a factor which captures the number of quantization bits. To the best of the authors’ knowledge, the results in this paper are considered better than any existing results for DCG under quantization.
This is based on a joint work with Siva Theja Maguluri and Justin Romberg.
Bio: Thinh T. Doan is a TRIAD postdoctoral fellow at Georgia Institute of Technology, joint between the School of Industrial and Systems Engineering and the School of Electrical and Computer Engineering (ECE). He was born in Vietnam, where he got his Bachelor degree in Automatic Control at Hanoi University of Science and Technology in 2008. He obtained his Master and Ph.D. degrees both in ECE from the University of Oklahoma in 2013 and the University of Illinois at Urbana-Champaign in 2018, respectively. His research interests lie at the intersection of control theory, optimization, distributed algorithms, and applied probability, with the main applications in machine learning, reinforcement learning, power networks, and multi-agent systems.
Abstract: We introduce a simple method for proving lower bounds for the size of the smallest percolating set in a certain graph bootstrap process. We apply this method to determine the sizes of the smallest percolating sets in multi- dimensional tori and multidimensional grids (in particular hypercubes). The former answers a question of Morrison and Noel, and the latter provides an alternative and simpler proof for one of their main results. This is based on a joint work with Lianna Hambardzumyan and Hamed Hatami.
Abstract: Often the popular press talks about the power of quantum computing coming from its ability to perform several computations simultaneously. We’ve already had a similar capability from probabilistic machines. This talk will explore the relationship between quantum and randomized computation, how they are similar and how they differ, and why quantum can work exponentially faster on some but far from all computational problems. We’ll talk about some open problems in quantum complexity that can help shed light on the similarities and differences between randomness and “quantumness”.
This talk will not assume any previous knowledge of quantum information or quantum computing.
Abstract: Consider a linear combination of independent identically distributed random variables \(X_1, . . . , X_n\) with fixed weights \(a_1, . . . a_n\). If the random variables
are continuous, the sum is almost surely non-zero. However, for discrete random variables an exact cancelation may occur with a positive probability. This
probability depends on the arithmetic nature of the sequence \(a_1, . . . a_n\). We will discuss how to measure the relevant arithmetic properties and how to evaluate the probability of the exact and approximate cancelation.
Abstract: In this talk we introduce two different random graph models that produce sparse graphs with overlapping community structure and discuss community detection in each context. The Random Overlapping Community (ROC) model produces a sparse graph by constructing many Erdos Renyi random graphs (communities) on small randomly selected subsets of vertices. By varying the size and density of these communities, ROC graphs can be tuned to exhibit a wide range normalized of closed walk count vectors, including those of hypercubes. This is joint work with Santosh Vempala. In the second half of the talk, we introduce the Community Configuration Model (CCM), a variant of the configuration model in which half-edges are assigned colors and pair according to a matching rule on the colors. The model is a generalization of models in the statistical physics literature and is a natural finite analog for classes of graphexes. We describe a hypothesis testing algorithm that determines whether a graph came from a community configuration model or a traditional configuration model. This is joint work with Christian Borgs, Jennifer Chayes, Souvik Dhara, and Subhabrata Sen.
Abstract: Studying random samples drawn from large,
complex sets is one way to begin to learn about typical properties and
behaviors. However, it is important that the samples examined are
random enough: studying samples that are unexpectedly correlated or
drawn from the wrong distribution can produce misleading conclusions.
Sampling processes using Markov chains have been utilized in physics,
chemistry, and computer science, among other fields, but they are often
applied without careful analysis of their reliability. Making sure
widely-used sampling processes produce reliably representative samples
is a main focus of my research, and in this talk I'll touch on two
specific applications from statistical physics and combinatorics.
I'll also discuss work applying these same Markov chain processes used
for sampling in a novel way to address research questions in
programmable matter and swarm robotics, where a main goal is to
understand how simple computational elements can accomplish complicated
system-level goals. In a constrained setting, we've answered this
question by showing that groups of abstract particles executing our
simple processes (which are derived from Markov chains) can provably
accomplish remarkable global objectives. In the long run, one goal is
to understand the minimum computational abilities elements need in
order to exhibit complex global behavior, with an eye towards
developing systems where individual components are as simple as
possible.
This talk includes joint work with Marta Andrés Arroyo, Joshua J.
Daymude, Daniel I. Goldman, David A. Levin, Shengkai Li, Dana Randall,
Andréa Richa, William Savoie, Alexandre Stauffer, and Ross Warkentin.
Abstract: Stochastic programming is concerned with
decision making under uncertainty, seeking an optimal policy with
respect to a set of possible future scenarios. While the value of
Stochastic Programming is obvious to many practitioners, in reality
uncertainty in decision making is oftentimes neglected.
For deterministic optimisation problems, a coherent chain of
modelling and solving exists. Employing standard modelling languages
and solvers for stochastic programs is however difficult. First, they
have (with exceptions) no native support to formulate Stochastic
Programs. Secondly solving stochastic programs with standard solvers
(e.g. MIP solvers) is often computationally intractable.
David will be talking about his research that aims to make
Stochastic Programming more accessible. First, he will be talking about
modelling deterministic and stochastic programs in the Constraint
Programming language MiniZinc
- a modelling paradigm that retains the structure of a problem much
more strongly than MIP formulations. Secondly, he will be talking about
decomposition algorithms he has been working on to solve combinatorial
Stochastic Programs.
Bio:
David is a PhD student at Monash University in Melbourne. Before
joining Monash, he completed a Bachelors degree in Systems Engineering
in Switzerland, studied Robotics at University of Bristol and worked in
various engineering roles. As part of his Phd, David is working on a
solver system for stochastic combinatorial optimisation problems that
are modelled in in the Constraint Programming language MiniZinc. In
future, he would like to combine his passion for automation with the
research he is doing in operations research.
Abstract: Physical sensors (thermal, light, motion,
etc.) are becoming ubiquitous and offer important benefits to society.
However, allowing sensors into our private spaces has resulted in
considerable privacy concerns. Differential privacy has been developed
to help alleviate these privacy concerns. In this talk, we’ll develop
and define a framework for releasing physical data that preserves both
utility and provides privacy. Our notion of closeness of physical data
will be defined via the Earth Mover Distance and we’ll discuss the
implications of this choice. Physical data, such as temperature
distributions, are often only accessible to us via a linear
transformation of the data.
We’ll analyse the implications of our privacy definition for linear
inverse problems, focusing on those that are traditionally considered
to be "ill-conditioned”. We’ll then instantiate our framework with the
heat kernel on graphs and discuss how the privacy parameter relates to
the connectivity of the graph. Our work indicates that it is possible
to produce locally private sensor measurements that both keep the exact
locations of the heat sources private and permit recovery of the
``general geographic vicinity'' of the sources. Joint work with Anna C.
Gilbert.
Abstract: A lot of well-studied problems in CS
Theory are about making “sketches” of graphs that occupy much less
space than the graph itself, but where the shortest path distances of
the graph can still be approximately recovered from the sketch. For
example, in the literature on Spanners, we seek a sparse subgraph whose
distance metric approximates that of the original graph. In Emulator
literature, we relax the requirement that the approximating graph is a
subgraph. Most generally, in Distance Oracles, the sketch can be an
arbitrary data structure, so long as it can approximately answer
queries about the pairwise distance between nodes in the original graph.
Research on these objects typically focuses on optimizing the
worst-case tradeoff between the quality of the approximation and the
amount of space that the sketch occupies. In this talk, we will survey
a recent leap in understanding about this tradeoff, overturning the
conventional wisdom on the problem. Specifically, the tradeoff is not
smooth, but rather it follows a new discrete hierarchy in which the
quality of the approximation that can be obtained jumps considerably at
certain predictable size thresholds. The proof is graph-theoretic and
relies on building large families of graphs with large discrepancies in
their metrics.
Abstract: A low-diameter decomposition (LDD) of an
undirected graph G is a partitioning of G into components of bounded
diameter, such that only a small fraction of original edges are between
the components. This decomposition has played instrumental role in the
design of low-stretch spanning tree, spanners, distributed algorithms
etc.
A natural question is whether such a decomposition can be efficiently
maintained/updated as G undergoes insertions/deletions of edges. We
make the first step towards answering this question by designing a
fully-dynamic graph algorithm that maintains an LDD in sub-linear
update time.
It is known that any undirected graph G admits a spanning tree T with
nearly logarithmic average stretch, which can be computed in nearly
linear-time. This tree decomposition underlies many recent progress in
static algorithms for combinatorial and scientific flows. Using our
dynamic LDD algorithm, we present the first non-trivial algorithm that
dynamically maintains a low-stretch spanning tree in \(\tilde{O}(t^2)\)
amortized update time, while achieving \((t + \sqrt{n^{1+o(1)}/t})\)
stretch, for every \(1 \leq t \leq n\).
Joint work with Sebastian Krinninger.
March 23: Closed, Spring Break
Abstract: The talk is based on https://arxiv.org/abs/1802.08318.
Abstract: Consider the problem of selling items to
a unit-demand buyer. Most work on maximizing seller revenue considers
either a setting that is single dimensional, such as where the items
are identical, or multi-dimensional, where the items are heterogeneous.
With respect to revenue-optimal mechanisms, these settings sit at
extreme ends of a spectrum: from simple and fully characterized
(single-dimensional) to complex and nebulous (multi-dimensional).
In this paper, we identify a setting that sits in between these
extremes. We consider a seller who has three services {A,B,C} for sale
to a single buyer with a value v and an interest G from {A,B,C}, and
there is a known partial ordering over the services. For example,
suppose the seller is selling {internet}, {internet, phone}, and
{internet, cable tv}. A buyer with interest {internet} would be
satisfied by receiving phone or cable tv in addition, but a customer
whose interest is {internet, phone} cannot be satisfied by any other
option. Thus this corresponds to a partial-ordering where {internet}
> {internet, phone} and {internet} > {internet, cable tv}, but
{internet, phone} and {internet, cable tv} are not comparable.
We show formally that partially-ordered items lie in a space of their
own, in between identical and heterogeneous items: there exist
distributions over (value, interest) pairs for three partially-ordered
items such that the menu complexity of the optimal mechanism is
unbounded, yet for all distributions there exists an optimal mechanism
of finite menu complexity. So this setting is vastly more complex than
identical items (where the menu complexity is one), or even
“totally-ordered” items as in the FedEx Problem [FGKK16] (where the
menu complexity is at most seven, for three items), yet drastically
more structured than heterogeneous items (where the menu complexity can
be uncountable [DDT15]). We achieve this result by proving a
characterization of the class of best duals and by giving a primal
recovery algorithm which obtains the optimal mechanism. In addition, we
(1) extend our lower-bound to the Multi-Unit Pricing setting, (2) give
a tighter and deterministic characterization of the optimal mechanism
when the buyer’s distribution satisfies the declining marginal revenue
condition, and (3) prove a master theorem that allows us to reason
about duals instead of distributions.
Joint work with Nikhil Devanur, Raghuvansh Saxena, Ariel Schvartzman, and Matt Weinberg.
Kira will be around all week. Please email her to schedule one-on-one meetings.
Abstract: In this talk, we study solvers for
geometrically embedded graph structured block linear systems. The
general form of such systems, PSD-Graph-Structured Block Matrices
(PGSBM), arise in scientific computing, linear elasticity, the inner
loop of interior point algorithms for linear programming, and can be
viewed as extensions of graph Laplacians into multiple labels at each
graph vertex. Linear elasticity problems, more commonly referred to as
trusses, describe forces on a geometrically embedded object.
We present an asymptotically faster algorithm for solving linear
systems in well-shaped 3-D trusses. Our algorithm utilizes the
geometric structures to combine nested dissection and support theory,
which are both well studied techniques for solving linear systems. We
decompose a well-shaped 3-D truss into balanced regions with small
boundaries, run Gaussian elimination to eliminate the interior
vertices, and then solve the remaining linear system by preconditioning
with the boundaries.
On the other hand, we prove that the geometric structures are
"necessary" for designing fast solvers. Specifically, solving linear
systems in general trusses is as hard as solving general linear systems
over the real. Furthermore, we give some other PGSBM linear systems for
which fast solvers imply fast solvers for general linear systems.
Based on the joint works with Robert Schwieterman and Rasmus Kyng.
Abstract:
In 1995 Kim famously proved the Ramsey bound \(R(3,t) \ge c t^2/\log
t\) by constructing an \(n\)-vertex graph that is triangle-free and has
independence number at most \(C \sqrt{n \log n}\). We extend this
celebrated result, which is best possible up to the value of the
constants, by approximately decomposing the complete graph \(K_n\) into
a packing of such nearly optimal Ramsey \(R(3,t)\) graphs.
More precisely, for any \(\epsilon>0\) we find an edge-disjoint
collection \((G_i)_i\) of \(n\)-vertex graphs \(G_i \subseteq K_n\)
such that (a) each \(G_i\) is triangle-free and has independence number
at most \(C_\epsilon \sqrt{n \log n}\), and (b) the union of all the
\(G_i\) contains at least \((1-\epsilon)\binom{n}{2}\) edges. Our
algorithmic proof proceeds by sequentially choosing the graphs \(G_i\)
via a semi-random (i.e., Rödl nibble type) variation of the
triangle-free process.
As an application, we prove a conjecture in Ramsey theory by Fox,
Grinshpun, Liebenau, Person, and Szabó (concerning a Ramsey-type
parameter introduced by Burr, Erdös, Lovász in 1976). Namely, denoting
by \(s_r(H)\) the smallest minimum degree of \(r\)-Ramsey minimal
graphs for \(H\), we close the existing logarithmic gap for \(H=K_3\)
and establish that \(s_r(K_3) = \Theta(r^2 \log r)\).
Based on joint work with Lutz Warnke.
Abstract: In a self-organizing particle system, an abstraction of programmable
matter, simple computational elements called particles with limited
memory and communication self-organize to solve system-wide problems of
movement, coordination, and configuration.
In this paper, we consider stochastic, distributed, local, asynchronous
algorithms for 'shortcut bridging', in which particles self-assemble
bridges over gaps that simultaneously balance minimizing the length and
cost of the bridge. Army ants of the genus Eticon
have been observed exhibiting a similar behavior in their foraging
trails, dynamically adjusting their bridges to satisfy an efficiency
tradeoff using local interactions. Using techniques from Markov chain
analysis, we rigorously analyze our algorithm, show
it achieves a near-optimal balance between the competing factors of path
length and bridge cost, and prove that it exhibits a dependence on the
angle of the gap being 'shortcut' similar to that of the ant bridges. We
also present simulation results that qualitatively
compare our algorithm with the army ant bridging behavior. Our work
presents a plausible explanation of how convergence to globally optimal
configurations can be achieved via local interactions by simple
organisms (e.g., ants) with some limited computational
power and access to random bits. The proposed algorithm demonstrates the
robustness of the stochastic approach to algorithms for programmable
matter, as it is a surprisingly simple extension of a stochastic
algorithm for compression.
This is joint work between myself/my professor Andrea Richa at ASU and Sarah Cannon and Prof. Dana Randall here at GaTech.
Abstract:
We show variants of spectral sparsification routines can preserve the
total spanning tree counts of graphs, which by Kirchhoff's matrix-tree
theorem, is equivalent to determinant of a graph Laplacian minor, or
equivalently, of any SDDM matrix. Our analyses utilizes this
combinatorial connection to bridge between statistical leverage scores
/ effective resistances and the analysis of random graphs by [Janson,
Combinatorics, Probability and Computing `94]. This leads to a routine
that in quadratic time, sparsifies a graph down to about \(n^{1.5}\)
edges in ways that preserve both the determinant and the distribution
of spanning trees (provided the sparsified graph is viewed as a random
object). Extending this algorithm to work with Schur complements and
approximate Choleksy factorizations leads to algorithms for counting
and sampling spanning trees which are nearly optimal for dense graphs.
We give an algorithm that computes a \((1\pm \delta)\) approximation to
the determinant of any SDDM matrix with constant probability in about
\(n^2\delta^{−2}\) time. This is the first routine for graphs that
outperforms general-purpose routines for computing determinants of
arbitrary matrices. We also give an algorithm that generates in about
\(n^2\delta^{−2}\) time a spanning tree of a weighted undirected graph
from a distribution with total variation distance of \(\delta\) from
the w-uniform distribution.
This is joint work with John Peebles, Richard Peng and Anup B. Rao.
Abstract: We prove that every triangle-free graph with maximum
degree \(\Delta\) has list chromatic number at most \((1+o(1))\frac{\Delta}{\ln
\Delta}\). This matches the best-known bound for graphs of girth at least
5. We also provide a new proof that for any \(r\geq4\) every
\(K_r\)-free graph has list-chromatic number at most
\(200r\frac{\Delta\ln\ln\Delta}{\ln\Delta}\).
Abstract:
The search for the asymptotics of the Ramsey function R(3,k) has a long
and fascinating history. It begins in the hill country surrounding
Budapest and winding over the decades through Europe, America, Korea
and Rio de Janiero. We explore it through a CS lens, giving algorithms
that provide the various upper and lower bounds. The arguments are
various more or less sophisticated uses of Erdos Magic and, indeed,
many of the most important advances in the Probabilistic Method have
come from these investigations.