Maria-Florina Balcan (Carnegie Mellon University), Data-driven algorithm design

Data-driven algorithm design for combinatorial problems is an important aspect of modern data science. Rather than using off the shelf algorithms that only have worst case performance guarantees, practitioners typically optimize over large families of parameterized algorithms and tune the parameters of these algorithms using a training set of problem instances from their domain to determine a configuration with high expected performance over future instances. However, most of this work comes with no performance guarantees. The challenge is that for many combinatorial problems, including partitioning and subset selection problems, a small tweak to the parameters can cause a cascade of changes in the algorithm's behavior, so the algorithm's performance is a discontinuous function of its parameters.

In this talk, I will present new work that helps put data-driven combinatorial algorithm selection on firm foundations. This includes strong computational and statistical performance guarantees, both for the batch and online scenarios where a collection of typical problem instances from the given application are presented either all at once or in an online fashion, respectively. I will describe both specific examples (for clustering, partitioning, and subset selection problems) and general principles that emerge in this context (including general techniques for sample complexity guarantees in the batch setting and no-regret guarantees in the online settings).

Nikhil Bansal (CWI and Eindhoven University of Technology), Discrepancy, Rounding and Approximation

Discrepancy theory deals with the following question: Given a set system on some universe of elements, color the elements red and blue so that each set in the system is colored as evenly as possible. I will give an overview of discrepancy and describe some of its applications. Then we focus on some results and techniques in discrepancy, and in particular show how they can be used to design new general rounding techniques leading to improved approximation guarantees for various algorithmic problems.

Maria Chudnovsky (Princeton University), Induced subgraphs and tree decompositions

Tree decompositions are a powerful tool in structural graph theory, that is traditionally used in the context of forbidden graph minors. Connecting tree decompositions and forbidden induced subgraphs has so far remained out of reach. Recently we obtained several results in this direction; the talk will be a survey of these results.

Nicole Immorlica (Microsoft Research), Incentivizing Exploration with Selective Data Disclosure

We study the design of rating systems that incentivize efficient social learning. Agents arrive sequentially and choose actions, each of which yields a reward drawn from an unknown distribution. A policy maps the rewards of previously-chosen actions to messages for arriving agents. The regret of a policy is the difference, over all rounds, between the expected reward of the best action and the reward induced by the policy. Prior work proposes policies that recommend a single action to each agent, obtaining optimal regret under standard rationality assumptions. We instead assume a frequentist behavioral model and, accordingly, restrict attention to disclosure policies that use messages consisting of the actions and rewards from a subsequence of past agents, chosen ex ante. We design a policy with optimal regret in the worst case over reward distributions. Our research suggests three components of effective policies: independent focus groups, group aggregators, and interlaced information structures.

Joint work with Jieming Mao, Alex Slivkins and Steven Wu.

Eduardo Sany Laber (Pontifical Catholic University of Rio de Janeiro), On the Price of Explainability for some Clustering Problems

Machine learning models and algorithms have been used in a number of systems that take decisions that affect our lives. Thus, explainable methods are desirable so that people are able to have a better understanding about their behaviour. However, we may be forced to lose quality and/or efficiency in order to achieve explainability. In this talk we investigate, from a theoretical perspective, the price of explainability for some clustering problems.

Alexander Razborov (The University of Chicago), Theons and Quasi-Randomness

There are two known approaches to the theory of limits of discrete combinatorial objects: geometric (graph limits) and algebraic (flag algebras). In the first part of the talk we present a general framework intending to combine useful features of both theories and compare it with previous attempts of this kind. Our main objects are \(T\)-ons, for a universal relational first-order theory \(T\); they generalize all previously considered partial cases, some of them (like permutons) in a rather non-trivial way.

In the second part we apply this framework to offer a new perspective on quasi-randomness for combinatorial objects more complicated than ordinary graphs. Our quasi-randomness properties are natural in the sense that they do not use ad hoc densities and they are preserved under the operation of defining combinatorial structures of one kind from structures of a different kind. One key concept in this theory is that of unique coupleability roughly meaning that any alignment of two objects on the same ground set should “look like” random.

Based on two joint papers with Leonardo Coregliano: Russian Mathematical Surveys (2020, 75(4)) and arXiv:2012.11773.

Luca Trevisan (Bocconi University), Graph and Hypergraph Sparsification
graph \(H\) is a sparsifier of a graph \(G\) if \(H\) has much fewer edges than \(G\) and, in an appropriate technical sense, \(H\) “approximates” \(G\). Sparsifiers are useful as compressed representations of graphs and to speed up certain graph algorithms. In a “cut sparsifier,” the notion of approximation is that every cut is crossed by approximately the same number of edges in \(G\) as in \(H\). In a “spectral sparsifier” a stronger, linear-algebraic, notion of approximation holds. Similar definitions can be given for hypergraph.

We discuss recent progress on constructions and lower bounds for graph and hypergraph sparsification, and we point out some challenging open problems.

Bianca Zadrozny (IBM Research Brazil), Evaluating classifier learning methods under covariate shift and spatial correlation

Classifier learning methods commonly assume that the training data consist of randomly drawn examples from the same distribution as the test examples about which the learned model is expected to make predictions. In the real world, the joint distribution of inputs to the model and outputs of the model differs between training and test data, a problem known as sample selection bias or dataset shift. In this talk, I will review existing methods for dealing with this problem, in particular of the special case known as covariate shift where only the input distribution changes and the conditional distribution of the output for a given input is assumed to remain fixed. I will then introduce the problem of covariate shift in geospatial data and illustrate the challenges of learning from geospatial data by assessing existing methods for evaluating the accuracy of classifier learning methods under covariate shift and spatial correlation.


[Home] [All Inv. Speakers] [LATIN 2020]