MIFODS Seminar

MIFODS events in the next two weeks.MIFODS runs a joint seminar with other seminars at MIT: The Stochastics and Statistics seminar, the LIDS seminar, the ToC colloquium, the Machine Learning seminar, the Applied Math colloquium and the Combinatorics seminar.

Seminars

All seminars in chronological order
mifods seminar

Afonso Bandeira (NYU)

Statistical estimation under group actions: The Sample Complexity of Multi-Reference Alignment

Video
Many problems in signal/image processing, and computer vision amount to estimating a signal, image, or tri-dimensional structure/scene from corrupted measurements. A particularly challenging form of measurement corruption are latent transformations of the underlying signal to be recovered. Many such transformations can be described as a group acting on the object to be recovered. Examples include the Simulatenous Localization and Mapping (SLaM) problem in Robotics and Computer Vision, where pictures of a scene are obtained from different positions andorientations; Cryo-Electron Microscopy (Cryo-EM) imaging where projections of a molecule density are taken from unknown rotations, andseveral others. One fundamental example of this type of problems is Multi-Reference Alignment: Given a group acting in a space, the goal is to estimate an orbit of the group action from noisy samples. For example, in one of its simplest forms, one is tasked with estimating a signal from noisy cyclically shifted copies. We will show that the number of observations needed by any method has a surprising dependency on the signal-to-noise ratio (SNR), and algebraic properties of the underlying group action. Remarkably, in some important cases, this sample complexity is achieved with computationally efficient methods based on computing invariants under the group of transformations.
mifods seminar

Justin Solomon (MIT)

Scaling and Broadening the Scope of Computational Transport

Video
The theory of optimal transport links probability to geometry, providing a natural means to characterize, compare, and manipulate probability distributions supported on a geometric domain. While optimal transport has long been identified as a candidate for improving pipelines in computer vision, graphics, statistics, and other disciplines, computational considerations have made it challenging to apply in practice; recent improvements in the computational pipeline, however, have motivated renewed interest in this discipline among members of the computer science community. In this talk, I will discuss recent efforts jointly with collaborators and students in the MIT Geometric Data Processing Group to develop efficient algorithms for computational optimal transport and derived problems, leveraging structure in key applications of this machinery. I will highlight new techniques for stochastic transport, transport along discrete surfaces, and transport along graphs, as well as extensions of the basic problem to tensor-valued measures and to quadratic assignment problems.
mifods seminar

David Sontag (MIT)

When Inference is Tractable

Video
A key capability of artificial intelligence will be the ability to reason about abstract concepts and draw inferences. Where data is limited, probabilistic inference in graphical models provides a powerful framework for performing such reasoning, and can even be used as modules within deep architectures. But, when is probabilistic inference computationally tractable? I will present recent theoretical results that substantially broaden the class of provably tractable models by exploiting model stability (Lang, Sontag, Vijayaraghavan, AI Stats ’18), structure in model parameters (Weller, Rowland, Sontag, AI Stats ’16), and reinterpreting inference as ground truth recovery (Globerson, Roughgarden, Sontag, Yildirim, ICML ’15).
mifods seminar

Ohad Shamir (Weizman Institute)

Is Depth Needed for Deep Learning?

Video
Deep learning, as its name indicates, is based on training artificial neural networks with many layers. A key theoretical question is to understand why such depth is beneficial, and when is it provably necessary to express certain types of functions. In fact, this question is closely related to circuit complexity, which has long been studied in theoretical computer science -- albeit for different reasons, and for circuits which differ in some important ways from modern neural networks. Despite this similarity, the interaction between the circuit complexity and machine learning research communities is currently quite limited. In this talk, I'll survey some of the recent depth separation results developed in the machine learning community, and discuss open questions where insights from circuit complexity might help. The talk is aimed at a general theoretical computer science audience, and no prior knowledge about deep learning will be assumed.
mifods seminar

Johannes Schmidt-Hieber (Leiden)

Statistical theory for deep neural networks with ReLU activation function

Video
The universal approximation theorem states that neural networks are capable of approximating any continuous function up to a small error that depends on the size of the network. The expressive power of a network does, however, not guarantee that deep networks perform well on data. For that, control of the statistical estimation risk is needed. In the talk, we derive statistical theory for fitting deep neural networks to data generated from the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to logarithmic factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architecture, the tuning parameter is the sparsity of the network. Specifically, we consider large networks with number of potential parameters being much bigger than the sample size. Interestingly, the depth (number of layers) of the neural network architectures plays an important role and our theory suggests that scaling the network depth with the logarithm of the sample size is natural.
mifods seminar

Sanjoy Dasgupta (UCSD)

Using interaction for simpler and better learning

In the usual setup of supervised learning, the learner is given a stack of labeled examples and told to fit a classifier to them. It would be quite unnatural for a human to learn in this way, and indeed this model is known to suffer from a variety of fundamental hardness barriers. However, many of these hurdles can be overcome by moving to a setup in which the learner interacts with a human (or other information source) during the learning process. We will see how interaction makes it possible to: 1. Learn DNF (disjunctive normal form) concepts. 2. Perform machine teaching in situations where the student’s concept class is unknown. 3. Improve the results of unsupervised learning. We will present a generic approach to “interactive structure learning” that, for instance, yields simple interactive algorithms for topic modeling and hierarchical clustering. Along the way, we will present a novel cost function for hierarchical clustering, as well as an efficient algorithm for approximately minimizing this cost.
mifods seminar

Carlos Fernandez-Granda (NYU)

TBA

TBA
mifods seminar

Shai Shalev-Shwartz (HUJI)

TBA

TBA
mifods seminar

Workshop on Sublinear Algorithms and robust statistics

Workshop on the Spring 18 semester's theme: Sublinear Algorithms and robust statistics