## About

With the emergence of big data, and the huge combinatorial objects
that make up this data,
there is a
need for understanding data that is too large to store or view in its entirety.
As much of today's data has some sort of combinatorial structure,
the questions that now arise have relatively unfamiliar flavors:
For example, does the Twitter graph have small diameter?
How fast is a networked community growing? Does a distribution over traffic
accidents in a very large map have certain shape properties,
such as monotone increasing density with distance to a traffic light,
or similarity to a Poisson Multinomial distribution?
Determining the scope of what is possible to understand in such settings
is a basic and fundamental endeavor, that has been addressed in
multiple ways within the communities of statistics, mathematics, computer science,
and information theory.
Models that have been studied include:
streaming algorithms, where big data streams by, and the algorithm has limited
space in which to store information and intermediate computations;
sketching algorithms, where the input is compressed (sketched) into a smaller
domain in a way that certain operations can be supported on the sketches;
compressive sensing algorithms, where a signal is recovered from noisy measurements;
sublinear time algorithms, in which algorithms approximate a parameter of the
data after viewing a miniscule fraction of the data; local algorithms that compute and make decisions on parts of the output considering only a
portion of the input; property testing algorithms, in which a quick
determination of what properties the data has must be made in
sublinear time in the size of the data;
and distribution property testing algorithms,
in which properties of the distribution must be understood in time sublinear in
the size of the sampled domain.

## WORKSHOP & BOOTCAMP

Register hereregistrations to bootcamp and/or workshop

are free but mandatory

are free but mandatory

### Synopsis

As big data is getting bigger, there is a need for analyzing data with sublinear constraints -- that is, for algorithms which require only sublinear time, space, measurements and/or samples.
The goal of this workshop is to bring together experts in various areas of Computer Science, Electrical Engineering, Statistics and Mathematics to discuss recent work and exciting new challenges. It is hoped that the multidisciplinary nature of the workshop will highlight common goals and themes, as well as to facilitate an interchange of technical ideas that may be of use more widely than previously thought.
The workshop will be preceded by a one day bootcamp on June 10, with the goal of presenting the basic techniques, definitions and goals in several of the communities.

### Program

The workshop will take place at MIT (room 2-190) on June 11-13. Bootcamp on June 10 (room 32-G449).

### Confirmed workshop speakers

- Ankur Moitra (MIT)
- Michael Mitzenmacher (Harvard)
- Moses Charikar (Stanford)
- Yihong Wu (Yale)
- Alon Orlitsky (UCSD)
- Huy Nguyen (Northeastern)
- Ilias Diakonikolas (USC)
- Deanna Needell (UCLA)
- David Woodruff (CMU)
- Vladimir Braverman (Johns Hopkins)
- Parikishit Gopalan (VMware Research)
- Noga Ron Zewi (University of Haifa)
- Artur Czumaj (U. of Warwick)
- Sofya Raskhodnikova (Boston U.)
- Jacob Fox (Stanford)
- Andrea Montanari (Stanford)
- Jelani Nelson (Harvard)

### Confirmed bootcamp speakers

- Yufei Zhao (MIT)
- David Gamarnik (MIT)
- Alex Andoni (Columbia)
- Krzysztof Onak (IBM Yorktown)
- Costas Daskalakis (MIT)

### Organizers

- David Gamarnik (MIT)
- Piotr Indyk (MIT)
- Philippe Rigollet (MIT)
- Ronitt Rubinfeld (MIT) -- Lead organizer
- Devavrat Shah (MIT)
- Madhu Sudan (Harvard)
- Rachel Ward (UT Austin)
- Yihong Wu (Yale)

Schedule

Local information

Posters