Fall 2019

Learning under complex structure


Some of the most significant advances in the recent history of statistics and data science have relied on our ability to express and exploit structure in data. This structure may be simple as in the case of parametric models such as linear regression, low rank matrix estimation or principal component analysis, where the data is assumed to be the superposition of a linear algebraic structure and some well behaved (e.g., Gaussian) noise. In other cases, this structure may be simple provided that the correct data representation is known, as is the case in wavelet thresholding for natural image denoising, where linearity arises in the wavelet basis space. The recent explosion of data that is routinely collected has led scientists to contemplate more and more sophisticated structural assumptions. In some cases such as models with latent variables, new models aim at capturing heterogeneity in the data, whereas in others, complex structures arise naturally as algebraic structures governed by the rigid laws of physics. Understanding how to harness and exploit such structure is key to improving the prediction accuracy of various learning procedures. The ultimate goal is to develop a set of tools that leverage underlying complex structures to pool information across observations and ultimately improve statistical accuracy as well as computational efficiency of the deployed methods. Bringing together computer scientists, mathematicians and statisticians will have a transformative impact in this fast developing avenue of research.