Joan Bruna on "Mathematical aspects of neural network approximation and learning"
High-dimensional learning remains an outstanding phenomena where experimental evidence outpaces our current mathematical understanding. Neural Networks provide a rich yet intricate class of functions with statistical abilities to break the curse of dimensionality, and where physical priors can be tightly integrated into the architecture to improve sample efficiency. Despite these advantages, an outstanding theoretical challenge in these models is computational, by providing an analysis that explains successful optimization and generalization in the face of existing worst-case computational hardness results.
In this talk, we will describe snippets of such challenge, covering respectively optimization and approximation. First, we will focus on the framework that lifts parameter optimization to an appropriate measure space. We will overview existing results that guarantee global convergence of the resulting Wasserstein gradient flows, and present our recent results that study typical fluctuations of the dynamics around their mean field evolution, as well as extensions of this framework beyond vanilla supervised learning to account for symmetries in the function. Next, we will discuss the role of depth in terms of approximation, and present novel results establishing so-called ‘depth separation’ for a broad class of functions. We will conclude by discussing consequences in terms of optimization, highlighting current and future mathematical challenges.
Joint work with: Zhengdao Chen, Grant Rotskoff, Eric Vanden-Eijnden, Luca Venturi, Samy Jelassi and Aaron Zweig.