The Pervasive Role of Composing Transformations in Machine Learning

From the layer maps of neural networks to training procedures and reinforcement learning, compositions of transformations permeate modern AI. These compositional products often involve randomly selected maps, as in weight initialization, stochastic gradient descent (SGD), and dropout. In reinforcement learning, Bellman-type operators with randomness are iterated to update reward structures and strategies. I will discuss the mathematics and geometry underlying the composition of random transformations. In particular, I will explain a general limit law established in joint work with Gouëzel. Moreover, I will discuss a possible cut-off phenomenon related to the depth of neural networks and the influence of iteration order. Motivated by these observations, and in collaboration with Avelin, Dherin, Gonzalvo, Mazzawi, and Munn, we propose backward variants of SGD that improve stability and convergence while maintaining generalisation.

Department of Applied Mathematics and Theoretical Physics

Further information

Time:

Venue:

Speaker:

Series:

Study at Cambridge

About the University

Research at Cambridge