Understanding the interaction between the neural network design, the dataset, and gradient descent: the case of convolutions
Hannah Pinson (Eindhoven University of Technology)
Thursday 27th February 14:00-15:00 Maths 116
Abstract
While the field of deep learning is thriving, we lack a robust mathematical understanding of its fundamental principles. Important “ingredients” of the deep learning process, such as the properties of the architecture or the learning algorithm, are often studied in isolation. However, a comprehensive framework for analyzing their interaction seems crucial for unlocking deeper insights into the learning process of neural networks. In this talk, I will introduce our novel mathematical framework designed to characterize the interplay between dataset properties, network architecture, and the dynamics of gradient descent for linear, convolutional neural networks (CNNs). By leveraging concepts from dynamical systems, we derived analytical results that explain the highly nonlinear learning dynamics, revealing which data features are used and encoded within the network's parameters. Furthermore, I will present some empirical evidence demonstrating the relevance of our theoretical findings to deep, non-linear CNNs used in practical applications. This bridge between theory and practice suggests the potential of our framework as a basis for a more general mathematical theory of deep learning.
Add to your calendar
Download event information as iCalendar file (only this event)