Constrained and Layer-wise Training of Neural Networks
Dr Tiffany Vlaar (University of Glasgow)
Thursday 14th March 14:00-15:00 Maths 311B / Zoom (ID: 894 0173 1730)
Abstract
My research aims to further our understanding of neural networks. The first part of the talk will focus on parameter constraints. Common techniques used to improve the generalisation performance of deep neural networks (such as e.g. L2 regularisation and batch normalisation) are tantamount to imposing a constraint on the neural network parameters, but despite their widespread use are often not well understood. In the talk I will describe an approach for efficiently incorporating hard constraints into a stochastic gradient Langevin dynamics framework. Our constraints offer direct control of the parameter space, which allows us to study their effect on generalisation. In the second part of the talk, I will focus on the role played by individual layers and substructures of neural networks: layer-wise sensitivity to the choice of initialisation and optimiser hyperparameter settings varies and training neural network layers differently may lead to enhanced generalisation and/or reduced computational cost. Specifically, I will show that 1) a multirate approach can be used to train deep neural networks for transfer learning applications in half the time, without reducing the generalisation performance of the model, and 2) solely applying the sharpness-aware minimisation (SAM) technique to the normalisation layers of the network enhances generalisation, while providing computational savings.
Add to your calendar
Download event information as iCalendar file (only this event)