Abstract: The proliferation of parameters in over-parametrized neural architectures, although not necessarily disadvantageous from a generalization standpoint, has, however, grown the computational cost required to train some state-of-the-art models to beyond the point of affordability for fair access to research opportunities and deployment in low-resource environments. Due to limited memory, time, and compute, and to enable private, secure, on-device computation, methods for model compression (such as pruning, quantization, and distillation) have seen a rise in popularity. Despite the widespread deployment of pruned models, little is understood about their sparsity structure and trainability properties, especially as different pruning techniques are developed and applied. In the context of the "lottery ticket hypothesis", recently put forward by Frankle & Carbin, 2018, this talk will discuss algorithms to empirically find lottery tickets (i.e. sparse sub-networks identified within dense architectures that can successfully be trained from scratch to state-of-the-art performance despite only containing a small fraction of the weights), their transferability properties, as well as the diversity in structure observed in lottery tickets obtained through different pruning methods.
Short bio: Michela Paganini is a Postdoctoral Researcher at Facebook AI Research in Menlo Park and an affiliate at Lawrence Berkeley National Lab. She joined Facebook in 2018 after earning her PhD in physics from Yale University. During her graduate studies, she worked on the design, development, and deployment of deep learning algorithms for the ATLAS experiment at CERN, with a focus on computer vision and generative modeling. Prior to that, in 2013, she graduated from the University of California, Berkeley with degrees in physics and astrophysics. Her current research focuses on empirically and theoretically characterizing neural network dynamics in the over-parameterized and under-parameterized regimes using pruning as a tool for model compression. She is broadly interested in the science of deep learning, with a focus on connecting emergent behavior in constrained networks to theoretical predictions.
Quando e dove: Ore 10:30 - Sala Seminari DISCO
Maggiori info: email@example.com