Room “Sala Seminari” - Abacus Building (U14)
From Zero-shot Learning to Training-free Generalization
Speaker
Dr Massimiliano Mancini
University of Trento
Abstract
Zero-shot learning is the ability of a machine learning model to recognize semantic concepts unseen during training. While over the years this field evolved by making the setting more and more challenging, a principle remained intact: to recognize unseen classes, we need to ground the (visual) input to a semantic description rather than the class itself. This semantic description, in the form of language, is the key to generalization. Nowadays, with the advent of large multimodal models (LMM), the concept of "unseen" is brittle and hard to define. Nevertheless, can this principle still be useful for semantic generalization? In this talk, we will briefly introduce zero-shot learning and the fundamental approaches to address this task. We will then discuss LMMs and their capabilities. Finally, we will see how the principle of grounding visual information to language can be used to re-purpose LMMs to remove assumptions (e.g., in classification) and address tasks beyond the ones they were designed for (e.g., anomaly detection, and compositional recognition) without requiring any training. We will conclude with a discussion of the pros and cons of this paradigm as well as promising future directions.
Short bio
Massimiliano Mancini is an ELLIS member and an Assistant Professor at the University of Trento. He completed his Ph.D. at the Sapienza University of Rome, co-advised by Barbara Caputo and Elisa Ricci. During his Ph.D., he was part of the TeV lab at Fondazione Brunk Kessler, the VANDAL lab at the Italian Institute of Technology, and a visiting student at the KTH Royal Institute of Technology. After his Ph.D., he joined the University of Tübingen as a postdoc in the Explainable Machine Learning group of the University of Tübingen led by Zeynep Akata. He serves as area chair for major conferences in the field (CVPR, ECCV, NeurIPS, ICRA) and as an associate/area editor for CVIU and TMLR. His research focuses on efficient transfer learning, cross-domain generalization, continual learning, automatic bias identification, and compositional reasoning.
Contact person for the seminar: alessandro.raganato@unimib.it