Author: "Pomponi, Jary" / Topic: neural networks - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Pomponi, Jary"' showing total 2 results

Start Over Author "Pomponi, Jary" Topic neural networks

Author: Scardapane, Simone, Baiocchi, Alessandro, Devoto, Alessio, Marsocci, Valerio, Minervini, Pasquale, and Pomponi, Jary
Subjects: MODULAR design, SCIENTIFIC discoveries, DESIGN
Abstract: This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Author: Pomponi, Jary, Scardapane, Simone, and Uncini, Aurelio
Subjects: *MEMORY, *FOOTPRINTS, *ECOLOGICAL impact, *DEEP learning
Abstract: In this paper, we propose a novel ensembling technique for deep neural networks, which is able to drastically reduce the required memory compared to alternative approaches. In particular, we propose to extract multiple sub-networks from a single, untrained neural network by solving an end-to-end optimization task combining differentiable scaling over the original architecture, with multiple regularization terms favouring the diversity of the ensemble. Since our proposal aims to detect and extract sub-structures, we call it Structured Ensemble. On a large experimental evaluation, we show that our method can achieve higher or comparable accuracy to competing methods while requiring significantly less storage. In addition, we evaluate our ensembles in terms of predictive calibration and uncertainty, showing they compare favourably with the state-of-the-art. Finally, we draw a link with the continual learning literature, and we propose a modification of our framework to handle continuous streams of tasks with a sub-linear memory cost. We compare with a number of alternative strategies to mitigate catastrophic forgetting, highlighting advantages in terms of average accuracy and memory. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Books, media, physical & digital resources

Searchworks