169 results on '"Simone Scardapane"'
Search Results
52. A Multimodal Deep Network for the Reconstruction of T2W MR Images.
- Author
-
Antonio Falvo, Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2017
- Full Text
- View/download PDF
53. Flexible Generative Adversarial Networks with Non-parametric Activation Functions.
- Author
-
Eleonora Grassucci, Simone Scardapane, Danilo Comminiello, and Aurelio Uncini
- Published
- 2017
- Full Text
- View/download PDF
54. On the use of deep recurrent neural networks for detecting audio spoofing attacks.
- Author
-
Simone Scardapane, Lucas Stoffl, Florian Röhrbein, and Aurelio Uncini
- Published
- 2017
- Full Text
- View/download PDF
55. In Codice Ratio: OCR of Handwritten Latin Documents using Deep Convolutional Networks.
- Author
-
Donatella Firmani, Paolo Merialdo, Elena Nieddu, and Simone Scardapane
- Published
- 2017
56. Inferring 3D change detection from bitemporal optical images
- Author
-
Valerio Marsocci, Virginia Coletta, Roberta Ravanelli, Simone Scardapane, and Mattia Crespi
- Subjects
FOS: Computer and information sciences ,3D change detection ,remote sensing ,deep learning ,elevation change detection ,dataset ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Atomic and Molecular Physics, and Optics ,Computer Science Applications ,FOS: Electrical engineering, electronic engineering, information engineering ,Computers in Earth Sciences ,Engineering (miscellaneous) - Abstract
Change detection is one of the most active research areas in Remote Sensing (RS). Most of the recently developed change detection methods are based on deep learning (DL) algorithms. This kind of algorithms is generally focused on generating two-dimensional (2D) change maps, thus only identifying planimetric changes in land use/land cover (LULC) and not considering nor returning any information on the corresponding elevation changes. Our work goes one step further, proposing two novel networks, able to solve simultaneously the 2D and 3D CD tasks, and the 3DCD dataset, a novel and freely available dataset precisely designed for this multitask. Particularly, the aim of this work is to lay the foundations for the development of DL algorithms able to automatically infer an elevation (3D) CD map -- together with a standard 2D CD map --, starting only from a pair of bitemporal optical images. The proposed architectures, to perform the task described before, consist of a transformer-based network, the MultiTask Bitemporal Images Transformer (MTBIT), and a deep convolutional network, the Siamese ResUNet (SUNet). Particularly, MTBIT is a transformer-based architecture, based on a semantic tokenizer. SUNet instead combines, in a siamese encoder, skip connections and residual layers to learn rich features, capable to solve efficiently the proposed task. These models are, thus, able to obtain 3D CD maps from two optical images taken at different time instants, without the need to rely directly on elevation data during the inference step. Encouraging results, obtained on the novel 3DCD dataset, are shown. The code and the 3DCD dataset are available at https://sites.google.com/uniroma1.it/3dchangedetection/home-page., Comment: https://doi.org/10.1016/j.isprsjprs.2022.12.009
- Published
- 2023
57. Parallel and distributed training of neural networks via successive convex approximation.
- Author
-
Paolo Di Lorenzo and Simone Scardapane
- Published
- 2016
- Full Text
- View/download PDF
58. Diffusion spline adaptive filtering.
- Author
-
Simone Scardapane, Michele Scarpiniti, Danilo Comminiello, and Aurelio Uncini
- Published
- 2016
- Full Text
- View/download PDF
59. Distributed spectral clustering based on Euclidean distance matrix completion.
- Author
-
Simone Scardapane, Rosa Altilio, Massimo Panella, and Aurelio Uncini
- Published
- 2016
- Full Text
- View/download PDF
60. A Comparison of Consensus Strategies for Distributed Learning of Random Vector Functional-Link Networks.
- Author
-
Roberto Fierimonte, Simone Scardapane, Massimo Panella, and Aurelio Uncini
- Published
- 2016
- Full Text
- View/download PDF
61. Benchmarking Functional Link Expansions for Audio Classification Tasks.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, Raffaele Parisi, and Aurelio Uncini
- Published
- 2016
- Full Text
- View/download PDF
62. A Nonlinear Acoustic Echo Canceller with Improved Tracking Capabilities.
- Author
-
Danilo Comminiello, Michele Scarpiniti, Simone Scardapane, Raffaele Parisi, and Aurelio Uncini
- Published
- 2016
- Full Text
- View/download PDF
63. Bidirectional deep-readout echo state networks.
- Author
-
Filippo Maria Bianchi, Simone Scardapane, Sigurd Løkse, and Robert Jenssen
- Published
- 2018
64. Learning from Distributed Data Sources Using Random Vector Functional-Link Networks.
- Author
-
Simone Scardapane, Massimo Panella, Danilo Comminiello, and Aurelio Uncini
- Published
- 2015
- Full Text
- View/download PDF
65. Functional link expansions for nonlinear modeling of audio and speech signals.
- Author
-
Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, Raffaele Parisi, and Aurelio Uncini
- Published
- 2015
- Full Text
- View/download PDF
66. Distributed music classification using Random Vector Functional-Link nets.
- Author
-
Simone Scardapane, Roberto Fierimonte, Dianhui Wang, Massimo Panella, and Aurelio Uncini
- Published
- 2015
- Full Text
- View/download PDF
67. Online Selection of Functional Links for Nonlinear System Identification.
- Author
-
Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, Raffaele Parisi, and Aurelio Uncini
- Published
- 2015
- Full Text
- View/download PDF
68. Significance-Based Pruning for Reservoir's Neurons in Echo State Networks.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2015
- Full Text
- View/download PDF
69. FairDrop: Biased Edge Dropout for Enhancing Fairness in Graph Representation Learning
- Author
-
Indro Spinelli, Simone Scardapane, Amir Hussain, and Aurelio Uncini
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,representation learning ,social networking (online) ,topology ,Statistics - Machine Learning ,task analysis ,Machine Learning (stat.ML) ,measurement ,artificial intelligence ,prediction algorithms ,Machine Learning (cs.LG) - Abstract
Graph representation learning has become a ubiquitous component in many scenarios, ranging from social network analysis to energy forecasting in smart grids. In several applications, ensuring the fairness of the node (or graph) representations with respect to some protected attributes is crucial for their correct deployment. Yet, fairness in graph deep learning remains under-explored, with few solutions available. In particular, the tendency of similar nodes to cluster on several real-world graphs (i.e., homophily) can dramatically worsen the fairness of these procedures. In this paper, we propose a novel biased edge dropout algorithm (FairDrop) to counter-act homophily and improve fairness in graph representation learning. FairDrop can be plugged in easily on many existing algorithms, is efficient, adaptable, and can be combined with other fairness-inducing solutions. After describing the general algorithm, we demonstrate its application on two benchmark tasks, specifically, as a random walk model for producing node embeddings, and to a graph convolutional network for link prediction. We prove that the proposed algorithm can successfully improve the fairness of all models up to a small or negligible drop in accuracy, and compares favourably with existing state-of-the-art solutions. In an ablation study, we demonstrate that our algorithm can flexibly interpolate between biasing towards fairness and an unbiased edge dropout. Furthermore, to better evaluate the gains, we propose a new dyadic group definition to measure the bias of a link prediction task when paired with group-based fairness metrics. In particular, we extend the metric used to measure the bias in the node embeddings to take into account the graph structure., Comment: Submitted to a journal for the peer-review process
- Published
- 2022
70. New trends in urban change detection: detecting 3D changes from bitemporal optical images
- Author
-
Valerio Marsocci, Virginia Coletta, Roberta Ravanelli, Simone Scardapane, and Mattia Crespi
- Abstract
Keywords: Urban sustainability, Earth observation, 3D change detection, Deep Learning, DatasetNowadays, remote sensing products can provide useful and consistent information about urban areas and their morphological structures with different spatial and temporal resolutions, making it possible to perform long term spatiotemporal analyses of the historic development of the cities and in this way to monitor the evolution of their urbanization patterns, a goal strictly related to the United Nations (UN) Sustainable Development Goals (SDGs) concerning the sustainability of the cities (SDG 11 - Sustainable Cities and Communities).In this context, Change Detection (CD) algorithms estimate the changes occurred at ground level and are employed in a wide range of applications, including the identification of urban changes. Most of the recently developed CD methodologies rely on deep learning architectures. Nevertheless, the CD algorithms currently available are mainly focused on generating two-dimensional (2D) change maps, where the planimetric extent of the areas affected by changes is identified without providing any information on the corresponding elevation (3D) variations. These algorithms can thus only identify planimetric changes such as appearing/disappearing buildings/trees, shrinking/expanding structures and are not able to satisfy the requirements of applications which need to detect and, most of all, to quantify the elevation variations occurred in the area of interest (AOI), such as the estimation of volumetric changes in urban areas.It is therefore essential to develop CD algorithms capable of automatically generating an elevation (3D) CD map (a map containing the quantitative changes in elevation for the AOI) together with a standard 2D CD map, from the smallest possible amount of information. In this contribution, we will present the MultiTask Bitemporal Images Transformer (MTBIT) [1], a recently developed network, belonging to the family of vision Transformers and based on a semantic tokenizer, explicitly designed to solve the 2D and 3D CD tasks simultaneously from bitemporal optical images, and thus without the need to rely directly on elevation data during the inference phase. The MTBIT performances were evaluated in the urban area of Valladolid on the modified version of the 3DCD dataset [2], comparing this architecture with other networks designed to solve the 2D CD task. In particular, MTBIT reaches a metric accuracy equal to 6.46 m – the best performance among the compared architectures – with a limited number of parameters (13,1 M) [1].The code and the 3DCD dataset are available at https://sites.google.com/uniroma1.it/3dchangedetection/home-page. References[1] Marsocci, V., Coletta, V., Ravanelli, R., Scardapane, S., and Crespi, M.: Inferring 3D change detection from bitemporal optical images. ISPRS Journal of Photogrammetry and Remote Sensing. 2023 - in press.[2] Coletta, V., Marsocci, V., and Ravanelli, R.: 3DCD: A NEW DATASET FOR 2D AND 3D CHANGE DETECTION USING DEEP LEARNING TECHNIQUES, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLIII-B3-2022, 1349–1354, 2022.
- Published
- 2023
71. On the robustness of vision transformers for in-flight monocular depth estimation
- Author
-
Simone Ercolino, Alessio Devoto, Luca Monorchio, Matteo Santini, Silvio Mazzaro, and Simone Scardapane
- Abstract
Monocular depth estimation (MDE) has shown impressive performance recently, even in zero-shot or few-shot scenarios. In this paper, we consider the use of MDE on board low-altitude drone flights, which is required in a number of safety-critical and monitoring operations. In particular, we evaluate a state-of-the-art vision transformer (ViT) variant, pre-trained on a massive MDE dataset. We test it both in a zero-shot scenario and after fine-tuning on a dataset of flight records, and compare its performance to that of a classical fully convolutional network. In addition, we evaluate for the first time whether these models are susceptible to adversarial attacks, by optimizing a small adversarial patch that generalizes across scenarios. We investigate several variants of losses for this task, including weighted error losses in which we can customize the design of the patch to selectively decrease the performance of the model on a desired depth range. Overall, our results highlight that (a) ViTs can outperform convolutive models in this context after a proper fine-tuning, and (b) they appear to be more robust to adversarial attacks designed in the form of patches, which is a crucial property for this family of tasks.
- Published
- 2023
72. GP-based kernel evolution for L2-Regularization Networks.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2014
- Full Text
- View/download PDF
73. An effective criterion for pruning reservoir's connections in Echo State Networks.
- Author
-
Simone Scardapane, Gabriele Nocco, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2014
- Full Text
- View/download PDF
74. An interpretable graph-based image classifier.
- Author
-
Filippo Maria Bianchi, Simone Scardapane, Lorenzo Livi, Aurelio Uncini, and Antonello Rizzi
- Published
- 2014
- Full Text
- View/download PDF
75. Structured Ensembles: An approach to reduce the memory footprint of ensemble methods
- Author
-
Simone Scardapane, Jary Pomponi, and Aurelio Uncini
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,continual learning ,deep learning ,ensemble ,neural networks ,pruning ,structured pruning ,uncertainty ,learning ,computer ,Computer Science - Artificial Intelligence ,Computer science ,Cognitive Neuroscience ,Machine Learning (stat.ML) ,Machine learning ,computer.software_genre ,Regularization (mathematics) ,Machine Learning (cs.LG) ,Statistics - Machine Learning ,Artificial Intelligence ,Learning ,Pruning (decision trees) ,Forgetting ,Artificial neural network ,business.industry ,Deep learning ,Uncertainty ,Ensemble learning ,Task (computing) ,Artificial Intelligence (cs.AI) ,Memory footprint ,Neural Networks, Computer ,Artificial intelligence ,business - Abstract
In this paper, we propose a novel ensembling technique for deep neural networks, which is able to drastically reduce the required memory compared to alternative approaches. In particular, we propose to extract multiple sub-networks from a single, untrained neural network by solving an end-to-end optimization task combining differentiable scaling over the original architecture, with multiple regularization terms favouring the diversity of the ensemble. Since our proposal aims to detect and extract sub-structures, we call it Structured Ensemble. On a large experimental evaluation, we show that our method can achieve higher or comparable accuracy to competing methods while requiring significantly less storage. In addition, we evaluate our ensembles in terms of predictive calibration and uncertainty, showing they compare favourably with the state-of-the-art. Finally, we draw a link with the continual learning literature, and we propose a modification of our framework to handle continuous streams of tasks with a sub-linear memory cost. We compare with a number of alternative strategies to mitigate catastrophic forgetting, highlighting advantages in terms of average accuracy and memory., Comment: Article accepted at Neural Networks
- Published
- 2021
76. A Preliminary Study on Transductive Extreme Learning Machines.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2013
- Full Text
- View/download PDF
77. Proportionate Algorithms for Blind Source Separation.
- Author
-
Michele Scarpiniti, Danilo Comminiello, Simone Scardapane, Raffaele Parisi, and Aurelio Uncini
- Published
- 2013
- Full Text
- View/download PDF
78. Convex combination of MIMO filters for multichannel acoustic echo cancellation.
- Author
-
Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, Raffaele Parisi, and Aurelio Uncini
- Published
- 2013
- Full Text
- View/download PDF
79. Music classification using extreme learning machines.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2013
- Full Text
- View/download PDF
80. Interactive quality enhancement in acoustic echo cancellation.
- Author
-
Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, and Aurelio Uncini
- Published
- 2013
- Full Text
- View/download PDF
81. PM 10 Forecasting Using Kernel Adaptive Filtering: An Italian Case Study.
- Author
-
Simone Scardapane, Danilo Comminiello, Michele Scarpiniti, Raffaele Parisi, and Aurelio Uncini
- Published
- 2012
- Full Text
- View/download PDF
82. A Calibrated Multiexit Neural Network for Detecting Urothelial Cancer Cells
- Author
-
Simone Scardapane, Enrico Giarnieri, and L. Lilli
- Subjects
Article Subject ,Databases, Factual ,Calibration (statistics) ,Computer science ,Computer applications to medicine. Medical informatics ,R858-859.7 ,02 engineering and technology ,Urine ,Machine learning ,computer.software_genre ,General Biochemistry, Genetics and Molecular Biology ,Field (computer science) ,urothelial cell ,deep convolutional network ,cancer ,cytopathology ,03 medical and health sciences ,Deep Learning ,0302 clinical medicine ,Image Interpretation, Computer-Assisted ,0202 electrical engineering, electronic engineering, information engineering ,Medical imaging ,Humans ,Leverage (statistics) ,Ground truth ,General Immunology and Microbiology ,Artificial neural network ,Event (computing) ,business.industry ,Applied Mathematics ,Deep learning ,Computational Biology ,General Medicine ,Urinary Bladder Neoplasms ,030220 oncology & carcinogenesis ,Modeling and Simulation ,Calibration ,020201 artificial intelligence & image processing ,Neural Networks, Computer ,Artificial intelligence ,Urothelium ,business ,computer ,Algorithms ,Research Article - Abstract
Deep convolutional networks have become a powerful tool for medical imaging diagnostic. In pathology, most efforts have been focused in the subfield of histology, while cytopathology (which studies diagnostic tools at the cellular level) remains underexplored. In this paper, we propose a novel deep learning model for cancer detection from urinary cytopathology screening images. We leverage recent ideas from the field of multioutput neural networks to provide a model that can efficiently train even on small-scale datasets, such as those typically found in real-world scenarios. Additionally, we argue that calibration (i.e., providing confidence levels that are aligned with the ground truth probability of an event) has been a major shortcoming of prior works, and we experiment a number of techniques to provide a well-calibrated model. We evaluate the proposed algorithm on a novel dataset, and we show that the combination of focal loss, multiple outputs, and temperature scaling provides a model that is significantly more accurate and calibrated than a baseline deep convolutional network.
- Published
- 2021
83. Distributed Training of Graph Convolutional Networks
- Author
-
Paolo Di Lorenzo, Simone Scardapane, and Indro Spinelli
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Theoretical computer science ,Computer Networks and Communications ,Computer science ,Inference ,Machine Learning (stat.ML) ,Topology (electrical circuits) ,02 engineering and technology ,Network topology ,Convolutional neural network ,Machine Learning (cs.LG) ,Set (abstract data type) ,Statistics - Machine Learning ,Convergence (routing) ,0202 electrical engineering, electronic engineering, information engineering ,Neural and Evolutionary Computing (cs.NE) ,Computer Science - Neural and Evolutionary Computing ,020206 networking & telecommunications ,Graph ,graph convolutional networks ,distributed optimization ,networks ,consensus ,Signal Processing ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Gradient descent ,Information Systems - Abstract
The aim of this work is to develop a fully-distributed algorithmic framework for training graph convolutional networks (GCNs). The proposed method is able to exploit the meaningful relational structure of the input data, which are collected by a set of agents that communicate over a sparse network topology. After formulating the centralized GCN training problem, we first show how to make inference in a distributed scenario where the underlying data graph is split among different agents. Then, we propose a distributed gradient descent procedure to solve the GCN training problem. The resulting model distributes computation along three lines: during inference, during back-propagation, and during optimization. Convergence to stationary solutions of the GCN training problem is also established under mild conditions. Finally, we propose an optimization criterion to design the communication topology between agents in order to match with the graph describing data relationships. A wide set of numerical results validate our proposal. To the best of our knowledge, this is the first work combining graph convolutional neural networks with distributed optimization., Published on IEEE Transactions on Signal and Information Processing over Networks
- Published
- 2021
84. Continual Barlow Twins: continual self-supervised learning for remote sensing semantic segmentation
- Author
-
Valerio Marsocci and Simone Scardapane
- Subjects
FOS: Computer and information sciences ,Atmospheric Science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Computers in Earth Sciences - Abstract
In the field of Earth Observation (EO), Continual Learning (CL) algorithms have been proposed to deal with large datasets by decomposing them into several subsets and processing them incrementally. The majority of these algorithms assume that data is (a) coming from a single source, and (b) fully labeled. Real-world EO datasets are instead characterized by a large heterogeneity (e.g., coming from aerial, satellite, or drone scenarios), and for the most part they are unlabeled, meaning they can be fully exploited only through the emerging Self-Supervised Learning (SSL) paradigm. For these reasons, in this paper we propose a new algorithm for merging SSL and CL for remote sensing applications, that we call Continual Barlow Twins (CBT). It combines the advantages of one of the simplest self-supervision techniques, i.e., Barlow Twins, with the Elastic Weight Consolidation method to avoid catastrophic forgetting. In addition, for the first time we evaluate SSL methods on a highly heterogeneous EO dataset, showing the effectiveness of these strategies on a novel combination of three almost non-overlapping domains datasets (airborne Potsdam dataset, satellite US3D dataset, and drone UAVid dataset), on a crucial downstream task in EO, i.e., semantic segmentation. Encouraging results show the superiority of SSL in this setting, and the effectiveness of creating an incremental effective pretrained feature extractor, based on ResNet50, without the need of relying on the complete availability of all the data, with a valuable saving of time and resources.
- Published
- 2022
85. A Meta-Learning Approach for Training Explainable Graph Neural Networks
- Author
-
Indro Spinelli, Simone Scardapane, and Aurelio Uncini
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence ,Computer Networks and Communications ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Software ,Computer Science Applications ,Machine Learning (cs.LG) - Abstract
In this article, we investigate the degree of explainability of graph neural networks (GNNs). The existing explainers work by finding global/local subgraphs to explain a prediction, but they are applied after a GNN has already been trained. Here, we propose a meta-explainer for improving the level of explainability of a GNN directly at training time, by steering the optimization procedure toward minima that allow post hoc explainers to achieve better results, without sacrificing the overall accuracy of GNN. Our framework (called MATE, MetA-Train to Explain) jointly trains a model to solve the original task, e.g., node classification, and to provide easily processable outputs for downstream algorithms that explain the model's decisions in a human-friendly way. In particular, we meta-train the model's parameters to quickly minimize the error of an instance-level GNNExplainer trained on-the-fly on randomly sampled nodes. The final internal representation relies on a set of features that can be ``better'' understood by an explanation algorithm, e.g., another instance of GNNExplainer. Our model-agnostic approach can improve the explanations produced for different GNN architectures and use any instance-based explainer to drive this process. Experiments on synthetic and real-world datasets for node and graph classification show that we can produce models that are consistently easier to explain by different algorithms. Furthermore, this increase in explainability comes at no cost to the accuracy of the model.
- Published
- 2022
86. Learning Speech Emotion Representations in the Quaternion Domain
- Author
-
Danilo Comminiello, Simone Scardapane, Eric Guizzo, and Tillman Weyde
- Subjects
FOS: Computer and information sciences ,Computational Mathematics ,Computer Science - Machine Learning ,Sound (cs.SD) ,Acoustics and Ultrasonics ,Audio and Speech Processing (eess.AS) ,Computer Science (miscellaneous) ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing ,Machine Learning (cs.LG) - Abstract
The modeling of human emotion expression in speech signals is an important, yet challenging task. The high resource demand of speech emotion recognition models, combined with the the general scarcity of emotion-labelled data are obstacles to the development and application of effective solutions in this field. In this paper, we present an approach to jointly circumvent these difficulties. Our method, named RH-emo, is a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling the use of quaternion-valued networks for speech emotion recognition tasks. RH-emo is a hybrid real/quaternion autoencoder network that consists of a real-valued encoder in parallel to a real-valued emotion classifier and a quaternion-valued decoder. On the one hand, the classifier permits to optimize each latent axis of the embeddings for the classification of a specific emotion-related characteristic: valence, arousal, dominance and overall emotion. On the other hand, the quaternion reconstruction enables the latent dimension to develop intra-channel correlations that are required for an effective representation as a quaternion entity. We test our approach on speech emotion recognition tasks using four popular datasets: Iemocap, Ravdess, EmoDb and Tess, comparing the performance of three well-established real-valued CNN architectures (AlexNet, ResNet-50, VGG) and their quaternion-valued equivalent fed with the embeddings created with RH-emo. We obtain a consistent improvement in the test accuracy for all datasets, while drastically reducing the resources' demand of models. Moreover, we performed additional experiments and ablation studies that confirm the effectiveness of our approach. The RH-emo repository is available at: https://github.com/ispamm/rhemo., Comment: Accepted for Publication in IEEE/ACM Transactions on Audio, Speech and Language Processing
- Published
- 2022
- Full Text
- View/download PDF
87. A Probabilistic Re-Intepretation of Confidence Scores in Multi-Exit Models
- Author
-
Aurelio Uncini, Simone Scardapane, and Jary Pomponi
- Subjects
Science ,Physics ,QC1-999 ,branch neural networks ,General Physics and Astronomy ,deep learning ,deep neural networks ,adaptive computation ,fast inference ,Astrophysics ,Article ,QB460-466 ,High Energy Physics::Experiment - Abstract
In this paper, we propose a new approach to train a deep neural network with multiple intermediate auxiliary classifiers, branching from it. These ‘multi-exits’ models can be used to reduce the inference time by performing early exit on the intermediate branches, if the confidence of the prediction is higher than a threshold. They rely on the assumption that not all the samples require the same amount of processing to yield a good prediction. In this paper, we propose a way to train jointly all the branches of a multi-exit model without hyper-parameters, by weighting the predictions from each branch with a trained confidence score. Each confidence score is an approximation of the real one produced by the branch, and it is calculated and regularized while training the rest of the model. We evaluate our proposal on a set of image classification benchmarks, using different neural models and early-exit stopping criteria.
- Published
- 2021
88. Avalanche. An end-to-end library for continual learning
- Author
-
Simone Scardapane, Martin Mundt, Tyler L. Hayes, Simone Calderara, Keiland W. Cooper, Christopher Kanan, Eden Belouadah, Lorenzo Pellegrini, Adrian Popescu, Matthias De Lange, Fabio Cuzzolin, Jeremy Forest, Jary Pomponi, Subutai Ahmad, Qi She, Luca Antiga, Gido M. van de Ven, Davide Maltoni, Davide Bacciu, Vincenzo Lomonaco, Joost van de Weijer, Marc Masana, Antonio Carta, Gabriele Graffieti, Andreas S. Tolias, German Ignacio Parisi, Andrea Cossu, Tinne Tuytelaars, and Vincenzo Lomonaco, Lorenzo Pellegrini, Andrea Cossu, Antonio Carta, Gabriele Graffieti, Tyler L. Hayes, Matthias De Lange, Marc Masana, Jary Pomponi, Gido M. van de Ven, Martin Mundt, Qi She, Keiland Cooper, Jeremy Forest, Eden Belouadah, Simone Calderara, German I. Parisi, Fabio Cuzzolin, Andreas S. Tolias, Simone Scardapane, Luca Antiga, Subutai Ahmad, Adrian Popescu, Christopher Kanan, Joost van de Weijer, Tinne Tuytelaars, Davide Bacciu, Davide Maltoni
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Technology ,Computer Science - Artificial Intelligence ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Continual learning ,Computer Science, Artificial Intelligence ,Machine Learning (cs.LG) ,End-to-end principle ,continual learning ,software ,learning algorithms ,Research based ,Codebase ,Science & Technology ,Data stream mining ,business.industry ,Deep learning ,Data science ,Port (computer networking) ,Artificial Intelligence (cs.AI) ,Computer Science ,Continual Learning, Software Library, Software Engineering ,Artificial intelligence ,business - Abstract
Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms., Official Website: https://avalanche.continualai.org
- Published
- 2021
89. Guest Editorial: Trends in Reservoir Computing
- Author
-
Miguel C. Soriano, Claudio Gallicchio, Alessio Micheli, and Simone Scardapane
- Subjects
Computer science ,Cognitive Neuroscience ,Reservoir computing ,Computer Vision and Pattern Recognition ,Data science ,Computer Science Applications - Published
- 2021
90. MARE: Self-supervised multi-attention REsu-net for semantic segmentation in remote sensing
- Author
-
Nikos Komodakis, Simone Scardapane, and Valerio Marsocci
- Subjects
Self supervised learning ,Artificial neural network ,Computer science ,Science ,deep learning ,Land cover ,Net (mathematics) ,vaihingen dataset ,semantic segmentation ,Task (project management) ,linear attention ,self-supervised learning ,remote sensing ,Bag-of-words model ,Remote sensing (archaeology) ,General Earth and Planetary Sciences ,Segmentation ,Remote sensing - Abstract
Scene understanding of satellite and aerial images is a pivotal task in various remote sensing (RS) practices, such as land cover and urban development monitoring. In recent years, neural networks have become a de-facto standard in many of these applications. However, semantic segmentation still remains a challenging task. With respect to other computer vision (CV) areas, in RS large labeled datasets are not very often available, due to their large cost and to the required manpower. On the other hand, self-supervised learning (SSL) is earning more and more interest in CV, reaching state-of-the-art in several tasks. In spite of this, most SSL models, pretrained on huge datasets like ImageNet, do not perform particularly well on RS data. For this reason, we propose a combination of a SSL algorithm (particularly, Online Bag of Words) and a semantic segmentation algorithm, shaped for aerial images (namely, Multistage Attention ResU-Net), to show new encouraging results (i.e., 81.76% mIoU with ResNet-18 backbone) on the ISPRS Vaihingen dataset.
- Published
- 2021
91. Bayesian neural networks with maximum mean discrepancy regularization
- Author
-
Aurelio Uncini, Jary Pomponi, and Simone Scardapane
- Subjects
FOS: Computer and information sciences ,0209 industrial biotechnology ,Computer Science - Machine Learning ,Optimization problem ,maximum mean discrepancy ,Computer science ,Calibration (statistics) ,Cognitive Neuroscience ,bayesian learning ,Monte Carlo method ,Inference ,Machine Learning (stat.ML) ,02 engineering and technology ,variational approximation ,entropy ,Upper and lower bounds ,Regularization (mathematics) ,Machine Learning (cs.LG) ,Differential entropy ,020901 industrial engineering & automation ,Artificial Intelligence ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,Divergence (statistics) ,Interpretability ,Estimator ,Computer Science Applications ,Maximum mean discrepancy ,020201 artificial intelligence & image processing ,Algorithm - Abstract
Bayesian Neural Networks (BNNs) are trained to optimize an entire distribution over their weights instead of a single set, having significant advantages in terms of, e.g., interpretability, multi-task learning, and calibration. Because of the intractability of the resulting optimization problem, most BNNs are either sampled through Monte Carlo methods, or trained by minimizing a suitable Evidence Lower BOund (ELBO) on a variational approximation. In this paper, we propose an optimized version of the latter, wherein we replace the Kullback–Leibler divergence in the ELBO term with a Maximum Mean Discrepancy (MMD) estimator, inspired by recent work in variational inference. After motivating our proposal based on the properties of the MMD term, we proceed to show a number of empirical advantages of the proposed formulation over the state-of-the-art. In particular, our BNNs achieve higher accuracy on multiple benchmarks, including several image classification tasks. In addition, they are more robust to the selection of a prior over the weights, and they are better calibrated. As a second contribution, we provide a new formulation for estimating the uncertainty on a given prediction, showing it performs in a more robust fashion against adversarial attacks and the injection of noise over their inputs, compared to more classical criteria such as the differential entropy.
- Published
- 2021
92. A wide multimodal dense U-net for fast magnetic resonance imaging
- Author
-
Simone Scardapane, Danilo Comminiello, Antonio Falvo, Michele Scarpiniti, and Aurelio Uncini
- Subjects
Signal processing ,medicine.diagnostic_test ,Computer science ,business.industry ,deep neural network ,fast MRI ,MR image reconstruction ,multimodal dense U-Net ,multiple sclerosis ,Deep learning ,020206 networking & telecommunications ,Magnetic resonance imaging ,02 engineering and technology ,Iterative reconstruction ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Focus (optics) ,business - Abstract
In this paper, a deep learning method for accelerating magnetic resonance imaging (MRI) is presented, which is able to reconstruct undersampled MR images obtained by reducing the k-space data in the direction of the phase encoding. In particular, we focus on the reconstruction of MR images related to patients affected by multiple sclerosis (MS) and we propose a new multimodal deep learning architecture that is able to exploit the joint information deriving from the combination of different types of MR images and to accelerate the MRI, while providing high quality of the reconstructed image. Experimental results show the performance improvement of the proposed method with respect to existing models in reconstructing images with an MRI acceleration of 4 times.
- Published
- 2021
93. A multimodal deep network for the reconstruction of T2W MR images
- Author
-
Antonio Falvo, Simone Scardapane, Michele Scarpiniti, Danilo Comminiello, and Aurelio Uncini
- Subjects
Image quality ,Computer science ,02 engineering and technology ,Fluid-attenuated inversion recovery ,multiple sclerosis ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Data acquisition ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,magnetic resonance imaging ,Computer vision ,fast MRI ,medicine.diagnostic_test ,Artificial neural network ,business.industry ,Deep learning ,deep neural network ,Magnetic resonance imaging ,020201 artificial intelligence & image processing ,Acquisition time ,Artificial intelligence ,Mr images ,business - Abstract
Multiple sclerosis is one of the most common chronic neurological diseases affecting the central nervous system. Lesions produced by the MS can be observed through two modalities of magnetic resonance (MR), known as T2W and FLAIR sequences, both providing useful information for formulating a diagnosis. However, long acquisition time makes the acquired MR image vulnerable to motion artifacts. This leads to the need of accelerating the execution of the MR analysis. In this paper, we present a deep learning method that is able to reconstruct subsampled MR images obtained by reducing the k-space data, while maintaining a high image quality that can be used to observe brain lesions. The proposed method exploits the multimodal approach of neural networks and it also focuses on the data acquisition and processing stages to reduce execution time of the MR analysis. Results prove the effectiveness of the proposed method in reconstructing subsampled MR images while saving execution time.
- Published
- 2021
94. A New Class of Efficient Adaptive Filters for Online Nonlinear Modeling
- Author
-
Danilo Comminiello, Alireza Nezamdoust, Simone Scardapane, Michele Scarpiniti, Amir Hussain, and Aurelio Uncini
- Subjects
Signal Processing (eess.SP) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Sound (cs.SD) ,Systems and Control (eess.SY) ,Electrical Engineering and Systems Science - Systems and Control ,Computer Science - Sound ,Computer Science Applications ,Machine Learning (cs.LG) ,Human-Computer Interaction ,Control and Systems Engineering ,Audio and Speech Processing (eess.AS) ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Signal Processing ,Electrical and Electronic Engineering ,Software ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Nonlinear models are known to provide excellent performance in real-world applications that often operate in non-ideal conditions. However, such applications often require online processing to be performed with limited computational resources. To address this problem, we propose a new class of efficient nonlinear models for online applications. The proposed algorithms are based on linear-in-the-parameters (LIP) nonlinear filters using functional link expansions. In order to make this class of functional link adaptive filters (FLAFs) efficient, we propose low-complexity expansions and frequency-domain adaptation of the parameters. Among this family of algorithms, we also define the partitioned-block frequency-domain FLAF, whose implementation is particularly suitable for online nonlinear modeling problems. We assess and compare frequency-domain FLAFs with different expansions providing the best possible tradeoff between performance and computational complexity. Experimental results prove that the proposed algorithms can be considered as an efficient and effective solution for online applications, such as the acoustic echo cancellation, even in the presence of adverse nonlinear conditions and with limited availability of computational resources., Comment: This work has been accepted for publication in IEEE Transactions on Systems, Man, and Cybernetics: Systems. Copyright may be transferred without notice, after which this version may no longer be accessible
- Published
- 2021
- Full Text
- View/download PDF
95. Self-supervised learning for medieval handwriting identification: A case study from the Vatican Apostolic Library
- Author
-
Lorenzo Lastilla, Serena Ammirati, Donatella Firmani, Nikos Komodakis, Paolo Merialdo, Simone Scardapane, Lastilla, L., Ammirati, S., Firmani, D., Komodakis, N., Merialdo, P., and Scardapane, S.
- Subjects
Self-supervised learning ,ComputingMethodologies_PATTERNRECOGNITION ,handwriting identification ,manuscripts ,self-supervised learning ,Media Technology ,Library and Information Sciences ,Management Science and Operations Research ,Handwriting identification ,Manuscript ,Computer Science Applications ,Information Systems - Abstract
In this paper, we consider the task of automatically identifying whether different parts of medieval and modern manuscripts can be traced back to the same copyist/scribe, a problem of significant interest in paleography. Currently, the application of deep learning techniques in the context of scribe recognition has been hindered by the lack of a sufficiently large, labeled dataset, since the labeling process is incredibly complex and time-consuming. Here, we propose the first successful application of the recent framework of self-supervised learning to the field of digital paleography, wherein we pretrain a convolutional neural network by leveraging large amounts of unlabeled manuscripts. To this end, we build a novel dataset consisting of both labeled and unlabeled manuscripts for copyist identification extracted from the Vatican Apostolic Library. We show that fine-tuning this model to the task of interest significantly outperforms other baselines, including the common setup of initializing the network from general-domain features, or training the model from scratch, also in terms of generalization power. Overall, our results reveal the strong potential of self-supervised techniques in the field of digital paleography, where unlabeled data (i.e., digitized manuscripts) is nowadays available, while labeled data is scarcer.
- Published
- 2022
96. Adaptive Propagation Graph Convolutional Network
- Author
-
Aurelio Uncini, Indro Spinelli, and Simone Scardapane
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Theoretical computer science ,Interleaving ,Computer Networks and Communications ,Computer science ,graphneural network (GNN) ,Inference ,Machine Learning (stat.ML) ,02 engineering and technology ,Regularization (mathematics) ,Machine Learning (cs.LG) ,convolutional network ,Artificial Intelligence ,Statistics - Machine Learning ,Adaptive system ,0202 electrical engineering, electronic engineering, information engineering ,Differentiable function ,graph data ,node classification ,Artificial neural network ,Graph ,Computer Science Applications ,Convolutional code ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Laplacian smoothing ,Software - Abstract
Graph convolutional networks (GCNs) are a family of neural network models that perform inference on graph data by interleaving vertex-wise operations and message-passing exchanges across nodes. Concerning the latter, two key questions arise: (i) how to design a differentiable exchange protocol (e.g., a 1-hop Laplacian smoothing in the original GCN), and (ii) how to characterize the trade-off in complexity with respect to the local updates. In this paper, we show that state-of-the-art results can be achieved by adapting the number of communication steps independently at every node. In particular, we endow each node with a halting unit (inspired by Graves' adaptive computation time) that after every exchange decides whether to continue communicating or not. We show that the proposed adaptive propagation GCN (AP-GCN) achieves superior or similar results to the best proposed models so far on a number of benchmarks, while requiring a small overhead in terms of additional parameters. We also investigate a regularization term to enforce an explicit trade-off between communication and accuracy. The code for the AP-GCN experiments is released as an open-source library., Published in IEEE Transaction on Neural Networks and Learning Systems
- Published
- 2020
97. Quaternion Neural Networks for 3D Sound Source Localization in Reverberant Environments
- Author
-
Simone Scardapane, Danilo Comminiello, and Michela Ricciardi Celsi
- Subjects
Sound localization ,source localization ,Audio signal ,Artificial neural network ,Ambisonics ,Computer science ,Acoustics ,Direction of arrival ,020206 networking & telecommunications ,02 engineering and technology ,Acoustic source localization ,3D audio ,hypercomplex-valued neural networks ,Sound intensity ,convolutional recurrent neural networks ,Computer Science::Sound ,quaternion neural networks ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quaternion - Abstract
Localization of sound sources in 3D sound fields is an extremely challenging task, especially when the environments are reverberant and involve multiple sources. In this work, we propose a deep neural network to analyze audio signals recorded by 3D microphones and localize sound sources in a spatial sound field. In particular, we consider first-order Ambisonics microphones to capture 3D acoustic signals and represent them by spherical harmonic decomposition in the quaternion domain. Moreover, to improve the localization performance, we use quaternion input features derived from the acoustic intensity, which is strictly related to the direction of arrival (DOA) of a sound source. The proposed network architecture involves both quaternion-valued convolutional and recurrent layers. Results show that the proposed method is able to exploit both the quaternion-valued representation of ambisonic signals and to improve the localization performance with respect to existing methods.
- Published
- 2020
98. Combined Sparse Regularization for Nonlinear Adaptive Filters
- Author
-
Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, Aurelio Unclni, and Luis A. Azpicueta-Ruiz
- Subjects
Signal Processing (eess.SP) ,Nonlinear system identification ,Computer science ,MathematicsofComputing_NUMERICALANALYSIS ,020206 networking & telecommunications ,02 engineering and technology ,Regularization (mathematics) ,Adaptive filter ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Nonlinear system ,0202 electrical engineering, electronic engineering, information engineering ,Nonlinear adaptive filter ,FOS: Electrical engineering, electronic engineering, information engineering ,Leverage (statistics) ,Electrical Engineering and Systems Science - Signal Processing ,0305 other medical science ,Sparse regularization ,Algorithm ,sparse regularization ,functional links ,linear-in-the-parameters nonlinear filters ,sparse adaptive filters ,adaptive combination of filters - Abstract
Nonlinear adaptive filters often show some sparse behavior due to the fact that not all the coefficients are equally useful for the modeling of any nonlinearity. Recently, a class of proportionate algorithms has been proposed for nonlinear filters to leverage sparsity of their coefficients. However, the choice of the norm penalty of the cost function may be not always appropriate depending on the problem. In this paper, we introduce an adaptive combined scheme based on a block-based approach involving two nonlinear filters with different regularization that allows to achieve always superior performance than individual rules. The proposed method is assessed in nonlinear system identification problems, showing its effectiveness in taking advantage of the online combined regularization., This is a corrected version of the paper presented at EUSIPCO 2018 and published on IEEE https://ieeexplore.ieee.org/document/8552955
- Published
- 2020
99. Efficient Data Augmentation Using Graph Imputation Neural Networks
- Author
-
Aurelio Uncini, Simone Scardapane, Michele Scarpiniti, and Indro Spinelli
- Subjects
Artificial neural network ,Graph neural networks ,Computer science ,graph convolution ,Data reconstruction ,graph neural network ,imputation ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,Missing data imputation ,data augmentation ,Labeled data ,Leverage (statistics) ,Graph (abstract data type) ,Data mining ,Imputation (statistics) ,computer - Abstract
Recently, data augmentation in the semi-supervised regime, where unlabeled data vastly outnumbers labeled data, has received a considerable attention. In this paper, we describe an efficient technique for this task, exploiting a recent framework we proposed for missing data imputation called graph imputation neural network (GINN). The key idea is to leverage both supervised and unsupervised data to build a graph of similarities between points in the dataset. Then, we augment the dataset by severely damaging a few of the nodes (up to 80% of their features), and reconstructing them using a variation of GINN. On several benchmark datasets, we show that our method can obtain significant improvements compared to a fully-supervised model, and we are able to augment the datasets up to a factor of \(10\times \). This points to the power of graph-based neural networks to represent structural affinities in the samples for tasks of data reconstruction and augmentation.
- Published
- 2020
100. Flexible Generative Adversarial Networks with Non-parametric Activation Functions
- Author
-
Simone Scardapane, Danilo Comminiello, Aurelio Uncini, and Eleonora Grassucci
- Subjects
activation function ,generative adversarial network ,image ,neural network ,Artificial neural network ,business.industry ,Computer science ,Process (engineering) ,Activation function ,Nonparametric statistics ,Stability (learning theory) ,020206 networking & telecommunications ,Context (language use) ,02 engineering and technology ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Kernel (statistics) ,Convergence (routing) ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,business - Abstract
Generative adversarial networks (GANs) have become widespread models for complex density estimation tasks such as image generation or image-to-image synthesis. At the same time, training of GANs can suffer from several problems, either of stability or convergence, sometimes hindering their effective deployment. In this paper we investigate whether we can improve GAN training by endowing the neural network models with more flexible activation functions compared to the commonly used rectified linear unit (or its variants). In particular, we evaluate training a deep convolutional GAN wherein all hidden activation functions are replaced with a version of the kernel activation function (KAF), a recently proposed technique for learning non-parametric nonlinearities during the optimization process. On a thorough empirical evaluation on multiple image generation benchmarks, we show that the resulting architectures learn to generate visually pleasing images in a fraction of the number of the epochs, eventually converging to a better solution, even when we equalize (or even lower) the number of free parameters. Overall, this points to the importance of investigating better and more flexible architectures in the context of GANs.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.