Author: "Rouditchenko, Andrew" / Publisher: arxiv - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Rouditchenko, Andrew"' showing total 2 results

Start Over Author "Rouditchenko, Andrew" Publisher arxiv

2 results on '"Rouditchenko, Andrew"'

1. CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification

Author: Gong, Yuan, Khurana, Sameer, Rouditchenko, Andrew, and Glass, James
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Audio classification is an active research area with a wide range of applications. Over the past decade, convolutional neural networks (CNNs) have been the de-facto standard building block for end-to-end audio classification models. Recently, neural networks based solely on self-attention mechanisms such as the Audio Spectrogram Transformer (AST) have been shown to outperform CNNs. In this paper, we find an intriguing interaction between the two very different models - CNN and AST models are good teachers for each other. When we use either of them as the teacher and train the other model as the student via knowledge distillation (KD), the performance of the student model noticeably improves, and in many cases, is better than the teacher model. In our experiments with this CNN/Transformer Cross-Model Knowledge Distillation (CMKD) method we achieve new state-of-the-art performance on FSD50K, AudioSet, and ESC-50.
Published: 2022
Full Text: View/download PDF

2. Routing with Self-Attention for Multimodal Capsule Networks

Author: Duarte, Kevin, Chen, Brian, Shvetsova, Nina, Rouditchenko, Andrew, Thomas, Samuel, Liu, Alexander, Harwath, David, Glass, James, Kuehne, Hilde, and Shah, Mubarak
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: The task of multimodal learning has seen a growing interest recently as it allows for training neural architectures based on different modalities such as vision, text, and audio. One challenge in training such models is that they need to jointly learn semantic concepts and their relationships across different input representations. Capsule networks have been shown to perform well in context of capturing the relation between low-level input features and higher-level concepts. However, capsules have so far mainly been used only in small-scale fully supervised settings due to the resource demand of conventional routing algorithms. We present a new multimodal capsule network that allows us to leverage the strength of capsules in the context of a multimodal learning framework on large amounts of video data. To adapt the capsules to large-scale input data, we propose a novel routing by self-attention mechanism that selects relevant capsules which are then used to generate a final joint multimodal feature representation. This allows not only for robust training with noisy video data, but also to scale up the size of the capsule network compared to traditional routing methods while still being computationally efficient. We evaluate the proposed architecture by pretraining it on a large-scale multimodal video dataset and applying it on four datasets in two challenging downstream tasks. Results show that the proposed multimodal capsule network is not only able to improve results compared to other routing techniques, but also achieves competitive performance on the task of multimodal learning.
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"Rouditchenko, Andrew"'

1. CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification

2. Routing with Self-Attention for Multimodal Capsule Networks

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

2 results on '"Rouditchenko, Andrew"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources