Author: "Shiv Naga Prasad Vitaladevuni" / Topic: keyword spotting - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Shiv Naga Prasad Vitaladevuni"' showing total 8 results

Start Over Author "Shiv Naga Prasad Vitaladevuni" Topic keyword spotting

8 results on '"Shiv Naga Prasad Vitaladevuni"'

1. Metadata-Aware End-to-End Keyword Spotting

Author: Thibaud Senechal, Geng-Shen Fu, Yuriy Mishchenko, Anish Shah, Hongyi Liu, Shiv Naga Prasad Vitaladevuni, Noah D. Stein, Brian Kulis, and Apurva Abhyankar
Subjects: Metadata, World Wide Web, End-to-end principle, Computer science, Keyword spotting
Published: 2020
Full Text: View/download PDF

2. Low-Bit Quantization and Quantization-Aware Training for Small-Footprint Keyword Spotting

Author: Chris Beauchene, Yuriy Mishchenko, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni, Spyros Matsoukas, Ming Sun, and Yusuf Goren
Subjects: Artificial neural network, Computer science, Low bit, Quantization (signal processing), Small footprint, 020208 electrical & electronic engineering, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Quantization (physics), Keyword spotting, 0202 electrical engineering, electronic engineering, information engineering, Algorithm, 0105 earth and related environmental sciences
Abstract: In this paper, we investigate novel quantization approaches to reduce memory and computational footprint of deep neural network (DNN) based keyword spotters (KWS). We propose a new method for KWS offline and online quantization, which we call dynamic quantization, where we quantize DNN weight matrices column-wise, using each column's exact individual min-max range, and the DNN layers' inputs and outputs are quantized for every input audio frame individually, using the exact min-max range of each input and output vector. We further apply a new quantization-aware training approach that allows us to incorporate quantization errors into KWS model during training. Together, these approaches allow us to significantly improve the performance of KWS in 4-bit and 8-bit quantized precision, achieving the end-to-end accuracy close to that of full precision models while reducing the models' on-device memory footprint by up to 80%.
Published: 2019
Full Text: View/download PDF

3. Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification

Author: Shiv Naga Prasad Vitaladevuni, Yixin Gao, Chao Wang, Ming Sun, and Chieh-Chi Kao
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Computer science, Property (programming), Computation, Speech recognition, Small footprint, Computer Science::Neural and Evolutionary Computation, Convolutional neural network, Computer Science - Sound, Term (time), Computer Science::Sound, Audio and Speech Processing (eess.AS), Kernel (statistics), Keyword spotting, Feature (machine learning), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper proposes a Sub-band Convolutional Neural Network for spoken term classification. Convolutional neural networks (CNNs) have proven to be very effective in acoustic applications such as spoken term classification, keyword spotting, speaker identification, acoustic event detection, etc. Unlike applications in computer vision, the spatial invariance property of 2D convolutional kernels does not fit acoustic applications well since the meaning of a specific 2D kernel varies a lot along the feature axis in an input feature map. We propose a sub-band CNN architecture to apply different convolutional kernels on each feature sub-band, which makes the overall computation more efficient. Experimental results show that the computational efficiency brought by sub-band CNN is more beneficial for small-footprint models. Compared to a baseline full band CNN for spoken term classification on a publicly available Speech Commands dataset, the proposed sub-band CNN architecture reduces the computation by 39.7% on commands classification, and 49.3% on digits classification with accuracy maintained., Comment: Accepted by Interspeech 2019
Published: 2019
Full Text: View/download PDF

4. An Empirical Study of Cross-Lingual Transfer Learning Techniques for Small-Footprint Keyword Spotting

Author: Nikko Strom, Spyros Matsoukas, Ming Sun, Andreas Schwarz, Minhua Wu, and Shiv Naga Prasad Vitaladevuni
Subjects: Empirical research, Artificial neural network, Computer science, 020204 information systems, Test set, Speech recognition, Keyword spotting, 0202 electrical engineering, electronic engineering, information engineering, Leverage (statistics), 020201 artificial intelligence & image processing, 02 engineering and technology, Transfer of learning, Hidden Markov model
Abstract: This paper presents our work on building a small-footprint keyword spotting system for a resource-limited language, which requires low CPU, memory and latency. Our keyword spotting system consists of deep neural network (DNN) and hidden Markov model (HMM), which is a hybrid DNN-HMM decoder. We investigate different transfer learning techniques to leverage knowledge and data from a resource-abundant source language to improve the keyword DNN training for a target language which has limited in-domain data. The approaches employed in this paper include training a DNN using source language data to initialize the target language DNN training, mixing data from source and target languages together in a multi-task DNN training setup, using logits computed from a DNN trained on the source language data to regularize the keyword DNN training in the target language, as well as combinations of these techniques. Given different amounts of target language training data, our experimental results show that these transfer learning techniques successfully improve keyword spotting performance for the target language, measured by the area under the curve (AUC) of DNN-HMM decoding detection error tradeoff (DET) curves using a large in-house far-field test set.
Published: 2017
Full Text: View/download PDF

5. Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

Author: Nikko Strom, Arindam Mandal, Anirudh Raju, Shiv Naga Prasad Vitaladevuni, Geng-Shen Fu, Spyros Matsoukas, Sankaran Panchapagesan, George Tucker, and Ming Sun
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial neural network, Computer science, Speech recognition, education, Initialization, Machine Learning (stat.ML), Context (language use), 02 engineering and technology, Machine Learning (cs.LG), Reduction (complexity), 030507 speech-language pathology & audiology, 03 medical and health sciences, Computer Science - Learning, Statistics - Machine Learning, Keyword spotting, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Latency (engineering), 0305 other medical science, Hidden Markov model, Computation and Language (cs.CL), Smoothing
Abstract: We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67.6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.
Published: 2017

6. Multi-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting

Author: Shiv Naga Prasad Vitaladevuni, Ming Sun, Aparna Khare, Sankaran Panchapagesan, Spyros Matsoukas, Arindam Mandal, and Bjorn Hoffmeister
Subjects: Cross entropy, Computer science, Speech recognition, Keyword spotting, 0202 electrical engineering, electronic engineering, information engineering, Multi-task learning, 020206 networking & telecommunications, 020201 artificial intelligence & image processing, 02 engineering and technology
Published: 2016
Full Text: View/download PDF

7. Model Compression Applied to Small-Footprint Keyword Spotting

Author: Ming Sun, Sankaran Panchapagesan, Shiv Naga Prasad Vitaladevuni, George Tucker, Minhua Wu, and Geng-Shen Fu
Subjects: Model compression, Computer science, Keyword spotting, Small footprint, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 02 engineering and technology, Data mining, computer.software_genre, 010301 acoustics, 01 natural sciences, computer
Published: 2016
Full Text: View/download PDF

8. Model Shrinking for Embedded Keyword Spotting

Author: Varun K. Nagaraja, Bjorn Hoffmeister, Shiv Naga Prasad Vitaladevuni, and Ming Sun
Subjects: Support vector machine, Computer science, business.industry, Keyword spotting, Pattern recognition, Feature selection, Artificial intelligence, business, Classifier (UML)
Abstract: In this paper we present two approaches to improve computational efficiency of a keyword spotting system running on a resource constrained device. This embedded keyword spotting system detects a pre-specified keyword in real time at low cost of CPU and memory. Our system is a two stage cascade. The first stage extracts keyword hypotheses from input audio streams. After the first stage is triggered, hand-crafted features are extracted from the keyword hypothesis and fed to a support vector machine (SVM) classifier on the second stage. This paper focuses on improving the computational efficiency of the second stage SVM classifier. More specifically, select a subset of feature dimensions and merge the SVM classifier to a smaller size, while maintaining the keyword spotting performance. Experimental results indicate that we can remove more than 36% of the non-discriminative SVM features, and reduce the number of support vectors by more than 60% without significant performance degradation. This results in more than 15% relative reduction in CPU utilization.
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"Shiv Naga Prasad Vitaladevuni"'

1. Metadata-Aware End-to-End Keyword Spotting

2. Low-Bit Quantization and Quantization-Aware Training for Small-Footprint Keyword Spotting

3. Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification

4. An Empirical Study of Cross-Lingual Transfer Learning Techniques for Small-Footprint Keyword Spotting

5. Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

6. Multi-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting

7. Model Compression Applied to Small-Footprint Keyword Spotting

8. Model Shrinking for Embedded Keyword Spotting

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

8 results on '"Shiv Naga Prasad Vitaladevuni"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources