345 results on '"Sreenivasa Rao"'
Search Results
2. Efficient design of an authenticated key agreement protocol for dew-assisted IoT systems
- Author
-
Ankita Mishra, Mohammad S. Obaidat, Dheerendra Mishra, Saurabh Rana, and Y. Sreenivasa Rao
- Subjects
Correctness ,Group method of data handling ,Computer science ,business.industry ,Cloud computing ,Mutual authentication ,Bottleneck ,Theoretical Computer Science ,Hardware and Architecture ,Session (computer science) ,Latency (engineering) ,business ,Protocol (object-oriented programming) ,Software ,Information Systems ,Computer network - Abstract
Real-time communication is a significant aspect of Internet of Things (IoT). IoT-enabled devices requires the immediate adoption of the highly distributed and heterogeneous framework of collateral merits. Moreover, cloud-based streaming services for IoT have disadvantages such as the inability to provide low latency, mobility support, location-awareness, and real-time data handling, which makes ubiquitous connectivity between the IoT device and server. At the same time, the concept of dew computing modifies the current mechanism of cloud-based services for IoT. It minimizes the response time of comprehensive data, which was collected by nearby resources. However, speedy advancement in IoT directs the evolving security aspects to address emerging challenges. To address the security issues, a mutual authentication architecture has been introduced for dew computing, which ensures secure and authorized session establishment without the requirement of a trusted server. The main objective of the proposed framework is to avoid bottleneck situations without compromising efficiency and security in real-time communication to IoT users through dew computing. To ensure the correctness of protocol, proof of security and simulation using AVISPA are presented. Analysis of performance and comparative study is also conducted to show the advantage in efficiency.
- Published
- 2021
- Full Text
- View/download PDF
3. Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages
- Author
-
K. M. Srinivasa Raghavan, K. E. Manjunath, K. Sreenivasa Rao, V. Ramasubramanian, and Dinesh Babu Jayagopi
- Subjects
General Computer Science ,Language identification ,Computer science ,Speech recognition ,Window (computing) ,020206 networking & telecommunications ,02 engineering and technology ,Code-switching ,01 natural sciences ,Phone ,International Phonetic Alphabet ,0103 physical sciences ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Transcription (software) ,010301 acoustics ,Utterance - Abstract
In this study, we evaluate and compare two different approaches for multilingual phone recognition in code-switched and non-code-switched scenarios. First approach is a front-end Language Identification (LID)-switched to a monolingual phone recognizer (LID-Mono), trained individually on each of the languages present in multilingual dataset. In the second approach, a common multilingual phone-set derived from the International Phonetic Alphabet (IPA) transcription of the multilingual dataset is used to develop a Multilingual Phone Recognition System (Multi-PRS). The bilingual code-switching experiments are conducted using Kannada and Urdu languages. In the first approach, LID is performed using the state-of-the-art i-vectors. Both monolingual and multilingual phone recognition systems are trained using Deep Neural Networks. The performance of LID-Mono and Multi-PRS approaches are compared and analysed in detail. It is found that the performance of Multi-PRS approach is superior compared to more conventional LID-Mono approach in both code-switched and non-code-switched scenarios. For code-switched speech, the effect of length of segments (that are used to perform LID) on the performance of LID-Mono system is studied by varying the window size from 500 ms to 5.0 s, and full utterance. The LID-Mono approach heavily depends on the accuracy of the LID system and the LID errors cannot be recovered. But, the Multi-PRS system by virtue of not having to do a front-end LID switching and designed based on the common multilingual phone-set derived from several languages, is not constrained by the accuracy of the LID system, and hence performs effectively on code-switched and non-code-switched speech, offering low Phone Error Rates than the LID-Mono system.
- Published
- 2021
- Full Text
- View/download PDF
4. Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
- Author
-
Hemant A. Patil, Nirmalya Sen, Krothapalli Sreenivasa Rao, T. K. Basu, Shyamal Kumar Das Mandal, Md. Sahidullah, R. H. Sapat College of Engineering Management Studies & Research, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Indian Institute of Technology Kharagpur (IIT Kharagpur), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Signal Processing (eess.SP) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Linguistics and Language ,Boosting (machine learning) ,Computer science ,Speech recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Speaker Recognition ,Language and Linguistics ,Machine Learning (cs.LG) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,GMM-UMB Classifier ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Classifier (linguistics) ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Signal Processing ,Duration (project management) ,Short Test Utterance ,Perspective (graphical) ,020206 networking & telecommunications ,Speaker recognition ,Mixture model ,GMM-SVM Classifier ,Human-Computer Interaction ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,Utterance Partitioning ,Duration Variability ,Computer Vision and Pattern Recognition ,0305 other medical science ,Software ,Utterance - Abstract
The performance of speaker recognition system is highly dependent on the amount of speech used in enrollment and test. This work presents a detailed experimental review and analysis of the GMM-SVM based speaker recognition system in presence of duration variability. This article also reports a comparison of the performance of GMM-SVM classifier with its precursor technique Gaussian mixture model-universal background model (GMM-UBM) classifier in presence of duration variability. The goal of this research work is not to propose a new algorithm for improving speaker recognition performance in presence of duration variability. However, the main focus of this work is on utterance partitioning (UP), a commonly used strategy to compensate the duration variability issue. We have analysed in detailed the impact of training utterance partitioning in speaker recognition performance under GMM-SVM framework. We further investigate the reason why the utterance partitioning is important for boosting speaker recognition performance. We have also shown in which case the utterance partitioning could be useful and where not. Our study has revealed that utterance partitioning does not reduce the data imbalance problem of the GMM-SVM classifier as claimed in earlier study. Apart from these, we also discuss issues related to the impact of parameters such as number of Gaussians, supervector length, amount of splitting required for obtaining better performance in short and long duration test conditions from speech duration perspective. We have performed the experiments with telephone speech from POLYCOST corpus consisting of 130 speakers., International Journal of Speech Technology, Springer Verlag, In press
- Published
- 2021
- Full Text
- View/download PDF
5. Relation Prediction of Co-Morbid Diseases Using Knowledge Graph Completion
- Author
-
Pabitra Mitra, Krothapalli Sreenivasa Rao, and Saikat Biswas
- Subjects
Models, Statistical ,Theoretical computer science ,Databases, Factual ,Relation (database) ,Computer science ,Applied Mathematics ,Association (object-oriented programming) ,Computational Biology ,Comorbidity ,Disease ,Co morbid ,Markov Chains ,Task (project management) ,Gene Ontology ,Knowledge graph ,Data Display ,Genetics ,Task analysis ,Humans ,Embedding ,Protein Interaction Maps ,Algorithms ,Biotechnology - Abstract
Co-morbid disease condition refers to the simultaneous presence of one or more diseases along with the primary disease. A patient suffering from co-morbid diseases possess more mortality risk than with a disease alone. So, it is necessary to predict co-morbid disease pairs. In past years, though several methods have been proposed by researchers for predicting the co-morbid diseases, not much work is done in prediction using knowledge graph embedding using tensor factorization. Moreover, the complex-valued vector-based tensor factorization is not being used in any knowledge graph with biological and biomedical entities. We propose a tensor factorization based approach on biological knowledge graphs. Our method introduces the concept of complex-valued embedding in knowledge graphs with biological entities. Here, we build a knowledge graph with disease-gene associations and their corresponding background information. To predict the association between prevalent diseases, we use ComplEx embedding based tensor decomposition method. Besides, we obtain new prevalent disease pairs using the MCL algorithm in a disease-gene-gene network and check their corresponding inter-relations using edge prediction task.
- Published
- 2021
- Full Text
- View/download PDF
6. Robust vowel region detection method for multimode speech
- Author
-
K. Sreenivasa Rao and Kumud Tripathi
- Subjects
Computer Networks and Communications ,Computer science ,Speech recognition ,020207 software engineering ,TIMIT ,02 engineering and technology ,Signal ,language.human_language ,Bengali ,Hardware and Architecture ,Phone ,Vowel ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,language ,Software ,Vocal tract ,Utterance ,Continuous wavelet transform - Abstract
The aim of this paper is to explore a robust method for vowel region detection from multimode speech. In realistic scenario, speech can be classified into three modes namely; conversation, extempore, and read. The existing method detects the vowel form the speech recorded in clean environment which may not be appropriate for the multimode speech tasks. To address this issue, we proposed an approach based on continuous wavelet transform coefficients and phone boundaries for detecting the vowel regions from different modes of the speech signal. For evaluation of the proposed vowel region (VR) detection technique, TIMIT (read speech) and Bengali (read, extempore, and conversation speech) corpora are used. The proposed VR detection technique is compared to the state-of-the-art methods. The experiments has recorded significant gain in the performance of the proposed technique than the state-of-the-art methods. The efficiency of the proposed technique is shown by extracting vocal tract and excitation source features from automatically detected VRs for developing the multilingual speech mode classification (MSMC) model. The evaluation results report that the performance of the MSMC model is significantly improved when features are extracted from the vowel regions than the entire speech utterance.
- Published
- 2021
- Full Text
- View/download PDF
7. VOP detection for read and conversation speech using CWT coefficients and phone boundaries
- Author
-
K. Sreenivasa Rao and Kumud Tripathi
- Subjects
Signal processing ,General Computer Science ,Computer science ,Speech recognition ,020206 networking & telecommunications ,TIMIT ,Computational intelligence ,02 engineering and technology ,Signal ,Speech segmentation ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Identification (information) ,Vowel ,0202 electrical engineering, electronic engineering, information engineering ,0305 other medical science ,Continuous wavelet transform - Abstract
In this paper, we propose a novel approach for accurate detection of vowel onset points (VOPs). A VOP is the instant at which a vowel begins in a speech signal. Precise identification of VOPs is important for various speech applications such as speech segmentation and speech rate modification. Existing methods detect the majority of VOPs to an accuracy of 40 ms deviation, which may not be appropriate for the above speech applications. To address this issue, we proposed a two-stage approach for accurate detection of VOPs. At the first stage, VOPs are detected using continuous wavelet transform coefficients, and the position of the detected VOPs are corrected using phone boundaries in the second stage. The phone boundaries are detected by the spectral transition measure method. Experiments are done using TIMIT and Bengali speech corpora. Performance of the proposed approach is compared with two standard signal processing based methods as well as with a recent VOP detection technique. The evaluation results show that the proposed method performs better than the existing methods.
- Published
- 2021
- Full Text
- View/download PDF
8. Multilingual Audio-Visual Smartphone Dataset and Evaluation
- Author
-
Christoph Busch, Krothapalli Sreenivasa Rao, S. R. Mahadeva Prasanna, P. N. Aravinda Reddy, Hareesh Mandalapu, Raghavendra Ramachandra, and Pabitra Mitra
- Subjects
FOS: Computer and information sciences ,Computer Science - Cryptography and Security ,General Computer Science ,Biometrics ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Machine learning ,computer.software_genre ,Facial recognition system ,presentation attack detection ,Robustness (computer science) ,General Materials Science ,audio-visual speaker recognition ,business.industry ,SIGNAL (programming language) ,General Engineering ,Usability ,Smartphone biometrics ,Speaker recognition ,TK1-9971 ,Visualization ,Electrical engineering. Electronics. Nuclear engineering ,Noise (video) ,Artificial intelligence ,multilingual ,business ,Cryptography and Security (cs.CR) ,computer - Abstract
Smartphones have been employed with biometric-based verification systems to provide security in highly sensitive applications. Audio-visual biometrics are getting popular due to their usability, and also it will be challenging to spoof because of their multimodal nature. In this work, we present an audio-visual smartphone dataset captured in five different recent smartphones. This new dataset contains 103 subjects captured in three different sessions considering the different real-world scenarios. Three different languages are acquired in this dataset to include the problem of language dependency of the speaker recognition systems. These unique characteristics of this dataset will pave the way to implement novel state-of-the-art unimodal or audio-visual speaker recognition systems. We also report the performance of the bench-marked biometric verification systems on our dataset. The robustness of biometric algorithms is evaluated towards multiple dependencies like signal noise, device, language and presentation attacks like replay and synthesized signals with extensive experiments. The obtained results raised many concerns about the generalization properties of state-of-the-art biometrics methods in smartphones.
- Published
- 2021
- Full Text
- View/download PDF
9. Moving ridge neuronal espionage network simulation for reticulum invasion sensing
- Author
-
G. Sreeram, Parveen Nikhat, B. Deevana Raju, K. Sreenivasa Rao, and S. Pradeep
- Subjects
General Computer Science ,Artificial neural network ,Computer science ,business.industry ,Human intelligence ,020206 networking & telecommunications ,02 engineering and technology ,Intrusion detection system ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Network simulation ,Categorization ,Knowledge extraction ,Container (abstract data type) ,0202 electrical engineering, electronic engineering, information engineering ,Oversampling ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Purpose The paper aims to precise and fast categorization on to transaction evolves into indispensible. The effective capacity difficulty of all the IDS simulates today at below discovery amount of fewer regular barrage associations and therefore the next warning rate. Design/methodology/approach The reticulum perception is that the methods which examine and determine the scheme of contact on unearths toward number of dangerous and perchance fateful interchanges occurring toward the system. Within character of guaran-teeing the slumberous, opening and uprightness count of to socialize for professional. The precise and fast categorization on to transaction evolves into indispensible. The effective capacity difficulty of all the intrusion detection simulation (IDS) simulates today at below discovery amount of fewer regular barrage associations and therefore the next warning rate. The container with systems of connections are reproduction everything beacon subject to the series of actions to achieve results accepts exists a contemporary well-known method. At the indicated motivation a hybrid methodology supported pairing distinct ripple transformation and human intelligence artificial neural network (ANN) for IDS is projected. The lack of balance of the situation traversing the space beyond information range was eliminated through synthetic minority oversampling technique-based oversampling have low regular object and irregular below examine of the dominant object. We are binding with three layer ANN is being used for classification, and thus the experimental results on knowledge discovery databases are being used for the facts in occurrence of accuracy rate and disclosure estimation toward identical period. True and false made up accepted. Findings At the indicated motivation a hybrid methodology supported pairing distinct ripple transformation and human intelligence ANN for IDS is projected. The lack of balance of the situation traversing the space beyond information range was eliminated through synthetic minority oversampling technique-based oversampling have low regular object and irregular below examine of the dominant object. Originality/value Chain interruption discovery is the series of actions for the results knowing the familiarity opening and honor number associate order, the scientific categorization undertaking become necessary. The capacity issues of invasion discovery is the order to determine and examine. The arrangement of simulations at the occasion under discovery estimation for low regular aggression associations and above made up feeling sudden panic amount.
- Published
- 2020
- Full Text
- View/download PDF
10. BOXREC
- Author
-
Krothapalli Sreenivasa Rao, Shamik Sural, Debopriyo Banerjee, and Niloy Ganguly
- Subjects
Focus (computing) ,Information retrieval ,Casual ,business.industry ,Computer science ,E-commerce ,Recommender system ,Clothing ,Session (web analytics) ,Theoretical Computer Science ,Set (abstract data type) ,Artificial Intelligence ,Pairwise comparison ,business - Abstract
Fashionable outfits are generally created by expert fashionistas, who use their creativity and in-depth understanding of fashion to make attractive outfits. Over the past few years, automation of outfit composition has gained much attention from the research community. Most of the existing outfit recommendation systems focus on pairwise item compatibility prediction (using visual and text features) to score an outfit combination having several items, followed by recommendation of top-n outfits or a capsule wardrobe having a collection of outfits based on user’s fashion taste. However, none of these consider a user’s preference of price range for individual clothing types or an overall shopping budget for a set of items. In this article, we propose a box recommendation framework—BOXREC—which at first collects user preferences across different item types (namely, top-wear, bottom-wear, and foot-wear) including price range of each type and a maximum shopping budget for a particular shopping session. It then generates a set of preferred outfits by retrieving all types of preferred items from the database (according to user specified preferences including price ranges), creates all possible combinations of three preferred items (belonging to distinct item types), and verifies each combination using an outfit scoring framework—BOXREC-OSF. Finally, it provides a box full of fashion items, such that different combinations of the items maximize the number of outfits suitable for an occasion while satisfying maximum shopping budget. We create an extensively annotated dataset of male fashion items across various types and categories (each having associated price) and a manually annotated positive and negative formal as well as casual outfit dataset. We consider a set of recently published pairwise compatibility prediction methods as competitors of BOXREC-OSF. Empirical results show superior performance of BOXREC-OSF over the baseline methods. We found encouraging results by performing both quantitative and qualitative analysis of the recommendations produced by BOXREC. Finally, based on user feedback corresponding to the recommendations given by BOXREC, we show that disliked or unpopular items can be a part of attractive outfits.
- Published
- 2020
- Full Text
- View/download PDF
11. $$hf_0$$: A Hybrid Pitch Extraction Method for Multimodal Voice
- Author
-
M Gurunath Reddy, Pradeep Rengaswamy, K. Sreenivasa Rao, and Pallab Dasgupta
- Subjects
0209 industrial biotechnology ,Signal processing ,Audio signal ,business.industry ,Computer science ,Noise (signal processing) ,Applied Mathematics ,Deep learning ,Pattern recognition ,02 engineering and technology ,Fundamental frequency ,Convolutional neural network ,Pitch class ,020901 industrial engineering & automation ,Signal Processing ,Artificial intelligence ,business ,Representation (mathematics) - Abstract
Pitch or fundamental frequency ( $$f_0$$ ) estimation is a fundamental problem extensively studied for its potential speech and clinical applications. The existing $$f_0$$ estimation methods degrade in performance when applied over real-time audio signals with varying $$f_0$$ modulations and high SNR environment. In this work, a $$f_0$$ estimation method using both signal processing and deep learning approaches is developed. Specifically, we train a convolutional neural network to map the periodicity-rich input representation to pitch classes, such that the number of pitch classes is drastically reduced compared to existing deep learning approaches. Then, the accurate $$f_0$$ is estimated from the nominal pitch classes based on signal processing approaches. The observations from the experimental results showed that the proposed method generalizes to unseen modulations of speech and noisy signals (with various types of noise) for large-scale datasets. Also, the proposed hybrid model significantly reduces the learning parameters required to train the model compared to other methods. Furthermore, the evaluation measures showed that the proposed method performs significantly better than the state-of-the-art signal processing and deep learning approaches.
- Published
- 2020
- Full Text
- View/download PDF
12. Multilingual and multimode phone recognition system for Indian languages
- Author
-
K. Sreenivasa Rao, M. Kiran Reddy, and Kumud Tripathi
- Subjects
Linguistics and Language ,Computer science ,Speech recognition ,media_common.quotation_subject ,02 engineering and technology ,01 natural sciences ,Language and Linguistics ,Telugu ,Mode (computer interface) ,Phone ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Conversation ,010301 acoustics ,media_common ,Artificial neural network ,Communication ,020206 networking & telecommunications ,language.human_language ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,Bengali ,Modeling and Simulation ,language ,Computer Vision and Pattern Recognition ,Software ,Vocal tract ,Utterance - Abstract
The aim of this paper is to develop a flexible framework capable of automatically recognizing phonetic units present in a speech utterance of any language spoken in any mode. In this study, we considered two modes of speech: conversation and read modes in four Indian languages, namely, Telugu, Kannada, Odia, and Bengali. The proposed approach consists of two stages: (i) Automatic speech mode classification (SMC) and (ii) Automatic phoneme recognition using mode-specific multilingual phone recognition system (MMPRS). The vocal tract and excitation source features are considered for classifying speech modes using feed forward neural networks (FFNNs). The vocal tract, excitation source, and tandem features are used in training deep neural network (DNN)-based multilingual phone recognition systems (MPRSs). The performance of the proposed approach is compared with baseline mode-dependent and mode-independent MPRSs. Experimental results show that the proposed approach which combines both SMC and MMPRS into a single system outperforms the baseline phone recognition systems.
- Published
- 2020
- Full Text
- View/download PDF
13. Behaviour and Emotions of Working Professionals Towards Online Learning Systems
- Author
-
Venkata Ramana Attili, Sreenivasa Rao Annaluri, Suresh Reddy Gali, and Ramasubbareddy Somula
- Subjects
Computer science ,Online learning ,Sentiment analysis ,Applied psychology ,Computer Science Applications - Abstract
Student behaviour in the classroom depends on various influential factors (such as family, friends, locality, habits, etc.). Once a student enters into professional life after completing the graduation, it finds it difficult to get back to the learning process due to a variety of issues. In such situations, most of the students go for online courses to improve their skills or to get a promotion at work by upgrading their academic degrees. The tendency of working professionals attending online classes is increasing rapidly due to the vast development in technology in recent times and due to the demand for innovative Secunderabad, e technologies. In this paper, a detailed study on a variety of participants from different work domains was carried out to study the sentiments of working professionals by analysing their behaviour and emotions using Hadoop, big data, and R-Language. Using the RFacebook API, the functioning of the students was analysed in this work by using R programming. Results have shown that the behaviour of 89% working professionals is positive, and emotionally, 75% were satisfied with online courses. However, the tendency of being lazy was also expressed by many for online courses.
- Published
- 2020
- Full Text
- View/download PDF
14. VEP Detection for Read, Extempore and Conversation Speech
- Author
-
K. Sreenivasa Rao and Kumud Tripathi
- Subjects
Computer science ,Speech recognition ,media_common.quotation_subject ,020208 electrical & electronic engineering ,020206 networking & telecommunications ,02 engineering and technology ,Signal ,Computer Science Applications ,Theoretical Computer Science ,Mode (computer interface) ,Vowel ,0202 electrical engineering, electronic engineering, information engineering ,Conversation ,Electrical and Electronic Engineering ,media_common - Abstract
In this paper, we propose a novel approach for accurate detection of the vowel end points (VEPs) in any mode of speech. VEP is the instant at which the vowel ends in the speech signal. In t...
- Published
- 2020
- Full Text
- View/download PDF
15. Ensemble framework for concept-drift detection in multidimensional streaming data
- Author
-
Kalli Srinivasa Nageswara Prasad, Annaluri Sreenivasa Rao, and Attili Venkata Ramana
- Subjects
Concept drift ,Computer science ,InformationSystems_DATABASEMANAGEMENT ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,Hardware and Architecture ,Streaming data ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Tuple ,computer ,Software ,Sequence (medicine) - Abstract
The potential objective of data mining (DM) over the data streaming is the detection of concept-drift. Concept-Drift signifies a diversity among the data tuples streamed in the sequence. The concep...
- Published
- 2020
- Full Text
- View/download PDF
16. System security enhancement using hybrid <scp>HUA‐GPC</scp> approach under transmission line(s) and/or generator(s) outage conditions
- Author
-
Ravi Srinivas Lanka, Amarendra Alluri, and Rayapudi Sreenivasa Rao
- Subjects
Generator (computer programming) ,Computer science ,Transmission line ,business.industry ,Modeling and Simulation ,Electrical engineering ,Electrical and Electronic Engineering ,business ,Computer Science Applications - Published
- 2021
- Full Text
- View/download PDF
17. A Survey on Cognitive Radio Network Models for optimizing Secondary User Transmission
- Author
-
Sruthi Guda and Sreenivasa Rao Duggirala
- Subjects
Channel conflict ,Cognitive radio ,Transmission (telecommunications) ,Transmission delay ,Computer science ,business.industry ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Throughput ,Channel (broadcasting) ,business ,Radio spectrum ,Network model ,Computer network - Abstract
Radio Spectrum is a Valuable resource and is highly bounded. Due to the massive development and growth of the communication devices it is not possible accommodate all the users within the available spectrum bands. The main reason behind that is some of the frequency bands such as Mobile are over utilized where as some of the frequency bands like radio and TV are underutilized. To make use of the spectrum resources effectively, Cognitive radio technology is considered to be the prominent Solution. Cognitive radio is an intelligent radio/system it senses from the environment and changes its transmission parameters according to the environment. The proposed research article has conducted a survey on different Network models of CR in order to facilitate SU transmission and to enhance the Transmission quality of Secondary Users in terms of various parameters such as throughput, transmission delay, Channel Conflict probability, Channel Loss probability etc.
- Published
- 2021
- Full Text
- View/download PDF
18. Knowledge Distillation for Singing Voice Detection
- Author
-
Gurunath Reddy M, K. Sreenivasa Rao, Soumava Paul, and Partha Pratim Das
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Artificial neural network ,Computer science ,business.industry ,Deep learning ,Computation ,Machine learning ,computer.software_genre ,Computer Science - Sound ,Machine Learning (cs.LG) ,Domain (software engineering) ,Task (computing) ,Audio and Speech Processing (eess.AS) ,Software deployment ,Singular value decomposition ,FOS: Electrical engineering, electronic engineering, information engineering ,Music information retrieval ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Singing Voice Detection (SVD) has been an active area of research in music information retrieval (MIR). Currently, two deep neural network-based methods, one based on CNN and the other on RNN, exist in literature that learn optimized features for the voice detection (VD) task and achieve state-of-the-art performance on common datasets. Both these models have a huge number of parameters (1.4M for CNN and 65.7K for RNN) and hence not suitable for deployment on devices like smartphones or embedded sensors with limited capacity in terms of memory and computation power. The most popular method to address this issue is known as knowledge distillation in deep learning literature (in addition to model compression) where a large pre-trained network known as the teacher is used to train a smaller student network. Given the wide applications of SVD in music information retrieval, to the best of our knowledge, model compression for practical deployment has not yet been explored. In this paper, efforts have been made to investigate this issue using both conventional as well as ensemble knowledge distillation techniques., Accepted at INTERSPEECH 2021. 5 pages, 3 figures
- Published
- 2021
- Full Text
- View/download PDF
19. Construction of scale for measuring attitude of students towards academic environment
- Author
-
M Sampath Kumar, V. Sudha Rani, Gec Vidya Sagar, and I. Sreenivasa Rao
- Subjects
Scale (ratio) ,Computer science ,Industrial engineering - Published
- 2021
- Full Text
- View/download PDF
20. Robust f0 extraction from monophonic signals using adaptive sub-band filtering
- Author
-
M. Kiran Reddy, Pradeep Rengaswamy, Pallab Dasgupta, and Krothapalli Sreenivasa Rao
- Subjects
Linguistics and Language ,Computer science ,Communication ,Speech recognition ,Autocorrelation ,Frame (networking) ,020206 networking & telecommunications ,02 engineering and technology ,Fundamental frequency ,Lyrics ,01 natural sciences ,Language and Linguistics ,Computer Science Applications ,Viterbi decoder ,Modeling and Simulation ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Extraction methods ,Computer Vision and Pattern Recognition ,Emotion recognition ,Singing ,010301 acoustics ,Software - Abstract
Fundamental frequency (f0) extraction plays an important role in processing of monophonic signals such as speech and song. It is essential in various real-time applications such as emotion recognition, speech/singing voice discrimination and so on. Several f0 extraction methods have been proposed over the years, but no one algorithm works well for both speech and song. In this paper, we propose a novel approach that can accurately estimate f0 from speech as well as songs. First, voiced/unvoiced detection is performed using a novel RNN-LSTM based approach. Then, each voiced frame is decomposed into several sub-bands. From each sub-band of a voiced frame, the candidate pitch periods are identified using autocorrelation and non-linear operations. Finally, Viterbi decoding is used to form the final pitch contours. The performance of the proposed method is evaluated using popular speech (Keele, CMU-ARCTIC), and song (MIR-1K, LYRICS) databases. The evaluation results show that the proposed method performs equally well for speech and monophonic songs, and is better than the state-of-the-art methods. Further, the efficacy of proposed f0 extraction method is demonstrated by developing an interactive SARGAM learning tool.
- Published
- 2020
- Full Text
- View/download PDF
21. Detection of Specific Language Impairment in Children Using Glottal Source Features
- Author
-
Krothapalli Sreenivasa Rao, Paavo Alku, Mittapalle Kiran Reddy, Indian Institute of Technology Kharagpur, Dept Signal Process and Acoust, Aalto-yliopisto, and Aalto University
- Subjects
glottal source parameters ,General Computer Science ,Artificial neural network ,Computer science ,Speech recognition ,General Engineering ,Developmental dysphasia ,Speech corpus ,Specific language impairment ,medicine.disease ,Speech therapy ,support vector machines ,Support vector machine ,Discriminative model ,Cepstrum ,openSMILE ,medicine ,General Materials Science ,Language disorder ,Mel-frequency cepstrum ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Voice source ,artificial neural networks ,lcsh:TK1-9971 - Abstract
Developmental dysphasia, also known as specific language impairment (SLI), is a language disorder in children that involves difficulty in speaking and understanding spoken words. Detecting SLI at an early stage is very important for successful speech therapy in children. In this paper, we propose a novel approach based on glottal source features for detecting children with SLI using the speech signal. The proposed method utilizes time-and frequency-domain glottal parameters, which are extracted from the voice source signal obtained using glottal inverse filtering (GIF). In addition, Mel-frequency cepstral coefficient (MFCC) and openSMILE based acoustic features are also extracted from speech utterances. Two machine learning algorithms, namely, support vector machine (SVM) and feed-forward neural network (FFNN), are trained separately for the MFCC, openSMILE and glottal features. A leave-fourteen-speakers-out cross-validation strategy is used for evaluating the classifiers. The experiments are conducted using the SLI speech corpus launched by the LANNA research group. Experimental results show that the glottal parameters contain significant discriminative information required for identifying children with SLI. Furthermore, the complementary nature of glottal parameters is investigated by independently combining these features with the MFCC and openSMILE acoustic features. The overall results indicate that the glottal features when used in combination with MFCC feature set provides the best performance with the FFNN classifier in the speaker-independent scenario.
- Published
- 2020
22. E 2 ‐SR: a novel energy‐efficient secure routing scheme to protect MANET‐IoT
- Author
-
Sreenivasa Rao and Maitreyi Ponguwala
- Subjects
Authentication ,Computer science ,Network packet ,business.industry ,Data security ,020302 automobile design & engineering ,020206 networking & telecommunications ,Cryptography ,02 engineering and technology ,Mobile ad hoc network ,Certificate ,Encryption ,Computer Science Applications ,Public-key cryptography ,Elliptic curve ,0203 mechanical engineering ,Data integrity ,0202 electrical engineering, electronic engineering, information engineering ,Hash chain ,Electrical and Electronic Engineering ,business ,Computer network - Abstract
Integration of mobile ad hoc networks (MANETs) and internet of things (IoT) becomes an emerging paradigm to enable opportunistic communication in IoT. However, the lack of infrastructure in MANET increases involvement of adversaries in IoT environment. Thus security provisioning against severe adversaries is still challenging in MANET-IoT. In this paper, we propose an energy efficient secure routing (E2-SR) scheme to ensure data security and integrity in MANET-IoT. We adapt Certificate based authentication in Hash Chain based Certificate Authentication (HCCA) scheme. Cluster formation is involved with secure verification of IoT devices by elliptic curve group law formulations. For cluster formation, secure dual head clustering with elliptic curve verification (SDHC-EC) algorithm is proposed. Secure routing is enabled by a new Worst Case Particle Swarm Optimization (WC-PSO) algorithm. The WC-PSO algorithm is supported by dual state markov chain model (DS-MCM) for security enhancement. Data security is ensured with data integrity using novel dual XOR-Rivest Cipher6 Encryption with Fuzzy Evaluation (DXOR-RC6 with FE) algorithm. The proposed MANET-IoT network is modeled and tested in ns-3.26 environment. The evaluation shows that E2-SR achieves better results in packet delivery ratio, throughput, residual energy, and routing overhead.
- Published
- 2019
- Full Text
- View/download PDF
23. Children’s Story Classification in Indian Languages Using Linguistic and Keyword-based Features
- Author
-
K. Sreenivasa Rao and D. M. Harikrishna
- Subjects
General Computer Science ,business.industry ,Latent semantic analysis ,Computer science ,02 engineering and technology ,computer.software_genre ,Telugu ,language.human_language ,Support vector machine ,Naive Bayes classifier ,Rule-based machine translation ,020204 information systems ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,language ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Natural language processing ,Sentence - Abstract
The primary objective of this work is to classify Hindi and Telugu stories into three genres: fable, folk-tale, and legend . In this work, we are proposing a framework for story classification (SC) using keyword and part-of-speech (POS) features. For improving the performance of SC system, feature reduction techniques and combinations of various POS tags are explored. Further, we investigated the performance of SC by dividing the story into parts depending on its semantic structure. In this work, stories are (i) manually divided into parts based on their semantics as introduction, main, and climax ; and (ii) automatically divided into equal parts based on number of sentences in a story as initial, middle, and end . We have also examined sentence increment model, which aims at determining an optimum number of sentences required to identify story genre by incremental selection of sentences in a story. Experiments are conducted on Hindi and Telugu story corpora consisting of 300 and 150 short stories, respectively. The performance of SC system is evaluated using different combinations of keyword and POS-based features, with three well-established machine learning classifiers: (i) Naive Bayes (NB), (ii) k-Nearest Neighbour (KNN), and (iii) Support Vector Machine (SVM). Performance of the classifier is evaluated using 10-fold cross-validation and effectiveness of classifier is measured using precision, recall, and F-measure. From the classification results, it is observed that adding linguistic information boosts the performance of story classification. In view of the structure of the story, main, and initial parts of the story have shown comparatively better performance. The results from the sentence incremental model have indicated that the first nine and seven sentences in Hindi and Telugu stories, respectively, are sufficient for better classification of stories. In most of the studies, SVM models outperformed the other models in classification accuracy.
- Published
- 2019
- Full Text
- View/download PDF
24. DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features
- Author
-
M. Kiran Reddy and K. Sreenivasa Rao
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Computer Science - Machine Learning ,0209 industrial biotechnology ,Computer Networks and Communications ,Computer science ,Speech recognition ,Speech synthesis ,Computational intelligence ,02 engineering and technology ,computer.software_genre ,Computer Science - Sound ,Telugu ,Bottleneck ,Machine Learning (cs.LG) ,Personalization ,020901 industrial engineering & automation ,Audio and Speech Processing (eess.AS) ,Artificial Intelligence ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Artificial neural network ,General Neuroscience ,language.human_language ,Tamil ,language ,Malayalam ,020201 artificial intelligence & image processing ,computer ,Software ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Cross-lingual voice conversion (CLVC) is quite challenging since the source and target speakers speak different languages. It is essential for various applications such as developing mixed-language speech synthesis systems, customization of speaking devices, etc. This paper proposes a deep neural network (DNN)-based approach utilizing bottleneck features for CLVC. In the proposed method, the speaker-independent information present in the speech signals from different languages is represented by using the bottleneck features extracted from a deep auto-encoder. A DNN model is trained to learn the mapping between bottleneck features and the corresponding spectral features of the target speaker. The proposed approach can capture speaker-specific characteristics of a target speaker, and requires no speech data from the source speaker during training. The performance of the proposed method is evaluated using data from three Indian languages: Telugu, Tamil and Malayalam. The experimental results show that the proposed method can effectively convert the source speaker voice to target speaker voice in a cross-lingual scenario.
- Published
- 2019
- Full Text
- View/download PDF
25. Aggressive Development of Qunatum-Dot Cellular Automata Technology (QCA) in Computers and Communications Networks
- Author
-
Sreenivasa Rao Ijjada and Adepu. Hrariprasad
- Subjects
Development (topology) ,Computer architecture ,Computer science ,General Engineering ,Cellular automaton - Published
- 2019
- Full Text
- View/download PDF
26. CWT-Based Approach for Epoch Extraction From Telephone Quality Speech
- Author
-
K. Sreenivasa Rao, Y. Madhu Keerthana, and M. Kiran Reddy
- Subjects
Channel (digital image) ,Computer science ,Epoch (reference date) ,Applied Mathematics ,Speech recognition ,020206 networking & telecommunications ,02 engineering and technology ,Speech processing ,Signal ,Identification (information) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Continuous wavelet transform ,Vocal tract - Abstract
Epochs are the instants of significant excitation to vocal tract system. Existing methods can extract epochs accurately from clean speech signals. However, identification of epoch locations from band-limited telephonic speech is challenging due to the attenuation of fundamental frequency component and degradation caused by channel effect. This letter proposes an epoch extraction method that can accurately extract epochs from clean as well as telephonic speech signals. In the proposed method, the significant impulse-like discontinuities are extracted directly from the speech signal using continuous wavelet transform. The performance of the proposed method is evaluated using three speakers, namely, SLT, BDL, and JMK from CMU Arctic database. The clean speech is simulated using G.191 software tools to obtain telephonic speech. Experimental results show that the epoch identification rate of proposed method is significantly better than the state-of-the-art methods for the telephone quality speech.
- Published
- 2019
- Full Text
- View/download PDF
27. LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis
- Author
-
K. Sreenivasa Rao, M. Kiran Reddy, and R. Pradeep
- Subjects
021110 strategic, defence & security studies ,Artificial neural network ,Computer science ,Speech quality ,Speech recognition ,media_common.quotation_subject ,0211 other engineering and technologies ,Speech synthesis ,02 engineering and technology ,computer.software_genre ,Long short term memory ,Control and Systems Engineering ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Voice ,020201 artificial intelligence & image processing ,Quality (business) ,computer ,Software ,Parametric statistics ,media_common - Abstract
The quality of statistical parametric speech synthesis (SPSS) relies on voiced/unvoiced classification. Errors in voicing decision can contribute to significant degradation in speech quality. This paper proposes a robust voicing detection method based on power spectrum and long short term memory (LSTM) network for SPSS. The performance of the proposed method is evaluated using CMU Arctic, Keele and MIR-1K databases. Further, the effectiveness of the proposed method is analyzed for deep neural network (DNN)-based SPSS. The results show that the proposed method can better classify the voiced and unvoiced speech segments, which significantly improves the speech quality.
- Published
- 2019
- Full Text
- View/download PDF
28. Quantum-Dot Cellular Automata Technology for High-Speed High-Data-Rate Networks
- Author
-
Sreenivasa Rao Ijjada and Adepu Hariprasad
- Subjects
Very-large-scale integration ,0209 industrial biotechnology ,Computer science ,Applied Mathematics ,Transistor ,NAND gate ,Quantum dot cellular automaton ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,law.invention ,020901 industrial engineering & automation ,XNOR gate ,CMOS ,law ,Logic gate ,Signal Processing ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,Hardware_LOGICDESIGN ,Electronic circuit - Abstract
In today’s technology, optical networking plays a vital role in reducing data losses, thereby providing higher data rates between the transreceivers. The very large scale integration circuits in the modulator–demodulator (MODEM) usually fabricated using complementary metal oxide semiconductor (CMOS) technology have serious scaling limitations; hence, device scaling beyond 65 nm technology becomes highly challenging. Quantum-dot cellular automata (QCA) is one of the most promising nanotechnologies that enable areas of smaller size, i.e. 60% less design area than the CMOS technology, with capability to produce high speed by taking less cycles compared with the other CMOS designs to reduce scaling issues. The QCA-based designs are considered as the best alternative solutions to the transistor-based (CMOS) designs. This paper deals with the implementation of the logic gates NOT, AND, OR, NAND, NOR, XOR and XNOR using both CMOS and QCA technologies, while the QCA allows more possible design structures for each logic gate to enable optimization of the area. Finally, the proposed QCA-based logic gate design and CMOS-based designs are compared in terms of the design area, cell count and the speed of the designs. QCA designer 2.0.3 and virtuoso CADENCE Computer-Aided Design tools are used for carrying out the work.
- Published
- 2019
- Full Text
- View/download PDF
29. Incorporation of Manner of Articulation Constraint in LSTM for Speech Recognition
- Author
-
K. Sreenivasa Rao and R. Pradeep
- Subjects
0209 industrial biotechnology ,Sonorant ,Computer science ,Applied Mathematics ,Speech recognition ,TIMIT ,Linear prediction ,02 engineering and technology ,Obstruent ,Manner of articulation ,020901 industrial engineering & automation ,Recurrent neural network ,Signal Processing ,Spectral flatness ,Articulation (phonetics) - Abstract
The variants of recurrent neural networks such as long short-term memory (LSTM) and gated recurrent unit are successful in sequence modelling such as automatic speech recognition. However, the decoded sequence is prune to have false substitutions, insertions and deletions. In our work, we investigate the outcome of the hidden layers in LSTM trained on TIMIT dataset. We found interestingly that the first hidden layer was capturing information related to some broad manners of articulation. The successive hidden layers try to cluster among the broad manners of articulation. We detected two broad manners of articulation, namely sonorants (vowels, semi-vowels, nasals) and obstruents (fricatives, stops, affricates) by exploiting the spectral flatness measure (SFM) on the linear prediction coefficients. We define a additional gate called manner of articulation gate that is high if the broad manners of articulation of tth frame are same as that of $$(t+1)$$ th frame. The manner of articulation detection is embedded at the output of the activation gate of LSTM at the first hidden layer. By doing so, the sonorants being substituted as obstruents are minimized at the output layer. The proposed method decreased the phone error rates by 0.7% when evaluated on the core test set of the TIMIT.
- Published
- 2019
- Full Text
- View/download PDF
30. PrivGuard: Sensitivity Guided Anonymization based PPDM with Automatic Selection of Sensitive Attributes
- Author
-
M. Sreenivasa Rao and V. Uma Rani
- Subjects
Information privacy ,Computer science ,Rank (computer programming) ,02 engineering and technology ,computer.software_genre ,Credit card ,Tree (data structure) ,Naive Bayes classifier ,Information sensitivity ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Information gain ratio ,020201 artificial intelligence & image processing ,Data mining ,Sensitivity (control systems) ,computer - Abstract
The concept of Privacy Preserving Data Mining (PPDM) is the method of extraction of hidden patterns or knowledge from great volumes of data without revealing sensitive personal or sensitive business information. The data used for data mining operation may contain sensitive data such as Social Security Number, Salary, Name, Credit card Number etc. Disclosure of such information is threat to the privacy of individuals. The aim of PPDM is to provide privacy of sensitive information in the data used for data mining. Several methods have been developed based on Anonymization, Perturbation and Cryptography. All these methods take list of sensitive attributes as input from data owner. Not only that, another limitation is they perform transformations on the data without considering the level of sensitivity of the attributes in order to provide privacy. We proposed a framework for PPDM based on anonymization guided by the sensitivity rank of the attribute. This work also automatically identifies the sensitive attributes in the data.The proposed work, PrivGuard: Sensitivity Guided Anonymization based PPDM with Automatic Selection of Sensitive Attributes finds sensitive attributes in the database by finding the Sensitivity Rank for each attribute. In order to find Sensitivity Rank for attributes it finds the rank of attribute by calculating attribute evaluation measures such as InformationGain, Symmetric Uncertainty attribute evaluation, Gain Ratio, OneR attribute evaluation etc. Then, computes the sensitivity rank and uses this to decide how much anonymization is required to provide the privacy. This method can fix the balance between data privacy and data utility by applying appropriate level of anonymization using taxonomy tree of the attribute. The level of anonymization is calculated by finding the generalization score based on attribute sensitivity rank. Finally, C4.5 and Naive Bayes classifiers are built on anonymized data and compared with other anonymization methods. Our method outperforms than existing methods and observed that our results are very near to results of data mining using original data.
- Published
- 2021
- Full Text
- View/download PDF
31. WITHDRAWN: Outage analysis of half-duplex decode-and-forward relaying with cooperative NOMA in vehicular networks
- Author
-
Avinash Sharma, Sravani Potula, Sreenivasa Rao Ijjada, and Karunakar Reddy Santhamgari
- Subjects
010302 applied physics ,Vehicular ad hoc network ,Computer science ,business.industry ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,021001 nanoscience & nanotechnology ,Transmitter power output ,01 natural sciences ,law.invention ,Relay ,law ,Linear network coding ,0103 physical sciences ,Computer Science::Networking and Internet Architecture ,Bit error rate ,Fading ,0210 nano-technology ,business ,Decoding methods ,5G ,Computer Science::Information Theory ,Computer network - Abstract
Vehicular networks are the emerging as a promising technology in 5G communication with many challenges still unaddressed. Due to the dynamic nature of the vehicular networks, fast-varying channel conditions many of the researchers restricted their work to block fading channels. In this paper, in contrast to the orthogonal multiple accesses, cooperative non-orthogonal multiple accesses is considered for two-way decode and forward relaying with network coding to mitigate the impact of continuous-time varying channels. An explicit closed form expression is derived for sum outage probabilities at source and destination. Simulation results manifest that packet error rate decreases for optimal data phases and the position of the decodable relay from source and transmit power affirm the impact over sum outage probabilities compared to two-way amplify and forward relaying and decode and forward with coding and decoding schemes.
- Published
- 2021
- Full Text
- View/download PDF
32. Pairing-based CP-ABE with constant-size ciphertexts and secret keys for cloud environment
- Author
-
Ashok Kumar Das, Saru Kumari, Muhammad Khurram Khan, Vanga Odelu, Y. Sreenivasa Rao, Kim-Kwang Raymond Choo, Odelu, Vanga, Kumar, Das Ashok, Sreenivasa, Rao Y, Kumari, Saru, Khan, Muhammad Khurram, and Choo, Kim-Kwang Raymond
- Subjects
mobile cloud computing ,Computer science ,Cloud computing ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,Computer security ,computer.software_genre ,Encryption ,0202 electrical engineering, electronic engineering, information engineering ,Access structure ,business.industry ,020206 networking & telecommunications ,constant-size ciphertexts ,ciphertext-policy attribute-based encryption ,Computer security model ,lightweight mobile devices ,Symmetric-key algorithm ,Hardware and Architecture ,020201 artificial intelligence & image processing ,business ,Constant (mathematics) ,Law ,computer ,Mobile device ,Software ,AND gate ,constant-size secret keys - Abstract
Ciphertext-policy attribute-based encryption (CP-ABE) scheme can be deployed in a mobile cloud environment to ensure that data outsourced to the cloud will be protected from unauthorized access. Since mobile devices are generally resource-constrained, CP-ABE schemes designed for a mobile cloud deployment should have constant sizes for secret keys and ciphertexts. However, most existing CP-ABE schemes do not provide both constant size ciphertexts and secret keys. Thus, in this paper, we propose a new pairing-based CP-ABE scheme, which offers both constant size ciphertexts and secret keys (CSCTSK) with an expressive AND gate access structure. We then show that the proposed CP-ABE-CSCTSK scheme is secure against chosen-ciphertext adversary in the selective security model, and present a comparative summary to demonstrate the utility of the scheme. Since mobile devices are generally resource-constrained and cloud services are Internet-based and pay-by-use, a key feature in ciphertext-policyAttribute-based encryption (CP-ABE) should be constant sizes for secret keys and ciphertexts.In this paper, we propose a new pairing-based CP-ABE scheme, which offers both constant size ciphertexts and secret keys (CSCTSK) with an expressive AND gate access structure.We then show that the proposed CP-ABE-CSCTSK scheme is secure against chosen-ciphertext adversary in the selective security model, and demonstrate its utility.
- Published
- 2017
- Full Text
- View/download PDF
33. A Triple Band-Notched UWB Antenna for Microwave Imaging Applications
- Author
-
I. Govardhani and D. Sreenivasa Rao
- Subjects
Microwave imaging ,HFSS ,Computer science ,business.industry ,Communications satellite ,Specific absorption rate ,Optoelectronics ,Antenna (radio) ,Wideband ,Microwave transmission ,business ,Band rejection - Abstract
In this paper, a unique wideband monopole triple band-notched antenna for microwave imaging applications is proposed. It is designed on 38 X 38 mm2 Fire-Resistant (FR-4) epoxy substrate. It can avoid the inferences which rise from the Wi-MAX (3.3 GHz - 3.7 GHz), WLAN (5 GHz - 6 GHz), and X-band satellite communication uplink band (7.9 GHz - 8.4 GHz). Three slots are introduced in the radiating part to achieve the band rejection characteristics. The designed antenna is enhanced by utilizing electromagnetic simulation software Ansoft HFSS. The prototype antenna is fabricated and tested. The designed antenna operates in a band from 3.1 to 11 GHz and observed excellent radiation characteristics. A 50 mm 3-D breast model is designed and simulated specific absorption rate values are estimated for glandular, fatty, skin and tumor to detect cancer in the initial time.
- Published
- 2021
- Full Text
- View/download PDF
34. Bacterial Foraging Optimized Parameters for ANN using Adaptive Harris Hawks Weight Optimization
- Author
-
Manchikalapudi Satya Sai Ram, Duggirala Sreenivasa Rao, and Pedalanka P S Subhashini
- Subjects
Computer science ,business.industry ,Deep learning ,Ant colony optimization algorithms ,020208 electrical & electronic engineering ,Feature extraction ,Particle swarm optimization ,Pattern recognition ,02 engineering and technology ,Speaker recognition ,Field (computer science) ,ComputingMethodologies_PATTERNRECOGNITION ,0202 electrical engineering, electronic engineering, information engineering ,Preprocessor ,020201 artificial intelligence & image processing ,Artificial intelligence ,Mel-frequency cepstrum ,business - Abstract
Speaker recognition is important to validate user identity using the extracted features of the audio speech signal in the field of authentication and surveillance. Two modules may be used to understand the speaker, namely training and testing. The capability of recognition systems to identify speakers based on waveform distribution depends largely on how the recognition system trains model parameters to provide the best class of discrimination. The mel-frequency cepstral coefficients (MFCCs) of each speaking sample are obtained initially in the training phase by preprocessing the audio speech signal. The characteristics are then identified using RBF-ANN to the target speaker. Recognition is based on an estimation of a sufficiently large number of acoustic features. In the proposed work, Bacterial Foraging Optimized (BFO) parameters are used that are provided as input for the RBF-ANN model. The ANN weights are updated using the Adaptive Harris Hawks Optimization (AHHO) method for improving the system performance. The performance of the proposed DNN-RBF based AHHO is compared with three different deep learning based optimization algorithms Modified Grey Wolf Optimization (MGWO), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO) and the results show that the proposed model accuracy in speaker recognition is high when compared to traditional methods.
- Published
- 2021
- Full Text
- View/download PDF
35. Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey
- Author
-
Krothapalli Sreenivasa Rao, Aravinda Reddy P N, S. R. Mahadeva Prasanna, Raghavendra Ramachandra, Christoph Busch, Hareesh Mandalapu, and Pabitra Mitra
- Subjects
FOS: Computer and information sciences ,Computer Science - Cryptography and Security ,General Computer Science ,Biometrics ,Computer science ,media_common.quotation_subject ,Feature extraction ,02 engineering and technology ,Facial recognition system ,Field (computer science) ,Presentation ,presentation attack detection ,audio-visual person recognition ,Human–computer interaction ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Vulnerability (computing) ,media_common ,Authentication ,General Engineering ,Benchmarking ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,lcsh:TK1-9971 ,Cryptography and Security (cs.CR) - Abstract
Biometric recognition is a trending technology that uses unique characteristics data to identify or verify/authenticate security applications. Amidst the classically used biometrics, voice and face attributes are the most propitious for prevalent applications in day-to-day life because they are easy to obtain through restrained and user-friendly procedures. The pervasiveness of low-cost audio and face capture sensors in smartphones, laptops, and tablets has made the advantage of voice and face biometrics more exceptional when compared to other biometrics. For many years, acoustic information alone has been a great success in automatic speaker verification applications. Meantime, the last decade or two has also witnessed a remarkable ascent in face recognition technologies. Nonetheless, in adverse unconstrained environments, neither of these techniques achieves optimal performance. Since audio-visual information carries correlated and complementary information, integrating them into one recognition system can increase the system’s performance. The vulnerability of biometrics towards presentation attacks and audio-visual data usage for the detection of such attacks is also a hot topic of research. This paper made a comprehensive survey on existing state-of-the-art audio-visual recognition techniques, publicly available databases for benchmarking, and Presentation Attack Detection (PAD) algorithms. Further, a detailed discussion on challenges and open problems is presented in this field of biometrics. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 3
- Published
- 2021
- Full Text
- View/download PDF
36. The VLSI Realization of Sign-Magnitude Decimal Multiplication Efficiency
- Author
-
K. Sreenivasa Rao and Reddipogula Chandra Babu
- Subjects
Reduction (complexity) ,Very-large-scale integration ,Adder ,Computer science ,Multiplier (economics) ,Multiplication ,Arithmetic ,Realization (systems) ,Decimal ,Sign (mathematics) - Abstract
Multiplication is a dynamic procedure in which intermediate partial products (IPPs) are typically picked from a set of multiples of pre-calculated radix-10 X. Many plays require just [0, 5] by encoding the Y digits to a one-hot representation of the signed digits in [−5, 5]. This eliminates the sense of choice at the cost of additional IPP. Two-complement signed-digit (TCSD) encoding is also used to characterize IPPs that allow dynamic negation (through one xor per bit of X multiples) of Y-coded digits in [−5, − 1]. With the generation of 17 IPPs for 16-digit operands, we are able to launch a partial product reduction (PPR) with 16 IPPs that improve VLSI regularity. We thus save 75% of the negating xors by encoding sign-magnitude signed-digit (SMSD). For first-level PPR, we create an efficient adder with two SMSD input numbers, the total number defined by the TCSD encoding. Multi-level TCSD 2:1 reduction results in two TCSD combined partial items jointly subject to a special early-initiated conversion scheme for the final binary-coded decimal portion. As such, the VLSI implementation of a 16-digit parallel decimal multiplier is synthesized where results show some increase in efficiency over previous similar designs.
- Published
- 2021
- Full Text
- View/download PDF
37. A Score-Level Fusion Method for Protecting Fingerprint and Palmprint Templates
- Author
-
Vallabhadas Dilip Kumar, Mulagala Sandhya, Y. Sreenivasa Rao, Maurya Anup Kumar, and Sahoo Biswajeet
- Subjects
Password ,Level fusion ,Authentication ,Biometrics ,Computer science ,business.industry ,Data_MISCELLANEOUS ,Fingerprint (computing) ,Binary number ,Pattern recognition ,Security token ,Template ,Artificial intelligence ,business - Abstract
Today computers are in virtually everything we touch. With this increased use of computers, there is a need for secure authentication of the users using their biometric characteristics. If a biometric template is lost/stolen, it is lost permanently and can’t be reissued like traditional authentication systems using a password or token. Taking two of the well-known traits, i.e., fingerprint and palmprint into consideration, a multi-biometric system is built. Binary vectors for each of the biometric traits are constructed, calculated the match score of both of them individually then we applied score-level fusion using T-operators. As compared to the previous multi-biometric schemes, the analysis of the results of the experiments provides us with good accuracy, i.e., an EER of 6.58% was reported which shows the validity of the proposed method.
- Published
- 2021
- Full Text
- View/download PDF
38. WITHDRAWN: SS < 30 mV/dec; Hybrid tunnel FET 3D analytical model for IoT applications
- Author
-
M. Sushanth Babu, Ajaykumar Dharmireddy, Avinash Sharma, and Sreenivasa Rao Ijjada
- Subjects
010302 applied physics ,business.industry ,Ambipolar diffusion ,Computer science ,Transistor ,Electrical engineering ,02 engineering and technology ,021001 nanoscience & nanotechnology ,01 natural sciences ,law.invention ,Power (physics) ,law ,Booster (electric power) ,0103 physical sciences ,0210 nano-technology ,Internet of Things ,business ,Scaling - Abstract
Low power and high speed devices are the future transistor technology. The low power and higher the wield of Strained Channels with functionality booster becomes more significance with the device scaling in the device modeling. Tunnel FET (TFET) is the cynosure device in the present and feature transistor technology. This paper evinced the recent past of different gate structural of TFETs particularly, subsequently made few suggestions on the development of new TFET structure. Young’s parabolic approximation 3D analytical method is unveiled to develop TEFT. This proposed model may have least sub-threshold slope, drive current improvement and reduce the ambipolar leakage current when compare with the other existing TFET gate structures. Sentaurus TCAD simulator tool used for the device modeling and characterization.
- Published
- 2020
- Full Text
- View/download PDF
39. Identification of glottal instants using electroglottographic signal for vulnerable cases of voicing
- Author
-
Tanumay Mandal, Krothapalli Sreenivasa Rao, and Sanjay K. Gupta
- Subjects
biomedical applications ,lcsh:Medical technology ,Computer science ,pathological egg signal ,Speech recognition ,0206 medical engineering ,education ,Phase (waves) ,Health Informatics ,02 engineering and technology ,Signal ,Article ,030218 nuclear medicine & medical imaging ,glottal instant identification ,03 medical and health sciences ,medical signal detection ,0302 clinical medicine ,Health Information Management ,phase information ,Closing (morphology) ,medical signal processing ,instant detection methods ,glottal opening ,electroglottographic signal ,feature extraction ,fungi ,speech recognition ,respiratory system ,020601 biomedical engineering ,Identification (information) ,lcsh:R855-855.5 ,glottal closing ,Voice - Abstract
Robust detection of glottal instants is essential for various speech and biomedical applications. Glottal closing and glottal opening are two crucial instants/epochs of a glottal cycle. The first-order derivative of the Electroglottographic (EGG) signal demonstrates important peaks at those locations for standard voicing, but the detection of glottal instants becomes erroneous when the peak to peak amplitude of the EGG signal is very low, irregular and unpredictable. In this work, a new efficient method is proposed for identification of glottal instants from the EGG signals including the segments of the signals where the signals are feeble with irregular periodicity. The overall accuracy of detection will be enhanced by identifying the glottal instants for the whole part of the signal including the vulnerable segments of signal. As the phase of a signal is uniform in nature, the phase information of the EGG signal has been explored to detect glottal instants accurately. Under low strength of the EGG signal, the proposed method remarkably has better performance compared to the existing instants detection methods and for pathological EGG signal, the detection accuracy of glottal instants is better than other existing methods.
- Published
- 2020
- Full Text
- View/download PDF
40. Glottal Closure Instants Detection from EGG Signal by Classification Approach
- Author
-
K. Sreenivasa Rao, M Gurunath Reddy, and Partha Pratim Das
- Subjects
Computer science ,Speech recognition ,Glottal closure ,Signal - Published
- 2020
- Full Text
- View/download PDF
41. Change Impact Analysis in Software Versioning
- Author
-
Kalli Srinivasa Nageswara Prasad, Vinit Kumar Gunjan, Annaluri Sreenivasa Rao, and Madapuri Rudra Kumar
- Subjects
business.industry ,Computer science ,Change impact analysis ,Software engineering ,business ,Software versioning - Published
- 2020
- Full Text
- View/download PDF
42. Change Request Impacts in Software Maintenance
- Author
-
Vinit Kumar Gunjan, Annaluri Sreenivasa Rao, Kalli Srinivasa Nageswara Prasad, and Madapuri Rudra Kumar
- Subjects
Class (computer programming) ,Software ,Process management ,Process (engineering) ,Computer science ,business.industry ,Component (UML) ,Change request ,Change management ,Software maintenance ,business ,Practical implications - Abstract
This book discusses Change Management Impact Analysis and how this method is used to analysis the risks and benefits of a change management initiative when it pertains to obtaining critical insight into how the change management program budget should be allotted. The process also offers useful indicators for what areas within the system should be monitored during the change management process. This book presents theoretical analysis of practical implications and surveys, along with analysis. It covers the functions aimed at identifying various stakeholders associated with the software such as requirement component, design component, and class component. The book talks about the interrelationship between the change and the effects on the rest of the system and dives deeper to include the critical role that the analysis places on the existing multiple functions such as estimating the development costs, the project overhead costs, cost for the modification of the system, and system strength or detecting errors in the system during the process. Case studies are also included to help researchers and practitioners to absorb the material presented. This book is useful to graduate students, researchers, academicians, institutions, and professionals that interested in exploring the areas of Impact Analysis.
- Published
- 2020
- Full Text
- View/download PDF
43. Change Request Impact Analysis Tools
- Author
-
Madapuri Rudra Kumar, Vinit Kumar Gunjan, Kalli Srinivasa Nageswara Prasad, and Annaluri Sreenivasa Rao
- Subjects
World Wide Web ,Computer science ,Change request ,Analysis tools - Published
- 2020
- Full Text
- View/download PDF
44. Articulatory-feature-based methods for performance improvement of Multilingual Phone Recognition Systems using Indian languages
- Author
-
Ramasubramanian, Dinesh Babu Jayagopi, K. Sreenivasa Rao, and K E Manjunath
- Subjects
Ground truth ,Multidisciplinary ,Bengali ,Phone ,Computer science ,Speech recognition ,language ,Feature based ,Mel-frequency cepstrum ,Performance improvement ,Hidden Markov model ,language.human_language ,Oracle - Abstract
In this work, the performance of Multilingual Phone Recognition System (Multi-PRS) is improved using articulatory features (AFs). Four Indian languages – Kannada, Telugu, Bengali and Odia – are used for developing Multi-PRS. The transcription is derived using international phonetic alphabets (IPAs). Multi-PRS is trained using hidden Markov models and the state-of-the-art Deep Neural Networks (DNNs). AFs for five AF groups – place, manner, roundness, frontness and height – are predicted from Mel-frequency cepstral coefficients (MFCCs) using DNNs. The oracle AFs, which are derived from the ground truth IPA transcriptions, are used to set the best performance realizable by the predicted AFs. The performances of predicted and oracle AFs are compared. In addition to the AFs, the phone posteriors are explored to further boost the performance of Multi-PRS. Multi-task learning is explored to improve the prediction accuracy of AFs and thereby reduce the Phone Error Rates (PERs) of Multi-PRSs. Fusion of AFs is done using two approaches: i) lattice re-scoring approach and ii) AFs as tandem features. We show that oracle AFs by feature fusion with MFCCs offer a remarkably low target of PER of 10.4%, which is 24.7% absolute reduction compared with baseline Multi-PRS with MFCCs alone. The best performing system using predicted AFs has shown 3.2% reduction in absolute PER (9.1% reduction in relative PER) compared with baseline Multi-PRS. The best performance is obtained using the tandem approach for fusion of various AFs and phone posteriors.
- Published
- 2020
- Full Text
- View/download PDF
45. Development and analysis of multilingual phone recognition systems using Indian languages
- Author
-
Dinesh Babu Jayagopi, V. Ramasubramanian, K. Sreenivasa Rao, and K. E. Manjunath
- Subjects
Linguistics and Language ,Computer science ,business.industry ,computer.software_genre ,Language and Linguistics ,language.human_language ,Telugu ,Human-Computer Interaction ,Kannada ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Bengali ,Transcription (linguistics) ,Phone ,language ,Recognition system ,Computer Vision and Pattern Recognition ,Artificial intelligence ,0305 other medical science ,business ,computer ,Software ,Natural language processing - Abstract
In this paper, the development of Multilingual Phone Recognition System (Multi-PRS) using four Indian languages—Kannada, Telugu, Bengali, and Odia—is described. Multi-PRS is an universal Phone Recognition System (PRS), which performs the phone recognition independent of any language. International phonetic alphabets based transcription is used for grouping the acoustically similar phonetic units from multiple languages. Multilingual phone recognisers for Indian languages are studied using two broad groups namely—Dravidian languages and Indo-Aryan languages. Dravidian and Indo-Aryan languages are grouped separately to develop Bilingual PRSs. We have explored both HMMs and DNNs for developing PRSs under both context-dependent and context-independent setups. The state-of-the-art DNNs have outperformed the HMMs. The performance of Multi-PRSs is analysed and compared with that of the monolingual PRSs. The advantages of Multi-PRSs over monolingual PRSs are discussed. Further, we have developed tandem Multi-PRSs using phone posteriors as tandem features to improve the performance of the baseline Multi-PRSs. It is found that the tandem Multi-PRSs have outperformed the baseline Multi-PRSs in all the cases.
- Published
- 2019
- Full Text
- View/download PDF
46. Multilingual speech mode classification model for Indian languages
- Author
-
K. Sreenivasa Rao and Kumud Tripathi
- Subjects
Computer science ,Speech recognition ,language.human_language ,Telugu ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Identification (information) ,Bengali ,Multilayer perceptron ,Feature (machine learning) ,language ,Mel-frequency cepstrum ,0305 other medical science ,Pitch contour ,Vocal tract - Abstract
This paper explores the vocal tract and excitation source information for the multilingual speech mode classification (MSMC) task. MSMC is a language independent speech mode classification model that could detect the mode of speech spoken in any language. Here, we considered data of three broad speech modes: conversation, extempore, and read from three Indian languages, namely, Telugu, Bengali, and Odia. The vocal tract information is captured using Mel-frequency cepstral coefficients. The pitch contour processed at supra-segmental level represents the excitation source information. The MSMC model is developed using multilayer perceptron. Experimental results show that the vocal tract features provide better overall identification accuracy, compared to excitation source information. Further, an improvement in overall accuracy is achieved by combining the scores obtained by two separate MSMC model based on excitation source and vocal tract features. The results generated using a combined score, outperform the model developed using standard vocal tract feature.
- Published
- 2020
- Full Text
- View/download PDF
47. A robust unsupervised pattern discovery and clustering of speech signals
- Author
-
Sreenivasa Rao K, Lokendra Birla, and Kishore Kumar R
- Subjects
Vocabulary ,Phrase ,Computer science ,media_common.quotation_subject ,Speech recognition ,Image processing ,Mixture model ,Speech processing ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Identification (information) ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Histogram ,Signal Processing ,Computer Vision and Pattern Recognition ,0305 other medical science ,Cluster analysis ,Software ,media_common - Abstract
In this paper, a novel approach to unsupervised pattern discovery for speech signals is proposed. The proposed work deviates from the standard speech recognition task, and aims to cluster the speech utterances based on the vocabulary of a broad topic. It attempts to discover the matched sequence of phonetic units by making use of the repeated patterns between the speech signals. Identification of matched sequence of phonetic patterns helps in clustering the speech signals, automatically. The proposed approach uses the posterior features derived from Gaussian mixture model (GMM) to find the repeated structure between the speech signals. Image processing techniques are used to identify these matched acoustic patterns. An angle histogram-based method is used to extract the desired matched keyword/phrase patterns present in a pair of speech utterances. The performance of the proposed method is evaluated on Hindi and Bengali news speech corpora using standard objective measures, and also compared with state-of-the-art techniques. The matched pairs of speech utterances obtained by the proposed method are grouped into broader classes using an appropriate clustering technique. The final clusters represent the broader classes of information such as politics, sports, and weather.
- Published
- 2018
- Full Text
- View/download PDF
48. Inverse filter based excitation model for HMM‐based speech synthesis system
- Author
-
Mittapalle Kiran Reddy and Krothapalli Sreenivasa Rao
- Subjects
0209 industrial biotechnology ,Excitation signal ,Computer science ,Inverse filter ,Speech synthesis ,02 engineering and technology ,Residual ,computer.software_genre ,Signal ,030507 speech-language pathology & audiology ,03 medical and health sciences ,020901 industrial engineering & automation ,Quality (physics) ,Signal Processing ,Electrical and Electronic Engineering ,0305 other medical science ,Hidden Markov model ,Algorithm ,computer ,Excitation - Abstract
Even today, the speech generated by hidden Markov model (HMM)-based speech synthesis system (HTS) still has the buzziness due to the improper modelling of the excitation signal. This study proposes an efficient excitation modelling approach for improving the quality of HTS. In the proposed method, the residual signal obtained from inverse filter is parameterised as excitation features. HMMs are used to model these excitation parameters. During synthesis, the excitation signal is constructed by overlap adding the natural residual segments, and the excitation signal is further modified as per the target source features generated from HMMs. The proposed approach is incorporated in the HTS. Performance evaluation results indicate that the proposed method enhances the quality of synthesis, and is better than the state-of-the-art approaches used for modelling the excitation signal.
- Published
- 2018
- Full Text
- View/download PDF
49. Epoch detection from emotional speech signal using zero time windowing
- Author
-
Md. Shah Fahad, K. Sreenivasa Rao, and Jainath Yadav
- Subjects
0209 industrial biotechnology ,Linguistics and Language ,Computer science ,Epoch (reference date) ,Communication ,Speech recognition ,Sample (statistics) ,02 engineering and technology ,Signal ,Language and Linguistics ,Computer Science Applications ,Zero (linguistics) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,020901 industrial engineering & automation ,Amplitude ,Computer Science::Sound ,Aperiodic graph ,Modeling and Simulation ,A priori and a posteriori ,Computer Vision and Pattern Recognition ,0305 other medical science ,Software ,Group delay and phase delay - Abstract
The main objective of this work is to enhance the performance of epoch detection in the case of emotional speech. Existing epoch estimation methods require either modeling of the vocal-tract system or a priori information of the average pitch period. The performance of existing epoch estimation methods degrades significantly due to rapid variation of the pitch period in the emotional speech. In the present work, we have utilized the advantage of zero time windowing method, which provides instantaneous spectral information at each sample point due to the contribution of that sample point itself. The amplitudes of spectral peaks are higher at the instants of epochs compared to neighbouring sample points. The proposed method uses the sum of three prominent spectral peaks at each sampling instant of the Hilbert envelope of Numerator Group Delay (HNGD) spectrum, for accurate detection of epochs in the emotional speech. The experimental result shows that the accuracy of the proposed method is better than existing methods in the case of emotional speech. It is also observed that the proposed method works well even for the aperiodic nature of the speech signal and it is robust against emotional speech.
- Published
- 2018
- Full Text
- View/download PDF
50. A 75 μW Two-Stage Op-Amp using 0.18μW CMOS Technology for High-Speed Operations
- Author
-
B. Naresh, Sreenivasa Rao Ijjada, and K. Shashidhar
- Subjects
CMOS ,law ,Computer science ,business.industry ,Operational amplifier ,Electrical engineering ,General Physics and Astronomy ,Stage (hydrology) ,business ,law.invention - Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.