13 results on '"Razzazi, Farbod"'
Search Results
2. LMDT: A weakly-supervised large-margin-domain-transfer for handwritten digit recognition.
- Author
-
Hosseinzadeh, Hamidreza and Razzazi, Farbod
- Subjects
- *
HANDWRITING recognition (Computer science) , *DISCRIMINATION learning , *DATA distribution , *COEFFICIENTS (Statistics) , *ERROR rates - Abstract
Performance of handwritten character recognition systems degrades significantly when they are trained and tested on different databases. In this paper, we propose a novel large margin domain transfer algorithm, which is able to jointly reduce the data distribution mismatch of training (source) and test (target) datasets, as well as learning a target classifier by relying on a set of pre-learned classifiers with the labeled source data in addition to a few available target labels. The proposed method optimizes the combination coefficients of pre-learned classifiers to obtain the minimum mismatch between results on the source and target datasets. Our method is applicable both in semi-supervised and unsupervised domain adaptation scenarios, while most of the previous competing domain adaptation methods work only in semi-supervised scenario. Experiments on adaptation to different handwritten digit datasets demonstrate that this method achieves superior classification accuracy on target sets, comparing to the state of the art methods. Quantitative evaluation shows that an unsupervised adaptation reduces the error rates by 40.2% comparing with the SVM classifier trained by the labeled samples from the source domain. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
3. SR-NBS: A fast sparse representation based N-best class selector for robust phoneme classification.
- Author
-
Saeb, Armin, Razzazi, Farbod, and Babaie-Zadeh, Massoud
- Subjects
- *
AUTOMATIC speech recognition , *SEARCH algorithms , *ROBUST control , *PHONEME (Linguistics) , *PROBLEM solving , *COMPUTATIONAL complexity - Abstract
Abstract: Although exemplar based approaches have shown good accuracy in classification problems, some limitations are observed in the accuracy of exemplar based automatic speech recognition (ASR) applications. The main limitation of these algorithms is their high computational complexity which makes them difficult to extend to ASR applications. In this paper, an N-best class selector is introduced based on sparse representation (SR) and a tree search strategy. In this approach, the classification is fulfilled in three steps. At first, the set of similar training samples for the specific test sample is selected by k-dimensional (KD) tree search algorithm. Then, an SR based N-best class selector is used to limit the classification among certain classes. This makes the classifier adapt to each test sample and reduces the empirical risk. Finally, a well known low error rate classifier is trained by the selected exemplar samples and the trained classifier is employed to classify among the candidate classes. The algorithm is applied to phoneme classification and it is compared with some well-known phoneme classifiers according to accuracy and complexity issues. By this approach, we obtain competitive classification rate with promising computational complexity in comparison with the state of the art phoneme classifiers in clean and well known acoustic noisy environments which causes this approach become a suitable candidate for ASR applications. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
4. A clustering based feature selection method in spectro-temporal domain for speech recognition
- Author
-
Esfandian, Nafiseh, Razzazi, Farbod, and Behrad, Alireza
- Subjects
- *
SPEECH perception , *SIGNAL processing , *CLUSTER analysis (Statistics) , *FINITE Gaussian mixture models (Statistics) , *COMPARATIVE studies , *COVARIANCE matrices - Abstract
Abstract: Spectro-temporal representation of speech has become one of the leading signal representation approaches in speech recognition systems in recent years. This representation suffers from high dimensionality of the features space which makes this domain unsuitable for practical speech recognition systems. In this paper, a new clustering based method is proposed for secondary feature selection/extraction in the spectro-temporal domain. In the proposed representation, Gaussian mixture models (GMM) and weighted K-means (WKM) clustering techniques are applied to spectro-temporal domain to reduce the dimensions of the features space. The elements of centroid vectors and covariance matrices of clusters are considered as attributes of the secondary feature vector of each frame. To evaluate the efficiency of the proposed approach, the tests were conducted for new feature vectors on classification of phonemes in main categories of phonemes in TIMIT database. It was shown that by employing the proposed secondary feature vector, a significant improvement was revealed in classification rate of different sets of phonemes comparing with MFCC features. The average achieved improvements in classification rates of voiced plosives comparing to MFCC features is 5.9% using WKM clustering and 6.4% using GMM clustering. The greatest improvement is about 7.4% which is obtained by using WKM clustering in classification of front vowels comparing to MFCC features. [Copyright &y& Elsevier]
- Published
- 2012
- Full Text
- View/download PDF
5. A novel approach to HMM-based speech recognition systems using particle swarm optimization
- Author
-
Najkar, Negin, Razzazi, Farbod, and Sameti, Hossein
- Subjects
- *
HIDDEN Markov models , *ALGORITHMS , *SPEECH perception , *PARTICLE swarm optimization , *COMPUTATIONAL complexity , *DYNAMIC programming , *MAXIMUM likelihood statistics - Abstract
Abstract: The main core of HMM-based speech recognition systems is Viterbi algorithm. Viterbi algorithm uses dynamic programming to find out the best alignment between the input speech and a given speech model. In this paper, dynamic programming is replaced by a search method which is based on particle swarm optimization algorithm. The major idea is focused on generating an initial population of segmentation vectors in the solution search space and improving the location of segments by an updating algorithm. Several methods are introduced and evaluated for the representation of particles and their corresponding movement structures. In addition, two segmentation strategies are explored. The first method is the standard segmentation which tries to maximize the likelihood function for each competing acoustic model separately. In the next method, a global segmentation tied between several models and the system tries to optimize the likelihood using a common tied segmentation. The results show that the effect of these factors is noticeable in finding the global optimum while maintaining the system accuracy. The idea was tested on an isolated word recognition and phone classification tasks and shows its significant performance in both accuracy and computational complexity aspects. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
6. Corrigendum to "Non- invasive Localization of the Ectopic Foci of Focal Atrial Tachycardia by Using ECG Signal based Sparse Decomposition Algorithm".
- Author
-
Mohammadi, Fatemeh, Sheikhani, Ali, Razzazi, Farbod, and Ghorbani Sharif, Alireza
- Subjects
TACHYCARDIA ,ELECTROCARDIOGRAPHY ,ALGORITHMS - Published
- 2022
- Full Text
- View/download PDF
7. A non-linear mapping representing human action recognition under missing modality problem in video data.
- Author
-
Gharahdaghi, Aidin, Razzazi, Farbod, and Amini, Arash
- Subjects
- *
HUMAN behavior , *NONLINEAR operators , *KINECT (Motion sensor) , *VIDEOS , *DATABASES - Abstract
Human action recognition by using standard video files is a well-studied problem in the literature. In this study, we assume to have access to single modality standard data of some actions (training data). Based on this data, we aim at identifying the actions that are present in a target modality video data without any explicit source–target relationship information. In this case, the training and test phases of the recognition task are based on different imaging modalities. Our goal in this paper is to introduce a mapping (a nonlinear operator) on both modalities such that the outcome shares some common features. These common features were then used to recognize the actions in each domain. Simulation results on MSRDailyActivity3D, MSRActionPairs, UTKinect-Action3D, and SBU Kinect interaction datasets showed that the introduced method outperforms state-of-the art methods with a success rate margin of 15% on average. • We proposed a nonlinear mapping in cross modal human action recognition. • We proposed a cropping strategy to extract the salient segment of video frames. • The method is developed without using any joint RGB-D auxiliary dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
8. Non-invasive localization of the ectopic foci of focal atrial tachycardia by using ECG signal based sparse decomposition algorithm.
- Author
-
Mohammadi, Fatemeh, Sheikhani, Ali, Razzazi, Farbod, and Ghorbani Sharif, Alireza
- Subjects
ALGORITHMS ,TACHYCARDIA ,ATRIAL flutter ,SUPRAVENTRICULAR tachycardia ,INDEPENDENT component analysis ,LEFT heart atrium ,IMAGE reconstruction - Abstract
Over 95% of supraventricular tachycardia (SVT) can be treated using the electrophysiology study (EPS) and catheter ablation procedure. As The lack of accurate tools to guide electrophysiologists to find the exact location of ectopic foci, the present study was to locate ectopic foci of focal-atrial tachycardia (FAT) via information extracted from simple and non-invasive electrocardiogram (ECG) signals. For this aim, 32 ECG signals were collected from Tehran arrhythmia clinic database, and then independent features were obtained through the independent component analysis (ICA). Subsequently, the location of arrhythmia in the right and left atria was established and classified by sparse decomposition algorithm upon developing a dictionary based on these independent features. This algorithm could represent a spatial–temporal pattern of the input in the form of a combined line of basic functions existing in the dictionary. Over the past few decades, it has been also widely utilized in image compression and restoration as well as source separation and classification. The study results demonstrated that the sparse decomposition algorithm was well able to locate arrhythmias in the right and left atria with a mean accuracy of 93.27 ± 2.78. Since FAT generally emerges in certain anatomical locations, five areas in the right atrium and four areas in the left atrium, were determined as nine classifications in this respect. The accuracy of the results for different classifications was by 70.24% on average. Therefore, it was concluded that ECG signals could estimate the location ectopic foci before EPS in an accurate manner. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. Enhanced target detection using a new cognitive sonar waveform design in shallow water.
- Author
-
Pakdel Azar, Omid, Amiri, Hadi, and Razzazi, Farbod
- Subjects
- *
SONAR , *WATER depth , *GENETIC algorithms - Abstract
A cognitive waveform design is proposed for use in SONARs (Sound Operated Navigation and Ranging) for detecting targets in undersurface environment. The study adopts a waveform design procedure exploiting prior knowledge gained from environmental reverberations and target signals based on maximizing signal- to- interference-plus- noise ratio (SINR) in the receiver. A genetic algorithm is set forth to work out the optimal specifications in a wideband waveform design. Using initial Linear Frequency Modulated (LFM) waveform as input to the proposed optimization algorithm, an optimal waveform is obtained showing significant improvements in SINR. A number of experiments are conducted to simulate Probability of Detection (POD) versus Signal-to-Reverberation Ratio (SRR) in underwater environments. Analyzing simulation results reveals considerable improvements in target detection task utilizing optimal waveform design compared to initial LFM waveform. Also, in comparison with the initial LFM waveform, it was noticed that there was a substantial reduction in the bandwidth of the optimized design waveform. Our proposed method solves the problem of finding the optimal frequency for designing a cognitive waveform appropriate to the conditions of the environment and the target. We proposed the genetic algorithm method to solve the waveform design problem that maximize the SINR. This paper provides a general methodology for extracting the optimal frequency parameters of the waveform and can be extended to other parameters in cognitive design. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. A scale–rate filter selection method in the spectro-temporal domain for phoneme classification.
- Author
-
Fartash, Mehdi, Setayeshi, Saeed, and Razzazi, Farbod
- Subjects
- *
FEATURE extraction , *MACHINE learning , *GENETIC algorithms , *MACHINE theory , *EVOLUTIONARY algorithms , *SUPPORT vector machines , *MATHEMATICAL models , *COMPUTER systems - Abstract
Abstract: Recently, there has been a significant increase in studies employing auditory models in speech recognition systems. In this paper, we propose a new evolutionary tuned feature extraction method by spectro-temporal analysis. In our proposed model, there is a special subspace for each phoneme with a specific best scale in the spectral filter and a specific best rate in the temporal filter. These two parameters were obtained by genetic cellular automata evolutionary algorithm. The extracted features from the specific subspace are classified by a binary one-versus-rest support vector machine. Finally, a multiclass classifier for all phonemes is employed by combining these sub-models. The proposed method improved the discrimination of phonemes significantly especially in highly confusable phonemes. To show the efficiency of the proposed feature sets, it was empirically compared with two baseline models. The achieved relative improvements are about 10% in classification rate for voiced plosives, unvoiced plosives and nasals; and about 7.38% for front vowels relative to the state of the art baseline model. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
11. Decision fusion of horizontal and vertical trajectories for recognition of online Farsi subwords
- Author
-
Ghods, Vahid, Kabir, Ehsanollah, and Razzazi, Farbod
- Subjects
- *
DATA fusion (Statistics) , *MATHEMATICAL combinations , *FEATURE selection , *CLASSIFICATION , *DIGITAL signal processing , *PATTERN recognition systems - Abstract
Abstract: Online handwriting is formed by a combination of horizontal and vertical trajectories. If these trajectories are treated separately, new recognition methods are emerged. In contrast, one classifier is often used to recognize handwriting. In this work, some features for x(t) and y(t) signals were proposed and used to make two separate classifiers. After initial recognition by these classifiers, their results were fused for final recognition. Using HMM classifiers and simple product rule for decision fusion, the recognition results of 42 classes of Farsi subwords showed promising achievements. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
12. Phone-based filter parameter optimization of filter and sum robust speech recognition using likelihood maximization
- Author
-
Kouhi-Jelehkaran, Bahram, Bakhshi, Hamidreza, and Razzazi, Farbod
- Subjects
- *
MATHEMATICAL optimization , *AUTOMATIC speech recognition , *MAXIMUM entropy method , *INFORMATION filtering , *MICROPHONES , *CALIBRATION , *ALGORITHMS - Abstract
Abstract: Because of noise and reverberation, accuracy of speech recognition systems decreases when the distance between talker and microphone increases. By the using of microphone arrays and appropriate filtering of received signals, the accuracy of recognizer can be increased. Many different methods for using microphone arrays have been proposed that can be classified into two main approaches: systems that perform in two independent stages of array processing and then recognition and systems that use array processing to generate a sequence of features which maximize the likelihood of generating the correct hypothesis in recognition phase. Following second approach, in this paper a new method for microphone array processing is proposed in which the parameters of array processing are adjusted in calibration phase based on phones used in language and maximum likelihood method. Optimized filter parameters are stored and used during recognition phase. A new modified Viterbi algorithm using optimal phone-based filter parameters is used for recognition phase. The proposed algorithm is analytically formulated and Persian language is used to find any improvement in speech recognition accuracy compared with results of delay and sum and utterance-based filter and sum algorithms. The results show 12.2% improvement in accuracy compared to utterance-based algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
13. A weakly supervised representation learning for modulation recognition of short duration signals.
- Author
-
Hosseinzadeh, Hamidreza, Einalou, Zahra, and Razzazi, Farbod
- Subjects
- *
SUPERVISED learning , *PROBLEM solving , *ADDITIVE white Gaussian noise channels , *DIGITAL modulation - Abstract
• A weakly supervised representation learning (WSRL) method is proposed for AMR of short duration signals. • The WSRL shows robustness when facing a limited number of labeled signals. • Experimental results indicate the effectiveness of WSRL in AWGN and flat-fading channels. Modulation recognition of the short duration signals is considered as a challenging area in various civil and military applications. The calculation of distinct features is a serious problem in recognizing these signals. In this study, a robust feature representation is presented to solve this problem. In addition, our learned feature representation has a considerable potential to recognize different modulation types while facing a limited number of labeled signals. In the proposed method, the dataset is randomly divided into equal size disjoint subsets, and fuzzy c-means clustering is performed on the data in each subset to obtain pseudo cluster labels. A projection function is learned for each subset in order to create an ensemble of projection functions, and then the data are represented using these functions. The learned feature representation is tested in AWGN and flat-fading channels. Experimental results indicated a significant improvement in classification accuracy compared to the conventional features. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.