Author: "Dehghan Firoozabadi, Ali" / Topic: spectral estimation - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Dehghan Firoozabadi, Ali"' showing total 3 results

Start Over Author "Dehghan Firoozabadi, Ali" Topic spectral estimation

3 results on '"Dehghan Firoozabadi, Ali"'

1. A novel method for estimating the number of speakers based on generalized eigenvalue–vector decomposition and adaptive wavelet transform by using K-means clustering

Author: Dehghan Firoozabadi, Ali, Irarrazaval, Pablo, Adasme, Pablo, Zabala-Blanco, David, and Azurdia-Meza, Cesar
Published: 2020
Full Text: View/download PDF

2. Speaker Counting Based on a Novel Hive Shaped Nested Microphone Array by WPT and 2D Adaptive SRP Algorithms in Near-Field Scenarios.

Author: Dehghan Firoozabadi, Ali, Adasme, Pablo, Zabala-Blanco, David, Palacios Játiva, Pablo, and Azurdia-Meza, Cesar
Subjects: *MICROPHONES, *MICROPHONE arrays, *FISHER discriminant analysis, *ACOUSTIC localization, *RECURRENT neural networks, *SPEECH enhancement, *ALGORITHMS
Abstract: Speech processing algorithms, especially sound source localization (SSL), speech enhancement, and speaker tracking are considered to be the main fields in this application. Most speech processing algorithms require knowing the number of speakers for real implementation. In this article, a novel method for estimating the number of speakers is proposed based on the hive shaped nested microphone array (HNMA) by wavelet packet transform (WPT) and 2D sub-band adaptive steered response power (SB-2DASRP) with phase transform (PHAT) and maximum likelihood (ML) filters, and, finally, the agglomerative classification and elbow criteria for obtaining the number of speakers in near-field scenarios. The proposed HNMA is presented for aliasing and imaging elimination and preparing the proper signals for the speaker counting method. In the following, the Blackman–Tukey spectral estimation method is selected for detecting the proper frequency components of the recorded signal. The WPT is considered for smart sub-band processing by focusing on the frequency bins of the speech signal. In addition, the SRP method is implemented in 2D format and adaptively by ML and PHAT filters on the sub-band signals. The SB-2DASRP peak positions are extracted on various time frames based on the standard deviation (SD) criteria, and the final number of speakers is estimated by unsupervised agglomerative clustering and elbow criteria. The proposed HNMA-SB-2DASRP method is compared with the frequency-domain magnitude squared coherence (FD-MSC), i-vector probabilistic linear discriminant analysis (i-vector PLDA), ambisonics features of the correlational recurrent neural network (AF-CRNN), and speaker counting by density-based classification and clustering decision (SC-DCCD) algorithms on noisy and reverberant environments, which represents the superiority of the proposed method for real implementation. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. 3D Multiple Sound Source Localization by Proposed Cuboids Nested Microphone Array in Combination with Adaptive Wavelet-Based Subband GEVD.

Author: Dehghan Firoozabadi, Ali, Irarrazaval, Pablo, Adasme, Pablo, Zabala-Blanco, David, Palacios-Játiva, Pablo, and Azurdia-Meza, Cesar
Subjects: ACOUSTIC localization, LOCALIZATION (Mathematics), MICROPHONE arrays, PROBABILITY density function, K-means clustering, WAVELET transforms
Abstract: Sound source localization is one of the applicable areas in speech signal processing. The main challenge appears when the aim is a simultaneous multiple sound source localization from overlapped speech signals with an unknown number of speakers. Therefore, a method able to estimate the number of speakers, along with the speaker's location, and with high accuracy is required in real-time conditions. The spatial aliasing is an undesirable effect of the use of microphone arrays, which decreases the accuracy of localization algorithms in noisy and reverberant conditions. In this article, a cuboids nested microphone array (CuNMA) is first proposed for eliminating the spatial aliasing. The CuNMA is designed to receive the speech signal of all speakers in different directions. In addition, the inter-microphone distance is adjusted for considering enough microphone pairs for each subarray, which prepares appropriate information for 3D sound source localization. Subsequently, a speech spectral estimation method is considered for evaluating the speech spectrum components. The suitable spectrum components are selected and the undesirable components are denied in the localization process. The speech information is different in frequency bands. Therefore, the adaptive wavelet transform is used for subband processing in the proposed algorithm. The generalized eigenvalue decomposition (GEVD) method is implemented in sub-bands on all nested microphone pairs, and the probability density function (PDF) is calculated for estimating the direction of arrival (DOA) in different sub-bands and continuing frames. The proper PDFs are selected by thresholding on the standard deviation (SD) of the estimated DOAs and the rest are eliminated. This process is repeated on time frames to extract the best DOAs. Finally, K-means clustering and silhouette criteria are considered for DOAs classification in order to estimate the number of clusters (speakers) and the related DOAs. All DOAs in each cluster are intersected for estimating the position of the 3D speakers. The closest point to all DOA planes is selected as a speaker position. The proposed method is compared with a hierarchical grid (HiGRID), perpendicular cross-spectra fusion (PCSF), time-frequency wise spatial spectrum clustering (TF-wise SSC), and spectral source model-deep neural network (SSM-DNN) algorithms based on the accuracy and computational complexity of real and simulated data in noisy and reverberant conditions. The results show the superiority of the proposed method in comparison with other previous works. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Dehghan Firoozabadi, Ali"'

1. A novel method for estimating the number of speakers based on generalized eigenvalue–vector decomposition and adaptive wavelet transform by using K-means clustering

2. Speaker Counting Based on a Novel Hive Shaped Nested Microphone Array by WPT and 2D Adaptive SRP Algorithms in Near-Field Scenarios.

3. 3D Multiple Sound Source Localization by Proposed Cuboids Nested Microphone Array in Combination with Adaptive Wavelet-Based Subband GEVD.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

3 results on '"Dehghan Firoozabadi, Ali"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources