Descriptor: "Computer Science::Sound" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Computer Science::Sound"' showing total 30,752 results

Start Over Descriptor "Computer Science::Sound"

30,752 results on '"Computer Science::Sound"'

1. Test-Measured Rényi Divergences

Author: Milan Mosonyi and Fumio Hiai
Subjects: FOS: Computer and information sciences, Quantum Physics, Computer Science::Sound, Information Theory (cs.IT), Computer Science - Information Theory, FOS: Physical sciences, Mathematical Physics (math-ph), Library and Information Sciences, Quantum Physics (quant-ph), Mathematical Physics, Computer Science Applications, Information Systems
Abstract: One possibility of defining a quantum R\'enyi $\alpha$-divergence of two quantum states is to optimize the classical R\'enyi $\alpha$-divergence of their post-measurement probability distributions over all possible measurements (measured R\'enyi divergence), and maybe regularize these quantities over multiple copies of the two states (regularized measured R\'enyi $\alpha$-divergence). A key observation behind the theorem for the strong converse exponent of asymptotic binary quantum state discrimination is that the regularized measured R\'enyi $\alpha$-divergence coincides with the sandwiched R\'enyi $\alpha$-divergence when $\alpha>1$. Moreover, it also follows from the same theorem that to achieve this, it is sufficient to consider $2$-outcome measurements (tests) for any number of copies (this is somewhat surprising, as achieving the measured R\'enyi $\alpha$-divergence for $n$ copies might require a number of measurement outcomes that diverges in $n$, in general). In view of this, it seems natural to expect the same when $\alpha1$ case., Comment: v3: 30 pages, minor improvements. Thanks to a comment by an anonymous reviewer, we can now show that the two different ways to regularize the test-measured R\'enyi $\alpha$-divergence lead to different quantities
Published: 2023

2. Quantum Rényi Divergences and the Strong Converse Exponent of State Discrimination in Operator Algebras

Author: Fumio Hiai and Milán Mosonyi
Subjects: FOS: Computer and information sciences, Quantum Physics, Nuclear and High Energy Physics, Information Theory (cs.IT), Computer Science - Information Theory, Mathematics - Operator Algebras, FOS: Physical sciences, Statistical and Nonlinear Physics, Computer Science::Social and Information Networks, Mathematical Physics (math-ph), 81P45, 81P18, 94A17, 46L52, 46L53, 81R15, 62H15, High Energy Physics::Theory, Computer Science::Sound, FOS: Mathematics, Quantum Physics (quant-ph), Operator Algebras (math.OA), Mathematical Physics
Abstract: The sandwiched R\'enyi $\alpha$-divergences of two finite-dimensional quantum states play a distinguished role among the many quantum versions of R\'enyi divergences as the tight quantifiers of the trade-off between the two error probabilities in the strong converse domain of state discrimination. In this paper we show the same for the sandwiched R\'enyi divergences of two normal states on an injective von Neumann algebra, thereby establishing the operational significance of these quantities. Moreover, we show that in this setting, again similarly to the finite-dimensional case, the sandwiched R\'enyi divergences coincide with the regularized measured R\'enyi divergences, another distinctive feature of the former quantities. Our main tool is an approximation theorem (martingale convergence) for the sandwiched R\'enyi divergences, which may be used for the extension of various further results from the finite-dimensional to the von Neumann algebra setting. We also initiate the study of the sandwiched R\'enyi divergences of pairs of states on a $C^*$-algebra, and show that the above operational interpretation, as well as the equality to the regularized measured R\'enyi divergence, holds more generally for pairs of states on a nuclear $C^*$-algebra., Comment: 40 pages. To appear in Annales Henri Poincare
Published: 2022

3. Large-Signal Equivalent-Circuit Model of Asymmetric Electrostatic Transducers

Author: Jorge M. Monsalve, Lutz Ehrig, Harald Schenk, Bert Kaiser, Holger Conrad, Michael Stolz, David Schuffenhauer, H. Schenk, Anton Melnikov, and Publica
Subjects: Physics, Microphone, Acoustics, Computer Science Applications, law.invention, Capacitor, Nonlinear system, Transducer, Computer Science::Sound, Control and Systems Engineering, law, Distortion, Equivalent circuit, Electrical and Electronic Engineering, Charge amplifier, Network model
Abstract: This article presents a circuit model that is able to capture the full nonlinear behavior of an asymmetric electrostatic transducer whose dynamics are governed by a single degree of freedom. Effects such as stress-stiffening and pull-in are accounted for. The simulation of a displacement-dependent capacitor and a nonlinear spring is accomplished with arbitrary behavioral sources, which are a standard component of circuit simulators. As an application example, the parameters of the model were fitted to emulate the behavior of an electrostatic MEMS loudspeaker whose finite-element (FEM) simulations and acoustic characterisation where already reported in the literature. The obtained waveforms show good agreement with the amplitude and distortion that was reported both in the transient FEM simulations and in the experimental measurements. This model is also used to predict the performance of this device as a microphone, coupling it to a two-stage charge amplifier. Additional complex behaviors can be introduced to this network model if it is required.
Published: 2022

4. Fast Exact Dynamic Time Warping on Run-Length Encoded Time Series

Author: Vincent Froese, Brijnesh Jain, Maciej Rymar, Mathias Weller, Technische Universität Berlin (TU), Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique Gaspard-Monge (LIGM), and École des Ponts ParisTech (ENPC)-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel
Subjects: FOS: Computer and information sciences, Dynamic Programming, General Computer Science, Applied Mathematics, [INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS], 510 Mathematik, Block Matrices, 68W32, block matrix, Computer Science Applications, Computer Science::Sound, Computer Science - Data Structures and Algorithms, line intersections, Data Structures and Algorithms (cs.DS), Sparse Data, Time Series Similarity
Abstract: Dynamic Time Warping (DTW) is a well-known similarity measure for time series. The standard dynamic programming approach to compute the DTW distance of two length-n time series, however, requires $$O(n^2)$$ O ( n 2 ) time, which is often too slow for real-world applications. Therefore, many heuristics have been proposed to speed up the DTW computation. These are often based on lower bounding techniques, approximating the DTW distance, or considering special input data such as binary or piecewise constant time series. In this paper, we present a first exact algorithm to compute the DTW distance of two run-length encoded time series whose running time only depends on the encoding lengths of the inputs. The worst-case running time is cubic in the encoding length. In experiments we show that our algorithm is indeed fast for time series with short encoding lengths.
Published: 2022

5. Experimental and Numerical Investigation of Acoustic Performance for Full-Sized SPS

Author: Yu Gao, Kun Liu, Wenan Jiang, and Yubin Fang
Subjects: Article Subject, Computer Science::Sound, Mechanics of Materials, Mechanical Engineering, Geotechnical Engineering and Engineering Geology, Condensed Matter Physics, Civil and Structural Engineering
Abstract: The sound insulation of a sandwich plate system (SPS) was measured by the sound pressure method in fixed support boundary conditions and reverberation sound field. The results were compared with those obtained using the finite element method. The sound insulation curves obtained via experiments and numerical simulation were observed to be in agreement. This indicates that the numerical simulation method can effectively reflect the sound insulation performance of the structure. In addition, the influence of different parameters on the sound insulation performance of the structure was evaluated using the finite element method, and the weight of each parameter in the influence of sound insulation was ranked by applying the function of “fsrftest” in the software MATLAB. It was found that in the low-frequency domain, the length-to-width ratio of the SPS had the most significant effect on the sound insulation performance of the structure, and the mass ratio of the panel to core exerted the least influence on it. Furthermore, in the medium- and high-frequency domain, the main factors affecting the sound insulation were different in the different frequency ranges, and the frequency range should be considered during the design of the structure. The results can provide technical support for the analysis of the sound insulation performance of SPSs.
Published: 2022

6. Gradient Solitons on Doubly Warped Product Manifolds

Author: Blaga, Adara M. and Taştan, Hakan M.
Subjects: Mathematics - Differential Geometry, Differential Geometry (math.DG), Computer Science::Sound, FOS: Mathematics, Statistical and Nonlinear Physics, Mathematics::Differential Geometry, Mathematical Physics
Abstract: Firstly we provide new characterizations for doubly warped product manifolds. Then we consider several types of gradient solitons on them such as Riemann, Ricci, Yamabe and conformal and examine the effect of a gradient soliton on a doubly warped product to its factor manifolds. Finally we investigate the concircularly flat and conharmonically flat cases of doubly warped products., Comment: 14 pages
Published: 2022

7. Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Author: Ondřej Mokrý, Paul Magron, Thomas Oberlin, Cédric Févotte, Brno University of Technology [Brno] (BUT), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Institut Supérieur de l'Aéronautique et de l'Espace (ISAE-SUPAERO), Signal et Communications (IRIT-SC), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), Centre National de la Recherche Scientifique (CNRS), Czech Science Foundation (GA ˇCR) Project No. 20-29009S, ANR-19-P3IA-0004,ANITI,Artificial and Natural Intelligence Toulouse Institute(2019), European Project: CoG-6681839,ERC FACTORY, Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Magron, Paul, Artificial and Natural Intelligence Toulouse Institute - - ANITI2019 - ANR-19-P3IA-0004 - P3IA - VALID, and European Research Council (ERC FACTORY-CoG-6681839) - ERC FACTORY - CoG-6681839 - INCOMING
Subjects: FOS: Computer and information sciences, Sound (cs.SD), [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, audio inpainting, nonnegative matrix factorization, Computer Science - Sound, expectation-maximization, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, Audio and Speech Processing (eess.AS), Computer Science::Sound, Control and Systems Engineering, alternating minimization, Signal Processing, FOS: Electrical engineering, electronic engineering, information engineering, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, Software, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: International audience; Audio inpainting, i.e., the task of restoring missing or occluded audio signal samples, usually relies on sparse representations or autoregressive modeling. In this paper, we propose to structure the spectrogram with nonnegative matrix factorization (NMF) in a probabilistic framework. First, we treat the missing samples as latent variables, and derive two expectation-maximization algorithms for estimating the parameters of the model, depending on whether we formulate the problem in the time-or time-frequency domain. Then, we treat the missing samples as parameters, and we address this novel problem by deriving an alternating minimization scheme. We assess the potential of these algorithms for the task of restoring short-to middle-length gaps in music signals. Experiments reveal great convergence properties of the proposed methods, as well as competitive performance when compared to state-of-the-art audio inpainting techniques.
Published: 2023

8. Audio Bank: A High-Level Acoustic Signal Representation for Audio Event Recognition

Author: Sukanya Sonowal, Jin Young Choi, and Tushar Sandhan
Subjects: FOS: Computer and information sciences, Audio mining, Sound (cs.SD), Artificial neural network, Event (computing), Computer science, business.industry, Feature vector, Speech recognition, Speech coding, Acoustic model, Pattern recognition, Computer Science - Sound, Non-negative matrix factorization, Computer Science - Information Retrieval, Support vector machine, Audio and Speech Processing (eess.AS), Computer Science::Sound, Computer Science::Multimedia, FOS: Electrical engineering, electronic engineering, information engineering, Artificial intelligence, business, Information Retrieval (cs.IR), Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Automatic audio event recognition plays a pivotal role in making human robot interaction more closer and has a wide applicability in industrial automation, control and surveillance systems. Audio event is composed of intricate phonic patterns which are harmonically entangled. Audio recognition is dominated by low and mid-level features, which have demonstrated their recognition capability but they have high computational cost and low semantic meaning. In this paper, we propose a new computationally efficient framework for audio recognition. Audio Bank, a new high-level representation of audio, is comprised of distinctive audio detectors representing each audio class in frequency-temporal space. Dimensionality of the resulting feature vector is reduced using non-negative matrix factorization preserving its discriminability and rich semantic information. The high audio recognition performance using several classifiers (SVM, neural network, Gaussian process classification and k-nearest neighbors) shows the effectiveness of the proposed method., 6 pages, 9 figures, published in IEEE International Conf ICCAS 2014 (Best paper award)
Published: 2023

9. Simulation Analysis of the Sound Transmission Loss of Composite Laminated Cylindrical Shells with Applied Acoustic Coverings

Author: Bin Li, Haoyang Ding, Zhigang Hu, Junya Zhao, Menghui Qi, and Zhiyong Wang
Subjects: Article Subject, Computer Science::Sound, Mechanics of Materials, Mechanical Engineering, Geotechnical Engineering and Engineering Geology, Condensed Matter Physics, Civil and Structural Engineering
Abstract: The noise reduction problem of composite laminated cylindrical shells at low and medium frequencies from 150 Hz to 1000 Hz is addressed by using the noise control method of laying acoustic coverings and conducting noise reduction experiments in a cylindrical shell cavity by laying melamine foam, sound-absorbing cotton, and multilayer combination materials and obtaining the corresponding transmission loss curves. Additionally, based on the LMS Virtual Lab acoustic simulation software, finite element models corresponding to the noise reduction experiments are established, and the acoustic cavity’s simple positive frequency and acoustic response of the cavity are numerically calculated. Based on this, the influence law for a laid acoustic cover layer on the sound transmission loss of a cylindrical shell is investigated. The results show that the noise reduction of sound-absorbing cotton with the same thickness is about 1.26 times that of melamine foam, and the noise reduction of melamine foam with the same mass is about 1.42 times that of sound-absorbing cotton. For multilayer laying, the noise reduction of adding the same thickness of butyl rubber is about 6.18 times that of melamine foam, and the larger the laying ratio is, the better the noise reduction effect will be.
Published: 2022

10. Neural Network-Based Dynamic Segmentation and Weighted Integrated Matching of Cross-Media Piano Performance Audio Recognition and Retrieval Algorithm

Author: Tianshu Wang
Subjects: Big Data, Article Subject, General Computer Science, Computer Science::Sound, General Mathematics, General Neuroscience, Image Processing, Computer-Assisted, Recognition, Psychology, Neural Networks, Computer, General Medicine, Algorithms
Abstract: This paper presents a dynamic segmentation and weighted comprehensive matching algorithm based on neural networks for cross-media piano performance audio recognition and retrieval. The 3D convolutional neural network process is separated to compress the network parameters and improve the computational speed. Skip connection and layer-wise learning rate solve the problem that the separated network is challenging to train. The piano performance audio recognition is facilitated by shuffle operation. In pattern recognition, music retrieval algorithms are gaining more and more attention due to their ease of implementation and efficiency. However, the problems of imprecise dynamic note segmentation and inconsistent matching templates directly affect the accuracy of the MIR algorithm. We propose a dynamic threshold-based segmentation and weighted comprehensive matching algorithm to solve these problems. The amplitude difference step is dynamically set, and the notes are segmented according to the changing threshold to improve the accuracy of note segmentation. A standard score frequency is used to transform the pitch template to achieve input normalization to enhance the accuracy of matching. Direct matching and DTW matching are fused to improve the adaptability and robustness of the algorithm. Finally, the effectiveness of the method is experimentally demonstrated. This paper implements the data collection and processing, audio recognition, and retrieval algorithm for cross-media piano performance big data through three main modules: the collection, processing, and storage module of cross-media piano performance big data, the building module of audio recognition of cross-media piano performance big data, and the dynamic precision module of cross-media piano performance big data.
Published: 2022

11. Application of Spectrum Analysis Technology in Music Audio Analysis

Author: Yi Li
Subjects: Article Subject, General Computer Science, Computer Science::Sound
Abstract: In order to improve the music analysis technology, this paper studies the music analysis technology combined with the spectrum analysis technology and builds an intelligent audio analysis model. In this paper, the nonlinear theoretical method is adopted, and the motion equation of the audio frequency is obtained through variable processing, so as to obtain the mean square fluctuation of the two sound wave models and then obtain the entanglement and compression. Moreover, this paper introduces the combined mode method to describe the interaction of the two laser fields and atomic matter and verifies that both the differential modes are decoupled from the interaction and only the sum mode participates in the interaction. The experiment verifies that the music audio analysis system based on spectrum analysis technology proposed in this paper can play an important role in music analysis.
Published: 2022

12. Gaussian mixture model based adaptive control for uncertain nonlinear systems with complex state constraints

Author: Yong Zhao, Rong Chen, Yuzhu Bai, and Yi Wang
Subjects: Nonlinear system, Adaptive control, Computer Science::Sound, Control theory, Computer science, Robustness (computer science), Mechanical Engineering, Bounded function, Terminal sliding mode, Aerospace Engineering, Nonlinear control, Mixture model
Abstract: This paper addresses an uncertain nonlinear control system problem with complex state constraints and mismatched uncertainties. A novel Gaussian Mixture Model (GMM) based adaptive PID-Nonsingular Terminal Sliding Mode Control (NTSMC) (GMM-adaptive-PID-NTSMC) method is proposed. It is achieved by combining a GMM based adaptive potential function with a novel switching surface of PID-NTSMC. Next, the stability of the closed-loop system is proved. The main contribution of this paper is that the GMM method is applied to obtain the analytic description of the complex bounded state constraints, ensuring that the states’ constraints are not violated with GMM-based adaptive potential function. The developed potential function can consider the influence of uncertainties. More importantly, the GMM-adaptive-PID-NTSMC can be generalized to control a more representative class of uncertain nonlinear systems with constrained states and mismatched uncertainties. In addition, the proposed controller enhances the robustness, and requires less control cost and reduces the steady state error with respect to the Artificial Potential Function based Nonsingular Terminal Sliding Mode Control (APF-NTSMC), GMM-NTSMC and GMM-adaptive-NTSMC. At last, numerical simulation is performed to validate the superior performance of the proposed controller.
Published: 2022

13. Robust shallow water reverberation reduction methods based on low-rank and sparsity decomposition

Author: Yunchao Zhu, Rui Duan, and Kunde Yang
Subjects: Acoustics and Ultrasonics, Arts and Humanities (miscellaneous), Computer Science::Sound
Abstract: Using the characteristics of low rank for reverberation and sparsity for the target echo in multi-ping detection, the low-rank and sparsity decomposition method can effectively reduce reverberation. However, in the case of highly sparse reverberation or a stationary target, the distinctions in the characteristics between the reverberation and target echo become ambiguous. As a result, the reverberation reduction performance is degraded. To guarantee a meaningful decomposition based on the random orthogonal model and random sparsity model, the identifiability condition (IC) for the decomposition was derived from the perspective of the low-rank matrix and sparse matrix, respectively. According to the IC, sparsity compensation for the low-rank matrix was proposed to address the false alarm probability inflation (FAPI) induced by highly sparse reverberation. In addition, increasing the dimension of the sparse matrix was also proposed to manage the detection probability shrinkage caused by a stationary target. The robust reverberation reduction performance was validated via simulations and field experiments. It is demonstrated that FAPI can be eliminated by increasing the sparse coefficient of the low-rank matrix to 0.30 and a stationary target could be detected with a large ping number, i.e., a high dimension, of the sparse matrix.
Published: 2022

14. Acoustic and Viscometric Investigations of Aqueous D-Pantothenic Acid Calcium Hemi-salt

Author: Subhash V. Kinnake
Subjects: Computer Science::Sound
Abstract: Acoustic and viscometric measurements are increasingly being used to investigate the properties of pure components as well as the nature, strength, and order of intermolecular interactions between constituents in solution. The density (ρ), viscosity (n), and ultrasonic velocity (U) of aqueous D-Pantothenic acid hemi-calcium salt at 298.15K and 300.15K were measured. Various thermo -acoustic parameters such as adiabatic compressibility, free length (Lf), free volume (Vf), internal pressure, acoustic impedance (Z), Gibb's free energy (G), and molar sound velocity (R) have been calculated using experimentally measured data. Acoustic parameters are useful in understanding molecular interactions in binary liquid mixtures. Keywords; Ultrasonic velocity, Free length, acoustical parameters, Acoustic impedance.
Published: 2022

15. Selective Partial Update Adaptive Filtering Algorithms for Block-Sparse System Identification

Author: Dandan Wei and Qing Xu
Subjects: Computer Science::Sound, Computer Networks and Communications, Hardware_ARITHMETICANDLOGICSTRUCTURES, Information Systems
Abstract: The block-sparse normalized least mean square (BS-NLMS) algorithm which takes advantage of sparsity, successfully shows fast convergence in adaptive block-sparse system identification, adaptive control, and other industrial informatics applications. It is also attractive in acoustic processing where long impulse response, highly correlated and sparse echo path are encountered. However, the major drawback of BS-NLMS is largely computational complexity. This paper proposes a novel selective partial-update block-sparse normalized least mean square (SPU-BS-NLMS) algorithm. Compared with conventional BS-NLMS for block-sparse system identification, the proposed elective partial-update block-sparse NLMS algorithm takes partial-update blocks scheme which is determined by the smallest squared Euclidean-norm at each iteration instead of entire block coefficients to save computations. Computational complexity analysis is conducted to help researchers select appropriate parameters for practical realizations and applications. Computer simulations on acoustic echo cancellation are conducted to verify the results and the effectiveness of the proposed algorithm.
Published: 2022

16. Recognition and Error Correction Techniques for Piano Playing Music Based on Convolutional Cyclic Hashing Method

Author: Dan Wang
Subjects: Computer Science::Sound, Computer Networks and Communications, Electrical and Electronic Engineering, Computer Science::Databases, Information Systems
Abstract: Music as a sound symbol can express what people think; music is both a form of social behavior and can promote people’s emotional communication; music is also a form of entertainment and can enrich people’s spiritual life. In this paper, we propose a new convolutional recurrent hashing method CRNNH, which uses multilayer RNN to learn to discriminate piano playing music using convolutional feature map sequences. Firstly, a convolutional feature map sequence learning preserving similarity hash function is designed consisting of multilayer convolutional feature maps extracted from multiple convolutional layers of a pretrained CNN; secondly, a new deep learning framework is proposed to generate hash codes using a multilayer RNN, which directly uses the convolutional feature maps as input to preserve the spatial structure of the feature maps; finally, a new loss function is proposed to preserve the semantic similarity and balance of the hash codes, while considering the quantization error generated when the hash layer outputs binary hash codes. The experimental results illustrate that the proposed CRNNH can obtain better performance compared to other hashing methods.
Published: 2022

17. AcousNet: A Deep Learning Based Approach to Dynamic 3D Holographic Acoustic Field Generation From Phased Transducer Array

Author: Yao Guo, Song Liu, David C. Jeong, Yuyu Jia, and Chengxi Zhong
Subjects: Forward kinematics, Control and Optimization, Inverse kinematics, Computer science, Phased array, Mechanical Engineering, Acoustics, Biomedical Engineering, Holography, Acoustic wave, Sound power, Interference (wave propagation), Computer Science Applications, law.invention, Computer Science::Robotics, Human-Computer Interaction, Transducer, Computer Science::Sound, Artificial Intelligence, Control and Systems Engineering, law, ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS, Computer Vision and Pattern Recognition
Abstract: Holographic acoustic field has shown great potential for non-contact robotic manipulations of millimeter or sub-millimeter size objects to effectively deliver acoustic power. The latest technology for generating dynamic holographic acoustic field is through phased transducer array, where relative phases of emitted acoustic waves from transducers are independently controlled to modulate the acoustic interference field. While the forward kinematics of a phased array based robotic manipulation system is simple and straightforward, the inverse kinematics (i.e., the mapping from a given holographic acoustic field to array phases for control purpose), however, is mathematically non-linear and unsolvable, presenting challenges in developing wider applications of holographic acoustic field for robotic manipulation. Considering, thus far, there are still no effective solutions reported, the authors put intensive efforts to solve this problem using a machine learning approach, which we refer to as AcousNet. Experimental results demonstrate the effectiveness of the proposed method for dynamic holographic acoustic field generation from phased transducer array.
Published: 2022

18. Research on Low-Frequency Noise Control Based on Fractal Coiled Acoustic Metamaterials

Author: Hongyu Cui, Chengtao Liu, and Haoming Hu
Subjects: Article Subject, Computer Science::Sound, Mechanics of Materials, Mechanical Engineering, Physics::Optics, Physics::Classical Physics, Geotechnical Engineering and Engineering Geology, Condensed Matter Physics, Civil and Structural Engineering
Abstract: In this paper, a self-similar fractal coiled acoustic metamaterial is designed for the low-frequency noise control problem by combining a perforated plate and a coiled back cavity structure. Based on the multiphysics field coupling method and thermoviscous acoustic theory, the effect of structural parameter changes on the sound absorption performance of the metamaterial structure is investigated using finite element analysis. The sound energy dissipation mechanism of the metamaterial structure is studied. Finally, 3D printing technology is used to prepare the metamaterial model structure, and the low-frequency sound absorption performance of the metamaterial structure is tested through structural sound absorption performance experiments, which verify that the metamaterial has subwavelength sound absorption performance.
Published: 2022

19. A Novel Denoise Method of Acoustic Signal from Train Bearings Based on Resampling Technique and Improved Crazy Climber Algorithm

Author: Yali Sun, Hua Li, Xing Zhao, Jiyou Fei, Xiaodong Liu, and Yijie Niu
Subjects: Article Subject, Computer Science::Sound, Mechanics of Materials, Mechanical Engineering, Geotechnical Engineering and Engineering Geology, Condensed Matter Physics, Civil and Structural Engineering
Abstract: The wayside acoustic defective bearing detector system (TADS) is located on both sides of the railway, so that the acoustic signals recorded by the microphone not only include the sound from the train bearings but also include it from the other disturbance sources. The heavy noise and multisource acoustic signals would badly reduce the reliability and accuracy of the detection result of the TADS. In order to extract the useful information from the recorded signal exactly and efficiently, a novel denoising method based on the Short-time Fourier transform (STFT) and improved Crazy Climber algorithm was improved in this paper. Firstly, the STFT was performed on the recorded acoustic signals in order to obtain the time-frequency distribution matrix. Based on the original algorithm, the novel movement rule and the fitting process of the ridge lines were presented which could extract the time-frequency ridge lines of the acoustic signal accurately and rapidly. In this way, the important information from the train bearings could be divided from the heavy noise and other signals. Finally, the simulation and experimental verifications were carried out, and the denoising method based on the STFT and improved Crazy Climber algorithm has proved to be effective in extracting ridge lines of the time-frequency distribution matrix and dividing the useful information form the recorded acoustic signals.
Published: 2022

20. Automatic Classification Method of Music Genres Based on Deep Belief Network and Sparse Representation

Author: Lina Pan
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Article Subject, Computer Science::Sound, General Mathematics
Abstract: Aiming at the problems of poor classification effect, low accuracy, and long time in the current automatic classification methods of music genres, an automatic classification method of music genres based on deep belief network and sparse representation is proposed. The music signal is preprocessed by framing, pre-emphasis, and windowing, and the characteristic parameters of the music signal are extracted by Mel frequency cepstrum coefficient analysis. The restricted Boltzmann machine is trained layer by layer to obtain the connection weights between layers of the depth belief network model. According to the output classification, the connection weights in the model are fine-tuned by using the error back-propagation algorithm. Based on the deep belief network model after fine-tuning training, the structure of the music genre classification network model is designed. Combined with the classification algorithm of sparse representation, for the training samples of sparse representation music genre, the sparse solution is obtained by using the minimum norm, the sparse representation of test vector is calculated, the category of training samples is judged, and the automatic classification of music genre is realized. The experimental results show that the music genre automatic classification effect of the proposed method is better, the classification accuracy rate is higher, and the classification time can be effectively shortened.
Published: 2022

21. 回転音場を形成する超音波振動系の設計に関する研究— 振動子駆動に伴って生じる振動分布変化の定量的評価

Subjects: Computer Science::Sound, sound filed, flexural vibration, ultrasonic vibration, Langevin transducer, Physics::Chemical Physics, Computer Science::Formal Languages and Automata Theory
Abstract: Rotating sound field excited by flexural vibration disk/ring is used for some high-power ultrasonic applications. In general, two or more transducers are used for this purpose with changing the number or position of transducers. When the transducers are attached to the disk/ring, the vibration distribution in the disk/ring does not exactly match the natural mode of vibration. This paper presents a quantitative evaluation of this difference in flexural vibration distribution of the disk driven by Langevin transducers.
Published: 2022

22. On the Robustness and Efficiency of the Plane-Wave-Enriched FEM with Variable q-Approach on the 2D Room Acoustics Problem

Author: Shunichi Mukae, Takeshi Okuzono, and Kimihiro Sakagami
Subjects: wave-based modeling, Computer Science::Sound, architectural acoustics, finite element method, acoustic simulation, General Medicine, Mathematics::Numerical Analysis
Abstract: Partition of unity finite element method with plane wave enrichment (PW-FEM) uses a shape function with a set of plane waves propagating in various directions. For room acoustic simulations in a frequency domain, PW-FEM can be an efficient wave-based prediction method, but its practical applications and especially its robustness must be studied further. This study elucidates PW-FEM robustness via 2D real-scale office room problems including rib-type acoustic diffusers. We also demonstrate PW-FEM performance using a sparse direct solver and a high-order Gauss–Legendre rule with a recently developed rule for ascertaining the number of integration points against the classical linear and quadratic FEMs. Numerical experiments investigating mesh size and room geometrical complexity effects on the robustness of PW-FEM demonstrated that PW-FEM becomes more robust at wide bands when using a mesh in which the maximum element size maintains a comparable value to the wavelength of the upper-limit frequency. Moreover, PW-FEM becomes unstable with lower spatial resolution mesh, especially for rooms with complex shape. Comparisons of accuracies and computational costs of linear and quadratic FEM revealed that PW-FEM requires twice the computational time of the quadratic FEM with a mesh having spatial resolution of six elements per wavelength, but it is highly accurate at wide bands with lower memory and with markedly fewer degrees of freedom. As an additional benefit of PW-FEM, the impulse response waveform of quadratic FEM in a time domain was found to deteriorate over time, but the PW-FEM waveform can maintain accurate waveforms over a long time.
Published: 2022

23. Contribution of Even/Odd Sound Wave Modes in Human Cochlear Model on Excitation of Traveling Waves and Determination of Cochlear Input Impedance

Author: Wenjia Hong and Yasushi Horii
Subjects: auditory mechanism, cochlea, traveling wave theory, compressible perilymph, even/odd mode analysis, Computer Science::Sound, Computer Science::Mathematical Software, Computer Science::Programming Languages, General Medicine
Abstract: Based on the Navier–Stokes equation for compressible media, this work studies the acoustic properties of a human cochlear model, in which the scala vestibuli and scala tympani are filled with compressible perilymph. Since the sound waves propagate as a compression wave in perilymph, this model can precisely handle the wave–based phenomena. Time domain analysis showed that a sound wave (fast wave) first propagates in the scala vestibuli and scala tympani, and then, a traveling wave (slow wave) is generated by the sound wave with some delay. Detailed studies based on even and odd mode analysis indicate that an odd mode sound wave, that is, the difference in the sound pressures between the scala vestibuli and scala tympani, excites the Békésy’s traveling wave, while an even mode sound determines the input impedance of the cochlea.
Published: 2022

24. Optimization and Modeling of Radial Pitch Diameter Difference in Tapping of AISI H13

Author: Jie Ren, Tingting Li, Zhi Chen, Yu Meng, Rui Zhang, and Xianguo Yan
Subjects: Article Subject, Computer Science::Sound, General Engineering, General Materials Science
Abstract: The radial pitch diameter difference has a great influence on the quality of the internal thread. However, it is difficult to accurately control the radial pitch diameter difference of the thread in the tapping. Therefore, the influence of various factors on radial pitch diameter difference for tapping AISI H13 steel was studied in this paper. Parameters with optimum radial pitch diameter difference were determined by the Taguchi method, and the tapping experiment was carried out according to Taguchi L18 orthogonal array. Based on the signal-to-noise ratio and variance analysis, the experimental results were evaluated to determine the combination of factors to obtain the smallest radial pitch diameter difference and the influence level of each factor on radial pitch diameter difference, and the prediction equation of radial pitch diameter difference was established through the regression analysis. The results show that the combination of factors to obtain the smallest radial pitch diameter difference is a hone radius of 10 μm, a spindle speed of 100 rev/min, and a chamfer length of 2 pitches; the order of importance of the influencing factors on radial pitch diameter difference is spindle speed, followed by hone radius and chamfer length, and their percentage contribution rates are 61.54%, 24.53%, and 6.16%, respectively; the determination coefficient R2 of the prediction equations is 0.925; the confirmation experiment conducted with 95% confidence level shows that Taguchi method and prediction equation successfully optimize and predict radial pitch diameter difference.
Published: 2022

25. Acoustic properties of the deep screened sonar radiator

Author: Anatolii Derepa, Olha Pozdniakova, Oleksandr Leiko, Oleksii Bohdanov, Anastasіia Osadcha, and Kateryna Shyshkova
Subjects: Computer Science::Sound, Computer Science::Neural and Evolutionary Computation, Physics::Medical Physics, Physics::Optics
Abstract: The necessity of application of rigid type acoustic screens in hydroacoustic stations with deep-water antennas of variable depth is substantiated. By the method of coupled fi elds in multiconnected regions for antennas formed from cylindrical piezoceramicradiators with rigid screens in the form of open rings of fi nite thickness, analytical expressions for calculations of acoustic fi elds of such radiators are obtained. In the solution of the problem “pass-through” radiation shielded transducer included interactions: electrical, mechanical and acoustic fi elds in the conversion of energy; acoustic fi elds of the shell and screen during energy generation; processes of energy conversion and formation. Quantitative results of calculations of acoustic fi elds on the received expression are resulted.
Published: 2022

26. Local Sound Speed Estimation for Pulse-Echo Ultrasound in Layered Media

Author: Arsenii V. Telichko, Jose G. Vilches-Moure, Ramasamy Paulmurugan, Jeremy J. Dahl, Rehman Ali, Huaijun Wang, and Uday Kumar Sukumar
Subjects: Synthetic aperture radar, Reverberation, geography, geography.geographical_feature_category, Acoustics and Ultrasonics, Phantoms, Imaging, Computer science, Acoustics, Estimator, Function (mathematics), Article, Imaging phantom, Rats, Sound, Heart Rate, Computer Science::Sound, Speed of sound, Animals, Coherence (signal processing), Electrical and Electronic Engineering, Instrumentation, Sound (geography), Ultrasonography
Abstract: Our previous methodology in local sound speed estimation utilized time delays measured by the cross correlation of delayed full-synthetic aperture channel data to estimate the average speed of sound. However, focal distortions in this methodology lead to biased estimates of the average speed of sound, which, in turn, leads to biased estimates of the local speed of sound. Here, we demonstrate the bias in the previous methodology and introduce a coherence-based average sound speed estimator that eliminates this bias and is computationally much cheaper in practice. Because this coherence-based approach estimates the average sound speed in the medium over an equally spaced grid in depth rather than time, we derive a refined model that relates the local and average speeds of sound as a function of depth in layered media. A fast, closed-form inversion of this model yields highly accurate local sound speed estimates. The root-mean-square (rms) error of local sound speed reconstruction in simulations of two-layer media is 4.6 and 2.5 m/s at 4 and 8 MHz, respectively. This work examines the impact of frequency, f -number, aberration, and reverberation on sound speed estimation. Phantom and in vivo experiments in rats further validate the coherence-based sound speed estimator.
Published: 2022

27. Acoustic emissions in directed energy deposition processes

Author: Tobias Hauser, Raven T. Reisch, Tobias Kamps, Alexander F. H. Kaplan, and Joerg Volpp
Subjects: Computer Science::Sound, Control and Systems Engineering, Mechanical Engineering, Industrial and Manufacturing Engineering, Software, Computer Science Applications
Abstract: Acoustic emissions in directed energy deposition processes such as wire arc additive manufacturing and directed energy deposition with laser beam/metal are investigated within this work, as many insights about the process can be gained from this. In both processes, experienced operators can hear whether a process is running stable or not. Therefore, different experiments for stable and unstable processes with common process anomalies were carried out, and the acoustic emissions as well as process camera images were captured. Thereby, it was found that stable processes show a consistent mean intensity in the acoustic emissions for both processes. For wire arc additive manufacturing, it was found that by the Mel spectrum, a specific spectrum adapted to human hearing, the occurrence of different process anomalies can be detected. The main acoustic source in wire arc additive manufacturing is the plasma expansion of the arc. The acoustic emissions and the occurring process anomalies are mainly correlating with the size of the arc because that is essentially the ionized volume leading to the air pressure which causes the acoustic emissions. For directed energy deposition with laser beam/metal, it was found that by the Mel spectrum, the occurrence of an unstable process can also be detected. The main acoustic emissions are created by the interaction between the powder and the laser beam because the powder particles create an air pressure through the expansion of the particles from the solid state to the liquid state when these particles are melted. These findings can be used to achieve an in situ quality assurance by an in-process analysis of the acoustic emissions.
Published: 2022

28. Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders

Author: Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Vers des robots à l’intelligence sociale au travers de l’apprentissage, de la perception et de la commande (ROBOTLEARN), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Grenoble Alpes (UGA), CentraleSupélec, Institut d'Électronique et des Technologies du numéRique (IETR), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Nantes Université - pôle Sciences et technologie, Nantes Université (Nantes Univ)-Nantes Université (Nantes Univ), GIPSA - Cognitive Robotics, Interactive Systems, & Speech Processing (GIPSA-CRISSP), GIPSA Pôle Parole et Cognition (GIPSA-PPC), Grenoble Images Parole Signal Automatique (GIPSA-lab), Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Grenoble Images Parole Signal Automatique (GIPSA-lab), Université Grenoble Alpes (UGA), ANR-19-P3IA-0003, ANR-3IA MIAI, ANR-19-CE33-0008-01, ANR-JCJC ML3RI, GA #871245, EC, ANR-19-P3IA-0003,MIAI,MIAI @ Grenoble Alpes(2019), ANR-19-CE33-0008,ML3RI,Apprentissage de bas-niveau d'ineractions robotiques multi-modales avec plusieurs personnes(2019), European Project: 871245,H2020-EU.2.1.1. - INDUSTRIAL LEADERSHIP - Leadership in enabling and industrial technologies - Information and Communication Technologies (ICT),SPRING(2020), and European Project: H2020,SPRING
Subjects: Computer Science::Machine Learning, FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), Acoustics and Ultrasonics, Computer Science - Artificial Intelligence, Noise measurement, Speech enhancement, Time series analysis, Computer Science - Sound, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Machine Learning (cs.LG), Time-domain analysis, Computational Mathematics, Artificial Intelligence (cs.AI), [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Computer Science::Sound, Audio and Speech Processing (eess.AS), Recording, FOS: Electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Training, Inference algorithms, Electrical and Electronic Engineering, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: International audience; Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variables, dedicated to model time series of high-dimensional data. DVAEs can be considered as extensions of the variational autoencoder (VAE) that include temporal dependencies between successive observed and/or latent vectors. Previous work has shown the interest of using DVAEs over the VAE for speech spectrograms modeling. Independently, the VAE has been successfully applied to speech enhancement in noise, in an unsupervised noise-agnostic set-up that requires neither noise samples nor noisy speech samples at training time, but only requires clean speech signals. In this paper, we extend these works to DVAE-based single-channel unsupervised speech enhancement, hence exploiting both speech signals unsupervised representation learning and dynamics modeling. We propose an unsupervised speech enhancement algorithm that combines a DVAE speech prior pre-trained on clean speech signals with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement. The algorithm is presented with the most general DVAE formulation and is then applied with three specific DVAE models to illustrate the versatility of the framework. Experimental results show that the proposed DVAE-based approach outperforms its VAE-based counterpart, as well as several supervised and unsupervised noise-dependent baselines, especially when the noise type is unseen during training.
Published: 2022

29. Lightweight Speaker Recognition in Poincaré Spaces

Author: Seokhyeong Kang, Kim Sung-Bin, Tae-Hyun Oh, and Jieun Lee
Subjects: Artificial neural network, Computer science, Applied Mathematics, Speech recognition, Hyperbolic space, Hyperbolic geometry, Speaker recognition, Euclidean distance, Discriminative model, Computer Science::Sound, Signal Processing, Metric (mathematics), Embedding, Electrical and Electronic Engineering
Abstract: This letter proposes a lightweight model for speaker recognition by leveraging a hyperbolic space. The speaker recognition performance heavily depends on the distinctiveness of speaker embeddings induced by metric learning. However, most state-of-the-art embedding methods are typically based on the Euclidean metric space, which does not account for inherent hierarchical structures of speech voice characteristics. The recent development of the neural hyperbolic geometry has demonstrated its effectiveness to model continuous hierarchical structures, which have been typically cumbersome to model by standard deep neural networks. This facet provides an additional by-product of a compact representation. Inspired by the favorable geometry of the hyperbolic geometry, we developed a hyperbolic ResNet for speaker recognition. We found that in smaller dimension regimes than typical cases, the learned speaker embeddings are more discriminative; in other words, more compact at the same level of performance. Our experiments on the large-scale VoxCeleb datasets show that, given the limited channel dimensions of neural networks, our method consistently has favorable performance against the standard ResNet for both speaker recognition and verification tasks.
Published: 2022

30. Probability of Resolution of MUSIC and g-MUSIC: An Asymptotic Approach

Author: David Schenck, Marius Pesavento, and Xavier Mestre
Subjects: Signal Processing (eess.SP), Performances analysis, Iterative methods, Covariance matrices, Covariance matrix, Eigenvalue and eigenfunctions, Direction of arrival estimation, FOS: Electrical engineering, electronic engineering, information engineering, Cost functions, Random variables, Electrical Engineering and Systems Science - Signal Processing, Electrical and Electronic Engineering, Central Limit Theorem, Multiple signal classification, Eigenvalues and eigenfunctions, Stochastic systems, Signal resolution, Cost-function, Direction of arrival, Behavioral science, Signal classification, Computer Science::Sound, Probability of resolution, Signal Processing, Behavioral research, G-multiple signal classification
Abstract: In this article, the outlier production mechanism of the conventional Multiple Signal Classification (MUSIC) and the g-MUSIC Direction-of-Arrival (DoA) estimation technique is investigated using tools from Random Matrix Theory (RMT). A general Central Limit Theorem (CLT) is derived that allows to analyze the asymptotic stochastic behavior of eigenvector-based cost functions in the asymptotic regime where the number of snapshots and the number of antennas increase without bound at the same rate. Furthermore, this CLT is used to provide an accurate prediction of the resolution capabilities of the MUSIC and the g-MUSIC DoA estimation method. The finite dimensional distribution of the MUSIC and the g-MUSIC cost function is shown to be asymptotically jointly Gaussian distributed in the asymptotic regime., Comment: This work has been accepted for publication in the IEEE Transactions on Signal Processing. Copyright may be transferred without notice, after which this version may no longer be accessible
Published: 2022

31. ACCURACY OF DETERMINATION OF LINEAR SPECTRAL FREQUENCIES

Author: P. I. Kuzin, E. I. Kuzina, A. P. Boyko, D. A. Starikov, and E. N. Chapurin
Subjects: Computer Science::Sound
Abstract: The article describes the basic idea of coding a speech signal by the method of linear prediction (Linear Predictive Coding – LPC), which consists in the fact that instead of the parameters of the speech signal, the encoded parameters of a certain filter are transmitted over the communication line, which is, in a sense, an equivalent of the human vocal tract, as well as parameters of the excitation signal of this filter: tone or noise. The essence of the parameters of the synthesizing filter – linear prediction coefficients (LP), calculated in the process of frame-by-frame adaptive filtering – is disclosed. The main advantages of these coefficients are stated, which consists in the ability to completely describe the state of the predictor filter, as well as the main disadvantages obtained as a result of numerous studies that prevent the direct transmission of LPCs over the communication channel due to their sensitivity to quantization errors. The necessity of searching for mathematically equivalent parameters f the reducing filter is substantiated. Alternative parameters of the LSP representation of the vocal tract model, called Linear Spectral Frequencies (LSP), are proposed, and are most often used in low-speed speech codecs at the present time. The main difficulty of the task of calculating the LSP directly from the LSP in real time is described, and it is also investigated that most of the analyzer’s processor time, as a rule, is spent on this task. It is proved that despite the high level of development of digital processors, the problem of calculating the parameters of the vocal tract model in real time remains one of the main difficulties in speech coding. That is why methods that reduce the complexity of this procedure are currently of significant interest. A new method for calculating the LSP and an algorithm based on it are proposed. Its main advantages over the existing ones are described. An assessment of the possible improvement in the quality of speech processing was made, due to the use of the developed method for calculating the LSP.
Published: 2022

32. Effective excitation of bulk plasmon-polaritons in hyperbolic metamaterials for high-sensitivity refractive index sensing

Author: Ruoqin Yan, Tao Wang, Huimin Wang, Xinzhao Yue, Lu Wang, Yuandong Wang, and Jinyan Zhang
Subjects: Quantitative Biology::Subcellular Processes, Computer Science::Sound, Quantitative Biology::Molecular Networks, Materials Chemistry, Physics::Optics, General Chemistry
Abstract: The study of hyperbolic metamaterial (HMM) refractive index sensors is an active field of plasmonics and nanophotonics. Our study provides the basis for the development of ultrasensitive HMM sensors related to biochemical sensing.
Published: 2022

33. Masking and noise reduction processing of music signals in reverberant music

Author: Shenghuan Zhang and Ye Cheng
Subjects: Computer Science::Sound, Artificial Intelligence, Software, Information Systems
Abstract: Noise will be inevitably mixed with music signals in the recording process. To improve the quality of music signals, it is necessary to reduce noise as much as possible. This article briefly introduces noise, the masking effect, and the spectral subtraction method for reducing noise in reverberant music. The spectral subtraction method was improved by the human ear masking effect to enhance its noise reduction performance. Simulation experiments were carried out on the traditional and improved spectral subtraction methods. The results showed that the improved spectral subtraction method could reduce the noise in reverberant music more effectively; under an objective evaluation criterion, the signal-to-noise ratio, the de-reverberated music signal processed by the improved spectral subtraction method had a higher signal-to-noise ratio; under a subjective evaluation criterion, mean opinion score (MOS), the de-reverberated music signal processed by the improved spectral subtraction method also had a better evaluation.
Published: 2022

34. Encoder-Decoder Based Attractors for End-to-End Neural Diarization

Author: Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Paola Garcia
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Computational Mathematics, Acoustics and Ultrasonics, Audio and Speech Processing (eess.AS), Computer Science::Sound, FOS: Electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Electrical and Electronic Engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper investigates an end-to-end neural diarization (EEND) method for an unknown number of speakers. In contrast to the conventional cascaded approach to speaker diarization, EEND methods are better in terms of speaker overlap handling. However, EEND still has a disadvantage in that it cannot deal with a flexible number of speakers. To remedy this problem, we introduce encoder-decoder-based attractor calculation module (EDA) to EEND. Once frame-wise embeddings are obtained, EDA sequentially generates speaker-wise attractors on the basis of a sequence-to-sequence method using an LSTM encoder-decoder. The attractor generation continues until a stopping condition is satisfied; thus, the number of attractors can be flexible. Diarization results are then estimated as dot products of the attractors and embeddings. The embeddings from speaker overlaps result in larger dot product values with multiple attractors; thus, this method can deal with speaker overlaps. Because the maximum number of output speakers is still limited by the training set, we also propose an iterative inference method to remove this restriction. Further, we propose a method that aligns the estimated diarization results with the results of an external speech activity detector, which enables fair comparison against cascaded approaches. Extensive evaluations on simulated and real datasets show that EEND-EDA outperforms the conventional cascaded approach., Accepted to IEEE/ACM TASLP. This article is based on our previous conference paper arxiv:2005.09921
Published: 2022

35. Chance constrained conic-segmentation support vector machine with uncertain data

Author: Shen Peng, Gianpiero Canessa, and Zhihua Allen-Zhao
Subjects: FOS: Computer and information sciences, Computer Science::Machine Learning, Computer Science - Machine Learning, Applied Mathematics, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Machine Learning (cs.LG), Statistics::Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, Optimization and Control (math.OC), Computer Science::Sound, Artificial Intelligence, Computer Science::Computer Vision and Pattern Recognition, FOS: Mathematics, Mathematics - Optimization and Control
Abstract: Support vector machines (SVM) is one of the well known supervised classes of learning algorithms. Furthermore, the conic-segmentation SVM (CS-SVM) is a natural multiclass analogue of the standard binary SVM, as CS-SVM models are dealing with the situation where the exact values of the data points are known. This paper studies CS-SVM when the data points are uncertain or mislabelled. With some properties known for the distributions, a chance-constrained CS-SVM approach is used to ensure the small probability of misclassification for the uncertain data. The geometric interpretation is presented to show how CS-SVM works. Finally, we present experimental results to investigate the chance constrained CS-SVM's performance., Accepted paper for Annals of Mathematics and Artificial Intelligence
Published: 2023

36. Self-supervised speech denoising using only noisy audio signals

Author: Jiasong Wu, Qingchun Li, Guanyu Yang, Lei Li, Lotfi Senhadji, Huazhong Shu, Laboratoire Traitement du Signal et de l'Image (LTSI), Université de Rennes (UR)-Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche en Information Biomédicale sino-français (CRIBS), Université de Rennes (UR)-Southeast University [Jiangsu]-Institut National de la Santé et de la Recherche Médicale (INSERM), Laboratory of Image Science and Technology [Nanjing] (LIST), Southeast University [Jiangsu]-School of Computer Science and Engineering, National Key Research and Development Program of China, NKRDPC: 2021ZD0113202, Institut National de la Santé et de la Recherche Médicale, Inserm, and National Natural Science Foundation of China, NSFC: 50912040302, 61876037, 62171125
Subjects: FOS: Computer and information sciences, Audio sub-sampler, Linguistics and Language, Sound (cs.SD), Speech denoising, Communication, Computer Science - Sound, Language and Linguistics, Computer Science Applications, Self-supervised, Training target, Computer Science::Sound, Audio and Speech Processing (eess.AS), Modeling and Simulation, FOS: Electrical engineering, electronic engineering, information engineering, Computer Vision and Pattern Recognition, [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing, Software, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In traditional speech denoising tasks, clean audio signals are often used as the training target, but absolutely clean signals are collected from expensive recording equipment or in studios with the strict environments. To overcome this drawback, we propose an end-to-end self-supervised speech denoising training scheme using only noisy audio signals, named Only-Noisy Training (ONT), without extra training conditions. The proposed ONT strategy constructs training pairs only from each single noisy audio, and it contains two modules: training audio pairs generated module and speech denoising module. The first module adopts a random audio sub-sampler on each noisy audio to generate training pairs. The sub-sampled pairs are then fed into a novel complex-valued speech denoising module. Experimental results show that the proposed method not only eliminates the high dependence on clean targets of traditional audio denoising tasks, but also achieves on-par or better performance than other training strategies. Availability-ONT is available at https://github.com/liqingchunnnn/Only-Noisy-Training, Comment: 11 pages, 4 figures, 6 tables
Published: 2023

37. The Utilization of Signal Analysis by Using Short Time Fourier Transform

Author: Indiati Retno Palupi and Wiji Raharjo
Subjects: Computer Science::Sound, Physics::Geophysics
Abstract: Signal Analysis is a part of geophysics work. It is important in analyse the character of signal or waveform in geophysics. In this paper the earthquake waveform is used as the example. One method to do this is used Short Time Fourier Transform. It adopts the basic concept of Fast Fourier Transform in the short period of time in waveform and at the same moment there is a convolutional process between the waveform and the mother wavelet and then resulting the spectrogram. Finally, the spectrogram will show the power spectrum or the magnitude of the amplitude in each time in the waveform. It relates with the energy of the earthquake. The result including three parameters, they are time, frequency and the spectrogram. It makes easier for the geophysicist to analyse the frequency changing in each time based on the spectrogram colour. Besides that, it can be used to identify the arrival time of P and S wave as the important information in calculate the hypocentre location of the earthquake.
Published: 2021

38. 3-D Source Location by Neural Network for FBG Acoustic Emission Sensors

Author: Qingbo Liu, Jingchuan Zhang, Dongyue Liu, Tao Fu, Chenggui Li, Xiaohui Liang, and Peng Wei
Subjects: Materials science, Explosive material, Artificial neural network, business.industry, Astrophysics::High Energy Astrophysical Phenomena, Physics::Optics, Bayesian interpretation of regularization, Optics, Fiber Bragg grating, Acoustic emission, Computer Science::Sound, Sensitivity (control systems), Electrical and Electronic Engineering, business, Instrumentation
Abstract: Fiber Bragg grating acoustic emission sensors have been used in many applications. In this paper, based on four fiber Bragg grating acoustic emission sensors, an acoustic emission location experiment is carried out on the surface of a cylindrical polymer-bonded explosive specimen. Due to the difference in the strain sensitivity of fiber Bragg grating in different directions, the traditional time-difference location method is not fit for fiber Bragg grating acoustic emission sensors. A 4-layer dense neural network for fiber Bragg grating acoustic emission sensors, together with a Bayesian regularization method is used to calculate the coordinates of the acoustic emission wave source. The experimental results show that the neural network location method is suitable for fiber Bragg grating acoustic emission sensors.
Published: 2021

39. Second sound in the crossover from the Bose-Einstein condensate to the Bardeen-Cooper-Schrieffer superfluid

Author: Daniel K. Hoffmann, Vijay Pal Singh, Thomas Paintner, Manuel Jäger, Wolfgang Limmer, Ludwig Mathey, and Johannes Hecker Denschlag
Subjects: Condensed Matter::Quantum Gases, Multidisciplinary, DDC 530 / Physics, Quantum fluids and solids, Condensed Matter::Other, Science, General Physics and Astronomy, Bose-Einstein condensates, General Chemistry, Bose-Einstein condensation, General Biochemistry, Genetics and Molecular Biology, Article, Electron gas, Phase transitions and critical phenomena, Computer Science::Sound, Fermi-Gas, Thermodynamics, ddc:530, Thermodynamik
Abstract: Second sound is an entropy wave which propagates in the superfluid component of a quantum liquid. Because it is an entropy wave, it probes the thermodynamic properties of the quantum liquid. Here, we study second sound propagation for a large range of interaction strengths within the crossover between a Bose-Einstein condensate (BEC) and the Bardeen-Cooper-Schrieffer (BCS) superfluid, extending previous work at unitarity. In particular, we investigate the strongly-interacting regime where currently theoretical predictions only exist in terms of an interpolation in the crossover. Working with a quantum gas of ultracold fermionic 6Li atoms with tunable interactions, we show that the second sound speed varies only slightly in the crossover regime. By varying the excitation procedure, we gain deeper insight on sound propagation. We compare our measurement results with classical-field simulations, which help with the interpretation of our experiments., publishedVersion
Published: 2021

40. Ambiguity Function Analysis for Orthogonal-LFM Waveform Based Multistatic Radar

Author: Valarmathi Jayaraman and Dillip Dash
Subjects: Cross-correlation, Ambiguity function, Computer science, Acoustics, Matched filter, Bandwidth (signal processing), symbols.namesake, Signal-to-noise ratio, Computer Science::Sound, symbols, Multistatic radar, Waveform, Electrical and Electronic Engineering, Instrumentation, Doppler effect
Abstract: Over the last decade research on multistatic radar waveform design has attracted significant consideration among the radar system designers. Among the waveform design techniques, linear frequency modulated waveforms are widely used but the problem associated with it is range Doppler coupling and complexity in achieving adequate pulse-to-pulse diversity. In this work an orthogonal linear frequency modulated (OLFM) waveform has been used as the transmitted waveforms which has good correlation properties and large time bandwidth product. The ambiguity function which is the output of the matched filter are used to analyze the target resolution capabilities of the OLFM waveforms. The mathematical expressions of the ambiguity function with OLFM waveforms for the multistatic architecture are represented based on the target fluctuations observed at the receiver. These expressions allow the ambiguity in resolving target position and velocity vectors for any transmitted waveform in the multistatic topology. The normalized cross correlation value of the LFM waveforms are compared with the OLFM waveforms which signifies the isolation factor between two transmitted waveforms at the receiver. A performance comparison has been made based on the probability of target detection for various multistatic scenarios under various signal to noise ratio conditions
Published: 2021

41. A study on the algorithm of ultrasonic detection and recognition based on DAG‐SVMs mixed HMM of teleoperation gestures for intelligent manufacturing devices

Author: Chenguang Zhang, Danling Wu, Kangzheng Huang, and Dianting Liu
Subjects: Technological innovations. Automation, Computer science, Manufactures, Computer Science::Human-Computer Interaction, Industrial and Manufacturing Engineering, DAG‐SVMs, TS1-2301, Computer Science::Robotics, Artificial Intelligence, teleoperation gesture, Doppler shift, Computer vision, HMM, Hidden Markov model, Ultrasonic detection, business.industry, HD45-45.2, Computer Science Applications, Support vector machine, Hardware and Architecture, Computer Science::Sound, intelligent manufacturing, Teleoperation, Artificial intelligence, business, Gesture
Abstract: Remote control for the position and status of a machine or an equipment can often be teleoperated by gestures in an intelligent manufacturing environment. In order to solve the problems that gestures with two directions such as left and right cannot be detected by single ultrasonic frequency, double different ultrasonic frequencies are used to detect gestures by the Doppler shift, and an algorithm of the recognition gesture based on the DAG‐SVMs mixed Hidden Markov Model (HMM) is proposed to identify and classify the extracted feature sequences. Thus, four more types of gestures are expanded other than that of reading display screen information, and the comparative experiments to classify and recognise gestures of teleoperation are made with DAG‐SVMs, the HMM, the DAG‐SVMs mixed HMM, and other improved HMM algorithms. The test results have shown that the mean rate of gesture recognition for the algorithm based on the DAG‐SVMs mixed HMM is 94.917%, which is 9.497% higher than that of the unimproved HMM, and its recognition accuracy of complex teleoperation gestures is improved by 2.3% compared with other improved HMM algorithms. The experimental results show that the DAG‐SVMs mixed HMM algorithm has a good effect on recognition for the gestures of teleoperation and it can perform gesture recognition accurately.
Published: 2021

42. Binaural Synthetic Aperture Imaging of the Field of Audition as the Head Rotates and Localisation Perception of Monophonic Sound Listened to through Headphones

Author: Duncan Tamsett
Subjects: binaural localisation, headphones, synthetic aperture, Computer Science::Sound, binaural audition, Physics, QC1-999, monophonic sound, General Medicine
Abstract: A human listening to monophonic sound through headphones perceives the sound to emanate from a point inside the head at the auditory centre at effectively zero range. The extent to which this is predicted by synthetic-aperture calculation performed in response to head rotation is explored. The instantaneous angle between the auditory axis and the acoustic source, lambda, for the zero inter-aural time delay imposed by headphones is 90°. The lambda hyperbolic cone simplifies to the auditory median plane, which intersects a spherical surface centred on the auditory centre, along a prime meridian lambda circle. In a two-dimensional (2-D) synthetic-aperture computation, points of intersection of all lambda circles as the head rotates constitute solutions to the directions to acoustic sources. Geometrically, lambda circles cannot intersect at a point representing the auditory centre; nevertheless, 2-D synthetic aperture images for a pure turn of the head and for a pure lateral tilt yield solutions as pairs of points on opposite sides of the head. These can reasonably be interpreted to be perceived at the sums of the position vectors of the pairs of points on the acoustic image, i.e., at the auditory centre. But, a turn of the head on which a fixed lateral tilt of the auditory axis is concomitant (as in species of owl) yields a 2-D synthetic-aperture image without solution. However, extending a 2-D synthetic aperture calculation to a three-dimensional (3-D) calculation will generate a 3-D acoustic image of the field of audition that robustly yields the expected solution.
Published: 2021

43. Analysis of Sound Absorbing Properties of Activated Carbon Fiber Felts Based on Density Change

Author: SHEN Yue, YAN Xuefeng, and LIU Qixia
Subjects: Technology, activated carbon fiber, Computer Science::Sound, felts, characteristic impedance, sound absorption coefficient, propagation constant, structural constant
Abstract: In order to study the sound absorbing properties of activated carbon fiber felts, three kinds of activated carbon fiber felts with different densities were selected, and the double channel acoustic analyzer with impedance tube was used to test the surface acoustic impedance within 125-2 500 Hz frequency sound waves. Then the four acoustic parameters, that is the sound absorption coefficient, the characteristic impedance, the propagation constant, and the structural constant, were calculated. Finally, the effect of density on the sound absorbing properties of activated carbon fiber felts was analyzed. It was found that with the increase of the frequency, the sound absorption coefficient and the propagation constant of the sample increased, but the characteristic impedance ratio and the structural constant decreased, but the decline was smaller and tended to be stable. With the increase of the density of activated carbon fiber felts, the acoustic parameters of the samples all increased, but the first resonance frequency moved towards low frequency, and the corresponding sound absorption coefficient increased.
Published: 2021

44. Spectrogram image encoding to provide variable audio data rates and preserve its sound quality

Author: Sergey V. Dvoryankin, Artem E. Zenov, Roman A. Ustinov, and Nikita S. Dvoryankin
Subjects: Information theory, Computer Science::Sound, speech information protection, audio control, speech compression, speech enhancement, sinusoidal model, spectral inversion, short-term fourier transform, audio fingerprinting, Information technology, General Medicine, Q350-390, T58.5-58.64
Abstract: In the applications of audio control and fixation in the conditions of information-technical counteraction, noise clearing, formation of digital watermarks, audio fingerprinting, protective text audio markers, etc., a compact representation of speech signals for subsequent transmission-storage is required while maximal preserving the similarity of the sound quality of restored speech with the original, elimination of accompanying interferences. Theproposed audio codec is based on the narrow-band sine Gaussian model of speech analysis/synthesis, where its representation as a superposition of harmonic components weighted by a Gaussian window applies to all types of speech frames, as well as on universal and special methods of construction and image processing of narrow-band dynamic spectrograms, in particular, by the application of compression-recovery algorithms to them, which will allow to regulate the speech stream speed within a wide range of 1.2–16Kbit/s with adaptation to changes of the audio data transmission-storage channel bandwidth, caused, in particular, by both objective factors and the actions of an intruder. This work aims to select the best parameters on the spectrogram images that reduce the overall bitrate, remove the influence of noise and interference and allow using of spectral inversion methods and algorithms to recover the speech signal with the same or better quality. The parameters are extracted from the spectrogram images obtained using of the short-time Fourier transform, using methods to extract the amplitudes, frequencies, phases and development tracks of selected local or global maxima (peaks) of the speech signal on the spectral slices. The communication channel can transmit either the parameters themselves, or the results of compression-encoding of the image to restore the image of the original spectrogram with the selection of peak parameters already on it with the subsequent synthesis of speech or for direct spectral inversion of the image into speech. It is possible to correct the reconstructed spectrogram by using a priori information about the speaker's speech from his pre-generated voice database.
Published: 2021

45. Excitation of Airborne Acoustic Surface Modes Driven by a Turbulent Flow

Author: Shishir Damani, William J. Devenport, Timothy A. Starkey, Nathan Alexander, Alastair P. Hibbins, J. Roy Sambles, Benjamin P. Pearce, and Samuel Shelley
Subjects: Physics::Fluid Dynamics, Physics, Coupling (physics), Computer Science::Sound, Surface wave, Turbulence, Excited state, Aeroacoustics, Aerospace Engineering, Mechanics, Static pressure, Boundary layer thickness, Excitation
Abstract: This experiment demonstrates the generation of trapped acoustic surface waves excited by a turbulent flow source through the coupling of pressure fluctuations at the interface between an acoustic m...
Published: 2021

46. Audio Encryption Algorithm Based on Chen Memristor Chaotic System

Author: Wanying Dai, Xiangliang Xu, Xiaoming Song, and Guodong Li
Subjects: fast Walsh–Hadamard transform, memristor chaotic system, audio map, channel shuffle, encryption algorithm, Physics and Astronomy (miscellaneous), General Mathematics, Chemistry (miscellaneous), Computer Science::Sound, Computer Science::Multimedia, Computer Science (miscellaneous), QA1-939, Mathematics, Computer Science::Cryptography and Security
Abstract: The data space for audio signals is large, the correlation is strong, and the traditional encryption algorithm cannot meet the needs of efficiency and safety. To solve this problem, an audio encryption algorithm based on Chen memristor chaotic system is proposed. The core idea of the algorithm is to encrypt the audio signal into the color image information. Most of the traditional audio encryption algorithms are transmitted in the form of noise, which makes it easy to attract the attention of attackers. In this paper, a special encryption method is used to obtain higher security. Firstly, the Fast Walsh–Hadamar Transform (FWHT) is used to compress and denoise the signal. Different from the Fast Fourier Transform (FFT) and the Discrete Cosine Transform (DCT), FWHT has good energy compression characteristics. In addition, compared with that of the triangular basis function of the Fast Fourier Transform, the rectangular basis function of the FWHT can be more effectively implemented in the digital circuit to transform the reconstructed dual-channel audio signal into the R and B layers of the digital image matrix, respectively. Furthermore, a new Chen memristor chaotic system solves the periodic window problems, such as the limited chaos range and nonuniform distribution. It can generate a mask block with high complexity and fill it into the G layer of the color image matrix to obtain a color audio image. In the next place, combining plaintext information with color audio images, interactive channel shuffling can not only weaken the correlation between adjacent samples, but also effectively resist selective plaintext attacks. Finally, the cryptographic block is used for overlapping diffusion encryption to fill the silence period of the speech signal, so as to obtain the ciphertext audio. Experimental results and comparative analysis show that the algorithm is suitable for different types of audio signals, and can resist many common cryptographic analysis attacks. Compared with that of similar audio encryption algorithms, the security index of the algorithm is better, and the efficiency of the algorithm is greatly improved.
Published: 2022

47. Low-Complexity 2D-MUSIC for Joint Range and Angle Estimation of Frequency Modulated Continuous-Wave Radar

Author: Yongchul Jung, Seunghyeok Lee, Seongjoo Lee, and Yunho Jung
Subjects: Radiation, Computer Science::Sound, Computer Networks and Communications, Electrical and Electronic Engineering, Instrumentation
Abstract: A pre-processing technique is proposed to reduce the complexity of two-dimensional multiple signal classification (2D-MUSIC) for the joint range and angle estimation of frequency-modulated continuous-wave (FMCW) radar systems. By using the central symmetry of the angle steering vector from a uniform linear array (ULA) antenna and the linearity of the beat signal in the FMCW radar, this preprocessing technique transforms 2D-MUSIC from complex values into real values. To compare the computational complexity of the proposed algorithm with the conventional 2D-MUSIC, we measured the CPU processing time for various numbers of snapshots, and the evaluation results indicated that the 2D-MUSIC with the proposed pre-processing technique is approximately three times faster than the conventional 2D-MUSIC.
Published: 2021

48. An Intelligent Ear Recognition Technique

Author: Yahya Hussein and ALI Sahan
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Sound, Computer Science Applications
Abstract: The human ear has unique and attractive details; therefore, human ear recognition is one of the most important fields in the biometric domains. In this work, we proposed an efficient and intelligent ear recognition technique based on particle swarm optimization, discrete wavelet transform, and fuzzy neural network. Discrete wavelet transform is used to provide comprise and effective features about the ear image, while the particle swarm optimization utilized to select more effective and attractive features. Furthermore, using particle swarm optimization leads to reduce the complexity of the classification stage since it reduces the number of the features. Fuzzy neural network used in the classification stage in order to provide strong distinguishing between the testing and training ear images. many experiments performed using two ear databases to examine the accuracy of the proposed technique. The analysis of the results refers that the presented technique gained high recognition accuracy using various data sets with less complexity. Keywords: Ear recognition; bio-metric; discrete wavelet transform, particle swarm optimization, fuzzy neural network.
Published: 2021

49. A discontinuous Galerkin coupling for nonlinear elasto-acoustics

Author: Vanja Nikolić, Markus Muhr, and Barbara Wohlmuth
Subjects: Coupling, Applied Mathematics, General Mathematics, Numerical Analysis (math.NA), 65M12, 35L70, Computational Mathematics, Nonlinear system, Classical mechanics, Computer Science::Sound, Discontinuous Galerkin method, FOS: Mathematics, Mathematics - Numerical Analysis, Mathematics
Abstract: Inspired by medical applications of high-intensity ultrasound, we study a coupled elasto-acoustic problem with general acoustic nonlinearities of quadratic type as they arise, for example, in the Westervelt and Kuznetsov equations of nonlinear acoustics. We derive convergence rates in the energy norm of a finite element approximation to the coupled problem in a setting that involves different acoustic materials and hence jumps within material parameters. A subdomain-based discontinuous Galerkin approach realizes the acoustic-acoustic coupling of different materials. At the same time, elasto-acoustic interface conditions are used for a mutual exchange of forces between the different models. Numerical simulations back up the theoretical findings in a three-dimensional setting with academic test cases as well as in an application-oriented simulation, where the modeling of human tissue as an elastic versus an acoustic medium is compared., 41 pages, 10 figures
Published: 2021

50. Spatial interpolation methods for virtual rotating array beamforming with arbitrary microphone configurations

Author: Ce Zhang, Wei Ma, and Jiacheng Yang
Subjects: Beamforming, Microphone array, Computer science, Microphone, Mechanical Engineering, Acoustics, Aerospace Engineering, Acoustic source localization, Multivariate interpolation, Computer Science::Sound, Space and Planetary Science, Control and Systems Engineering, Inverse distance weighting, Radial basis function, Computers in Earth Sciences, Social Sciences (miscellaneous), ComputingMethodologies_COMPUTERGRAPHICS, Interpolation
Abstract: Virtual rotating array (VRA) beamforming is a robust technique in the identification of rotating sound sources in frequency-domain. Under normal circumstances, the configuration of microphone array is established in ring geometry centred around the rotating axis. Two interpolation methods for arbitrary microphone configurations are proposed by Jekosch and Sarradj (Acoustics, 2020). One is to construct a mesh between all stationary microphones using Delaunay-triangulation, another one is a meshless technique based on radial basis function. However, whether other spatial interpolation methods are available in VRA beamforming with arbitrary microphone configurations is still unclear. This paper adds several new spatial interpolation methods in VRA beamforming and detailedly compares the performances of these interpolation methods in simulations. The simulating results demonstrated that all these interpolation methods are successfully applied in VRA beamforming with arbitrary microphone configurations. Inverse distance weighting interpolation method owns the best performance in rotating sound source localization. In addition, all these interpolation methods have poor spectrum construction capability and sound source strength precision.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

30,752 results on '"Computer Science::Sound"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources