Author: "Zoran Cvetkovic" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

1. Structured Errors-in-Variables Modelling for Cortico-Muscular Coherence Enhancement

Author: Zhenghao Guo, Verity M. McClelland, Wei Dai, and Zoran Cvetkovic
Published: 2023
Full Text: View/download PDF

2. Towards Robust Waveform-Based Acoustic Models

Author: Dino Oglic, Zoran Cvetkovic, Peter Sollich, Steve Renals, and Bin Yu
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Machine Learning, Acoustics and Ultrasonics, Machine Learning (stat.ML), Computer Science - Sound, Machine Learning (cs.LG), Computational Mathematics, Audio and Speech Processing (eess.AS), Statistics - Machine Learning, FOS: Electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Electrical and Electronic Engineering, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We study the problem of learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions. This problem is of paramount importance for the deployment of speech recognition systems that need to perform well in unseen environments. First, we characterize data augmentation theoretically as an instance of vicinal risk minimization, which aims at improving risk estimates during training by replacing the delta functions that define the empirical density over the input space with an approximation of the marginal population density in the vicinity of the training samples. More specifically, we assume that local neighborhoods centered at training samples can be approximated using a mixture of Gaussians, and demonstrate theoretically that this can incorporate robust inductive bias into the learning process. We then specify the individual mixture components implicitly via data augmentation schemes, designed to address common sources of spurious correlations in acoustic models. To avoid potential confounding effects on robustness due to information loss, which has been associated with standard feature extraction techniques (e.g., FBANK and MFCC features), we focus on the waveform-based setting. Our empirical results show that the approach can generalize to unseen noise conditions, with 150% relative improvement in out-of-distribution generalization compared to training using the standard risk minimization principle. Moreover, the results demonstrate competitive performance relative to models learned using a training sample designed to match the acoustic conditions characteristic of test utterances.
Published: 2022
Full Text: View/download PDF

3. Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs

Author: Zhengjun Yue, Erfan Loweimi, Heidi Christensen, Jon Barker, and Zoran Cvetkovic
Abstract: Raw waveform acoustic modelling has recently received increasing attention. Compared with the task-blind hand-crafted features which may discard useful information, representations directly learned from the raw waveform are task-specific and potentially include all task-relevant information. In the context of automatic dysarthric speech recognition, raw waveform acoustic modelling is under-explored owing to data scarcity. Parametric CNNs can compensate for this problem owing to having notably fewer parameters and requiring less training data in comparison with conventional non-parametric CNNs. In this paper, we explore the usefulness of raw waveform acoustic modelling using various parametric CNNs for ADSR. Additionally, we investigate the properties of the learned filters and monitor the training dynamics of various models. Furthermore, we study the effectiveness of data augmentation and multi-stream acoustic modelling through combining the non-parametric and parametric CNNs fed by hand-crafted and raw waveform features. Experimental results on the widelyused TORGO dysarthric database show that the parametric CNNs significantly outperform the non-parametric CNNs on dysarthric speech (up to 2.7% and 1.8% absolute error reduction), reaching up to 35.9% and 11.9% WERs for dysarthric and typical speech respectively. Multi-streaming acoustic modelling further improves the performance resulting in up to 33.2%and 10.3% WERs for dysarthric and typical speech, respectively.
Published: 2022
Full Text: View/download PDF

4. Similarity Maps for Ventricular Arrhythmia Classification

Author: Qing Lin, Hak-Keung Lam, Michael J. Curtis, and Zoran Cvetkovic
Subjects: Electrocardiography, Death, Sudden, Cardiac, Ventricular Fibrillation, Tachycardia, Ventricular, Humans, Arrhythmias, Cardiac
Abstract: Ventricular arrhythmias are the primary arrhythmias that cause sudden cardiac death. In current clinical and preclinical research, the discovery of new therapies and their translation is hampered by the lack of consistency in diagnostic criteria for distinguishing between ventricular tachycardia (VT) and ventricular fibrillation (VF). This study develops a new set of features, similarity maps, for discrimination between VT and VF using deep neural network architectures. The similarity maps are designed to capture the similarity and the regularity within an ECG trace. Our experiments show that the similarity maps lead to a substantial improvement in distinguishing VT and VF.
Published: 2022

5. Learning Waveform-Based Acoustic Models Using Deep Variational Convolutional Neural Networks

Author: Dino Oglic, Zoran Cvetkovic, and Peter Sollich
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Acoustics and Ultrasonics, Computer science, Feature extraction, Probabilistic logic, Machine Learning (stat.ML), Convolutional neural network, Machine Learning (cs.LG), Computational Mathematics, Statistics - Machine Learning, Robustness (computer science), Computer Science (miscellaneous), Waveform, Electrical and Electronic Engineering, Stochastic neural network, Divergence (statistics), Algorithm, Parametric statistics
Abstract: We investigate the potential of stochastic neural networks for learning effective waveform-based acoustic models. The waveform-based setting, inherent to fully end-to-end speech recognition systems, is motivated by several comparative studies of automatic and human speech recognition that associate standard non-adaptive feature extraction techniques with information loss which can adversely affect robustness. Stochastic neural networks, on the other hand, are a class of models capable of incorporating rich regularization mechanisms into the learning process. We consider a deep convolutional neural network that first decomposes speech into frequency sub-bands via an adaptive parametric convolutional block where filters are specified by cosine modulations of compactly supported windows. The network then employs standard non-parametric 1D convolutions to extract relevant spectro-temporal patterns while gradually compressing the structured high dimensional representation generated by the parametric block. We rely on a probabilistic parametrization of the proposed neural architecture and learn the model using stochastic variational inference. This requires evaluation of an analytically intractable integral defining the Kullback-Leibler divergence term responsible for regularization, for which we propose an effective approximation based on the Gauss-Hermite quadrature. Our empirical results demonstrate a superior performance of the proposed approach over comparable waveform-based baselines and indicate that it could lead to robustness. Moreover, the approach outperforms a recently proposed deep convolutional neural network for learning of robust acoustic models with standard FBANK features.
Published: 2021
Full Text: View/download PDF

6. Raw source and filter modelling for dysarthric speech recognition

Author: Zhengjun Yue, Erfan Loweimi, and Zoran Cvetkovic
Abstract: Acoustic modelling for automatic dysarthric speech recognition (ADSR) is a challenging task. Data deficiency is a major problem and substantial differences between the typical and dysarthric speech complicates transfer learning. In this paper, we build acoustic models using the raw magnitude spectra of the source and filter components. The proposed multi-stream model consists of convolutional and recurrent layers. It allows for fusing the vocal tract and excitation components at different levels of abstraction and after per-stream pre-processing. We show that such a multi-stream processing leverages these two information streams and helps the model towards normalising the speaker attributes and speaking style. This potentially leads to better handling of the dysarthric speech with a large inter-speaker and intra-speaker variability. We compare the proposed system with various features, study the training dynamics, explore usefulness of the data augmentation and provide interpretation for the learned convolutional filters. On the widely used TORGO dysarthric speech corpus, the proposed approach results in up to 1.7% absolute WER reduction for dysarthric speech compared with the MFCC baseline. Our best model reaches up to 40.6% and 11.8% WER for dysarthric and typical speech, respectively.
Published: 2022
Full Text: View/download PDF

7. Multi-modal acoustic-articulatory feature fusion for dysarthric speech recognition

Author: Zhengjun Yue, Erfan Loweimi, Zoran Cvetkovic, Heidi Christensen, and Jon Barker
Abstract: Building automatic speech recognition (ASR) systems for speakers with dysarthria is a very challenging task. Although multi-modal ASR has received increasing attention recently, incorporating real articulatory data with acoustic features has not been widely explored in the dysarthric speech community. This paper investigates the effectiveness of multi-modal acoustic modelling for dysarthric speech recognition using acoustic features along with articulatory information. The proposed multi-stream architectures consist of convolutional, recurrent and fully-connected layers allowing for bespoke per-stream pre-processing, fusion at the optimal level of abstraction and post-processing. We study the optimal fusion level/scheme as well as training dynamics in terms of cross-entropy and WER using the popular TORGO dysarthric speech database. Experimental results show that fusing the acoustic and articulatory features at the empirically found optimal level of abstraction achieves a remarkable performance gain, leading to up to 4.6% absolute (9.6% relative) WER reduction for speakers with dysarthria.
Published: 2022
Full Text: View/download PDF

8. Dictionary Learning Strategies for Cortico-Muscular Coherence Detection and Estimation

Author: Shengjia Du, Qi Yu, Wei Dai, Verity McClelland, and Zoran Cvetkovic
Subjects: Electromyography, Muscles, Motor Cortex, Electroencephalography, Algorithms
Abstract: The spectral method of cortico-muscular coherence (CMC) can reveal the communication patterns between the cerebral cortex and muscle periphery, thus providing guidelines for the development of new therapies for movement disorders and insights into fundamental motor neuroscience. The method is applied to electroencephalogram (EEG) and surface electromyogram (sEMG) recorded synchronously during a motor task. However, synchronous EEG and sEMG components are typically too weak compared to additive noise and background activities making significant coherence very difficult to detect. Dictionary learning and sparse representation have been proved effective in enhancing CMC levels. In this paper, we explore the potential of a recently proposed dictionary learning algorithm in combination with an improved component selection algorithm for CMC enhancement. The effectiveness of the method was demonstrated using neurophysiological data where it achieved considerable improvements in CMC levels.
Published: 2021

9. Speech Acoustic Modelling from Raw Phase Spectrum

Author: Steve Renals, Zoran Cvetkovic, Peter Bell, and Erfan Loweimi
Subjects: phase-based source-filter separation, Signal processing, acoustic modelling, Computer science, Speech recognition, Phase (waves), Information processing, Filter (signal processing), multi-head CNNs, Information fusion, ASR, Phase spectrum, Vocal tract, Raw phase spectrum
Abstract: Magnitude spectrum-based features are the most widely employed front-ends for acoustic modelling in automatic speech recognition (ASR) systems. In this paper, we investigate the possibility and efficacy of acoustic modelling using the raw short-time phase spectrum. In particular, we study the usefulness of the raw wrapped, unwrapped and minimum-phase phase spectra as well as the phase of the source and filter components for acoustic modelling. Furthermore, we explore the effectiveness of simultaneous deployment of the vocal tract and excitation components of the raw phase spectrum using multi-head CNNs and investigate multiple information fusion schemes. This paves the way for developing an effective phase-based multi-stream information processing systems for speech recognition. The performance, even for wrapped phase with a noise-like shape, is comparable to or better than the magnitude-based classic features, and up to 4.8% WER has been achieved in the WSJ (Eval-92) task.
Published: 2021
Full Text: View/download PDF

10. Multiscale Wavelet Transfer Entropy with Application to Corticomuscular Coupling Analysis

Author: Osvaldo Simeone, Zhenghao Guo, Kerry R. Mills, Zoran Cvetkovic, and Verity M. McClelland
Subjects: Signal Processing (eess.SP), FOS: Computer and information sciences, Computer science, Computer Science - Information Theory, Stationary wavelet transform, Entropy, 0206 medical engineering, Biomedical Engineering, 02 engineering and technology, Electroencephalography, Wavelet, medicine, FOS: Electrical engineering, electronic engineering, information engineering, Coherence (signal processing), Humans, Sensitivity (control systems), Electrical Engineering and Systems Science - Signal Processing, Entropy (energy dispersal), Muscle, Skeletal, medicine.diagnostic_test, business.industry, Electromyography, Information Theory (cs.IT), Motor Cortex, Pattern recognition, Neurophysiology, 020601 biomedical engineering, Quantitative Biology - Neurons and Cognition, FOS: Biological sciences, Transfer entropy, Neurons and Cognition (q-bio.NC), Artificial intelligence, business
Abstract: Objective: Functional coupling between the motor cortex and muscle activity is commonly detected and quantified by cortico-muscular coherence (CMC) or Granger causality (GC) analysis, which are applicable only to linear couplings and are not sufficiently sensitive: some healthy subjects show no significant CMC and GC, and yet have good motor skills. The objective of this work is to develop measures of functional cortico-muscular coupling that have improved sensitivity and are capable of detecting both linear and non-linear interactions. Methods: A multiscale wavelet transfer entropy (TE) methodology is proposed. The methodology relies on a dyadic stationary wavelet transform to decompose electroencephalogram (EEG) and electromyogram (EMG) signals into functional bands of neural oscillations. Then, it applies TE analysis based on a range of embedding delay vectors to detect and quantify intra- and cross-frequency band cortico-muscular coupling at different time scales. Results: Our experiments with neurophysiological signals substantiate the potential of the developed methodologies for detecting and quantifying information flow between EEG and EMG signals for subjects with and without significant CMC or GC, including non-linear cross-frequency interactions, and interactions across different temporal scales. The obtained results are in agreement with the underlying sensorimotor neurophysiology. Conclusion: These findings suggest that the concept of multiscale wavelet TE provides a comprehensive framework for analysing cortex-muscle interactions. Significance: The proposed methodologies will enable developing novel insights into movement control and neurophysiological processes more generally., Comment: 12 pages. Accepted version, to appear in IEEE Transactions on Biomedical Engineering
Published: 2021
Full Text: View/download PDF

11. A Deep 2D Convolutional Network for Waveform-Based Speech Recognition

Author: Steve Renals, Zoran Cvetkovic, Dino Oglic, and Peter Bell
Subjects: parametric filters, deep convolutional networks, Computer Science::Sound, Computer science, Speech recognition, automatic speech recognition, Waveform, robustness, raw speech
Abstract: Due to limited computational resources, acoustic models of early automatic speech recognition ( asr) systems were built in low-dimensional feature spaces that incur considerable information loss at the outset of the process. Several comparative studies of automatic and human speech recognition suggest that this information loss can adversely affect the robustness of asr systems. To mitigate that and allow for learning of robust models, we propose a deep 2 d convolutional network in the waveform domain. The first layer of the network decomposes waveforms into frequency sub-bands, thereby representing them in a structured high-dimensional space. This is achieved by means of a parametric convolutional block defined via cosine modulations of compactly supported windows. The next layer embeds the waveform in an even higher-dimensional space of high-resolution spectro-temporal patterns, implemented via a 2 d convolutional block. This is followed by a gradual compression phase that selects most relevant spectro-temporal patterns using wide-pass 2 d filtering. Our results show that the approach significantly outperforms alternative waveform-based models on both noisy and spontaneous conversational speech (24% and 11% relative error reduction, respectively). Moreover, this study provides empirical evidence that learning directly from the waveform domain could be more effective than learning using hand-crafted features.
Published: 2020
Full Text: View/download PDF

12. Localization Uncertainty In Time-Amplitude Stereophonic Reproduction

Author: Toon van Waterschoot, Marc Moonen, Enzo De Sena, Huseyin Hacihabiboglu, and Zoran Cvetkovic
Subjects: Stereophony, Acoustics and Ultrasonics, Computer science, recording and reproduction, 01 natural sciences, law.invention, 030507 speech-language pathology & audiology, 03 medical and health sciences, symbols.namesake, Position (vector), law, Audio and Speech Processing (eess.AS), 0103 physical sciences, Computer Science (miscellaneous), FOS: Electrical engineering, electronic engineering, information engineering, Active listening, Electrical and Electronic Engineering, 010301 acoustics, Sweet spot, Pearson product-moment correlation coefficient, Computational Mathematics, Stereophonic sound, Amplitude, auditory modeling, symbols, localization uncertainty, panning, 0305 other medical science, Algorithm, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This article studies the effects of inter-channel time and level differences in stereophonic reproduction on perceived localization uncertainty, which is defined as how difficult it is for a listener to tell where a sound source is located. Towards this end, a computational model of localization uncertainty is proposed first. The model calculates inter-aural time and level difference cues, and compares them to those associated to free-field point-like sources. The comparison is carried out using a particular distance functional that replicates the increased uncertainty observed experimentally with inconsistent inter-aural time and level difference cues. The model is validated by formal listening tests, achieving a Pearson correlation of 0.99. The model is then used to predict localization uncertainty for stereophonic setups and a listener in central and off-central positions. Results show that amplitude methods achieve a slightly lower localization uncertainty for a listener positioned exactly in the center of the sweet spot. As soon as the listener moves away from that position, the situation reverses, with time-amplitude methods achieving a lower localization uncertainty.
Published: 2020
Full Text: View/download PDF

13. Bilinear Dictionary Update via Linear Least Squares

Author: Jubo Zhu, Zoran Cvetkovic, Qi Yu, and Wei Dai
Subjects: Training set, Iterative method, Computer science, Bilinear interpolation, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), 020206 networking & telecommunications, 02 engineering and technology, Inverse problem, 0202 electrical engineering, electronic engineering, information engineering, Embedding, 020201 artificial intelligence & image processing, Total least squares, Algorithm, Linear least squares, Sparse matrix
Abstract: Algorithms for dictionary learning aim to learn a dictionary under which training data have sparse representations. This paper addresses the dictionary update sub-problem, the goal of which is to update the dictionary and the corresponding sparse coefficients given a fixed sparsity pattern. It is a non-convex bilinear inverse problem, and hence challenging to solve. Inspired by a recent work by Ling and Strohmer, we re-formulate the dictionary update problem as a linear least squares problem, which is convex and easy to solve. Necessary bounds on the number of training samples required for a unique solution are derived when exact sparsity pattern is known. Further, for dictionary update with unknown sparsity patterns, an efficient iterative algorithm based on total least squares is developed. Embedding the new dictionary update procedure into an overall dictionary learning algorithm achieves better numerical performance compared to state of the art algorithms.
Published: 2019
Full Text: View/download PDF

14. Simulation of coupled volume acoustics with coupled volume scattering delay network models

Author: Huseyin Hacihabiboglu, Enzo De Sena, Zoran Cvetkovic, Zühre Sü Gül, and Timuçin B. Atalay
Subjects: Physics, Acoustics and Ultrasonics, Arts and Humanities (miscellaneous), Volume (thermodynamics), Acoustics, Volume scattering, Network model
Published: 2021
Full Text: View/download PDF

15. Perceptual Soundfield Reconstruction In Three Dimensions Via Sound Field Extrapolation

Author: Huseyin Hacihabiboglu, Enzo De Sena, Ege Erdem, and Zoran Cvetkovic
Subjects: Microphone array, Computer science, Microphone, Acoustics, Extrapolation, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, 02 engineering and technology, Rendering (computer graphics), law.invention, Sound recording and reproduction, 030507 speech-language pathology & audiology, 03 medical and health sciences, Stereophonic sound, law, 0202 electrical engineering, electronic engineering, information engineering, Loudspeaker, 0305 other medical science, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: Perceptual sound field reconstruction (PSR) is a spatial audio recording and reproduction method based on the application of stereophonic panning laws in microphone array design. PSR allows rendering a perceptually veridical and stable auditory perspective in the horizontal plane of the listener, and involves recording using near-coincident microphone arrays. This paper extends the PSR concept to three dimensions using sound field extrapolation carried out in the spherical-harmonic domain. Sound field rendering is performed using a two-level loudspeaker rig. An active-intensity-based analysis of the rendered sound field shows that the proposed approach can render direction of monochromatic plane waves accurately.
Published: 2019

16. Dictionary Learning with BLOTLESS Update

Author: Zoran Cvetkovic, Qi Yu, Wei Dai, and Jubo Zhu
Subjects: Inverse problems, Signal Processing (eess.SP), FOS: Computer and information sciences, Computer Science - Machine Learning, Training set, Computer science, 020206 networking & telecommunications, Dictionary learning, 02 engineering and technology, Sparse approximation, Inverse problem, Machine Learning (cs.LG), Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, Total least squares, Electrical Engineering and Systems Science - Signal Processing, Neural coding, Algorithm, Sparse representation, Block (data storage)
Abstract: Algorithms for learning a dictionary to sparsely represent a given dataset typically alternate between sparse coding and dictionary update stages. Methods for dictionary update aim to minimise expansion error by updating dictionary vectors and expansion coefficients given patterns of non-zero coefficients obtained in the sparse coding stage. We propose a block total least squares (BLOTLESS) algorithm for dictionary update. BLOTLESS updates a block of dictionary elements and the corresponding sparse coefficients simultaneously. In the error free case, three necessary conditions for exact recovery are identified. Lower bounds on the number of training data are established so that the necessary conditions hold with high probability. Numerical simulations show that the bounds approximate well the number of training data needed for exact dictionary recovery. Numerical experiments further demonstrate several benefits of dictionary learning with BLOTLESS update compared with state-of-the-art algorithms especially when the amount of training data is small.
Published: 2019
Full Text: View/download PDF

17. Cortico-Muscular Coherence Enhancement Via Sparse Signal Representation

Author: Zoran Cvetkovic, Qi Yu, Wei Dai, Verity M. McClelland, and Yuhang Xu
Subjects: musculoskeletal diseases, medicine.diagnostic_test, business.industry, Computer science, Noise (signal processing), 0206 medical engineering, Motor control, 020206 networking & telecommunications, Pattern recognition, 02 engineering and technology, Coherence (statistics), Electroencephalography, Neurophysiology, 020601 biomedical engineering, Signal, 0202 electrical engineering, electronic engineering, information engineering, medicine, Artificial intelligence, business, psychological phenomena and processes, Sparse matrix
Abstract: Identifiction of specific cortico-muscular interactions is essential for understanding sensorimotor control. These interactions are commonly studied by analyzing cortico-muscular coherence (CMC) between electroencephalogram (EEG) and surface electromyogram (sEMG) recorded synchronously under a motor control task. However, the presence of noise and components irrelevant to the monitored task weakens CMC so that it is often very difficult to detect. This study proposes an approach based on dictionary learning and sparse signal representation combined with a component selection algorithm to extract versions of EEG and sEMG signals which contain higher relative levels of coherent components. Evaluations using neurophysiological data show that the method achieves substantial increase in CMC levels.
Published: 2018
Full Text: View/download PDF

18. P24-S Abnormal patterns of corticomuscular and intermuscular coherence in acquired and idiopathic/genetic childhood dystonias

Author: Verity M. McClelland, Jean-Pierre Lin, Peter Brown, Kerry R. Mills, and Zoran Cvetkovic
Subjects: Dystonia, medicine.medical_specialty, medicine.diagnostic_test, business.industry, Coherence (statistics), Index finger, Electroencephalography, Thumb, Audiology, medicine.disease, Sensory Systems, body regions, medicine.anatomical_structure, Neurology, Forearm, Physiology (medical), medicine, Mann–Whitney U test, Neurology (clinical), business, Sensorimotor cortex
Abstract: Background Sensorimotor processing is abnormal in Idiopathic/Genetic dystonias but remains unstudied in Acquired dystonias. We test the hypothesis that sensory modulation of Beta-Corticomuscular coherence (CMC) and Intermuscular coherence (IMC) differs with dystonia aetiology. Methods Participants: 11 children with Acquired dystonia, 5 with Genetic/Idiopathic dystonia and 13 typically-developing-children (TDC) (12–18 yrs). The child grasped a ruler between thumb and index finger. Mechanical perturbations were provided by an electromechanical tapper. Surface EMG (first dorsal interosseous and forearm extensors) and bipolar EEG over contralateral sensorimotor cortex were recorded in 5-s epochs (200×). Signals were amplified, bandpass filtered (EEG 0.5–100 Hz; EMG 5–250 Hz) and digitised (1024 Hz). CMC and IMC were computed using a 512-point short-time Fourier transform and 95% confidence levels were derived. The analysis window moved across the epoch to assess change in CMC/IMC over time. Results Beta-CMC (14–36 Hz) was identified in 13/13 TDCs, 3/5 children with Idiopathic/Genetic and 9/11 with Acquired dystonia. Beta-CMC magnitude increased significantly from baseline to early post-stimulus in TDCs and Acquired dystonia (Wilcoxon signed rank test p = 0.001 and p = 0.004 respectively), but not in Idiopathic/Genetic dystonia (p = 0.959). Post-stimulus beta-CMC magnitude was significantly higher in TDCs than Idiopathic/Genetic dystonia (Mann Whitney p = 0.002) and in Acquired than Idiopathic/Genetic dystonia (p = 0.038). Beta-IMC was similar across groups. Prominent low frequency (4–8 Hz) IMC was seen in all dystonia patients, but not in TDCs, and correlated with severity (BFMDRS-m). Conclusion Idiopathic/Genetic and Acquired dystonia share an abnormal low-frequency IMC but patterns of beta-CMC sensory modulation are distinct, indicating different sensorimotor processing abnormalities between these groups.
Published: 2019
Full Text: View/download PDF

19. Searching and managing references for medical research and results' publishing

Author: Snezana Popovic, Zoran Cvetkovic, and Emil Vlajic
Subjects: Publishing, business.industry, Computer science, Library science, Medical research, business
Published: 2016
Full Text: View/download PDF

20. Deconvolution of the glottal pulse using a finite-difference solution of the acoustical Klein-Gordon equation

Author: Edward Roy Pike, H. S. Kalsi, and Zoran Cvetkovic
Subjects: Blind deconvolution, Speech recognition, Finite difference, Finite difference method, 020206 networking & telecommunications, 02 engineering and technology, Wave equation, 030507 speech-language pathology & audiology, 03 medical and health sciences, symbols.namesake, Computer Science::Sound, 0202 electrical engineering, electronic engineering, information engineering, symbols, Waveform, Applied mathematics, Deconvolution, 0305 other medical science, Klein–Gordon equation, Impulse response, Mathematics
Abstract: Deconvolution of the glottal-pulse waveform from the speech signal remains an active field of research although dating back over half a century. In the main, existing approaches use classical inverse filtering frequency-domain methods to estimate both the vocal-tract and glottal-pulse waveforms. In this paper, we adopt a new approach which takes advantage of two relatively recent developments: firstly, the physical modeling of the speech process by means of the Klein-Gordon wave equation of relativistic quantum mechanics and, secondly, a finite-difference calculation of this equation to find the impulse response of the vocal tract. This approach allows accurate parameterisation of the impulse response which simplifies the blind deconvolution. Results show considerable improvement compared with existing algorithms when applied to synthetic speech where the ground truth is known.
Published: 2017
Full Text: View/download PDF

21. Cortico-muscular coherence enhancement via coherent Wavelet enhanced Independent Component Analysis

Author: Zoran Cvetkovic, Verity M. McClelland, Kerry R. Mills, and Yuhang Xu
Subjects: Computer science, Speech recognition, 0206 medical engineering, Wavelet Analysis, 02 engineering and technology, Electroencephalography, 03 medical and health sciences, 0302 clinical medicine, Wavelet, Component (UML), medicine, Humans, Coherence (signal processing), Muscle, Skeletal, medicine.diagnostic_test, Electromyography, Motor Cortex, Motor control, 020601 biomedical engineering, Independent component analysis, Noise, medicine.anatomical_structure, Algorithms, 030217 neurology & neurosurgery, Motor cortex
Abstract: Functional coupling between the motor cortex and muscle activity is usually detected and characterized using the spectral method of cortico-muscular coherence (CMC) between surface electromyogram (sEMG) and electroencephalogram (EEG) recorded synchronously under motor control task. However, CMC is often weak and not easily detectable in all individuals. One of the reasons for the low levels of CMC is the presence of noise and components unrelated to the considered tasks in recorded sEMG and EEG signals. In this paper we propose a method for enhancing relative levels of sEMG components coherent with synchronous EEG signals via a variant of Wavelet Independent Component Analysis combined with a novel component selection algorithm. The effectiveness of the proposed algorithm is demonstrated using data collected in neurophysiologcal experiments.
Published: 2017
Full Text: View/download PDF

22. Delay estimation between EEG and EMG via coherence with time lag

Author: Kerry R. Mills, Zoran Cvetkovic, Verity M. McClelland, and Yuhang Xu
Subjects: Quantitative Biology::Neurons and Cognition, medicine.diagnostic_test, Computer science, Speech recognition, 0206 medical engineering, Time lag, Spectral density, Motor control, 02 engineering and technology, Electroencephalography, 020601 biomedical engineering, Maxima and minima, 03 medical and health sciences, 0302 clinical medicine, medicine.anatomical_structure, medicine, Coherence (signal processing), Algorithm, 030217 neurology & neurosurgery, Motor cortex
Abstract: The traditional way to estimate the time delay between the motor cortex and the periphery is based on the estimation of the slope of the phase of the cross spectral density between motor cortex electroencephalogram (EEG) and electromyog-raphy (EMG) signals recorded synchronously during a motor control task. There are several issues that could make the delay estimation using this method subject to errors, leading frequently to estimates which are in disagreement with underlying physiology. This study introduces cortico-muscular coherence with time lag (CMCTL) function and proposes a method for estimating the delay based on finding its local maxima. We further address the issue of the interpretation of such time delay in multi-path propagation systems. Delay estimates obtained using the proposed method are more consistent compared with results obtained using the phase method and in a better agreement with physiological facts.
Published: 2016
Full Text: View/download PDF

23. Commutativity of block decimators and expanders with arbitrary rational sampling ratios and block lengths

Author: Bingo Wing-Kuen Ling, Charlotte Yuk-Fan Ho, and Zoran Cvetkovic
Subjects: Discrete mathematics, Applied Mathematics, Sampling (statistics), Statistics::Computation, Combinatorics, Computational Theory and Mathematics, Artificial Intelligence, Block (telecommunications), Signal Processing, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, Statistics, Probability and Uncertainty, Time complexity, Commutative property, Integer (computer science), Mathematics
Abstract: It is well known that samplers are linear time varying systems, so in general, the commutativity of samplers does not hold. There are some existing results on the commutativity of conventional decimators and expanders, block samplers with the same integer block lengths but different integer sampling ratios, and block samplers with different integer block lengths and integer sampling ratios. This paper extends the existing results to a necessary and sufficient condition for the commutativity of block decimators and expanders with arbitrary rational sampling ratios and block lengths.
Published: 2012
Full Text: View/download PDF

24. Modulation of corticomuscular coherence by peripheral stimuli

Author: Verity M. McClelland, Zoran Cvetkovic, and Kerry R. Mills
Subjects: Adult, Male, musculoskeletal diseases, Stimulation, macromolecular substances, Isometric exercise, Electroencephalography, Stimulus (physiology), Young Adult, Isometric Contraction, Physical Stimulation, medicine, Humans, Peripheral Nerves, Hand Strength, medicine.diagnostic_test, Proprioception, General Neuroscience, Motor Cortex, technology, industry, and agriculture, Index finger, Middle Aged, musculoskeletal system, Electric Stimulation, body regions, medicine.anatomical_structure, Reflex, Female, Psychology, Neuroscience, Psychomotor Performance, Motor cortex
Abstract: The purpose of this study was to investigate the effects of peripheral afferent stimuli on the synchrony between brain and muscle activity as estimated by corticomuscular coherence (CMC). Electroencephalogram (EEG) from sensorimotor cortex and electromyogram (EMG) from two intrinsic hand muscles were recorded during a key grip motor task, and the modulation of CMC caused by afferent electrical and mechanical stimulation was measured. The particular stimuli used were graded single-pulse electrical stimuli, above threshold for perception and activating cutaneous afferents, applied to the dominant or non-dominant index finger, and a pulsed mechanical displacement of the gripped object causing the subject to feel as if the object may be dropped. Following electrical stimulation of the dominant index finger, the level of β-range (14-36 Hz) CMC was reduced in a stimulus intensity-dependent fashion for up to 400 ms post-stimulus, then returned with greater magnitude before falling to baseline levels over 2.5 s, outlasting the reflex and evoked changes in EMG and EEG. Subjects showing no baseline β-range CMC nevertheless showed post-stimulus increases in β-range CMC with the same time course as those with baseline β-range CMC. The mechanical stimuli produced similar modulation of β-range CMC. Electrical stimuli to the non-dominant index finger produced no significant increase in β-range CMC. The results suggest that both cutaneous and proprioceptive afferents have access to circuits generating CMC, but that only a functionally relevant stimulus produces significant modulation of the background β-range CMC, providing further evidence that β-range CMC has an important role in sensorimotor integration.
Published: 2012
Full Text: View/download PDF

25. Voxel-wise quantification of myocardial perfusion by cardiac magnetic resonance. Feasibility and methods comparison

Author: Amedeo Chiribiri, Guillaume Leopold Theodorus Frederik Hautvast, Niloufar Zarinabad, Zoran Cvetkovic, Andreas Schuster, Masaki Ishida, Eike Nagel, and Philip Batchelor
Subjects: Computer science, Magnetic Resonance Imaging, Cine, Context (language use), Coronary Artery Disease, computer.software_genre, Sensitivity and Specificity, Article, Synthetic data, Imaging phantom, Imaging, Three-Dimensional, Voxel, Coronary Circulation, Image Interpretation, Computer-Assisted, Humans, Radiology, Nuclear Medicine and imaging, Autoregressive–moving-average model, business.industry, Myocardial Perfusion Imaging, Reproducibility of Results, Blood flow, Image Enhancement, Feasibility Studies, Deconvolution, Nuclear medicine, business, Perfusion, computer, Blood Flow Velocity, Magnetic Resonance Angiography, Biomedical engineering
Abstract: The purpose of this study is to enable high spatial resolution voxelwise quantitative analysis of myocardial perfusion in dynamic contrast-enhanced cardiovascular MR, in particular by finding the most favorable quantification algorithm in this context. Four deconvolution algorithms—Fermi function modeling, deconvolution using B-spline basis, deconvolution using exponential basis, and autoregressive moving average modeling —were tested to calculate voxel-wise perfusion estimates. The algorithms were developed on synthetic data and validated against a true gold-standard using a hardware perfusion phantom. The accuracy of each method was assessed for different levels of spatial averaging and perfusion rate. Finally, voxel-wise analysis was used to generate high resolution perfusion maps on real data acquired from five patients with suspected coronary artery disease and two healthy volunteers. On both synthetic and perfusion phantom data, the B-spline method had the highest error in estimation of myocardial blood flow. The autoregressive moving average modeling and exponential methods gave accurate estimates of myocardial blood flow. The Fermi model was the most robust method to noise. Both simulations and maps in the patients and hardware phantom showed that voxel-wise quantification of myocardium perfusion is feasible and can be used to detect abnormal regions. Magn Reson Med 68:1994-2004, 2012. © 2012 Wiley Periodicals, Inc.
Published: 2012
Full Text: View/download PDF

26. Multichannel Dereverberation Theorems and Robustness Issues

Author: Zoran Cvetkovic and H. Hacihabibouglu
Subjects: Approximation theory, Inverse system, Acoustics and Ultrasonics, Finite impulse response, Control theory, Robustness (computer science), Noise reduction, Linear system, Hardware_ARITHMETICANDLOGICSTRUCTURES, Electrical and Electronic Engineering, Impulse (physics), Moore–Penrose pseudoinverse, Mathematics
Abstract: Multichannel dereverberation amounts to the inversion of a multiple-input/multiple-output linear time-invariant system. In this paper, necessary and sufficient conditions for perfect dereverberation using stable and finite impulse response (FIR) filters are established. It is then shown that the inverse system given by the pseudoinverse of the original transfer function matrix exhibits a noise reduction property. A necessary and sufficient condition under which this pseudoinverse system is FIR is also given. Further, an FIR approximation to the pseudoinverse system is considered and the effects of the length of this approximation on the dereverberation accuracy are investigated. Finally, an analytical and numerical assessment of the dependence of the dereverberation accuracy on the accuracy of the acquisition of room impulse responses is provided.
Published: 2012
Full Text: View/download PDF

27. On the Design and Implementation of Higher Order Differential Microphones

Author: Zoran Cvetkovic, Huseyin Hacihabiboglu, and E. De Sena
Subjects: Beamforming, Optimization problem, Acoustics and Ultrasonics, Cardioid, Acoustics, Array data structure, Convex combination, Electrical and Electronic Engineering, Omnidirectional antenna, Trigonometric polynomial, Directivity, Algorithm, Mathematics
Abstract: A novel systematic approach to the design of directivity patterns of higher order differential microphones is proposed. The directivity patterns are obtained by optimizing a cost function which is a convex combination of a front-back energy ratio and uniformity within a frontal sector of interest. Most of the standard directivity patterns - omnidirectional, cardioid, subcardioid, hypercardioid, supercardioid - are particular solutions of this optimization problem with specific values of two free parameters: the angular width of the frontal sector and the convex combination factor. More general solutions of practical use are obtained by varying these two parameters. Many of these optimal directivity patterns are trigonometric polynomials with complex roots. A new differential array structure that enables the implementation of general higher order directivity patterns, with complex or real roots, is then proposed. The effectiveness of the proposed design framework and the implementation structure are illustrated by design examples, simulations, and measurements.
Published: 2012
Full Text: View/download PDF

28. Combined Features and Kernel Design for Noise Robust Phoneme Classification Using Support Vector Machines

Author: Bin Yu, Peter Sollich, Jibran Yousafzai, and Zoran Cvetkovic
Subjects: Acoustics and Ultrasonics, Noise measurement, Computer science, business.industry, Speech recognition, Pattern recognition, Speech processing, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Kernel method, Computer Science::Sound, Robustness (computer science), Cepstrum, Waveform, Artificial intelligence, Electrical and Electronic Engineering, User interface, business
Abstract: This paper proposes methods for combining cepstral and acoustic waveform representations for a front-end of support vector machine (SVM)-based speech recognition systems that are robust to additive noise. The key issue of kernel design and noise adaptation for the acoustic waveform representation is addressed first. Cepstral and acoustic waveform representations are then compared on a phoneme classification task. Experiments show that the cepstral features achieve very good performance in low noise conditions, but suffer severe performance degradation already at moderate noise levels. Classification in the acoustic waveform domain, on the other hand, is less accurate in low noise but exhibits a more robust behavior in high noise conditions. A combination of the cepstral and acoustic waveform representations achieves better classification performance than either of the individual representations over the entire range of noise levels tested, down to - 18-dB SNR.
Published: 2011
Full Text: View/download PDF

29. S67. Corticomuscular coherence in childhood dystonia

Author: Verity M. McClelland, Jean-Pierre Lin, Zoran Cvetkovic, and Kerry R. Mills
Subjects: Dystonia, medicine.medical_specialty, Sensory stimulation therapy, medicine.diagnostic_test, business.industry, Sensory system, Index finger, Thumb, Audiology, Electroencephalography, medicine.disease, Sensory Systems, body regions, medicine.anatomical_structure, Neurology, Forearm, Physiology (medical), Medicine, Neurology (clinical), medicine.symptom, business, Myoclonus
Abstract: Introduction There is growing evidence of abnormal sensorimotor processing in idiopathic and genetic dystonias.Mechanisms may differ in patients with secondary/acquired dystonias, but there are few physiological studies in this group.Corticomuscular coherence (CMC) quantifies synchrony between oscillatory EEG and EMG activity and reflects bidirectional cortex-muscle interaction.Beta range CMC is modulated by sensory stimuli..Several studies show elevated 3–7 Hz intermuscular coherence (IMC) in myoclonus dystonia or DYT1 dystonia.Neither IMC or CMC have been studied in acquired dystonia.This study aimed to record CMC/IMC in children with genetic/idiopathic and acquired dystonia, and to assess its modulation by afferent stimuli. Methods 16 children with dystonia and 13 typically developing children (TDC) participated (age 12–18 yrs).The child grasped a 15 cm ruler between thumb and index finger of the dominant hand.Mechanical perturbations to the ruler were provided by an electromechanical tapper.Surface EMG was recorded from First dorsal interosseous (FDI) and forearm extensors (FEx).Bipolar scalp EEG was recorded over contralateral sensorimotor cortex.Up to 200 5-s epochs of data were collected per subject.EMG and EEG signals were amplified, bandpass filtered (EEG 0.5–100 Hz; EMG 5–250 Hz) and sampled at 1024 Hz.Coherence was calculated for EEG:FDI, EEG:FEx and FDI:FEx using a short-time Fourier transform giving a frequency resolution of 2 Hz.95% confidence levels for significant coherence were calculated for each child.Bonferroni correction was applied to maintain type I error level at 0.05.The 500 ms window was moved across the 5-s data epoch in 50 ms steps to assess CMC/IMC change over time.CMC/IMC was compared between baseline, early (0.5–2 s) and late (2.0–3.5 s) post-stimulus periods using Fisher transformed coherency and Student t-test.Findings were compared between groups. Results Significant beta range (14–36 Hz) CMC and IMC was detected post-stimulus in all children, but was seen more consistently in the TDCs.An adult pattern of beta CMC/IMC modulation ( McCelland et al., 2012 ) was seen in 10/13 TDCs versus only 4/16 children with dystonia (all acquired).For the TDC group, beta-CMC for combined FDI:EEG and FEx:EEG increased significantly from baseline to the early post-stimulus period (p = 0.015), returning to baseline in the late post-stimulus period.For the genetic and acquired dystonia groups the beta-CMC in the post-stimulus period was not significantly increased from baseline.For IMC, the change from baseline to early post-stimulus was significant for both the TDC (p = 0.003), and the acquired dystonia group (p = 0.047) but not the genetic/idiopathic dystonia group(p = 0.088). Conclusion Beta range CMC can be identified in children with dystonia but both its magnitude and its modulation by sensory stimulation are less consistent than in TDCs. The findings also suggest differences in CMC/IMC between genetic/idiopathic vs acquired dystonia.
Published: 2018
Full Text: View/download PDF

30. Simulation of Directional Microphones in Digital Waveguide Mesh-Based Models of Room Acoustics

Author: Huseyin Hacihabiboglu, Zoran Cvetkovic, and Banu Gunel
Subjects: Architectural acoustics, Audio signal, Acoustics and Ultrasonics, Computer Science::Sound, Computer science, Microphone, Acoustics, Waveguide (acoustics), Musical instrument, Electrical and Electronic Engineering, Room acoustics, Directivity, Sound intensity
Abstract: Digital waveguide mesh (DWM) models are time-domain numerical methods providing computationally simple solutions for wave propagation problems. They have been used in various acoustical modeling and audio synthesis applications including synthesis of musical instrument sounds and speech, and modeling of room acoustics. A successful model of room acoustics should be able to account for source and receiver directivity. Methods for the simulation of directional sources in DWM models were previously proposed. This paper presents a method for the simulation of directional microphones in DWM-based models of room acoustics. The method is based on the directional weighting of the microphone response according to the instantaneous direction of incidence at a given point. The direction of incidence is obtained from instantaneous intensity that is calculated from local pressure values in the DWM model. The calculation of instantaneous intensity in DWM meshes and the directional accuracies of different mesh topologies are discussed. An intensity-based formulation for the response of a directional microphone is given. Simulation results for an actual microphone with frequency-dependent, non-ideal directivity function are presented.
Published: 2010
Full Text: View/download PDF

31. Structured prediction for differentiating between normal rhythms, ventricular tachycardia, and ventricular fibrillation in the ECG

Author: Zoran Cvetkovic, Yaqub Alwan, and Michael J. Curtis
Subjects: medicine.medical_specialty, Support Vector Machine, Heart Ventricles, macromolecular substances, Ventricular tachycardia, Electrocardiography, Rhythm, Internal medicine, Medicine, Humans, cardiovascular diseases, Structured prediction, Fibrillation, medicine.diagnostic_test, business.industry, medicine.disease, Markov Chains, Ventricular fibrillation, Ventricular Fibrillation, cardiovascular system, Cardiology, Tachycardia, Ventricular, medicine.symptom, business, Algorithms
Abstract: Recent studies have been performed on feature selection for diagnostics between non-ventricular rhythms and ventricular arrhythmias, or between non-ventricular fibrillation and ventricular fibrillation. However they did not assess classification directly between non-ventricular rhythms, ventricular tachycardia and ventricular fibrillation, which is important in both a clinical setting and preclinical drug discovery. In this study it is shown that in a direct multiclass setting, the selected features from these studies are not capable at differentiating between ventricular tachycardia and ventricular fibrillation. A high dimensional feature space, Fourier magnitude spectra, is proposed for classification, in combination with the structured prediction method conditional random fields. An improvement in overall accuracy, and sensitivity of every category under investigation is achieved.
Published: 2016

32. Single-Bit Oversampled A/D Conversion With Exponential Accuracy in the Bit Rate

Author: Zoran Cvetkovic, B.F. Logan, and Ingrid Daubechies
Subjects: Pointwise, Signal reconstruction, Image processing, Code rate, Library and Information Sciences, Computer Science Applications, Exponential function, Control theory, Oversampling, Dither, Linear combination, Algorithm, Information Systems, Mathematics
Abstract: A scheme for simple oversampled analog-to-digital (A/D) conversion using single-bit quantization is presented. The scheme is based on recording positions of zero-crossings of the input signal added to a deterministic dither function. This information can be represented in a manner such that the bit rate increases only logarithmically with the oversampling factor r. The input band-limited signal can be reconstructed from this information locally with O(1/r) pointwise error, resulting in an exponentially decaying distortion-rate characteristic. In the course of studying the accuracy of the proposed A/D conversion scheme, some new results are established about reconstruction of band-limited signals from irregular samples using linear combination of functions with fast decay. Schemes for local interpolation of band-limited signals from quantized irregular samples are also proposed.
Published: 2007
Full Text: View/download PDF

33. The content of copper and zinc in human ulcered atherosclerotic plaque

Author: Branko Petrovic, Zoran Cvetkovic, Djordje Radak, Nebojsa Tasic, Gordana Djordjevic-Denic, and Vesna Lackovic
Subjects: medicine.medical_specialty, oligoelements, business.industry, medicine.medical_treatment, lcsh:R, Significant difference, Ultrasound, lcsh:Medicine, chemistry.chemical_element, General Medicine, Zinc, Carotid endarterectomy, medicine.disease, Gastroenterology, Obesity, Surgery, Blood pressure, chemistry, Internal medicine, Diabetes mellitus, medicine, atherosclerosis, Endothelial dysfunction, business, ulcered plaque
Abstract: INTRODUCTION Copper and zinc have significant antiatherogenic effect influencing activity of antioxidant enzyms (giutathion-peroxidase i superoxid-dismutase), mechanism of apoptosis and other mechanisms. Few studies showed increased copper and zinc concentration in atherosclerotic plaque in comparison to normal vascular tissue. AIM The aim of the study was to compare copper and zinc concentrations in carotid artery tissue without significant atherosclerotic changes and human ulcered atherosclerotic plaque. MATERIAL AND METHODS Study was conducted on 66 patients. Carotid endarterectomy due to the significant carotid atherosclerotic changes with cerebrovascular disorders was performed in 54 patients (81.8%). Control group consisted of 12 patients (18.2%) without carotid atherosclerotic changes operated due to the symptomatic kinking and coiling of carotid artery. Operated group consisted of 38 man (62.96%) and 16 woman (37.04%). Control group had the same number of patients: six men (50%) and six women (50%). Preoperatively, all patients were examined by vascular surgeon, neurologist and cardiologist. Duplex sonografy of carotid and vertebral arteries was performed by Aloca DSD 630 ultrasound with mechanical and linear transducer 7.7 MHz. Indication for surgical treatment was obtained according to non-invasive diagnostic protocol and neurological symptoms. Copper and zinc concentration in human ulcered atherosclerotic plaque and carotid artery segment were estimated by spectophotometry (Varian AA-5). RESULTS Average age of our patients was 59.8?8.1 years. For males average age was 76.1 ?9.8 years. And for females 42.4?5.8 years. In group with carotid endarterectomy female patients were significantly younger than male patients (p
Published: 2004
Full Text: View/download PDF

34. Nonuniform oversampled filter banks for audio signal processing

Author: Zoran Cvetkovic and J.D. Johnston
Subjects: Audio signal, Acoustics and Ultrasonics, Computer science, Signal reconstruction, Speech recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, computer.software_genre, Filter bank, Anti-aliasing, Aliasing, Filter (video), Electronic engineering, Oversampling, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, Audio signal processing, computer, Software
Abstract: In emerging audio technology applications, there is a need for decompositions of audio signals into oversampled subband components with time-frequency resolution which mimics that of the cochlear filter bank and with high aliasing attenuation in each of the subbands independently, rather than aliasing cancellation properties. We present a design of nearly perfect reconstruction nonuniform oversampled filter banks which implement signal decompositions of this kind.
Published: 2003
Full Text: View/download PDF

35. Resilience properties of redundant expansions under additive noise and quantization

Author: Zoran Cvetkovic
Subjects: Bandlimiting, Discrete mathematics, Source code, Computational complexity theory, Logarithm, Signal reconstruction, media_common.quotation_subject, Library and Information Sciences, Computer Science Applications, Euclidean distance, Quantization (physics), Exponential error, Applied mathematics, Information Systems, Mathematics, media_common
Abstract: Representing signals using coarsely quantized coefficients of redundant expansions is an interesting source coding paradigm, the most important practical case of which is oversampled analog-to-digital (A/D) conversion. Signal reconstruction from quantized redundant expansions and the accuracy of such representations are problems which are not well understood and we study them in this paper for uniform scalar quantization in finite-dimensional spaces. To give a more global perspective, we first present an analysis of the resilience of redundant expansions to degradation by additive noise in general, and then focus on the effects of uniform scalar quantization. The accuracy of signal representations obtained by applying uniform scalar quantization to coefficients of redundant expansions, measured as the mean-squared Euclidean norm of the reconstruction error, has been previously shown to be lower-bounded by an 1/r/sup 2/ expression. We establish some general conditions under which the 1/r/sup 2/ accuracy can actually be attained, and under those conditions prove a 1/r/sup 2/ upper error bound. For a particular kind of structured expansions, which includes many popular frame classes, we propose reconstruction algorithms which attain the 1/r/sup 2/ accuracy at low numerical complexity. These structured expansions, moreover, facilitate efficient encoding of quantized coefficients in a manner which requires only a logarithmic bit-rate increase in redundancy, resulting in an exponential error decay in the bit rate. Results presented in this paper are immediately applicable to oversampled A/D conversion of periodic bandlimited signals.
Published: 2003
Full Text: View/download PDF

36. Efficient Synthesis Of Room Acoustics Via Scattering Delay Networks

Author: Zoran Cvetkovic, Hüseyin Hacιhabiboğlu, Enzo De Sena, and Julius O. Smith
Subjects: Physics, FOS: Computer and information sciences, Reverberation, Absorption (acoustics), Sound (cs.SD), Acoustics and Ultrasonics, Computational complexity theory, Scattering, Acoustics, Time evolution, Room acoustics, Computer Science - Sound, Multimedia (cs.MM), Computational Mathematics, Computer Science::Sound, Computer Science (miscellaneous), Electrical and Electronic Engineering, Order of magnitude, Energy (signal processing), Computer Science - Multimedia
Abstract: An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of the IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network and its memory requirements are negligible.
Published: 2015
Full Text: View/download PDF

37. On simple oversampled A/D conversion in L/sup 2/(R)

Author: Martin Vetterli and Zoran Cvetkovic
Subjects: Discrete mathematics, Logarithmic growth, Library and Information Sciences, Computer Science Applications, Quantization (physics), Analog signal, Sampling (signal processing), Control theory, Bit rate, Oversampling, Exponential decay, Sampling interval, Information Systems, Mathematics
Abstract: The accuracy of oversampled analog-to-digital (A/D) conversion, the dependence of accuracy on the sampling interval /spl tau/ and on the bit rate R are characteristics fundamental to A/D conversion but not completely understood. These characteristics are studied for oversampled A/D conversion of band-limited signals in L/sup 2/ (R). We show that the digital sequence obtained in the process of oversampled A/D conversion describes the corresponding analog signal with an error which tends to zero as /spl tau//sup 2/ in energy, provided that the quantization threshold crossings of the signal constitute a sequence of stable sampling in the respective space of band-limited functions. Further, we show that the sequence of quantized samples can be represented in a manner which requires only a logarithmic increase in the bit rate with the sampling frequency, R=O(|log/spl tau/|), and hence that the error of oversampled A/D conversion actually exhibits an exponential decay in the bit rate as the sampling interval tends to zero.
Published: 2001
Full Text: View/download PDF

38. On discrete short-time Fourier analysis

Author: Zoran Cvetkovic
Subjects: Discrete mathematics, Signal processing, Computational complexity theory, Quantization (signal processing), Filter bank, symbols.namesake, Nonlinear system, Fourier transform, Discrete time and continuous time, Fourier analysis, Signal Processing, symbols, Electrical and Electronic Engineering, Algorithm, Mathematics
Abstract: Weyl-Heisenberg frames are a principal tool of short-time Fourier analysis. We present a comprehensive study of Weyl-Heisenberg frames in l/sup 2/(Z), with a focus on frames that are tight. A number of properties of these frames are derived. A complete parameterization of finite-length windows for tight Weyl-Heisenberg frames in l/sup 2/(Z) is described. Design of windows for tight Weyl-Heisenberg frames requires optimization of their frequency characteristics under nonlinear constraints. We propose an efficient design method based on expansions with respect to prolate spheroidal sequences. The advantages of the proposed method over standard optimization procedures include a reduction in computational complexity and the ability to provide long windows that can be specified concisely using only a few parameters; these advantages become increasingly pronounced as the frame redundancy increases. The resilience of overcomplete Weyl-Heisenberg expansions to additive noise and quantization is also studied. We show that manifestations of degradation due to uncorrelated zero-mean additive noise are inversely proportional to the expansion redundancy, whereas the quantization error is for a given quantization step inversely proportional to the square of the expansion redundancy.
Published: 2000
Full Text: View/download PDF

39. Tight Weyl-Heisenberg frames in l/sup 2/(Z)

Author: Martin Vetterli and Zoran Cvetkovic
Subjects: Pure mathematics, Finite impulse response, Filter bank, Matrix decomposition, Time–frequency analysis, symbols.namesake, Factorization, Fourier analysis, Signal Processing, symbols, Polyphase system, Oversampling, Electrical and Electronic Engineering, Mathematics
Abstract: Tight Weyl-Heisenberg frames in l/sup 2/(Z) are the tool for short-time Fourier analysis in discrete time. They are closely related to paraunitary modulated filter banks and are studied here using techniques of the filter bank theory. Good resolution of short-time Fourier analysis in the joint time-frequency plane is not attainable unless some redundancy is introduced. That is the reason for considering overcomplete Weyl-Heisenberg expansions. The main result of this correspondence is a complete parameterization of finite length tight Weyl-Heisenberg frames in l/sup 2/(Z) with arbitrary rational oversampling ratios. This parameterization follows from a factorization of polyphase matrices of paraunitary modulated filter banks, which is introduced first.
Published: 1998
Full Text: View/download PDF

40. High-dimensional Discriminant Analysis of Human Cardiac Arrhythmias

Author: Yaqub Alwan, Zoran Cvetkovic, and Michael Curtis
Abstract: Publication in the conference proceedings of EUSIPCO, Marrakech, Morocco, 2013
Published: 2013
Full Text: View/download PDF

41. Effects of domain-specific SVM kernel design on the robustness of automatic speech recognition

Author: Peter Sollich, Zoran Cvetkovic, and Jibran Yousafzai
Subjects: Polynomial, business.industry, Speech recognition, Pattern recognition, Invariant (physics), Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Sound, Polynomial kernel, Problem domain, Radial basis function kernel, Radial basis function, Artificial intelligence, Linear combination, business, Mathematics
Abstract: We consider the effects of incorporating prior knowledge of features which correlate with phoneme identity as well as perceptual invariances into the design of SVM kernels for phoneme classification in high-dimensional spaces of acoustic waveforms of speech. To this end we explore products and linear combinations of polynomial and radial basis function kernels to design composite kernels which are invariant to waveform sign and time shift, and capture the dynamics of energy evolution in the time-frequency plane. Experiments show marked improvements in phoneme classification as a result of this custom kernel design. This demonstrates that even in high-dimensional feature spaces, careful kernel design based on prior knowledge of the problem domain can have significant payback.
Published: 2013
Full Text: View/download PDF

42. A computational model for the estimation of localisation uncertainty

Author: Zoran Cvetkovic and Enzo De Sena
Subjects: Correlation, Audio signal, Time difference, Speech recognition, Experimental data, Point (geometry), Psychoacoustics, Free field, Algorithm, Imaging phantom, Mathematics
Abstract: A computational model for prediction of localisation uncertainty of phantom auditory sources is proposed. The interaural level and time difference pairs due to point sources in free field are used as a reference. The mismatch between these “natural” pairs and interaural time and level difference pairs elicited by phantom sources is quantified by means of the 0.5-norm distance, which is justified on psychoacoustic grounds. The model is validated by results of subjective listening tests, achieving a high level of correlation with experimental data.
Published: 2013
Full Text: View/download PDF

43. Redundancy in speech signals and robustness of automatic speech recognition

Author: Peter Sollich, Jibran Yousafzai, Zoran Cvetkovic, and Matthew Ager
Subjects: Human auditory system, ComputingMethodologies_PATTERNRECOGNITION, Voice activity detection, Signal classification, Robustness (computer science), business.industry, Computer science, Speech recognition, Pattern recognition, Mel-frequency cepstrum, Artificial intelligence, Speech processing, business
Abstract: Automatic speech recognition (ASR) systems are yet to achieve the level of robustness inherent to speech recognition by the human auditory system. The primary goal of this paper is to argue that exploiting the redundancy in speech signals could be the key to solving the problem of the lack of robustness. This view is supported by our recent results on phoneme classification and recognition in the presence of noise which are surveyed in this paper.
Published: 2012
Full Text: View/download PDF

44. Frequency-Domain Scattering Delay Networks for Simulating Room Acoustics in Virtual Environments

Author: Zoran Cvetkovic, Huseyin Hacihabiboglu, and Enzo De Sena
Subjects: Reverberation, Architectural acoustics, Computer simulation, Computer science, Microphone, Frequency domain, Scalability, Virtual reality, Room acoustics, Simulation
Abstract: Modelling, simulation and auralisation of room acoustics plays an important role in computer games and virtual reality applications by increasing the level of realism. Accurate simulation of room acoustics is a computationally costly process which is often substituted with artificial reverberators that provide a computationally simpler alternative. However, such systems lack the accuracy and are not in general able to accurately simulate important aspects of room acoustics such as early reflections, source/microphone directivity, and frequency-dependent absorption. A new type of interactive and scalable room simulator named the scattering delay network (SDN) was recently proposed by the authors. A frequency-domain analysis and implementation of that simulator is presented in this paper. Numerical simulation examples which demonstrate the utility of the proposed system are provided.
Published: 2011
Full Text: View/download PDF

45. A generalized design method for directivity patterns of spherical microphone arrays

Author: Huseyin Hacihabiboglu, Zoran Cvetkovic, and Enzo De Sena
Subjects: Sound recording and reproduction, Beamforming, Harmonic analysis, Optimization problem, Computer science, Microphone, Speech recognition, Acoustics, Convex combination, Omnidirectional antenna, Directivity
Abstract: Spherical microphone arrays provide a flexible solution to obtaining higher-order directivity patterns, which are useful in audio recording and reproduction. A general systematic approach to the design of directivity patterns for spherical microphone arrays is introduced in this paper. The directivity patterns are obtained by optimizing a cost function which is a convex combination of a front-back energy ratio and a smoothness term. Most of the standard directivity patterns - i.e. omnidirectional, cardioid, subcardioid, hypercardioid and supercardioid - are particular solutions of this optimization problem with specific values of two free parameters: the angle of the frontal sector, and the convex combination factor. By varying these two parameters, more general solutions of practical use are obtained.
Published: 2011
Full Text: View/download PDF

46. Rectification of the EMG is an unnecessary and inappropriate step in the calculation of Corticomuscular coherence

Author: Kerry R. Mills, Verity M. McClelland, and Zoran Cvetkovic
Subjects: musculoskeletal diseases, Adult, Male, Speech recognition, Action Potentials, Electroencephalography, Young Adult, Rectification, Distortion, medicine, Confidence Intervals, Coherence (signal processing), Humans, Muscle, Skeletal, Digital signal processing, Mathematics, medicine.diagnostic_test, Hand Strength, business.industry, Electromyography, General Neuroscience, technology, industry, and agriculture, Motor Cortex, Somatosensory Cortex, Middle Aged, musculoskeletal system, Hand, Power (physics), body regions, Motor unit, Frequency domain, Data Interpretation, Statistical, Female, business, Algorithms
Abstract: Corticomuscular coherence (CMC) estimation is a frequency domain method used to detect a linear coupling between rhythmic activity recorded from sensorimotor cortex (EEG or MEG) and the electromyogram (EMG) of active muscles. In motor neuroscience, rectification of the surface EMG is a common pre-processing step prior to calculating CMC, intended to maximize information about action potential timing, whilst suppressing information relating to motor unit action potential (MUAP) shape. Rectification is believed to produce a general shift in the EMG spectrum towards lower frequencies, including those around the mean motor unit discharge rate. However, there are no published data to support the claim that EMG rectification enhances the detection of CMC. Furthermore, performing coherence analysis after the non-linear procedure of rectification, which results in a significant distortion of the EMG spectrum, is considered fundamentally flawed in engineering and digital signal processing. We calculated CMC between sensorimotor cortex EEG and EMG of two hand muscles during a key grip task in 14 healthy subjects. CMC calculated using unrectified and rectified EMG was compared. The use of rectified EMG did not enhance the detection of CMC, nor was there any evidence that MUAP shape information had an adverse effect on the CMC estimation. EMG rectification had inconsistent effects on the power and coherence spectra and obscured the detection of CMC in some cases. We also provide a comprehensive theoretical analysis, which, along with our empirical data, demonstrates that rectification is neither necessary nor appropriate in the calculation of CMC.
Published: 2011

47. Subband acoustic waveform front-end for robust speech recognition using support vector machines

Author: Jibran Yousafzai, Peter Sollich, and Zoran Cvetkovic
Subjects: Computer science, Linguistics, Spoken language
Published: 2010
Full Text: View/download PDF

48. Towards robust phoneme classification with hybrid features

Author: Peter Sollich, Zoran Cvetkovic, and Jibran Yousafzai
Subjects: Computer science, business.industry, Speech recognition, Pattern recognition, Speech processing, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Sound, Robustness (computer science), Cepstrum, Waveform, Mel-frequency cepstrum, Artificial intelligence, business
Abstract: In this paper, we investigate the robustness of phoneme classification to additive noise with hybrid features using support vector machines (SVMs). In particular, the cepstral features are combined with short term energy features of acoustic waveform segments to form a hybrid representation. The energy features are then taken into account separately in the SVM kernel, and a simple subtraction method allows them to be adapted effectively in noise. This hybrid representation contributes significantly to the robustness of phoneme classification and narrows the performance gap to the ideal baseline of classifiers trained under matched noise conditions.
Published: 2010
Full Text: View/download PDF

49. ARMA regularization of cardiac perfusion modeling

Author: Philip Batchelor, Niloufar Zarinabad Nooralipour, Amedeo Chiribiri, and Zoran Cvetkovic
Subjects: Noise, Mathematical optimization, Hardware_MEMORYSTRUCTURES, Robustness (computer science), Generalization, Singular value decomposition, Applied mathematics, Deconvolution, Regularization (mathematics), Mathematics, Exponential function, Convolution
Abstract: Cardiac perfusion modelling using ARMA systems is studied. ARMA is a generalization of a recently proposed exponential approximation technique, which was shown to exhibit better performance than the widely used truncated singular value decomposition method. Experiments demonstrate that ARMA achieves results as accurate as the those obtained using the exponential approximation, but it its at the same time less sensitive to additive noise and model order selection.
Published: 2010
Full Text: View/download PDF

50. High-dimensional linear representations for robust speech recognition

Author: Zoran Cvetkovic, Matthew Ager, and Peter Sollich
Subjects: Computer science, business.industry, Dimensionality reduction, Speech recognition, Feature extraction, Pattern recognition, High dimensional, Nonlinear system, Computer Science::Sound, Robustness (computer science), Cepstrum, Artificial intelligence, Hidden Markov model, business, Classifier (UML)
Abstract: Phoneme classification is investigated in linear feature domains with the aim of improving the robustness to additive noise. Linear feature domains allow for exact noise adaptation and so should result in more accurate classification than representations involving nonlinear processing and dimensionality reduction. We develop a generative framework for phoneme classification using linear features. We first show results for a representation consisting of concatenated frames from the centre of the phoneme, each containing f frames. As no single f is optimal for all phonemes, we further average over models with a range of values of f. Next we improve results by including information from the entire phoneme. In the presence of additive noise, classification in this framework performs better than an analogous PLP classifier, adapted to noise using cepstral mean and variance normalisation, below 18dB SNR.
Published: 2010
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Journal

Database

Publisher

78 results on '"Zoran Cvetkovic"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources