Start Over

Audio coding via EMD

Authors :: Thierry Chonavel
Kais Khaldi
Mounia Turki Hadj-Alouane
Ali Komaty
Abdel-Ouahab Boudraa
Institut de Recherche de l'Ecole Navale (IRENAV)
Université de Bordeaux (UB)-Institut Polytechnique de Bordeaux-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Arts et Métiers Sciences et Technologies
HESAM Université (HESAM)-HESAM Université (HESAM)
Département Signal et Communications (IMT Atlantique - SC)
IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)
Lab-STICC_IMTA_CID_TOMS
Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance (Lab-STICC)
École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)
Source :: Digital Signal Processing, Digital Signal Processing, Elsevier, 2020, 104, pp.102770. ⟨10.1016/j.dsp.2020.102770⟩
Publication Year :: 2020
Publisher :: HAL CCSD, 2020.
Abstract: International audience; In this paper an audio coding scheme based on the empirical mode decomposition in association with a psychoacoustic model is presented. The principle of the method consists in breaking down adaptively the audio signal into intrinsic oscillatory components, called Intrinsic Mode Functions (IMFs), that are fully described by their local extrema. These extrema are encoded. The coding is carried out frame by frame and no assumption is made upon the signal to be coded. The number of allocated bits varies from mode to mode and obeys to the coding error inaudibility constraint. Due to the symmetry of an IMF, only the extrema (maxima or minima) of one of its interpolating envelopes are perceptually coded. In addition, to deal with rapidly changing audio signals, a stationarity index is used and when a transient is detected, the frame is split into two overlapping sub-frames. At the decoder side, the IMFs are recovered using the associated coded maxima, and the original signal is reconstructed by IMFs summation. Performance of the proposed coding is analyzed and compared to that of MP3 and AAC codecs, and the wavelet-based coding approach. Based on the analyzed mono audio signals, the obtained results show that the proposed coding scheme outperforms the MP3 and the wavelet-based coding methods and performs slightly better than the AAC codec, showing thus the potential of the EMD for data-driven audio coding.

Subjects :: Computer science
Audio coding
02 engineering and technology
Sub-band coding
Hilbert–Huang transform
Wavelet
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
Codec
Psychoacoustics
Electrical and Electronic Engineering
Empirical mode decomposition
Audio signal
Applied Mathematics
020206 networking & telecommunications
Stationarity index
Maxima and minima
Computational Theory and Mathematics
Signal Processing
020201 artificial intelligence & image processing
Empirical mode compression
Computer Vision and Pattern Recognition
Statistics, Probability and Uncertainty
Algorithm
[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
Psychoacoustic model
Coding (social sciences)

Details

Language :: English
ISSN :: 10512004 and 10954333
Database :: OpenAIRE
Journal :: Digital Signal Processing, Digital Signal Processing, Elsevier, 2020, 104, pp.102770. ⟨10.1016/j.dsp.2020.102770⟩
Accession number :: edsair.doi.dedup.....5902cc1055bc9403ffbc8b4496f4469f
Full Text :: https://doi.org/10.1016/j.dsp.2020.102770⟩