1. Neurally Optimized Decoder for Low Bitrate Speech Codec.
- Author
-
Kim, Hyung Yong, Yoon, Ji Won, Cho, Won Ik, and Kim, Nam Soo
- Subjects
VIDEO coding ,AUTOMATIC speech recognition ,SPEECH processing systems ,GENERATIVE adversarial networks ,BINARY sequences ,CODECS - Abstract
Recently, a conventional neural decoder for speech codec has shown promising performance. However, it typically requires some prior knowledge of decoding such as bit allocation or dequantization scheme, which is not a universal solution for many different kinds of speech codecs. In order to address this limitation, we propose a neurally optimized decoder based on a generative model which can directly reconstruct the speech from the bitstream without a prior knowledge. The proposed decoder mainly consists of two components: 1) a dequantization model to group and dequantize related bits from the bitstream and 2) a generative model to restore the speech conditioned on the output of the dequantization model. Through experiments with mixed excitation linear prediction (MELP), Advanced multi-band excitation (AMBE), and SPEEX at around 2.4 kb/s, it is showed that the proposed model showed better performance in most of the objective and subjective evaluation compared to the conventional speech codecs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF