Start Over

Multi-band excitation based vocoders and their real-time implementation

Authors :: Ma, Wei
Publication Year :: 1994
Publisher :: University of Surrey, 1994.
Abstract: Vocoders compress speech signals into a very efficient representation - vocal parameters, resulting in very low bit rate speech communications. However, the reproduced speech quality has remained low for a long time, since vocoders traditionally employ very simple speech production models. As speech coding research is focusing on bit rates below 4 kb/s after CELP based hybrid coders achieved a remarkable breakthrough at medium to low rates, 16-4.8 kb/s in the last decade, the vocoder techniques have regained their importance. The most attractive vocoder developed in recent years is the Multi-Band Excitation (MBE) vocoder, which uses multiple V/UV decisions in the frequency domain. The MBE vocoder can produce high speech quality at rates around 4 kb/s. However, problems associated with this newly emerged scheme include: high implementation complexity, rather higher rate than the normal 2.4 kb/s of vocoder, and low acoustic robustness. This thesis reports on a study of this new vocoder scheme. The first part of the thesis aims to examine 3 key aspects of vocoders: vocal tract filter, pitch determination and V/UV decisions. First, a comprehensive review is given for each of the above topics, covering formulation and classifications. Various improvement attempts are then discussed for each of these essential functions. The relationship between different vocal tract filter descriptions reveals the possibility of design of a new type of MBE based vocoder, MBE-LPC vocoder. A pitch synchronized sinusoidal synthesizer demonstrates a very efficient way of using sinusoidal-based vocal tract filters. A new pitch determination method is designed for the MBE vocoder in order to correct the bias of low pitch preference. In networking operations, vocoders face not only the speech signals but also network control signals, such as tones. Two essential functions: silence and DTMF detection have been designed together with the consideration of the V/UV decisions. In recent years, with speech coding algorithms employing more and more complicated techniques, their real-time implementation is becoming a critical problem in speech coding research. Research progress is frequently delayed by real-time implementation. Therefore, the second part of this thesis investigates the MBE based vocoders from a real-time implementation and system application point of view. A general study on realtime implementation is included which analyses the problems associated with modern DSP solutions. Fast implementation and efficient computation methods are proposed in considering vocoding and DSP peculiarities. Two complete MBE based vocoders are then examined along with the objective verification and performance enhancements in real-time operation. The efficiency and correctness of these real-time implementations are reported, in which the real-time implementation of 4.15 kb/s MBE vocoder was the first one to pass the objective type approval test required by the INMARSAT-M system. A proposed 2.7 kb/s MBE-LPC vocoder achieves slightly better quality than the above standard MBE vocoder.