Bionic optimization of MFCC features based on speaker fast recognition

Authors :: Xiong Chen
Changan Di
Zhaodong Lin
Source :: Applied Acoustics. 173:107682
Publication Year :: 2021
Publisher :: Elsevier BV, 2021.
Abstract: Surrounded by low SNR, how to make the voice faster and better recognize the owner has become a heated research topic. The human auditory system can accurately acquire the characteristics of acoustic events in complex systems or low SNR noise environment, which is of significance in the research of bionic hearing of human ear. The response curve of human ear output is obtained by bionic technology, which is the best response curve for sound enhancement to modify Mel filter. The method of adaptive threshold selection is used to integrate Mel features to realize the reduction and dynamic extraction of low SNR speech features. This method not only can resist the disadvantages of poor robustness and complexity of parameter model, but also obtain dynamic and comprehensive speech information of different speakers in different scenes. Finally, the improved CNN and I-vector system are contributed to reduce the dimension of the data and to verify the recognition, so as to achieve the optimal frequency selective amplification and simplification of the acoustic signal. In the case of SNR-5db, the model is reduced by 15% and the recognition accuracy is improved by 3%.