Back to Search
Start Over
Audio Classification Based on MPEG-7 Spectral Basis Representations
- Source :
- IEEE Transactions on Circuits and Systems for Video Technology. 14:716-725
- Publication Year :
- 2004
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2004.
-
Abstract
- In this paper, we present an MPEG-7-based audio classification and retrieval technique targeted for analysis of film material. The technique consists of low-level descriptors and high-level description schemes. For low-level descriptors, low-dimensional features such as audio spectrum projection based on audio spectrum basis descriptors is produced in order to find a balanced tradeoff between reducing dimensionality and retaining maximum information content. High-level description schemes are used to describe the modeling of reduced-dimension features, the procedure of audio classification, and retrieval. A classifier based on continuous hidden Markov models is applied. The sound model state path, which is selected according to the maximum-likelihood model, is stored in an MPEG-7 sound database and used as an index for query applications. Various experiments are presented where the speaker- and sound-recognition rates are compared for different feature extraction methods. Using independent component analysis, we achieved better results than normalized audio spectrum envelope and principal component analysis in a speaker recognition system. In audio classification experiments, audio sounds are classified into selected sound classes in real time with an accuracy of 96%.
- Subjects :
- business.industry
Computer science
Speech recognition
Feature extraction
Pattern recognition
computer.software_genre
Filter bank
Independent component analysis
Computer Science::Sound
Computer Science::Multimedia
Principal component analysis
Media Technology
Discrete cosine transform
Artificial intelligence
Mel-frequency cepstrum
Electrical and Electronic Engineering
Hidden Markov model
business
Audio signal processing
Decorrelation
computer
Audio frequency
Subjects
Details
- ISSN :
- 10518215
- Volume :
- 14
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Circuits and Systems for Video Technology
- Accession number :
- edsair.doi...........a51ac8b4eb0237476ccc22710a4d8831
- Full Text :
- https://doi.org/10.1109/tcsvt.2004.826766