Start Over

Data Augmentation and D-vector Representation Methods for Speaker Change Detection

Authors :: Jeon Gue Park
Shin Cha
Jisu Park
Seongbae Eun
Young-Sun Yun
Source :: RACS
Publication Year :: 2020
Publisher :: ACM, 2020.
Abstract: Speaker Change Detection (SCD) is the process that detects speaker changes during a conversation. The conversation can be divided into homogeneous segments using a typical SCD system or speaker diarization system in which the segments are partitioned according to a speaker identity. When the d-vectors are used to identify or verify the speakers with deep neural network model, they are often considered insufficient to train model for detecting the speaker changes by using only acoustic information. There are few dedicated datasets for system training, so the progress of the SCD study is slow and the performance is poor. Therefore, we presented data augmentation method based on TIMIT dataset to suit for the system, and we also proposed several methods to represent d-vectors for SCD systems and their preliminary results. In the proposed data augmentation method, the boundary information of speakers is transformed into probability according to the offset in a given frame and collected in the segment. To model the boundaries of the speakers, we concatenate two random speech sentences dedicated to speech recognition system. The preliminary experimental results, specifically recall percentage, shows the possibility of the proposed approaches. In the future, we will add linguistic information to the proposed classification system, or improve the system to use hybrid system of d-vector and frame vectors, or convolutional networks.

Subjects :: Artificial neural network
Computer science
Speech recognition
Frame (networking)
TIMIT
01 natural sciences
Speaker diarisation
030507 speech-language pathology & audiology
03 medical and health sciences
Recurrent neural network
Rule-based machine translation
Hybrid system
0103 physical sciences
0305 other medical science
010301 acoustics
Change detection

Details

Database :: OpenAIRE
Journal :: Proceedings of the International Conference on Research in Adaptive and Convergent Systems
Accession number :: edsair.doi...........7e4784f0c9902d78dba0d734fb22126e
Full Text :: https://doi.org/10.1145/3400286.3418270

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Data Augmentation and D-vector Representation Methods for Speaker Change Detection

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Data Augmentation and D-vector Representation Methods for Speaker Change Detection

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources