Multisensor user authentication

Authors :: D. Krepp
Steven K. Rogers
Mark E. Oxley
John M. Colombi
Dennis W. Ruck
Source :: SPIE Proceedings.
Publication Year :: 1993
Publisher :: SPIE, 1993.
Abstract: User recognition is examined using neural and conventional techniques for processing speech and face images. This article for the first time attempts to overcome this significant problem of distortions inherently captured over multiple sessions (days). Speaker recognition uses both Linear Predictive Coding (LPC) cepstral and auditory neural model representations with speaker dependent codebook designs. For facial imagery, recognition is developed on a neural network that consists of a single hidden layer multilayer perceptron backpropagation network using either the raw data as inputs or principal components of the raw data computed using the Karhunen-Loeve Transform as inputs. The data consists of 10 subjects; each subject recorded utterances and had images collected for 10 days. The utterances collected were 400 rich phonetic sentences (4 sec), 200 subject name recordings (3 sec), and 100 imposter name recordings (3 sec). Face data consists of over 2000, 32 X 32 pixel, 8 bit gray scale images of the 10 subjects. Each subsystem attains over 90% verification accuracy individually using test data gathered on days following the training data.