Back to Search Start Over

SSTE: Syllable-Specific Temporal Encoding to FORCE-learn audio sequences with an associative memory approach.

Authors :
Jannesar N
Akbarzadeh-Sherbaf K
Safari S
Vahabie AH
Source :
Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2024 Sep; Vol. 177, pp. 106368. Date of Electronic Publication: 2024 May 07.
Publication Year :
2024

Abstract

The circuitry and pathways in the brains of humans and other species have long inspired researchers and system designers to develop accurate and efficient systems capable of solving real-world problems and responding in real-time. We propose the Syllable-Specific Temporal Encoding (SSTE) to learn vocal sequences in a reservoir of Izhikevich neurons, by forming associations between exclusive input activities and their corresponding syllables in the sequence. Our model converts the audio signals to cochleograms using the CAR-FAC model to simulate a brain-like auditory learning and memorization process. The reservoir is trained using a hardware-friendly approach to FORCE learning. Reservoir computing could yield associative memory dynamics with far less computational complexity compared to RNNs. The SSTE-based learning enables competent accuracy and stable recall of spatiotemporal sequences with fewer reservoir inputs compared with existing encodings in the literature for similar purpose, offering resource savings. The encoding points to syllable onsets and allows recalling from a desired point in the sequence, making it particularly suitable for recalling subsets of long vocal sequences. The SSTE demonstrates the capability of learning new signals without forgetting previously memorized sequences and displays robustness against occasional noise, a characteristic of real-world scenarios. The components of this model are configured to improve resource consumption and computational intensity, addressing some of the cost-efficiency issues that might arise in future implementations aiming for compactness and real-time, low-power operation. Overall, this model proposes a brain-inspired pattern generation network for vocal sequences that can be extended with other bio-inspired computations to explore their potentials for brain-like auditory perception. Future designs could inspire from this model to implement embedded devices that learn vocal sequences and recall them as needed in real-time. Such systems could acquire language and speech, operate as artificial assistants, and transcribe text to speech, in the presence of natural noise and corruption on audio data.<br />Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2024 Elsevier Ltd. All rights reserved.)

Details

Language :
English
ISSN :
1879-2782
Volume :
177
Database :
MEDLINE
Journal :
Neural networks : the official journal of the International Neural Network Society
Publication Type :
Academic Journal
Accession number :
38761415
Full Text :
https://doi.org/10.1016/j.neunet.2024.106368