Back to Search Start Over

A2P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented Neural Networks

Authors :
Ahmadzadeh, Mohsen
Kamal, Mehdi
Afzali-Kusha, Ali
Pedram, Massoud
Source :
IEEE Transactions on Neural Networks and Learning Systems; November 2023, Vol. 34 Issue: 11 p8284-8296, 13p
Publication Year :
2023

Abstract

In this work, to limit the number of required attention inference hops in memory-augmented neural networks, we propose an online adaptive approach called <inline-formula> <tex-math notation="LaTeX">$\text{A}^{2}\text{P}$ </tex-math></inline-formula>-memory-augmented neural network (MANN). By exploiting a small neural network classifier, an adequate number of attention inference hops for the input query are determined. The technique results in the elimination of a large number of unnecessary computations in extracting the correct answer. In addition, to further lower computations in <inline-formula> <tex-math notation="LaTeX">$\text{A}^{2}\text{P}$ </tex-math></inline-formula>-MANN, we suggest pruning weights of the final fully connected (FC) layers. To this end, two pruning approaches, one with negligible accuracy loss and the other with controllable loss on the final accuracy, are developed. The efficacy of the technique is assessed by applying it to two different MANN structures and two question answering (QA) datasets. The analytical assessment reveals, for the two benchmarks, on average, 50% fewer computations compared to the corresponding baseline MANNs at the cost of less than 1% accuracy loss. In addition, when used along with the previously published zero-skipping technique, a computation count reduction of approximately 70% is achieved. Finally, when the proposed approach (without zero skipping) is implemented on the CPU and GPU platforms, on average, a runtime reduction of 43% is achieved.

Details

Language :
English
ISSN :
2162237x and 21622388
Volume :
34
Issue :
11
Database :
Supplemental Index
Journal :
IEEE Transactions on Neural Networks and Learning Systems
Publication Type :
Periodical
Accession number :
ejs64405032
Full Text :
https://doi.org/10.1109/TNNLS.2022.3148818