1. SAPIENS: A 64-kb RRAM-Based Non-Volatile Associative Memory for One-Shot Learning and Inference at the Edge
- Author
-
Yue-Der Chih, Weier Wan, Ching-Hua Wang, Harry Chuang, Hongjie Wang, Po-Han Chen, Wei-Chen Chen, Haitong Li, Priyanka Raina, Akash Levy, Win-San Khwa, H.-S. Philip Wong, and Meng-Fan Chang
- Subjects
Hardware_MEMORYSTRUCTURES ,Artificial neural network ,Computer science ,business.industry ,Pattern recognition ,Content-addressable memory ,Chip ,One-shot learning ,Electronic, Optical and Magnetic Materials ,Resistive random-access memory ,Feature (machine learning) ,Enhanced Data Rates for GSM Evolution ,Artificial intelligence ,Electrical and Electronic Engineering ,Quantization (image processing) ,business - Abstract
Learning from a few examples (one/few-shot learning) on the fly is a key challenge for on-device machine intelligence. We present the first chip-level demonstration of one-shot learning with Stanford Associative memory for Programmable, Integrated Edge iNtelligence via life-long learning and Search (SAPIENS), a resistive random access memory (RRAM)-based non-volatile associative memory (AM) chip that serves as the backend for memory-augmented neural networks (MANNs). The 64-kb fully integrated RRAM-CMOS AM chip performs long-term feature embedding and retrieval, demonstrated on a 32-way one-shot learning task on the Omniglot dataset. Using only one example per class for 32 unseen classes during on-chip learning, SAPIENS achieves 79% measured inference accuracy on Omniglot, comparable to edge software model accuracy using five-level quantization (82%). It achieves an energy efficiency of 118 GOPS/W at 200 MHz for in-memory L1 distance computation and prediction. Multi-bank measurements on the same chip show that increasing the capacity from three banks (24 kb) to eight banks (64 kb) improves the chip accuracy from 73.5% to 79%, while minimizing the accuracy excursion due to bank-to-bank variability.
- Published
- 2021