Abstract
A multi-functional in-memory inference processor integrated circuit (IC) in a 65-nm CMOS process is presented. The prototype employs a deep in-memory architecture (DIMA), which enhances both energy efficiency and throughput over conventional digital architectures via simultaneous access of multiple rows of a standard 6T bitcell array (BCA) per precharge, and embedding column pitch-matched low-swing analog processing at the BCA periphery. In doing so, DIMA exploits the synergy between the dataflow of machine learning (ML) algorithms and the SRAM architecture to reduce the dominant energy cost due to data movement. The prototype IC incorporates a 16-kB SRAM array and supports four commonly used ML algorithms - the support vector machine, template matching, k-nearest neighbor, and the matched filter. Silicon measured results demonstrate simultaneous gains (dot product mode) in energy efficiency of 10 × and in throughput of 5.3 × leading to a 53 × reduction in the energy-delay product with negligible (≤ 1%) degradation in the decision-making accuracy, compared with the conventional 8-b fixed-point single-function digital implementations.
Original language | English (US) |
---|---|
Article number | 8246704 |
Pages (from-to) | 642-655 |
Number of pages | 14 |
Journal | IEEE Journal of Solid-State Circuits |
Volume | 53 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2018 |
Keywords
- Accelerator
- analog processing
- associative memory
- in-memory processing
- inference
- machine learning (ML)
ASJC Scopus subject areas
- Electrical and Electronic Engineering