Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A significant challenge is that there is no known solution to perform single-wavelength accumulation (a key operation required for DL workloads) using incoherent optical signals efficiently. Therefore, we devise a hybrid approach, where accumulation is done in the electrical domain, and multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently. Through detailed design and evaluation of our design, along with a comprehensive benchmarking study against state-of-the-art RRAM-based designs, we derive the following key results:(1) For a four-layer multilayer perceptron network, our design achieves 115× and 17.11× improvements in latency and energy, respectively, compared to the RRAM-based design. We can take full advantage of the speed and energy benefits of the optical technology because the inference task can be entirely mapped onto our design.(2) For a complex workload (Resnet50), weight reprogramming is needed, and intermediate results need to be stored/re-fetched to/from memories. In this case, for the same area, our design still outperforms the RRAM-based design by 15.92× in inference latency, and 8.99× in energy.
|ACM Journal on Emerging Technologies in Computing Systems
|Published - May 3 2023
- Additional Key Words and PhrasesOptical computing
- deep learning accelerator
ASJC Scopus subject areas
- Hardware and Architecture
- Electrical and Electronic Engineering