HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness

Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, Onur Mutlu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

NAND flash memory density continues to scale to keep up with the increasing storage demands of data-intensive applications. Unfortunately, as a result of this scaling, the lifetime of NAND flash memory has been decreasing. Each cell in NAND flash memory can endure only a limited number of writes, due to the damage caused by each program and erase operation on the cell. This damage can be partially repaired on its own during the idle time between program or erase operations (known as the dwell time), via a phenomenon known as the self-recovery effect. Prior works study the self-recovery effect for planar (i.e., 2D) NAND flash memory, and propose to exploit it to improve flash lifetime, by applying high temperature to accelerate self-recovery. However, these findings may not be directly applicable to 3D NAND flash memory, due to significant changes in the design and manufacturing process that are required to enable practical 3D stacking for NAND flash memory. In this paper, we perform the first detailed experimental characterization of the effects of self-recovery and temperature on real, state-of-the-art 3D NAND flash memory devices. We show that these effects influence two major factors of NAND flash memory reliability: (1) retention loss speed (i.e., the speed at which a flash cell leaks charge), and (2) program variation (i.e., the difference in programming speed across flash cells). We find that self-recovery and temperature affect 3D NAND flash memory quite differently than they affect planar NAND flash memory, rendering prior models of self-recovery and temperature ineffective for 3D NAND flash memory. Using our characterization results, we develop a new model for 3D NAND flash memory reliability, which predicts how retention, wearout, self-recovery, and temperature affect raw bit error rates and cell threshold voltages. We show that our model is accurate, with an error of only 4.9%. Based on our experimental findings and our model, we propose HeatWatch, a new mechanism to improve 3D NAND flash memory reliability. The key idea of HeatWatch is to optimize the read reference voltage, i.e., the voltage applied to the cell during a read operation, by adapting it to the dwell time of the workload and the current operating temperature. HeatWatch (1) efficiently tracks flash memory temperature and dwell time online, (2) sends this information to our reliability model to predict the current voltages of flash cells, and (3) predicts the optimal read reference voltage based on the current cell voltages. Our detailed experimental evaluations show that HeatWatch improves flash lifetime by 3.85× over a baseline that uses a fixed read reference voltage, averaged across 28 real storage workload traces, and comes within 0.9% of the lifetime of an ideal read reference voltage selection mechanism.

Original languageEnglish (US)
Title of host publicationProceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018
PublisherIEEE Computer Society
Pages504-517
Number of pages14
ISBN (Electronic)9781538636596
DOIs
StatePublished - Mar 27 2018
Externally publishedYes
Event24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018 - Vienna, Austria
Duration: Feb 24 2018Feb 28 2018

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
Volume2018-February
ISSN (Print)1530-0897

Other

Other24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018
Country/TerritoryAustria
CityVienna
Period2/24/182/28/18

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness'. Together they form a unique fingerprint.

Cite this