Adaptive cache bypass and insertion for many-core accelerators

Xuhao Chen, Shengzhao Wu, Li Wen Chang, Wei Sheng Huang, Carl Pearson, Zhiying Wang, Wen Mei W. Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many-core accelerators, e.g. GPUs, are widely used for ac- celerating general-purpose compute kernels. With the SIMT execution model, GPUs can hide memory latency through massive multithreading for many regular applications. To support more applications with irregular memory access pat- Tern, cache hierarchy is introduced to GPU architecture to capture input data sharing and mitigate the effect of irreg- ular accesses. However, GPU caches suffer from poor effi- ciency due to severe contention, which makes it difficult to adopt heuristic management policies, and also limits system performance and energy-efficiency. We propose an adaptive cache management policy specifi- cally for many-core accelerators. The tag array of L2 cache is enhanced with extra bits to track memory access history, an thus the locality information is captured and provided to L1 cache as heuristics to guide its run- Time bypass and inser- Tion decisions. By preventing un-reused data from polluting the cache and alleviating contention, cache efficiency is sig- nificantly improved. As a result, the system performance is improved by 31% on average for cache sensitive benchmarks, compared to the baseline GPU architecture.

Original languageEnglish (US)
Title of host publication2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014
PublisherAssociation for Computing Machinery
Pages1-8
Number of pages8
ISBN (Print)9781450328227
DOIs
StatePublished - Jan 1 2014
Event2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014, Held in Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014 - Minneapolis, MN, United States
Duration: Jun 14 2014Jun 15 2014

Publication series

NameACM International Conference Proceeding Series

Other

Other2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014, Held in Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014
CountryUnited States
CityMinneapolis, MN
Period6/14/146/15/14

Keywords

  • Bypass
  • Cache management
  • GPGPU
  • Insertion

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Adaptive cache bypass and insertion for many-core accelerators'. Together they form a unique fingerprint.

  • Cite this

    Chen, X., Wu, S., Chang, L. W., Huang, W. S., Pearson, C., Wang, Z., & Hwu, W. M. W. (2014). Adaptive cache bypass and insertion for many-core accelerators. In 2nd ACM International Workshop on Many-Core Embedded Systems, MES 2014 - In Conjunction with the 41st International Symposium on Computer Architecture, ISCA 2014 (pp. 1-8). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/2613908.2613909