Performance considerations

Wen mei W. Hwu, David B. Kirk, Izzat El Hajj

Research output: Chapter in Book/Report/Conference proceedingChapter


In this chapter we briefly present the off-chip memory (DRAM) architecture and discuss related performance considerations, such as memory coalescing and memory latency hiding. We then discuss an important optimization, thread granularity coarsening, that may target any of the different aspects of the compute and memory architecture, depending on the application. Finally, we wrap up this part of the book with a checklist of common performance optimizations that will serve as a guide for optimizing the performance of the parallel patterns that are discussed in the second and third parts of the book.

Original languageEnglish (US)
Title of host publicationProgramming Massively Parallel Processors
Subtitle of host publicationa Hands-on Approach, Fourth Edition
Number of pages25
ISBN (Electronic)9780323912310
ISBN (Print)9780323984638
StatePublished - Jan 1 2022


  • DRAM burst
  • Memory bandwidth
  • corner turning
  • latency hiding
  • memory bank
  • memory channel
  • memory coalescing
  • performance bottleneck
  • performance optimization
  • thread coarsening
  • thread granularity

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Performance considerations'. Together they form a unique fingerprint.

Cite this