Wen mei W. Hwu, David B. Kirk, Izzat El Hajj

Research output: Chapter in Book/Report/Conference proceedingChapter


In this chapter we dive into stencil sweep computation, which seems to be just convolution with special filter patterns. However, because the stencils come from discretization and numerical approximation of derivatives in solving differential equations, they have two characteristics that motivate and enable new optimizations. This chapter focuses on these new optimization opportunities and challenges. First, stencil sweeps are typically done on three-dimensional (3D) grids, whereas convolution is typically done on two-dimensional (2D) images or a small number of time slices of 2D images. This makes the tiling considerations different between the two and motivates thread coarsening for 3D stencils to enable larger input tiles and more data reuse. Second, the stencil patterns can sometimes enable register tiling of input data to further improve data access throughput and alleviate shared memory pressure.

Original languageEnglish (US)
Title of host publicationProgramming Massively Parallel Processors
Subtitle of host publicationa Hands-on Approach, Fourth Edition
Number of pages17
ISBN (Electronic)9780323912310
ISBN (Print)9780323984638
StatePublished - Jan 1 2022


  • Stencil
  • boundary condition problems
  • data reuse
  • differential equations
  • discretization
  • domain decomposition
  • finite difference method
  • ghost cells
  • halo cells
  • numerical methods
  • register tiling
  • shared memory tiling
  • stencil sweep
  • thread coarsening
  • tiling
  • tiling efficiency

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Stencil'. Together they form a unique fingerprint.

Cite this