Abstract
In this chapter we dive into stencil sweep computation, which seems to be just convolution with special filter patterns. However, because the stencils come from discretization and numerical approximation of derivatives in solving differential equations, they have two characteristics that motivate and enable new optimizations. This chapter focuses on these new optimization opportunities and challenges. First, stencil sweeps are typically done on three-dimensional (3D) grids, whereas convolution is typically done on two-dimensional (2D) images or a small number of time slices of 2D images. This makes the tiling considerations different between the two and motivates thread coarsening for 3D stencils to enable larger input tiles and more data reuse. Second, the stencil patterns can sometimes enable register tiling of input data to further improve data access throughput and alleviate shared memory pressure.
Original language | English (US) |
---|---|
Title of host publication | Programming Massively Parallel Processors |
Subtitle of host publication | a Hands-on Approach, Fourth Edition |
Publisher | Elsevier |
Pages | 173-189 |
Number of pages | 17 |
ISBN (Electronic) | 9780323912310 |
ISBN (Print) | 9780323984638 |
DOIs | |
State | Published - Jan 1 2022 |
Keywords
- Stencil
- boundary condition problems
- data reuse
- differential equations
- discretization
- domain decomposition
- finite difference method
- ghost cells
- halo cells
- numerical methods
- register tiling
- shared memory tiling
- stencil sweep
- thread coarsening
- tiling
- tiling efficiency
ASJC Scopus subject areas
- General Computer Science