Compute unified device architecture application suitability

Wen Mei Hwu, Christopher Rodrigues, Shane Ryoo, John Stratton

Research output: Contribution to journalArticle

Abstract

Graphics processing units (GPUs) can provide excellent speedups on some, but not all, general-purpose workloads. Using a set of computational GPU kernels as examples, the authors show how to adapt kernels to utilize the architectural features of a GeForce 8800 GPU and what finally limits the achievable performance.

Original languageEnglish (US)
Article number4814979
Pages (from-to)16-26
Number of pages11
JournalComputing in Science and Engineering
Volume11
Issue number3
DOIs
StatePublished - May 1 2009

Fingerprint

Graphics processing unit

Keywords

  • Benchmarks
  • CUDA
  • Compute unified device architecture
  • Computer architecture
  • GPGPU
  • General-purpose computing on GPU
  • Software optimization

ASJC Scopus subject areas

  • Computer Science(all)
  • Engineering(all)

Cite this

Compute unified device architecture application suitability. / Hwu, Wen Mei; Rodrigues, Christopher; Ryoo, Shane; Stratton, John.

In: Computing in Science and Engineering, Vol. 11, No. 3, 4814979, 01.05.2009, p. 16-26.

Research output: Contribution to journalArticle

Hwu, Wen Mei ; Rodrigues, Christopher ; Ryoo, Shane ; Stratton, John. / Compute unified device architecture application suitability. In: Computing in Science and Engineering. 2009 ; Vol. 11, No. 3. pp. 16-26.
@article{17b96f53cdab4b59a4c1d07f2fb67d4c,
title = "Compute unified device architecture application suitability",
abstract = "Graphics processing units (GPUs) can provide excellent speedups on some, but not all, general-purpose workloads. Using a set of computational GPU kernels as examples, the authors show how to adapt kernels to utilize the architectural features of a GeForce 8800 GPU and what finally limits the achievable performance.",
keywords = "Benchmarks, CUDA, Compute unified device architecture, Computer architecture, GPGPU, General-purpose computing on GPU, Software optimization",
author = "Hwu, {Wen Mei} and Christopher Rodrigues and Shane Ryoo and John Stratton",
year = "2009",
month = "5",
day = "1",
doi = "10.1109/MCSE.2009.48",
language = "English (US)",
volume = "11",
pages = "16--26",
journal = "Computing in Science and Engineering",
issn = "1521-9615",
publisher = "IEEE Computer Society",
number = "3",

}

TY - JOUR

T1 - Compute unified device architecture application suitability

AU - Hwu, Wen Mei

AU - Rodrigues, Christopher

AU - Ryoo, Shane

AU - Stratton, John

PY - 2009/5/1

Y1 - 2009/5/1

N2 - Graphics processing units (GPUs) can provide excellent speedups on some, but not all, general-purpose workloads. Using a set of computational GPU kernels as examples, the authors show how to adapt kernels to utilize the architectural features of a GeForce 8800 GPU and what finally limits the achievable performance.

AB - Graphics processing units (GPUs) can provide excellent speedups on some, but not all, general-purpose workloads. Using a set of computational GPU kernels as examples, the authors show how to adapt kernels to utilize the architectural features of a GeForce 8800 GPU and what finally limits the achievable performance.

KW - Benchmarks

KW - CUDA

KW - Compute unified device architecture

KW - Computer architecture

KW - GPGPU

KW - General-purpose computing on GPU

KW - Software optimization

UR - http://www.scopus.com/inward/record.url?scp=65349159175&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65349159175&partnerID=8YFLogxK

U2 - 10.1109/MCSE.2009.48

DO - 10.1109/MCSE.2009.48

M3 - Article

AN - SCOPUS:65349159175

VL - 11

SP - 16

EP - 26

JO - Computing in Science and Engineering

JF - Computing in Science and Engineering

SN - 1521-9615

IS - 3

M1 - 4814979

ER -