Program optimization carving for GPU computing

Shane Ryoo, Christopher I. Rodrigues, Sam S. Stone, John A. Stratton, Sain Zee Ueng, Sara S. Baghsorkhi, Wen mei W. Hwu

Research output: Contribution to journalArticle


Contemporary many-core processors such as the GeForce 8800 GTX enable application developers to utilize various levels of parallelism to enhance the performance of their applications. However, iterative optimization for such a system may lead to a local performance maximum, due to the complexity of the system. We propose program optimization carving, a technique that begins with a complete optimization space and prunes it down to a set of configurations that is likely to contain the global maximum. The remaining configurations can then be evaluated to determine the one with the best performance. The technique can reduce the number of configurations to be evaluated by as much as 98% and is successful at finding a near-best configuration. For some applications, we show that this approach is significantly superior to random sampling of the search space.

Original languageEnglish (US)
Pages (from-to)1389-1401
Number of pages13
JournalJournal of Parallel and Distributed Computing
Issue number10
StatePublished - Oct 1 2008


  • GPU computing
  • Optimization space exploration
  • Parallel computing

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Program optimization carving for GPU computing'. Together they form a unique fingerprint.

  • Cite this

    Ryoo, S., Rodrigues, C. I., Stone, S. S., Stratton, J. A., Ueng, S. Z., Baghsorkhi, S. S., & Hwu, W. M. W. (2008). Program optimization carving for GPU computing. Journal of Parallel and Distributed Computing, 68(10), 1389-1401.