Optimizing matrix transposes using a POWER7 cache model and explicit prefetching

Gabriel Mateescu, Gregory H. Bauer, Robert A. Fiedler

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We develop a matrix transpose approach on the POWER7 architecture based on modeling the memory access latency and cache, and then designing the cache blocking, data alignment, and prefetching techniques that enhance performance.

Original languageEnglish (US)
Title of host publicationPMBS'11 - Proceedings of the 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, Co-located with SC'11
Pages5-6
Number of pages2
DOIs
StatePublished - Dec 1 2011
Event2nd Int. Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS'11, Held as Part of the 24th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC'11 - Seattle, WA, United States
Duration: Nov 13 2011Nov 13 2011

Publication series

NamePMBS'11 - Proceedings of the 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, Co-located with SC'11

Other

Other2nd Int. Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS'11, Held as Part of the 24th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC'11
Country/TerritoryUnited States
CitySeattle, WA
Period11/13/1111/13/11

Keywords

  • Cache
  • Matrix transpose
  • POWER7
  • Prefetching

ASJC Scopus subject areas

  • Hardware and Architecture
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Optimizing matrix transposes using a POWER7 cache model and explicit prefetching'. Together they form a unique fingerprint.

Cite this