Decoupled architectures as a low-complexity alternative to out-of-order execution

Neal C. Crago, Sanjay Jeram Patel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we present OUTRIDERHP, a novel implementation of a decoupled architecture that approaches the performance of contemporary out-of-order processors on parallel benchmarks while maintaining low hardware complexity. OUTRIDERHP leverages the compiler to separate a single thread of execution into memory-accessing and memoryconsuming streams that can be executed concurrently, which we call strands. We identify loss-of-decoupling events which cripple performance on traditional decoupled architectures, and design OUTRIDERHP to enable extraction of multiple strands and control speculation which provide superior memory and functional unit latency tolerance. OUTRIDERHP outperforms a baseline in-order architecture by 26-220% and Decoupled Access/Execute by 7-172% when executing parallel benchmarks on an 8-core CMP configuration. OUTRIDERHP performs within 15% of higher-complexity out-of-order cores despite not utilizing large physical register files, dynamic scheduling, and register renaming hardware.

Original languageEnglish (US)
Title of host publicationProceedings - 2011 International Conference on Parallel Architectures and Compilation Techniques, PACT 2011
Pages179-180
Number of pages2
DOIs
StatePublished - 2011
Event20th International Conference on Parallel Architectures and Compilation Techniques, PACT 2011 - Galveston, TX, United States
Duration: Oct 10 2011Oct 14 2011

Publication series

NameParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
ISSN (Print)1089-795X

Other

Other20th International Conference on Parallel Architectures and Compilation Techniques, PACT 2011
Country/TerritoryUnited States
CityGalveston, TX
Period10/10/1110/14/11

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Decoupled architectures as a low-complexity alternative to out-of-order execution'. Together they form a unique fingerprint.

Cite this