Memory Performance and Bottlenecks in Multicore and GPU Architectures

Matheus S. Serpa, Francis B. Moreira, Philippe O.A. Navaux, Eduardo H.M. Cruz, Matthias Diener, Dalvan Griebler, Luiz Gustavo Fernandes

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Nowadays, there are several different architectures available not only for the industry, but also for normal consumers. Traditional multicore processors, GPUs, accelerators such as the Sunway SW26010, or even energy efficiency-driven processors such as the ARM family, present very different architectural characteristics. This wide range of characteristics presents a challenge for the developers of applications. Developers must deal with different instruction sets, memory hierarchies, or even different programming paradigms when programming for these architectures. Therefore, the same application can perform well when executing on one architecture, but poorly on another architecture. To optimize an application, it is important to have a deep understanding of how it behaves on different architectures. The related work in this area mostly focuses on a limited analysis encompassing execution time and energy. In this paper, we perform a detailed investigation on the impact of the memory subsystem of different architectures, which is one of the most important aspects to be considered. For this study, we performed experiments in the Broadwell CPU and Pascal GPU, using applications from the Rodinia benchmark suite. In this way, we were able to understand why an application performs well on one architecture and poorly on others.

Original languageEnglish (US)
Title of host publicationProceedings - 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages233-236
Number of pages4
ISBN (Electronic)9781728116440
DOIs
StatePublished - Mar 19 2019
Event27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019 - Pavia, Italy
Duration: Feb 13 2019Feb 15 2019

Publication series

NameProceedings - 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019

Conference

Conference27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019
Country/TerritoryItaly
CityPavia
Period2/13/192/15/19

Keywords

  • Cache memory
  • HPC
  • Manycore systems
  • Memory subsystem
  • Performance evaluation

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Memory Performance and Bottlenecks in Multicore and GPU Architectures'. Together they form a unique fingerprint.

Cite this