TY - JOUR
T1 - Fine-grain parallelism using multi-core, Cell/BE, and GPU systems
AU - Pratas, Frederico
AU - Trancoso, Pedro
AU - Sousa, Leonel
AU - Stamatakis, Alexandros
AU - Shi, Guochun
AU - Kindratenko, Volodymyr
N1 - Funding Information:
The authors gratefully acknowledge partial funding support from the following institutions: FCT (INESC-ID multi-annual funding) through the PIDDAC Program funds (F. Pratas and L. Sousa); HiPEAC European Network of Excellence on High Performance and Embedded Architecture and Compilation (F. Pratas, L. Sousa, and P. Trancoso); and the German Science Foundation (DFG) under the auspices of the Emmy-Noether program (A. Stamatakis). The authors also acknowledge the following groups and institutions for the resources that have contributed to this research: IT, FCT at U Coimbra for the access to the NVIDIA C2050; SiPS and L2F at INESC-ID, for the use of Intel based systems, NVIDIA GPUs and Sony PS3; CASPER at UCY, for the use of IBM x3650; Bernard Moret for the use of Dell Power Edge M905; The Exelixis Lab, Bioinformatics Unit (I12) at TU Munich for the use of Sun x4600; and Georgia Institute of Technology, its Sony–Toshiba–IBM Center of Competence, and the National Science Foundation, for the use of Cell/BE QS20 and QS22.
PY - 2012/8
Y1 - 2012/8
N2 - Currently, we are facing a situation where applications exhibit increasing computational demands and where a large variety of parallel processor systems are available. In this paper we focus on exploiting fine-grain parallelism for three applications with distinct characteristics: a Bioinformatics application (MrBayes), a Molecular Dynamics application (NAMD), and a database application (TPC-H). We assess, side-by-side, the performance of the three applications on general-purpose multi-core processors, the Cell Broadband Engine (Cell/BE), and Graphics Processing Units (GPU). Our results indicate that application performance depends on the characteristics of the parallel architectures and on the computational requirements of the core functions of the respective applications. For MrBayes the best overall performance is achieved on general-purpose multi-core processors, for NAMD on the Cell/BE, and for TPC-H on GPUs.
AB - Currently, we are facing a situation where applications exhibit increasing computational demands and where a large variety of parallel processor systems are available. In this paper we focus on exploiting fine-grain parallelism for three applications with distinct characteristics: a Bioinformatics application (MrBayes), a Molecular Dynamics application (NAMD), and a database application (TPC-H). We assess, side-by-side, the performance of the three applications on general-purpose multi-core processors, the Cell Broadband Engine (Cell/BE), and Graphics Processing Units (GPU). Our results indicate that application performance depends on the characteristics of the parallel architectures and on the computational requirements of the core functions of the respective applications. For MrBayes the best overall performance is achieved on general-purpose multi-core processors, for NAMD on the Cell/BE, and for TPC-H on GPUs.
KW - Database workloads
KW - Fine-grain parallelism
KW - Multi-core acelerators
KW - Multi-core processors
KW - Performance evaluation
KW - Scientific workloads
UR - http://www.scopus.com/inward/record.url?scp=84862697881&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862697881&partnerID=8YFLogxK
U2 - 10.1016/j.parco.2011.08.002
DO - 10.1016/j.parco.2011.08.002
M3 - Article
AN - SCOPUS:84862697881
SN - 0167-8191
VL - 38
SP - 365
EP - 390
JO - Parallel Computing
JF - Parallel Computing
IS - 8
ER -