TY - JOUR
T1 - Workload-Aware Optimal Power Allocation on Single-Chip Heterogeneous Processors
AU - Jang, Jae Young
AU - Wang, Hao
AU - Kwon, Euijin
AU - Lee, Jae W.
AU - Kim, Nam Sung
N1 - Funding Information:
This work is supported by the Ministry of Science, ICT & Future Planning (MSIP) under the "IT R&D program of MSIP/KEIT" (KI001810041244), "Research Project on High Performance and Scalable Manycore Operating System" (#14-824-09-011), and "Basic Science Research Program" (NRF-2014R1A1A1005894), and US National Science Foundation (NSF) CNS-1217102. Jae W. Lee is the corresponding author of this paper.
Publisher Copyright:
© 2015 IEEE.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - As technology scales below 32 nm, manufacturers began to integrate both CPU and GPU cores in a single chip, i.e., single-chip heterogeneous processor (SCHP), to improve the throughput of emerging applications. In SCHPs, the CPU and the GPU share the total chip power budget while satisfying their own power constraints, respectively. Consequently, to maximize the overall throughput and/or power efficiency, both power budget and workload should be judiciously allocated to the CPU and the GPU. In this paper, we first demonstrate that optimal allocation of power budget and workload to the CPU and the GPU can provide 13 percent higher throughput than the optimal allocation of workload alone for a single-program workload scenario. Second, we also demonstrate that asymmetric power allocation considering per-program characteristics for a multi-programmed workload scenario can provide 9 percent higher throughput or 24 percent higher power efficiency than the even power allocation per program depending on the optimization objective. Last, we propose effective runtime algorithms that can determine near-optimal or optimal combinations of workload and power budget partitioning for both single- and multi-programmed workload scenarios; the runtime algorithms can achieve 96 and 99 percent of the maximum achievable throughput within 5-8 and 3-5 kernel invocations for single- and multi-programmed workload cases, respectively.
AB - As technology scales below 32 nm, manufacturers began to integrate both CPU and GPU cores in a single chip, i.e., single-chip heterogeneous processor (SCHP), to improve the throughput of emerging applications. In SCHPs, the CPU and the GPU share the total chip power budget while satisfying their own power constraints, respectively. Consequently, to maximize the overall throughput and/or power efficiency, both power budget and workload should be judiciously allocated to the CPU and the GPU. In this paper, we first demonstrate that optimal allocation of power budget and workload to the CPU and the GPU can provide 13 percent higher throughput than the optimal allocation of workload alone for a single-program workload scenario. Second, we also demonstrate that asymmetric power allocation considering per-program characteristics for a multi-programmed workload scenario can provide 9 percent higher throughput or 24 percent higher power efficiency than the even power allocation per program depending on the optimization objective. Last, we propose effective runtime algorithms that can determine near-optimal or optimal combinations of workload and power budget partitioning for both single- and multi-programmed workload scenarios; the runtime algorithms can achieve 96 and 99 percent of the maximum achievable throughput within 5-8 and 3-5 kernel invocations for single- and multi-programmed workload cases, respectively.
KW - GPU
KW - Single-chip heterogeneous processor
KW - dynamic voltage and frequency scaling
KW - multicores
KW - runtime system
UR - http://www.scopus.com/inward/record.url?scp=84969921307&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969921307&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2015.2453965
DO - 10.1109/TPDS.2015.2453965
M3 - Article
AN - SCOPUS:84969921307
SN - 1045-9219
VL - 27
SP - 1838
EP - 1851
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 6
M1 - 7152947
ER -