TY - GEN
T1 - QoS-aware dynamic resource allocation for spatial-multitasking GPUs
AU - Aguilera, Paula
AU - Morrow, Katherine
AU - Kim, Nam Sung
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - General-purpose computing on GPUs (GPGPU computing) is becoming widely adopted; however, some GPGPU applications fail to fully utilize GPU resources. In these cases, spatial multitasking better exploits the parallelism offered by GPUs by partitioning the GPU resources among simultaneously-running applications. When one or more such applications have quality-of-service (QoS) requirements, enough resources must be allocated for those applications to satisfy their requirements. Remaining resources can be either disabled to reduce power consumption or used to accelerate other applications. However, we observe that the amount of resources for a QoS application to satisfy its performance requirement is dependent in part upon the co-executing applications. In this paper, we propose a runtime technique to dynamically partition GPU resources between concurrently running applications - at least one of which has a QoS requirement. We demonstrate that the proposed technique can satisfy a 100% QoS requirement while also achieving either a 7W power consumption reduction or a 17.57% performance improvement for co-executing best-effort applications.
AB - General-purpose computing on GPUs (GPGPU computing) is becoming widely adopted; however, some GPGPU applications fail to fully utilize GPU resources. In these cases, spatial multitasking better exploits the parallelism offered by GPUs by partitioning the GPU resources among simultaneously-running applications. When one or more such applications have quality-of-service (QoS) requirements, enough resources must be allocated for those applications to satisfy their requirements. Remaining resources can be either disabled to reduce power consumption or used to accelerate other applications. However, we observe that the amount of resources for a QoS application to satisfy its performance requirement is dependent in part upon the co-executing applications. In this paper, we propose a runtime technique to dynamically partition GPU resources between concurrently running applications - at least one of which has a QoS requirement. We demonstrate that the proposed technique can satisfy a 100% QoS requirement while also achieving either a 7W power consumption reduction or a 17.57% performance improvement for co-executing best-effort applications.
UR - http://www.scopus.com/inward/record.url?scp=84897831907&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84897831907&partnerID=8YFLogxK
U2 - 10.1109/ASPDAC.2014.6742976
DO - 10.1109/ASPDAC.2014.6742976
M3 - Conference contribution
AN - SCOPUS:84897831907
SN - 9781479928163
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 726
EP - 731
BT - 2014 19th Asia and South Pacific Design Automation Conference, ASP-DAC 2014 - Proceedings
T2 - 2014 19th Asia and South Pacific Design Automation Conference, ASP-DAC 2014
Y2 - 20 January 2014 through 23 January 2014
ER -