General-purpose computing on GPUs (GPGPU computing) is becoming widely adopted; however, some GPGPU applications fail to fully utilize GPU resources. In these cases, spatial multitasking better exploits the parallelism offered by GPUs by partitioning the GPU resources among simultaneously-running applications. When one or more such applications have quality-of-service (QoS) requirements, enough resources must be allocated for those applications to satisfy their requirements. Remaining resources can be either disabled to reduce power consumption or used to accelerate other applications. However, we observe that the amount of resources for a QoS application to satisfy its performance requirement is dependent in part upon the co-executing applications. In this paper, we propose a runtime technique to dynamically partition GPU resources between concurrently running applications - at least one of which has a QoS requirement. We demonstrate that the proposed technique can satisfy a 100% QoS requirement while also achieving either a 7W power consumption reduction or a 17.57% performance improvement for co-executing best-effort applications.