Impact of loop granularity and self-preemption on the performance of loop parallel applications on a multiprogrammed shared-memory multiprocessor

C. Natarajan, S. Sharma, Ravishankar K Iyer

Research output: Contribution to journalConference article

Abstract

This study uses real system measurements to investigate the relationships between loop granularity, parallel loop distribution and barrier wait times, and their impact on the multiprogramming performance of loop parallel applications on the CEDAR shared-memory multiprocessor The overhead due to multiprogramming varies from 5% for applications with large loop granularity to 140% for applications with very fine-gram loops. This is because applications with fine-gram loops have unequal parallel work distribution among the clusters m multiprogrammed environments, while the parallel work in applications with large loop granularity is equally distributed. Moreover, increased barrier wait times of the mam task and wait-for-work times of the helper tasks also contribute to the multi-programming performance degradation of the fine-grain loop parallel applications. We propose and implement a self-preemption technique to address the problem of met eased barrier wait times and wait-for-work times. Using this technique, the overhead due to multiprogramming is reduced by as much as 100%, and speedups of 1.1 to 1.7 are obtained.

Original languageEnglish (US)
Article number5727781
JournalProceedings of the International Conference on Parallel Processing
Volume2
DOIs
StatePublished - Jan 1 1994
Event23rd International Conference on Parallel Processing, ICPP 1994 - Raleigh, NC, United States
Duration: Aug 15 1994Aug 19 1994

Fingerprint

Shared-memory multiprocessors
Preemption
Parallel Applications
Granularity
Multiprogramming
Data storage equipment
Measurement System
Unequal
Degradation
Programming
Vary

ASJC Scopus subject areas

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Cite this

@article{86ed5ed00b0a4ce58845d51826404a32,
title = "Impact of loop granularity and self-preemption on the performance of loop parallel applications on a multiprogrammed shared-memory multiprocessor",
abstract = "This study uses real system measurements to investigate the relationships between loop granularity, parallel loop distribution and barrier wait times, and their impact on the multiprogramming performance of loop parallel applications on the CEDAR shared-memory multiprocessor The overhead due to multiprogramming varies from 5{\%} for applications with large loop granularity to 140{\%} for applications with very fine-gram loops. This is because applications with fine-gram loops have unequal parallel work distribution among the clusters m multiprogrammed environments, while the parallel work in applications with large loop granularity is equally distributed. Moreover, increased barrier wait times of the mam task and wait-for-work times of the helper tasks also contribute to the multi-programming performance degradation of the fine-grain loop parallel applications. We propose and implement a self-preemption technique to address the problem of met eased barrier wait times and wait-for-work times. Using this technique, the overhead due to multiprogramming is reduced by as much as 100{\%}, and speedups of 1.1 to 1.7 are obtained.",
author = "C. Natarajan and S. Sharma and Iyer, {Ravishankar K}",
year = "1994",
month = "1",
day = "1",
doi = "10.1109/ICPP.1994.117",
language = "English (US)",
volume = "2",
journal = "Proceedings of the International Conference on Parallel Processing",
issn = "0190-3918",

}

TY - JOUR

T1 - Impact of loop granularity and self-preemption on the performance of loop parallel applications on a multiprogrammed shared-memory multiprocessor

AU - Natarajan, C.

AU - Sharma, S.

AU - Iyer, Ravishankar K

PY - 1994/1/1

Y1 - 1994/1/1

N2 - This study uses real system measurements to investigate the relationships between loop granularity, parallel loop distribution and barrier wait times, and their impact on the multiprogramming performance of loop parallel applications on the CEDAR shared-memory multiprocessor The overhead due to multiprogramming varies from 5% for applications with large loop granularity to 140% for applications with very fine-gram loops. This is because applications with fine-gram loops have unequal parallel work distribution among the clusters m multiprogrammed environments, while the parallel work in applications with large loop granularity is equally distributed. Moreover, increased barrier wait times of the mam task and wait-for-work times of the helper tasks also contribute to the multi-programming performance degradation of the fine-grain loop parallel applications. We propose and implement a self-preemption technique to address the problem of met eased barrier wait times and wait-for-work times. Using this technique, the overhead due to multiprogramming is reduced by as much as 100%, and speedups of 1.1 to 1.7 are obtained.

AB - This study uses real system measurements to investigate the relationships between loop granularity, parallel loop distribution and barrier wait times, and their impact on the multiprogramming performance of loop parallel applications on the CEDAR shared-memory multiprocessor The overhead due to multiprogramming varies from 5% for applications with large loop granularity to 140% for applications with very fine-gram loops. This is because applications with fine-gram loops have unequal parallel work distribution among the clusters m multiprogrammed environments, while the parallel work in applications with large loop granularity is equally distributed. Moreover, increased barrier wait times of the mam task and wait-for-work times of the helper tasks also contribute to the multi-programming performance degradation of the fine-grain loop parallel applications. We propose and implement a self-preemption technique to address the problem of met eased barrier wait times and wait-for-work times. Using this technique, the overhead due to multiprogramming is reduced by as much as 100%, and speedups of 1.1 to 1.7 are obtained.

UR - http://www.scopus.com/inward/record.url?scp=33747759353&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33747759353&partnerID=8YFLogxK

U2 - 10.1109/ICPP.1994.117

DO - 10.1109/ICPP.1994.117

M3 - Conference article

AN - SCOPUS:33747759353

VL - 2

JO - Proceedings of the International Conference on Parallel Processing

JF - Proceedings of the International Conference on Parallel Processing

SN - 0190-3918

M1 - 5727781

ER -