The communication and synchronization overhead inherent in parallel processing can lead to situations where adding processors to the solution method actually increases execution time. Problem type, problem size, and architecture type all affect the optimal number of processors to employ. In this paper we examine the numerical solution of an elliptic partial differential equation in order to study the relationship between problem size and architecture. The equation's domain is discretized into n2 grid points which are divided into partitions and mapped onto the individual processor memories. We analytically quantify the relationships among grid size, stencil type, partitioning strategy processor execution time, and communication network type. In doing so, we determine the optimal number of processors to assign to the solution (and hence the optimal speedup), and identify (i) the smallest grid size which fully benefits from using all available processors, (ii) the leverage on performance given by increasing processor speed or communication network speed, and (iii) the suitability of various architectures for large numerical problems. Finally, we compare the predictions of our analytic model with measurements from a multiprocessor and find that the model accurately predicts performance.
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence