The coordination of large numbers of processors to execute a parallel program as fast as possible is of key importance to the design and efficient use of parallel processor systems. Issues in program parallelism and scheduling for parallel processor systems are examined, with particular emphasis on processor assignment to parallel loops. Optimal processor assignment algorithms are presented for simple and complex nested parallel loops. These algorithms can be applied at compile-time, or can be implemented as hardware modules to solve the processor assignment problem optimally at run-time. Speedup measurements for EISPACK and IEEE DSP subroutines that result from the optimal assignment of processors to parallel loops are also presented. These measurements indicate that optimal assignments result in almost linear speedups on parallel processor machines with a few tens of processors, and significantly high speedups for machines with hundreds or thousands of processors.