TY - GEN
T1 - An empirical study on the vectorization of multimedia applications for multimedia extensions
AU - Ren, Gang
AU - Wu, Peng
AU - Padua, David
PY - 2005
Y1 - 2005
N2 - Multimedia extensions (MME) are architectural extensions to general-purpose processors to boost the performance of multimedia workloads. Today, in-line assembly code, intrinsic functions and library routines are the most common means to program these extensions. A promising alternative is to exploit vectorization technology to automatically generate MME instructions from programs written in standard high-level languages. However, despite the early success of automatic vectorization for traditional vector supercomputers, state-of-the-art vectorizing compilers for multimedia extensions have yet to demonstrate their effectiveness, especially on multimedia workloads. In this paper, we conducted an empirical study on the vectorization of media processing programs for multimedia extensions. Our study identified several new issues that are not handled by traditional vectorizers. These issues arise partly as the result of the unique features of MME architectures, partly due to the characteristics of media processing applications. We proposed several techniques to address some of these issues. We further assessed the effectiveness of our techniques by manually applying them to a set of multimedia programs. In addition, we found that further optimizations after vectorization are essential to benefit from multimedia extensions. In our experiments, 23 of 34 core procedures from the Berkeley Media Benchmark (BMW) were manually vectorized and 14 procedures achieved speedups of 1.10 to 3.39 on a Pentium 4 processor.
AB - Multimedia extensions (MME) are architectural extensions to general-purpose processors to boost the performance of multimedia workloads. Today, in-line assembly code, intrinsic functions and library routines are the most common means to program these extensions. A promising alternative is to exploit vectorization technology to automatically generate MME instructions from programs written in standard high-level languages. However, despite the early success of automatic vectorization for traditional vector supercomputers, state-of-the-art vectorizing compilers for multimedia extensions have yet to demonstrate their effectiveness, especially on multimedia workloads. In this paper, we conducted an empirical study on the vectorization of media processing programs for multimedia extensions. Our study identified several new issues that are not handled by traditional vectorizers. These issues arise partly as the result of the unique features of MME architectures, partly due to the characteristics of media processing applications. We proposed several techniques to address some of these issues. We further assessed the effectiveness of our techniques by manually applying them to a set of multimedia programs. In addition, we found that further optimizations after vectorization are essential to benefit from multimedia extensions. In our experiments, 23 of 34 core procedures from the Berkeley Media Benchmark (BMW) were manually vectorized and 14 procedures achieved speedups of 1.10 to 3.39 on a Pentium 4 processor.
UR - http://www.scopus.com/inward/record.url?scp=33746294477&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33746294477&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2005.94
DO - 10.1109/IPDPS.2005.94
M3 - Conference contribution
AN - SCOPUS:33746294477
SN - 0769523129
SN - 0769523129
SN - 9780769523125
T3 - Proceedings - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
SP - 89b
BT - Proceedings - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
T2 - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
Y2 - 4 April 2005 through 8 April 2005
ER -