TY - JOUR
T1 - Massively Parallel Algorithms for Trace-Driven Cache Simulations
AU - Nicol, David M.
AU - Lubachevsky, Boris D.
N1 - Funding Information:
Manuscript received November 6, 1991; revised July 23, 1993. This work was supported in part by the National Aeronautics and Space Administration (NASA) under Grants NAG-1-1132 and NAS-1-18605, and in part by the National Science Foundation (NSF) under Grant ASC-8819373, and it was initiated during a visit to AT&T Bell Laboratories. Portions of this paper are reprinted by permission from the 6th Workshop on Parallel and Distributed Simulation (PADS92), @ 1992, Simulation Councils, Inc.
PY - 1994/8
Y1 - 1994/8
N2 - This paper considers the use of massively parallel architectures to execute a trace-driven simulation of a single cache set. A method is presented for the Least-Recently-Used (LRU) policy, which, regardless of the set size C, runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/logN processors. We present timings of the second algorithm’s implementation on the MasPar MP-1, a machine with 16 384 processors. A broad class of reference-based line replacement polices are considered, which includes LRU as well as the Least-Frequently-Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in time O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
AB - This paper considers the use of massively parallel architectures to execute a trace-driven simulation of a single cache set. A method is presented for the Least-Recently-Used (LRU) policy, which, regardless of the set size C, runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/logN processors. We present timings of the second algorithm’s implementation on the MasPar MP-1, a machine with 16 384 processors. A broad class of reference-based line replacement polices are considered, which includes LRU as well as the Least-Frequently-Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in time O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
KW - Parallel algorithms
KW - SIMD architecture trace-driven simulation
KW - cache simulation. Leastrecently-used (LRU)
UR - http://www.scopus.com/inward/record.url?scp=0028483610&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0028483610&partnerID=8YFLogxK
U2 - 10.1109/71.298211
DO - 10.1109/71.298211
M3 - Article
AN - SCOPUS:0028483610
SN - 1045-9219
VL - 5
SP - 849
EP - 859
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 8
ER -