TY - GEN
T1 - Micro-Armed Bandit
T2 - 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
AU - Gerogiannis, Gerasimos
AU - Torrellas, Josep
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/28
Y1 - 2023/10/28
N2 - Online Reinforcement Learning (RL) has been adopted as an effective mechanism in various decision-making problems in microarchitecture. Its high adaptability and the ability to learn at runtime are attractive characteristics in microarchitecture settings. However, although hardware RL agents are effective, they suffer from two main problems. First, they have high complexity and storage overhead. This complexity stems from decomposing the environment into a large number of states and then, for each of these states, bookkeeping many action values. Second, many RL agents are engineered for a specific application and are not reusable. In this work, we tackle both of these shortcomings by designing an RL agent that is both lightweight and reusable across different microarchitecture decision-making problems. We find that, in some of these problems, only a small fraction of the action space is useful in a given time window. We refer to this property as temporal homogeneity in the action space. Motivated by this property, we design an RL agent based on Multi-Armed Bandit algorithms, the simplest form of RL. We call our agent Micro-Armed Bandit. We showcase our agent in two use cases: data prefetching and instruction fetch in simultaneous multithreaded (SMT) processors. For prefetching, our agent outperforms non-RL prefetchers Bingo and MLOP by 2.6% and 2.3% (geometric mean), respectively, and attains similar performance as the state-of-the-art RL prefetcher Pythia - with the dramatically lower storage requirement of only 100 bytes. For SMT instruction fetch, our agent outperforms the Hill Climbing method by 2.2% (geometric mean).
AB - Online Reinforcement Learning (RL) has been adopted as an effective mechanism in various decision-making problems in microarchitecture. Its high adaptability and the ability to learn at runtime are attractive characteristics in microarchitecture settings. However, although hardware RL agents are effective, they suffer from two main problems. First, they have high complexity and storage overhead. This complexity stems from decomposing the environment into a large number of states and then, for each of these states, bookkeeping many action values. Second, many RL agents are engineered for a specific application and are not reusable. In this work, we tackle both of these shortcomings by designing an RL agent that is both lightweight and reusable across different microarchitecture decision-making problems. We find that, in some of these problems, only a small fraction of the action space is useful in a given time window. We refer to this property as temporal homogeneity in the action space. Motivated by this property, we design an RL agent based on Multi-Armed Bandit algorithms, the simplest form of RL. We call our agent Micro-Armed Bandit. We showcase our agent in two use cases: data prefetching and instruction fetch in simultaneous multithreaded (SMT) processors. For prefetching, our agent outperforms non-RL prefetchers Bingo and MLOP by 2.6% and 2.3% (geometric mean), respectively, and attains similar performance as the state-of-the-art RL prefetcher Pythia - with the dramatically lower storage requirement of only 100 bytes. For SMT instruction fetch, our agent outperforms the Hill Climbing method by 2.2% (geometric mean).
KW - Machine Learning for Architecture
KW - Microarchitecture
KW - Multi-Armed Bandits
KW - Prefetching
KW - Reinforcement Learning
KW - Simultaneous Multithreading
UR - http://www.scopus.com/inward/record.url?scp=85183458826&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183458826&partnerID=8YFLogxK
U2 - 10.1145/3613424.3623780
DO - 10.1145/3613424.3623780
M3 - Conference contribution
AN - SCOPUS:85183458826
T3 - Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
SP - 698
EP - 713
BT - Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
PB - Association for Computing Machinery
Y2 - 28 October 2023 through 1 November 2023
ER -