Abstract
An adversarial multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. It is assumed that the number of users is unknown, and that users receive zero reward on collision. The users do not coordinate with each other, and an adversary chooses different rewards for different users on the same channel. The proposed algorithm combines the Exp3.P algorithm developed in prior work for single user adversarial bandits with a collision resolution mechanism to achieve sub-linear regret. It is shown that if every user employs the proposed algorithm, the system wide regret is of the order Oleft( {Tfrac{3}{4}} right) over a horizon of time T. The algorithm is then extended to the dynamic case where the number of users in the system evolves over time, and it is shown to lead to sub-linear regret.
Original language | English (US) |
---|---|
Title of host publication | 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 4514-4518 |
Number of pages | 5 |
ISBN (Electronic) | 9781479981311 |
DOIs | |
State | Published - May 2019 |
Event | 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom Duration: May 12 2019 → May 17 2019 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2019-May |
ISSN (Print) | 1520-6149 |
Conference
Conference | 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 |
---|---|
Country | United Kingdom |
City | Brighton |
Period | 5/12/19 → 5/17/19 |
Keywords
- Cognitive radio
- dynamic spectrum access
- multi-armed bandits
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering
Cite this
Adversarial Multi-user Bandits for Uncoordinated Spectrum Access. / Bande, Meghana; Veeravalli, Venugopal Varadachari.
2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 4514-4518 8682263 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - Adversarial Multi-user Bandits for Uncoordinated Spectrum Access
AU - Bande, Meghana
AU - Veeravalli, Venugopal Varadachari
PY - 2019/5
Y1 - 2019/5
N2 - An adversarial multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. It is assumed that the number of users is unknown, and that users receive zero reward on collision. The users do not coordinate with each other, and an adversary chooses different rewards for different users on the same channel. The proposed algorithm combines the Exp3.P algorithm developed in prior work for single user adversarial bandits with a collision resolution mechanism to achieve sub-linear regret. It is shown that if every user employs the proposed algorithm, the system wide regret is of the order Oleft( {Tfrac{3}{4}} right) over a horizon of time T. The algorithm is then extended to the dynamic case where the number of users in the system evolves over time, and it is shown to lead to sub-linear regret.
AB - An adversarial multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. It is assumed that the number of users is unknown, and that users receive zero reward on collision. The users do not coordinate with each other, and an adversary chooses different rewards for different users on the same channel. The proposed algorithm combines the Exp3.P algorithm developed in prior work for single user adversarial bandits with a collision resolution mechanism to achieve sub-linear regret. It is shown that if every user employs the proposed algorithm, the system wide regret is of the order Oleft( {Tfrac{3}{4}} right) over a horizon of time T. The algorithm is then extended to the dynamic case where the number of users in the system evolves over time, and it is shown to lead to sub-linear regret.
KW - Cognitive radio
KW - dynamic spectrum access
KW - multi-armed bandits
UR - http://www.scopus.com/inward/record.url?scp=85069003374&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069003374&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682263
DO - 10.1109/ICASSP.2019.8682263
M3 - Conference contribution
AN - SCOPUS:85069003374
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4514
EP - 4518
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
ER -