TY - GEN
T1 - Arterial traffic control using reinforcement learning agents and information from adjacent intersections in the state and reward structure
AU - Medina, Juan C.
AU - Hajbabaie, Ali
AU - Benekohal, Rahim F.
PY - 2010
Y1 - 2010
N2 - An application that uses reinforcement learning (RL) agents for traffic control along an arterial under high traffic volumes is presented. RL agents were trained using Q learning and a modified version of the state representation that included information on the occupancy of the links from neighboring intersections. The proposed structure also includes a reward that considers potential blockage from downstream intersections (due to saturated conditions), as well as pressure to coordinate the signal response with the future arrival of traffic from upstream intersections. Experiments using microscopic simulation software were conducted for an arterial with 5 intersections under high conflicting volumes, and results were compared with the best settings of coordinated pre-timed phasing. Data showed lower delays and less number of stops with RL agents, as well as a more balanced distribution of the delay among all vehicles in the system. Evidence of coordinated-like behavior was found as the number of stops to traverse the 5 intersections was on average lower than 1.5, and also since the distribution of green times from all intersections was very similar. As traffic approached to capacity, however, delays with the pre-timed phasing were lower than with RL agents, but the agents produced lower maximum delay times and lower maximum number of stops per vehicle. Future research will analyze variable coefficients in the state and reward structures for the system to better cope with a wide variety of traffic volumes, including transitions from oversaturation to undersaturation and vice versa.
AB - An application that uses reinforcement learning (RL) agents for traffic control along an arterial under high traffic volumes is presented. RL agents were trained using Q learning and a modified version of the state representation that included information on the occupancy of the links from neighboring intersections. The proposed structure also includes a reward that considers potential blockage from downstream intersections (due to saturated conditions), as well as pressure to coordinate the signal response with the future arrival of traffic from upstream intersections. Experiments using microscopic simulation software were conducted for an arterial with 5 intersections under high conflicting volumes, and results were compared with the best settings of coordinated pre-timed phasing. Data showed lower delays and less number of stops with RL agents, as well as a more balanced distribution of the delay among all vehicles in the system. Evidence of coordinated-like behavior was found as the number of stops to traverse the 5 intersections was on average lower than 1.5, and also since the distribution of green times from all intersections was very similar. As traffic approached to capacity, however, delays with the pre-timed phasing were lower than with RL agents, but the agents produced lower maximum delay times and lower maximum number of stops per vehicle. Future research will analyze variable coefficients in the state and reward structures for the system to better cope with a wide variety of traffic volumes, including transitions from oversaturation to undersaturation and vice versa.
UR - http://www.scopus.com/inward/record.url?scp=78650502288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650502288&partnerID=8YFLogxK
U2 - 10.1109/ITSC.2010.5624977
DO - 10.1109/ITSC.2010.5624977
M3 - Conference contribution
AN - SCOPUS:78650502288
SN - 9781424476572
T3 - IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC
SP - 525
EP - 530
BT - 13th International IEEE Conference on Intelligent Transportation Systems, ITSC 2010
T2 - 13th International IEEE Conference on Intelligent Transportation Systems, ITSC 2010
Y2 - 19 September 2010 through 22 September 2010
ER -