TY - GEN
T1 - Sensitivity of reinforcement learning agents to aggregated sensor data in congested traffic networks
AU - Medina, J. C.
AU - Benekohal, R. F.
N1 - Publisher Copyright:
© 2014 American Society of Civil Engineers.
PY - 2014
Y1 - 2014
N2 - Flexible signal timing operation with cycle-free and sequence-free strategies using reinforcement learning has been researched from different fields and applied to transportation networks. Such techniques naturally rely on accurate incoming data for optimal operation. However, the effect of imperfect information received by RL agents in a traffic environment has not been explored in detail and may provide further indication to whether they can be truly suitable for real-world applications. This paper studies this topic in the context of a congested traffic network, where RL agents receive aggregated loop detector data to make decisions, instead of directly observing activations from all vehicles. A case study shows the sensitivity of the agents' performance when data is aggregated to different levels. Aggregation levels are used as a method to represent imperfect information, and the performance of the system is used as an indicator to determine acceptable aggregation for the system to remain operational in oversaturated conditions.
AB - Flexible signal timing operation with cycle-free and sequence-free strategies using reinforcement learning has been researched from different fields and applied to transportation networks. Such techniques naturally rely on accurate incoming data for optimal operation. However, the effect of imperfect information received by RL agents in a traffic environment has not been explored in detail and may provide further indication to whether they can be truly suitable for real-world applications. This paper studies this topic in the context of a congested traffic network, where RL agents receive aggregated loop detector data to make decisions, instead of directly observing activations from all vehicles. A case study shows the sensitivity of the agents' performance when data is aggregated to different levels. Aggregation levels are used as a method to represent imperfect information, and the performance of the system is used as an indicator to determine acceptable aggregation for the system to remain operational in oversaturated conditions.
UR - http://www.scopus.com/inward/record.url?scp=84933567410&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84933567410&partnerID=8YFLogxK
U2 - 10.1061/9780784413586.069
DO - 10.1061/9780784413586.069
M3 - Conference contribution
AN - SCOPUS:84933567410
T3 - T and DI Congress 2014: Planes, Trains, and Automobiles - Proceedings of the 2nd Transportation and Development Institute Congress
SP - 719
EP - 726
BT - T and DI Congress 2014
A2 - Varma, Amiy
A2 - Gosling, Geoffrey D.
PB - American Society of Civil Engineers
T2 - 2nd Transportation and Development Institute Congress - Planes, Trains, and Automobiles: Connections to Future Developments, T and DI 2014
Y2 - 8 June 2014 through 11 June 2014
ER -