Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy

Juan C. Medina, Rahim F. Benekohal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper explores the performance of decentralized reinforcement learning agents with communication capabilities for the operation of traffic signals in an oversaturated network. An explicit coordinating mechanism is implemented as part of the reward structure of the agent using the max-plus algorithm, aiming at improving the network-wide performance. Results from a simulated network with realistic features showed that Q-learning agents could process a greater number of vehicles than optimized signal timings from state-of-practice simulation software TRANSYT7F, even under varying oversaturation conditions. The effect of adding the max-plus algorithm was limited, but towards improved performance in terms of both total throughput and reduced number of stops per vehicle. Ongoing research evaluates potential conditions where the coordination should be emphasized to further enhance results, as well as alternative implementations of the max-plus algorithm.

Original languageEnglish (US)
Title of host publication2012 15th International IEEE Conference on Intelligent Transportation Systems, ITSC 2012
Pages596-601
Number of pages6
DOIs
StatePublished - 2012
Event2012 15th International IEEE Conference on Intelligent Transportation Systems, ITSC 2012 - Anchorage, AK, United States
Duration: Sep 16 2012Sep 19 2012

Publication series

NameIEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC

Other

Other2012 15th International IEEE Conference on Intelligent Transportation Systems, ITSC 2012
Country/TerritoryUnited States
CityAnchorage, AK
Period9/16/129/19/12

ASJC Scopus subject areas

  • Automotive Engineering
  • Mechanical Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy'. Together they form a unique fingerprint.

Cite this