TY - JOUR
T1 - Nash equilibrium seeking in noncooperative games
AU - Frihauf, Paul
AU - Krstic, Miroslav
AU - Basar, Tamer
N1 - Funding Information:
Manuscript received August 30, 2010; revised May 15, 2011; accepted September 23, 2011. Date of publication October 25, 2011; date of current version April 19, 2012.This work was supported by the Department of Defense (DoD), Air Force Office of Scientific Research (AFOSR), National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a, and by grants from National Science Foundation, Department of Energy (DOE), and AFOSR. Recommended by Associate Editor J. Grizzle.
PY - 2012/5
Y1 - 2012/5
N2 - We introduce a non-model based approach for locally stable convergence to Nash equilibria in static, noncooperative games with $N$ players. In classical game theory algorithms, each player employs the knowledge of the functional form of his payoff and the knowledge of the other players' actions, whereas in the proposed algorithm, the players need to measure only their own payoff values. This strategy is based on the extremum seeking approach, which has previously been developed for standard optimization problems and employs sinusoidal perturbations to estimate the gradient. We consider static games with quadratic payoff functions before generalizing our results to games with non-quadratic payoff functions that are the output of a dynamic system. Specifically, we consider general nonlinear differential equations with $N$ inputs and $N$ outputs, where in the steady state, the output signals represent the payoff functions of a noncooperative game played by the steady-state values of the input signals. We employ the standard local averaging theory and obtain local convergence results for both quadratic payoffs, where the actual convergence is semi-global, and non-quadratic payoffs, where the potential existence of multiple Nash equilibria precludes semi-global convergence. Our convergence conditions coincide with conditions that arise in model-based Nash equilibrium seeking. However, in our framework the user is not meant to check these conditions because the payoff functions are presumed to be unknown. For non-quadratic payoffs, convergence to a Nash equilibrium is not perfect, but is biased in proportion to the perturbation amplitudes and the higher derivatives of the payoff functions. We quantify the size of these residual biases and confirm their existence numerically in an example noncooperative game. In this example, we present the first application of extremum seeking with projection to ensure that the players' actions remain in a given closed and bounded action set.
AB - We introduce a non-model based approach for locally stable convergence to Nash equilibria in static, noncooperative games with $N$ players. In classical game theory algorithms, each player employs the knowledge of the functional form of his payoff and the knowledge of the other players' actions, whereas in the proposed algorithm, the players need to measure only their own payoff values. This strategy is based on the extremum seeking approach, which has previously been developed for standard optimization problems and employs sinusoidal perturbations to estimate the gradient. We consider static games with quadratic payoff functions before generalizing our results to games with non-quadratic payoff functions that are the output of a dynamic system. Specifically, we consider general nonlinear differential equations with $N$ inputs and $N$ outputs, where in the steady state, the output signals represent the payoff functions of a noncooperative game played by the steady-state values of the input signals. We employ the standard local averaging theory and obtain local convergence results for both quadratic payoffs, where the actual convergence is semi-global, and non-quadratic payoffs, where the potential existence of multiple Nash equilibria precludes semi-global convergence. Our convergence conditions coincide with conditions that arise in model-based Nash equilibrium seeking. However, in our framework the user is not meant to check these conditions because the payoff functions are presumed to be unknown. For non-quadratic payoffs, convergence to a Nash equilibrium is not perfect, but is biased in proportion to the perturbation amplitudes and the higher derivatives of the payoff functions. We quantify the size of these residual biases and confirm their existence numerically in an example noncooperative game. In this example, we present the first application of extremum seeking with projection to ensure that the players' actions remain in a given closed and bounded action set.
KW - Extremum seeking
KW - Nash equilibria
KW - learning
KW - noncooperative games
UR - http://www.scopus.com/inward/record.url?scp=84860441142&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84860441142&partnerID=8YFLogxK
U2 - 10.1109/TAC.2011.2173412
DO - 10.1109/TAC.2011.2173412
M3 - Article
AN - SCOPUS:84860441142
SN - 0018-9286
VL - 57
SP - 1192
EP - 1207
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 5
M1 - 6060862
ER -