Nash Equilibrium Seeking in Noncooperative Games

P Frihauf,M Krstic,T Basar

doi:10.1109/tac.2011.2173412

Abstract

We introduce a non-model based approach for locally stable convergence to Nash equilibria in static, noncooperative games with <formula formulatype="inline" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex Notation="TeX">$N$</tex></formula> players. In classical game theory algorithms, each player employs the knowledge of the functional form of his payoff and the knowledge of the other players' actions, whereas in the proposed algorithm, the players need to measure only their own payoff values. This strategy is based on the extremum seeking approach, which has previously been developed for standard optimization problems and employs sinusoidal perturbations to estimate the gradient. We consider static games with quadratic payoff functions before generalizing our results to games with non-quadratic payoff functions that are the output of a dynamic system. Specifically, we consider general nonlinear differential equations with <formula formulatype="inline" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex Notation="TeX">$N$</tex> </formula> inputs and <formula formulatype="inline" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex Notation="TeX">$N$</tex> </formula> outputs, where in the steady state, the output signals represent the payoff functions of a noncooperative game played by the steady-state values of the input signals. We employ the standard local averaging theory and obtain local convergence results for both quadratic payoffs, where the actual convergence is semi-global, and non-quadratic payoffs, where the potential existence of multiple Nash equilibria precludes semi-global convergence. Our convergence conditions coincide with conditions that arise in model-based Nash equilibrium seeking. However, in our framework the user is not meant to check these conditions because the payoff functions are presumed to be unknown. For non-quadratic payoffs, convergence to a Nash equilibrium is not perfect, but is biased in proportion to the perturbation amplitudes and the higher derivatives of the payoff functions. We quantify the size of these residual biases and confirm their existence numerically in an example noncooperative game. In this example, we present the first application of extremum seeking with projection to ensure that the players' actions remain in a given closed and bounded action set.

Full Text