Discrete-time mean field games with risk averse-agents

We propose and investigate a discrete-time mean field game model involving risk-averse agents. The model under study is a coupled system of dynamic programming equations with a Kolmogorov equation. The agents' risk aversion is modeled by composite risk measures. The existence of a solution to the coupled system is obtained with a fixed point approach. The corresponding feedback control allows to construct an approximate Nash equilibrium for a related dynamic game with finitely many players.


Introduction
The class of mean field games problem was introduced by J-M. Lasry and P-L. Lions in [17,18,19] and M. Huang, R. Malhamé, and P. Caines in [15], to study interactions among a large population of players. Many developments and applications have been proposed this last decade, in particular in economics modeling and finance; one can refer for example to Y. Achdou and al. [1], O. Guéant, J-M. Lasry and P-L. Lions [13], and P. Cardaliaguet and C.-H. Lehalle [7]. Economic models "à la Cournot", considering interactions between the agents via a price variable, have recently received particular attention, let us mention the works of A. Bensoussan and P.J. Graber [11], J. F. Bonnans, S. Hadikanloo, and L. Pfeiffer [6], Z. Kobeissi [16], and P. J. Graber, V. Ignazio, and A. Neufeld [12].
The specificity of the mean field game of this article is the risk aversion of the involved agents. Here risk aversion is modeled with the help of composite risk measures (also called dynamic risk measures). Mathematically, a risk measure ρ is a map that assigns to a random variable U a real number, which is usually high when U is very volatile. In this way ρ can be used to model the reluctance of a player to face highly uncertain expenses. We refer to the seminal work by P. Artzner, F. Delbaen, J-M. Eber and D. Heath in [3]. We will make use of composite risk measures, the natural extension of risk measures to a multistage framework, see for example the article of A. Shapiro and A. Ruszczyński [24]; for an application to multistage portofolio selection one can refer to A. Shapiro [26].
Let us describe more precisely our coupled system and the obtained results. The coupled system describes a population of identical agents which all optimize a discretetime dynamical system (in a continuous state space). In the model, the associated cost function depends on a variable called belief, which is related to the behavior of the whole group, whence a coupling between a single agent and the population. Assuming that the population is very large, one can consider that an isolated representative agent has no impact on the belief. Therefore his/her behavior can be conveniently described by dynamic programming equations (in which the belief is a parameter). Mathematically, the belief is the probability distribution of the states and controls of all agents at the different time steps of the game; it is described via the Kolmogorov equation. Our first result is an existence result, obtained with a standard fixed point approach. In our second result, we show that an optimal feedback control for the mean field game yields an ε-Nash equilibrium for an N -player dynamic game, where ε → 0 as N → ∞. The proof of this result is based on an estimate of the expectation of the Wasserstein distance between the empirical measure of i.i.d. variables and the law of these variables, obtained by N. Fournier and A. Guillin [10,Theorem 1]. The approach that we follow was proposed by M. Huang, P. Caines, and R. Malhamé in [14].
Discrete-time and continuous-space mean field game models have been studied in different works. The framework that we propose in this article is close to the one of N. Saldi, T. Başar and M. Raginsky [25], in particular, we make use of similar weighted spaces. A few works have already investigated the issue of risk aversion. Most of them model risk sensitivity via exponential utility functions, see for example H. Tembine, Q. Zhu and T. Başar [27]. The case of robust mean field games is investigated in problem (P2) in the work of J. Moon and T. Başar [20]. In many economic situations, risk modeling is of interest, in particular in the banking industry [21]. Our approach can also be relevant in situations where mean field games are used to design telecommunication systems or smart grids; see C. Bertucci et al. [5] and C. Alasseur, I. Ben Tahar and A. Matoussi [2]. For example, in the latter reference, it could be interesting to take into account the risk of individual no-energy situations or collective black-out situations via robust control.
The article is structured as follows. In Section 1 we introduce notations, assumptions, and the system of coupled equations. In Section 2 we interpret this system as a mean field game system with risk averse agents. In Section 3 we establish general technical results that will be helpful in Section 4, where we prove the existence of a solution to the coupled system. Finally in Section 5 we investigate the connection between the coupled system and an N -player game.

Functions
Let C-Lip denote the set of Lipschitz functions of modulus C on R d . We define the p-polynomially weighted space where the dimension d depends on the context, with associated norm Let Q C p ⊂ G C p denote the set of convex mappings f :

Probability measures
Let P(R d ) denote the set of probability measures on R d . Given p ∈ [1, +∞), we define the set of finite p-th order moment measures that we endow with the Rubinstein-Kantorovitch distance, defined by for any µ and ν ∈ P 1 (R d ) (see [28,Particular case 5.15] for more details). We recall that by the Hölder inequality, P p (R d ) ⊆ P 1 (R d ) for any p > 1. Given C > 0, we define We also consider the following sets of beliefs endowed with the Rubinstein-Kantorovitch distances for the product topology, also denoted d 1 .
For any m and ν ∈ P(R d ), we define the convolution product ν * m by for any bounded Borel map h ∈ R d → R. For any m ∈ P(R d ) and for any Borel map g : R d → R d , we define the image measure g m ∈ P(R d ) by for any bounded Borel map h ∈ R d → R d .

Coupled system
Let us first introduce the data of the problem. We consider The running cost : For modeling risk aversion, we consider a family of subsets (Z t ) t∈T such that For any t ∈ T , we define For any t ∈ T , Z t is assumed to be nonempty and convex, thus M t is a nonempty and convex subset of P(R d ).
We propose to study a risk averse mean field game (MFG), taking the form of the following coupled system: for any (t, x) ∈ T × R d . The five unknowns in the above system are • the value function u ∈ (G 2 ) T +1 • the feedback control α ∈ (G 1 ∩ 1-Lip) T • the distribution of states m ∈ (P 2 ) T +1 • the joint distribution of states and controls µ ∈ (P 2 (R 2d )) T • the belief b ∈ B 2 .
Let us describe briefly the coupled system; we will justify it more in detail in Section 2. Equation (MFG,i) is a dynamic programming equation associated with a discrete-time optimal control problem for a representative agent. The belief b appears as a parameter of the equation, since a single agent has no impact on it. The corresponding optimal feedback control α is then given by (MFG,ii). Now, assuming that all agents make use of the feedback control α, the distribution of their state m is described by the Kolmogorov equation (MFG,iii) with initial conditionm.
Our approach for proving the existence of a solution consists in formulating the system (MFG) as a fixed point equation. For this purpose, we consider two mappings. The first one, that we call dynamic programming mapping, assigns to a belief b the solutions u (b) and α (b) to equations (MFG,i) and (MFG,ii), respectively. The second one, the Kolmogorov mapping, assigns to a feedback control α the triplet (m (α), µ (α), b (α)), where m (α), µ (α), and b (α) are the solutions to (MFG,iii), (MFG,iv), and (MFG,v), respectively. These two mappings will be investigated in Section 4. They allow to reformulate the system (MFG) as an equivalent fixed point equation

Assumptions
We state now the assumptions on the data of the problem, in force all along the article. Note that for the results of Section 5 (dealing with the N -player dynamic game), we will need a slightly stronger assumption on the mapping F .
We make use of the same constant C to formulate the different assumptions. In the sequel, the constant C denotes a generic constant depending only on those involved in the assumptions and T ; its value can change from an inequality to the next one. Assumption 1.1. There exists C > 0 such thatm ∈ P C 2 (R d ) and such that for any t ∈ T , ν(t) ∈ P C 2 (R d ).
Assumption 1.2. There exists C > 0 such that for any t ∈ T and for any Z ∈ Z t , and there exists Z ∈ Z t such that Remark 1.3. Assumption 1.2 implies the existence of C > 0 such that The results obtained in Section 4 only require (1.5) to hold. The full Assumption 1.2 will be used in Section 5.
Assumption 1.4. There exists C > 0 such that for any t ∈ T and for any b 1 and b 2 ∈ B 2 , Remark 1.5. In economics or in finance, prices typically depend on the aggregated demand or supply. One could consider for example In this case, if ψ is a C-Lipschitz mapping then for any b 1 and b 2 ∈ B 2 , one has that which implies Assumption 1.4 (iii). Assumption 1.4 (iv) also holds if |ψ| ≤ C.

Interpretation of the coupled system
In Subsection 2.1 we describe the risk averse optimal control problem associated with (MFG,i-ii). In Subsection 2.2 we justify the Kolmogorov equation (MFG,iii).

Risk measures
Let X 0 and (Y t ) t∈T be (T + 1)-independent random variables defined on a probability space (Ω, F, P). Let L(X 0 ) =m and L(Y t ) = ν(t). We define the filtration (F t ) t∈T , where F 0 := σ(X 0 ) is the sigma-algebra generated by X 0 , and F t+1 := σ(X 0 , Y [t] ). We denote for any t ∈T and any p ∈ [1, +∞) the space of F t measurable random variables with finite p-th order moment and value in R d . When the dimension is d = 1, we simplify the notation: Definition 2.1. Given t ∈ T , we say that a mapping ρ t : L 1 t+1 → L 1 t is a conditional risk mapping if it satisfies the following conditions: • (TI) Translation Invariance: For any U ∈ L 1 t+1 and for any V ∈ L 1 t , we have • (PH) Positive Homogeneity: For any α ≥ 0, for any U ∈ L 1 t+1 , we have Quoting [23], the condional risk mapping ρ t (U t+1 ) can be interpreted as a fair one-time F t -measurable charge we would be willing to incur at time t instead of the random futur cost U t+1 .
We fix now a family of conditional risk mappings (ρ t ) t∈T , ρ t : where the random variables U t+1 and ρ t (U t+1 ) are explicitly represented as measurable functions of (x 0 , We set so that ρ t can be expressed in the following form: ] ). This approach would be close to the one developed in [24]. However, in the dynamic programming principle, the state should include x 0 and y [t−1] and therefore its dimension might be too large for practical computations.
Remark 2.3. Given a probability space (Ω , F , P ) and given α ∈ (0, 1], the conditional value at risk (also called expected shortfall or average value at risk) of a random variable where x + = max{0, x} denotes the positive part of any x ∈ R. It has the following dual representation (see [9,Lemma 4.51 and Theorem 4.52]): Therefore, a natural extension of the conditional value at risk to the framework of the article is given by where This particular definition of Z t satisfies Assumption 1.2. We refer to [9, Definition 11.8] and [8, Subsection 2.3.1] for extensions of the conditional value at risk to general filtrations in a discrete-time setting.

Control problem
We consider the following set of controls for any t ∈ T , Given a control A ∈ A, the evolution of the state of the representative player is given by The initial condition is the random variable X 0 fixed previously. In the notation, we do not make explicit the dependence of (X t ) t∈T with respect to A, which is always clear from the context. Note that by induction, X t ∈ L 2 t (Ω, R d ) for any t ∈T . For a given belief b ∈ B 2 , the risk averse multistage cost of the representative agent is given by The corresponding problem is inf As a consequence of the translation invariance property (TI), the problem (P ) can be expressed in a nested form 3) The solution of problem (P ) can be characterized by a dynamic programming approach, that we briefly describe. We refer to [23] for a rigorous presentation. The dynamic programming approach consists, in short, in investigating a family of control problems of same nature as (P ). The problems are parameterized by their initial time t and initial condition x; their value is denoted u(t, x). The function u is characterized by equation (MFG,i), which itself derives from the nested form (2.3). Consider now the function α, solution to (MFG,ii) (we will justify the existence and uniqueness of the "argmin" in this equation in Lemma 4.1). Let (X t ) t∈T be the solution to the closed-loop system and letĀ be defined byĀ Let us briefly justify thatĀ ∈ A. SinceX t is adapted to F t , we also have thatĀ t is F tmeasurable. In addition we show in Lemma 4.2 that α t is 1-Lipschitz. Since the random variables X 0 and (Y t ) t∈T have a bounded second-order moment,Ā t also have a bounded second-order moment and thus,Ā ∈ A.
The following proposition states the optimality of the controlĀ.
where u solves the dynamic programming equation (MFG,i). Moreover, the controlĀ (defined by (2.4)-(2.5)) is the unique solution to Problem (P ).

Kolmogorov equation
Then for any t ∈T , Proof. Let φ be a bounded Borel test function. For any t ∈ T , by independence of X t and Y t we have By definition of the push-forward (1.3) we obtain By definition of convolution (1.2) we have as was to be proved.

Technical lemmas
This section contains independent technical lemmas. The reader only interested in the main results of the article can skip it.
Lemma 3.1. Let p ∈ [1, +∞) and let C > 0. For any m 1 and m 2 in P C p (R d ), the probability measure m 1 * m 2 lies in P 2 p C p (R d ). In addition, given Since the mapping z → R d φ(y + z)dν(y) is non-expansive, we further obtain that which concludes the proof.
Lemma 3.2. Let p ∈ [1, +∞) and let C > 0. For any m ∈ P C p (R d ) and for any Borel map g ∈ G C 1 , the probability measure g m lies in P q p (R d ), with q = 2 p−1 C p (1 + C). In addition, the inequality holds for any m 1 and m 2 in P C p (R d ) and for any Borel maps g 1 and g 2 in G C 1 ∩ C−Lip. Proof. Let m ∈ P C p (R d ) and let g ∈ G C 1 be a Borel map. We have Consider (g 1 , m 1 ) and (g 2 , m 2 ) in G C 1 × P C p (R d ). We have Observing that C −1 φ • g 1 ∈ 1−Lip, we deduce inequality (3.1).
Given a convex function u : R d → R, we define the Moreau envelope V u and the proximal operator prox u of u as follows: In the proofs, we will occasionally consider the map g u : Lemma 3.4. Let R > 0 and let u ∈ Q R 2 (the set was defined in (1.1)). Then | prox u | 2 ∈ G Proof. Let u ∈ Q R 2 . By Proposition 3.3, the map prox u is non-expansive. Thus In addition, from the definition of the proximal operator (3.2), we have Since u ∈ Q R 2 , we deduce that | prox u (0)| 2 ≤ 4R. We further obtain with (3.3) that as was to be proved. Taking the square root of (3.4), we infer that | prox u | ∈ G , where C 2 (R) := (R + 1)(1 + C 1 (R)).
Proof. Let u ∈ Q R 2 . Clearly V u is convex as the infimum with respect to y ∈ R d of the jointly convex map (x, y) → g u (x, y). For any x ∈ R d , we have by definition of V u and prox u . Since u ∈ Q R 2 , we further obtain that Applying Lemma 3.4, we finally obtain that V u ∈ Q Lemma 3.6. Let R > 0. For any u and v in Q R 2 , the inequality holds, where C 3 (R) := 2(1 + C 1 (R)).
Proof. Let u and v in Q R 2 . Observing that g u and g w are 1-strongly convex with respect to their second argument, we have Summing up the two inequalities, we obtain that Combining (3.6) and (3.7) and taking the square root, we obtain (3.5).
Proof. Let u and v in Q R 2 . Recalling the definitions of g u and g v , we have
Lemma 3.8. Let R > 0. For any u ∈ Q R 2 and for any (x, y) ∈ R d × R d ,
Combining the two obtained inequalities and exchanging x and y, we obtain (3.9). Lemma 3.9. Let R > 0 and let M be a subset of P R 2 (R d ). Given u ∈ Q R 2 , consider the mapping Υ[u](x) defined for any x ∈ R d by is Lipschitz continuous with modulus 2(1 + R).
Proof. Let u ∈ Q R 2 . For any ξ ∈ M, the map R d x → R d u(x + y)dξ(y) is convex, as can be easily verified. Thus Υ[u](x) is convex with respect to x, as a supremum of convex maps. Moreover, for any x ∈ R d , we have This proves that Υ[u] ∈ Q C 6 (R) 2 . Consider now v ∈ Q R 2 . We have For any ξ ∈ M, we further have Combining (3.10) and (3.11), we deduce that as was to be proved.

Existence result
In this section we prove the main existence result. We first investigate the continuity of the dynamic programming mapping and the continuity of the Kolmogorov mapping introduced in Subsection 1.2.

Dynamic Programming mapping
Given a belief b ∈ B 2 , let us recall that u (·, ·, b) is the solution to (MFG,i) and α · (·, b) the solution to (MFG,ii). For conveniency, we introduce an intermediate mappingū (·, ·, b), defined byū for any t ∈ T . This allows to rewrite equations (MFG,i-ii) as follows: The first step consists in rewriting these equations in a functional form, with the help of the Moreau envelope and the proximal operator (introduced in (3.2)). In addition, we justify the existence and uniqueness of the minimizer in the right-hand side of (4.2).
For any t ∈ T , the map u (t, ·, b) is convex and for any x ∈ R d , the minimizer in (MFG,ii) is unique. Moreover, Proof. We proceed by backward induction. The terminal condition u (T, ·, b) = F (T, ·, b) and Assumption 1.4 (i) yield the convexity of u (T, ·, b). Let t ∈ T . Suppose that u (t + 1, ·, b) is convex. Then by Lemma 3.9,ū (t + 1, ·, b) is also convex. By the change of variable y = x + a, the dynamic programming equation (MFG,i) can be written as follows: This proves (4.3). Moreover, u (t, ·, b) is convex, as a consequence of Assumption 1.4 (i) and Lemma 3.5. Besides the unique minimizer in (4.5) is and therefore, the unique minimizer in (MFG,ii) is y * − x, which proves (4.4). The lemma is proved.
For any t ∈T , we have and for any t ∈ T , we have α t (·, b) ∈ G Cα 1 ∩ 1−Lip, (4.7) for some positive constants C α and C u independent of t and b.
Proof. In the proof, all constants C are independent of b. Let us prove (4.6) by backward induction. The terminal condition u (T, ·, b) = F (T, ·, b) and Assumption 1.4 (i) imply that u (T, ·, b) ∈ Q C 2 , for some constant C > 0 (independent of b). Let t ∈ T . Suppose that u (t + 1, ·, b) ∈ Q C 2 . Then by Lemma 3.9 and relation 1.5, we haveū (t, ·, b) ∈ Q C 2 . Recall that by Lemma 4.1, we have By Assumptions 1.4 (i) and (iv), Using again Assumption 1.4 (iv) and Lemma 3.5, we obtain that Vū (t+1,·,b) (·−P (t, b)) ∈ Q C 2 . Therefore, the right-hand side of (4.8) lies in Q C 2 and finally, u (t, ·, b) ∈ Q C 2 , where C is independent of b. Let us prove (4.7). By Lemma 4.1, we have (4.9) We already know thatū (t + 1, ·, b) ∈ Q C 2 . Moreover, by Assumption 1.4 (iv), P (t, b) is bounded. Therefore, by Lemma 3.4, proxū (t+1,·,b) (· − P (t, b)) ∈ G C 1 . Then it is easy to show that α t (·, b) ∈ G C 1 , where again, C does not depend on b. Finally, α (t, ·, b) is non-expansive as a consequence of (4.9) and Proposition 3.3. The lemma is proved. Lemma 4.3. There exists C > 0 such that for any (t, b 1 Proof. In the proof, all constants C are independent of b 1 and b 2 . We proceed by backward induction. By Assumption 1.4 (iii) and by the terminal condition u (T, ·, b) = F (T, ·, b), inequality (4.10) holds true for t = T . Let t ∈ T . Suppose that for some positive constant C > 0 independent of b 1 and b 2 . By Assumption 1.3 and Lemma 3.9, we deduce that By Lemma 4.1, we have It remains to bound a 1 (t, ·, b 1 , b 2 ), a 2 (t, ·, b 1 , b 2 ), and a 3 (t, ·, b 1 , b 2 ) in G C 2 . We deduce from Lemma 3.7, Assumption 1.4 (iv), and estimate (4.12), that Then by Lemma 3.8 and Assumption 1.4 (iv), we have Finally by Assumption 1.4 (ii-iv), we have Then combining (4.13) and the three estimates of a 1 , a 2 , and a 3 , we obtain that which concludes the proof.

Kolmogorov mapping
We study now the Kolmogorov mapping where (m , µ , b ) is the solution to (MFG,iii-v).
Lemma 4.5. There exists C b > 0 such that for any α ∈ (G Cα In addition the three mappings m , µ and b are continuous. Proof. Let α ∈ (G Cα 1 ∩ 1−Lip) T . All constants C in the proof are independent of α. Let us first prove by induction that for any t ∈T , there exists a constant C > 0 independent of α such that m (t, ·, α) ∈ P C 2 (R d ) and such that, m (t, ·, α) is continuous with respect to α. The claim is clear for t = 0, since m (0, ·, α) =m ∈ P C 2 (R d ), by Assumption 1.1. Now, let us assume that the claim holds true for some t ∈ T . We recall that Since ν(t) ∈ P C 2 (R d ) (by Assumption 1.1) and since α t ∈ G Cα 1 ∩ 1−Lip, we obtain with Lemma 3.1 and Lemma 3.2 that m (t + 1, ·, α) ∈ P C 2 (R d ) and that m (t + 1, ·, α) is a continuous function of α, by composition.
We deduce from Lemma 3.2 that µ (t, ·, α) ∈ P C 2 (R 2d ) and that µ (t, ·, α) is a continuous function of α, by composition. It immediately follows that b (α) ∈ B C 2 and that b is continuous.

Existence of equilibrium
We are ready to prove the existence of a solution of system (MFG). The proof relies on the Schauder fixed point theorem, that we first recall. Theorem 4.6. (Schauder) Let C be a convex and compact set in a Banach space X, and let T : C → C be a continuous mapping. Then T has a fixed point, i.e. there exists x ∈ C such that T (x) = x.
where C u , C α and C b are the constants obtained in Lemma 4.2 and Lemma 4.5.

Connection with a finite player game
In this section we establish a connection between the coupled system (MFG) and a dynamic game with N players. More precisely, we fix a solution (ū,ᾱ,m,μ,b) of system (MFG) and consider the situation where each of the N players adopts the feedbackᾱ. We show that this situation is an ε-Nash equilibrium for the N -player game and we quantify the rate of convergence of ε to 0 as N goes to infinity.
To show this, the following restriction on Assumption 1.4 (ii) will be required, in particular to prove Lemma 5.12.
Assumption 5.1. There exists C > 0 such that for any t ∈ T and for any b 1 and b 2 in We have already fixed a solution to system (MFG), now we also fix the number of players N ; all constants C appearing in the sequel are independent of N .

Formulation of the game
Consider a probability space (Ω, F , P). Let (X i 0 ) i∈N be i.i.d. random variables with law L(X i 0 ) =m. Let (Y i t ) i∈N ,t∈T be independent random variables, independent of (X i 0 ) i∈N , with law L(Y i t ) = ν(t). We denote ν(t) := N i=1 ν(t). We define the filtration (F t ) t∈T as follows: F 0 := σ(X 0 ) is the sigma-algebra generated by X 0 , F t+1 := σ(X 0 , Y [t] ). In this section we denote L p t (Ω, R d ) := L p (Ω, F t , P, R d ), the space of F t measurable random variables with finite p-th order moment and value in R d . When the dimension is d = 1, we simplify the notation L p t = L p t (Ω, R). For any t ∈ T , we consider the control set A.
For any t ∈ T and for any constant C > 0 we denote A C t the set of controls A ∈ A t such that Ω |A(ω)| 2 dP(ω) ≤ C and we set A C := A C 0 × · · · × A C T −1 . The control of player i ∈ N is an adapted stochastic process A i ∈ A, whose associated trajectory (X i t [A i ]) t∈T is defined by the following state equation There exists C > 0 (depending on R) such that for any i ∈ N and for any Given A ∈ A N , we define the random empirical measure of the positions and the random empirical joint measure of the positions and actions of players by where δ denotes the Dirac measure. We set . For any i ∈ N and for any t ∈ T , we define the individual conditional risk measure We define the set s., Z ∈ Z t . Then ρ i t can be expressed in the following form: In addition we have that where ν −i (t) := N j∈N \{i} ν(t). Then (ρ i t ) t∈T is a family of conditional risk mappings. We define the associated individual composite risk measure ρ i : Here players are risk averse with respect to their individual noise only. For any A ∈ A N the cost of the player i ∈ N is given by (t, X i Definition 5.3. Let ε ≥ 0. We say that an N -uplet A ∈ A N is an ε-Nash equilibrium for the N -player game if for any i ∈ N , For ε = 0, we recover the usual definition of a Nash equilibrium.

An approximate Nash equilibrium
For any player i ∈ N , we denote by (X i t ) t∈T the solution to the closed-loop system We define the controlĀ i ∈ A byĀ i t =ᾱ t (X i t ).
SinceX i t is adapted to F t , the controlĀ i t is also F t -measurable. Moreover,ᾱ t is 1-Lipschitz and the random variables X 0 and (Y t ) t∈T have a bounded second-order moment, thusĀ i ∈ A. In addition, by Proposition 2.4,Ā i minimizes the following cost J i : Finally we setĀ = (Ā 1 , . . . ,Ā N ). The following result states thatĀ is an ε-Nash equilibrium.
Theorem 5.4. Let ξ ∈ (0, 1/2). There exists a constant C > 0, independent of N , such that the N -upletĀ defined above is an ε-Nash equilibrium with In addition we have that The proof of the theorem can be found at the end of Subsection 5.3 (page 23), which contains technical intermediate lemmas. They rely on the following result.
Proof. All constants C in the proof are independent of U . Recall the definition of π i t , introduced in the proof of Lemma 5.6. We prove by backward induction that for any t ∈T , there exists C > 0 such that for any U ∈ L 1 T , The claim is trivial for t = T . Let t ∈ T . Assume that the claim holds true for t + 1. We first observe that for any U ∈ L 1 t+1 , a.s., (5.9) as a direct consequence of Assumption 1.2. It follows with the monotonicity of ρ i t that Similarly we prove that π i t (|U |) ≥ 1 C E |U | F t a.s. Recalling that ρ i (U ) = E [π 0 (U )], we finally obtain (5.8).
The following lemma is an estimate of the second-order moment of suboptimal controls (for problem (5.2)).
Lemma 5.9. There exists C > 0 such that for any i ∈ N , if A i satisfies Proof. Let i ∈ N and let A i satisfy (5.10). All constants C in the proof are independent of A i . We have Therefore, We need now to bound J i,N ( A i ,Ā −i ) from below. We obtain by using successively Lemmma 5.8, Assumptions 1.4 (i) and (iv), and Young's inequality that We deduce then from (5.11) and (5.12) that E T −1 t=0 | A i t | 2 ≤ C, which concludes the proof.
In the following we fix a constant c > 0 such that the result of Lemma 5.9 holds and such thatĀ i ∈ A c for any i ∈ N . Let b and b in B 2 , for any (t, t , x) ∈ T ×T × R  In the following lemma we study the convergence of the empirical belief to the reference beliefb ∈ B 2 .
Lemma 5.11. There exists C > 0 such that for any i ∈ N and for any A i ∈ A c , (5.13)

Conclusion
This paper has studied a mean field game model with risk averse agents, and provided a framework under which an equilibrium holds, for general composite risk measures and congestion terms. The specific structure of the integral cost of the agents has been exploited in order to rewrite the dynamic programming equations in a functional form (using the Moreau envelope and the proximal operator). In that way, the coupled system could be formulated as an equivalent fixed point equation, yielding the existence of a solution. Regularity properties have been obtained for risk averse agents. This has allowed to show that an optimal feedback control (for the mean field game) results in an ε-Nash equilibrium for a related dynamic game with N players. Future work could focus on the uniqueness of the Nash equilibrium with contraction arguments and smallness assumptions on the coupling terms. In this work, agents risk averse with respect to their own noise have been considered; investigating a mean field game model with common noise and risk averse agents would be of particular interest.