DYNAMIC OPTIMIZATION PROBLEMS FOR MEAN-FIELD STOCHASTIC LARGE-POPULATION SYSTEMS

. This paper considers dynamic optimization problems for a class of control average mean-ﬁeld stochastic large-population systems. For each agent, the state system is governed by a linear mean-ﬁeld stochastic diﬀerential equation with individual noise and common noise, and the weight coeﬃcients in the corresponding cost functional can be indeﬁnite. The decentralized optimal strategies are characterized by stochastic Hamiltonian system, which turns out to be an algebra equation and a mean-ﬁeld forward-backward stochastic diﬀerential equation. Applying the decoupling method, the feedback representation of decentralized optimal strategies is further obtained through two Riccati equations. The solvability of stochastic Hamiltonian system and Riccati equations under indeﬁnite condition is also derived. The explicit structure of the control average limit and the related mean-ﬁeld Nash certainty equivalence equation systems are discussed by some separation techniques. Moreover, the decentralized optimal strategies are proved to satisfy the approximate Nash equilibrium property. The good performance of the proposed theoretical results is illustrated by a practical example from the engineering ﬁeld.


Introduction
In recent years, the modeling and analysis for dynamics systems with a large number of agents, also known as large-population systems or large-scale systems, has been gained intensive and consistent attention from various fields, including social science, operational research, mathematical finance, economics and engineering, etc.The agents in large-population systems are individually negligible at the microscopic level but their collective behaviors are significant at the macroscopic level, which cannot be ignored.The most prominent feature of large-population systems is that there exists a coupling structure in the state system and cost functional.Since and cost functional, which can vary with respect to admissible controls.Furthermore, it will be proved later that the decentralized optimal strategies satisfy the -Nash equilibrium property while the counterpart in meanfield type control results in a franchise equilibrium, which can refer to Carmona and Delarue [6] for more details.
To guarantee the well-posedness of problems, all mentioned LQ MFGs and mean-field type control problems assume standard positive definite condition.Different from the deterministic LQ problem, the state and control weight coefficients in the cost functional of stochastic LQ setting are allowed to be indefinite.The interesting and surprising phenomenon quickly attracted widespread attention.Chen, Li, and Zhou [7] used the Riccati equation to study a kind of indefinite stochastic LQ problems earlier.Since then, the solvability of Riccati equation has always been the core difficulty in the study of indefinite problems, see for example Qian and Zhou [31] and Huang and Yu [20].A new method called equivalent cost functional was proposed by Yu [35] to study an indefinite stochastic LQ problem with random coefficients.Li, Li, and Yu [28] advanced the relaxed compensator to solve indefinite mean-field LQ stochastic optimal control problems directly.The above studies have enriched the development of indefinite stochastic LQ control problems.However, to our best acknowledge, there exists little work related to indefinite MFG problems.
This paper first studies indefinite large-population problems for a class of mean-filed stochastic systems, where considerable agents are coupled via the control average.Here, each agent is not only affected by its own noise but also disturbed by the common noise, which describes some exogenous factors that are identical to all agents.Moreover, the weight coefficients for state and control in the cost functional can be indefinite.In fact, the co-existence of mean-field term and control average structure in our large-population problems results in a highly complex coupled one, which is difficult to decouple.To add insult to injury, the weight matrices of the cost functional in our setting are allowed to be indefinite.For these reasons, we need to develop new methods to solve our problems.
With the help of the MFG approach, the decentralized strategies for the control average mean-field stochastic large-population problem are derived.Firstly, we introduce a limiting problem, which is an indefinite LQ optimal control problem for a stochastic mean-field system with nonhomogeneous terms.Secondly, the decentralized optimal strategies can be characterized by corresponding stochastic Hamiltonian system, which consists of an algebra equation and a mean-field forward-backward stochastic differential equation (MF-FBSDE).By the decoupling method, the feedback representation of decentralized optimal strategies is further obtained through two Riccati equations.The solvability of stochastic Hamiltonian system and Riccati equations under indefinite condition is also derived.Thirdly, the explicit structure of the control average limit and the corresponding meanfield NCE equation systems are obtained by some separation techniques.Moreover, the decentralized optimal strategies are verified to satisfy the -Nash equilibrium property.For illustration, one example stemming from the engineering field is further discussed.
In most existing literature, the dynamic optimization of a large-population system is always formulated by the state average dynamics system and cost functional without mean-field terms.Besides, the positive definite assumption is compulsory.However, considerable realistic situations (e.g., performance evaluation and decisionmaking problem of some financial models, cooperative control problem of the unmanned aerial vehicles) suggest that the problem to be studied should be formulated in a general control average mean-field setup, where the weight coefficients in the cost functional can be indefinite.Motivated by these observations, a new class of control average large-population problems for mean-field stochastic systems without positive definite condition are considered in this paper.By introducing a relaxed compensator, the existence and uniqueness of solutions to MF-FBSDE and Riccati equations under indefinite condition are discussed, thus the decentralized optimal strategies are designed in the feedback representation form.Distinguished from the conventional state average large-population setup, some separation techniques are proposed to determine the explicit structure of the frozen control average limit and the corresponding mean-field NCE equation systems.
The rest of this paper is organized as follows.Section 2 gives some basic notations and preliminaries, the largepopulation problem of mean-field type is also formulated.Section 3 studies the corresponding limiting control problem and establishes the characterization of decentralized optimal strategies by stochastic Hamiltonian system and Riccati equations under indefinite condition.The structure of the control average limit and the corresponding mean-field NCE equation systems are also discussed.Section 4 proves the -Nash equilibrium property.In Section 5, one case is discussed for the illustrating example.The decentralized optimal strategies are solved explicitly.Section 6 concludes this work.

Preliminaries and problem formulation
We consider a large-population system with N individual agents {A i } 1≤i≤N .For a fixed time T > 0, let (Ω, F, P, F) be a complete filtered probability space on which a standard (d + N )-dimensional Brownian motion W 0 t , W i t , 1 ≤ i ≤ N 0≤t≤T is defined and the filtration F = {F t , 0 ≤ t ≤ T } is assumed to be a nature one of W 0 , W i , 1 ≤ i ≤ N augmented by all P-null sets N .Here, W 0 signifies the common noise which is identical to all agents.For 1 ≤ i ≤ N , W i denotes the noise only for the ith agent which varies from person to person.For brief statement in the following sections, we define • be the inner product.appearing in superscript represents the transpose of a matrix or vector.S d stands for the set of all d × d symmetric matrices and S d + denotes the semi-positive matrices of S d .I denotes the identity matrix with appropriate dimension.The following spaces will be used throughout the paper.
We consider a mean-field stochastic large-population system with N individual agents {A i } 1≤i≤N .The state dynamics of A i is governed by the following MF-SDE where a ∈ R n is the initial state, x i • and u i • are the state process and control process, respectively.For any t ∈ • denotes the control average term.Now, we specify the strategy set of the large-population system.
Then the cost functional of A i is subject to 2) is well-defined.Remark 2.1.The motivations for studying control average mean-field large-population problem come from the following.In various financial and engineering problems, the mean-field large-population problem has been extensively studied for two reasons.On one hand, MF-SDE can be used to describe the particle systems at the microscopic level, which has important value in some applications.On the other hand, the given agent hopes that the optimal state process and/or control process could be not too sensitive with the possible variation of the random events.To achieve this, one may keep the variances var , then the problem consists of (2.1) and (2.2) is actually a mean-field large-population problem.Another motivation is that in some decision-making problems, the input or control of given agent will have immediate and transient impact on the state of oneself and others, thus the control average term arises.Now, we formulate dynamic optimization problems for mean-field large-population (MFL) systems.Problem (MFL).For 1 ≤ i ≤ N , find a strategy set ū = (ū 1 , . . ., ūN ) such that where ū−i = (ū 1 , . . ., ūi−1 , ūi+1 , . . ., ūN ).
In particular, the control strategy ū is the so-called Nash equilibrium for Problem (MFL).For the sake of comparison, we also present the definition of the -Nash equilibrium, which will be applied in the later section, for more details one can refer to Carmona and Delarue [6], Brezis [1].
• ∈ U c i , the strategy set ū = (ū 1 , . . ., ūN ) is called an -Nash equilibrium with respect to costs J i , if there exists an = (N ), lim where any alternative strategy u i • ∈ U c i is applied by A i .Obviously, if = 0 in the above definition, we see that it reduces to the exact Nash equilibrium.

Mean-field Nash certainty equivalence equation systems
Due to the highly complex interactions among peers, it is not implementable and efficient for the given agent to collect global information of all other ones in the framework of noncooperative games.Consequently, the centralized strategies based on global information are intractable to realize.Another alternative choice is to determine an approximate equilibrium depending on local information, which is known as the decentralized strategies.In this section, we will study the limiting problem, which is an indefinite LQ optimal control problem for a stochastic mean-field system with nonhomogeneous terms.By introducing a relaxed compensator, we first prove the solvability of stochastic Hamiltonian system and Riccati equations under indefinite condition.Then the decentralized optimal strategies are designed in the feedback form.The explicit structure of the control average limit and the corresponding mean-field NCE equation systems are also obtained by some separation techniques.
As the agent number N tends to infinity, we denote that 1 • is approximated by m • .Some subtle analysis for m • will be given latter.Then the limiting state of A i is controlled by The limiting cost functional becomes Remark 3.1.One should pay attention to distinguishing these two symbols: to emphasize the dependence of all agents due to the coupling structure in state equation.By contrast, J i u i • is only involved with the ith agent and m • .Now, the limiting mean-field large-population problem (LMFL) can be introduced as follows.Problem (LMFL).For 1 ≤ i ≤ N , find ūi • ∈ U d i such that

is called the decentralized strategy and xi
• is called the corresponding decentralized optimal state trajectory with respect to ūi • .Moreover, (x i • , ūi • ) is called the decentralized optimal pair for Problem (LMFL).
For more in-depth reveal the essence of problem, we give another version of (3.1) and (3.2) similar to Yong [34].To ease the presentation, we introduce the following notations: , , where 0 denotes zero matrices with appropriate dimensions.
After taking expectation on the both side of (3.1), we can obtain then the difference between (3.1) and (3.3) reads With these notations, the cost functional (3.2) can be rewritten as (3.5) Now, we introduce the following positive definite (PD) assumption: Assumption (PD).
If Assumption (PD) holds, it is easy to verify that Problem (LMFL) is well-posed.Inspired by the results in Li, Li, and Yu [28], we are interested in studying Problem (LMFL) under indefinite condition in this paper.It should be emphasized that the relaxed compensator plays a key role in this process.
To start with, we define space For any given (H., K.) where Then we introduce For further study, we introduce an auxiliary problem with H and K, denoted by Problem (LMFL) We notice that J i u i • and J i,H,K u i • are equivalent in the following lemma.
Lemma 3.2.For any given quantity , and then taking integral and expectation on the both side, we have where H • , σ • and σ 0 • are deterministic functions.Combining the above relationship with (3.5) and (3.6), we can obtain the desired result.Definition 3.3.If there exists a pair of functions (H., K.) satisfies Assumption (PD), then (H., K.) is called a relaxed compensator of Problem (LMFL).
It is clear that if there exists a corresponding relaxed compensator, Problem (LMFL) is well-posed.Next, we provide the open-loop decentralized optimal strategies of Problem (LMFL) under indefinite condition.Theorem 3.4.For any given quantity m • ∈ L 2 F 0 (0, T ; R n ), if there exists a relaxed compensator (H., K.) ∈ Υ[0, T ] × Υ[0, T ], then the following stochastic Hamiltonian system, which consists of an algebra equation and a MF-FBSDE admits a unique solution ) satisfies Assumption (PD).Then by Theorem 3.4 in [33], we can prove the following stochastic Hamiltonian system linked to Problem (LMFL) H,K admits a unique solution (x i,H,K ) is the unique decentralized optimal pair of Problem (LMFL) H,K .
We next prove the solvability equivalence between (3.7) and (3.8).In fact, if ).On the other hand, since the above relationships are invertible, we have that the transformation yields a solution to (3.8).
Therefore, due to the existence and uniqueness of solution to (3.8) and the solvability equivalence between (3.7) and (3.8), we conclude that (3.7) admits a unique solution ( . Moreover, by Lemma 3.2, the unique decentralized optimal pair (x i • , ūi ) of Problem (LMFL) H,K is also the unique decentralized optimal pair of Problem (LMFL), which completes the proof.
Noting that the decentralized optimal strategies are characterized by ( • ), which is the unique solution of (3.7).In the below, we will focus on the feedback representation of decentralized strategies.Theorem 3.5.For any given quantity m • ∈ L 2 F 0 (0, T ; R n ), if there exists a relaxed compensator (H., K.) ∈ Υ[0, T ] × Υ[0, T ], then the following Riccati equations system Moreover, Problem (LMFL) admits a unique feedback representation of decentralized optimal strategies where the decentralized optimal state trajectory xi • is determined by the following MF-SDE ) satisfies Assumption (PD).Inspired by Yong [34] and from (3.8), we let ȳi,H,K ) is the unique solution to the following Riccati equations system In the following, we aim to investigate the frozen control average limit and the related mean-field NCE equation systems.Firstly, the decentralized optimal state trajectory xi • can be rewritten as For further study, we introduce two equations as follows. and where a = a 1 + a 2 .Noting that (3.11) can be rewritten as • ] and E[Φ • ], which are all the deterministic functions.Based on these facts, we conclude that the individual noise W i and the common noise W 0 are completely seperated.
Anything else, since state xi • satisfies (3.20), by comparing the three equations (3.20)- (3.22), it is easy to verify that xi To start with, we introduce the following ordinary differential equation (ODE) and xi • as the decentralized one with respect to ūi • in (3.1), which follows For simplicity, we suppose that C 0 is a positive constant which may often vary from line to line.Firstly, we have the following results about the estimates of two different states.
Proof.By (4.1) and (4.2), we have According to (3.12) and Proposition 3.6, we obtain Moreover, we can verify Noting that for 1 ≤ j, k ≤ N, j = k, xj t and xk t are independent identically distributed under Based on these properties, we can check 2 < ∞, from (4.6) and (4.7), we have Combining (4.5) with (4.8), we get Proof.Recalling (2.2) and (3.2), we have Now, we provide the main results in this section.
Proof.Since the optimality of ūi • , it follows that J i (ū i t ) ≤ J i (u i t ), for any alternative u i • .Based on (4.9) and (4.15), we have which completes the proof with = O 1 √ N .

Example and simulation
In recent years, unmanned aerial vehicles (UAVs) have received extensive attention with their high mobility and low cost in military and civilian domains including typical examples like weather monitoring, forest fire detection, traffic evacuation, cargo transport, emergency search and rescue, communication relaying, etc.In particular, UAVs are considered a promising solution for handling complex communication scenarios, which can be used as airborne mobile base stations to enhance terrestrial wireless communication systems.Based on the fact that the serviced ground users are scattered across a wide range, ultra-dense UAVs need to be deployed for better quality communication, which means the number of UAV is large.However, it should not be neglected that the communication performance has been extremely limited due to the poor battery life of UAVs.Therefore, how to control the energy consumption of transit power is a challenging but appealing topic when facing sophisticated communication scenarios.In this section, the energy control problem for a large number of UAV air-to-ground wireless communication systems is studied.
We consider an ultra-dense UAVs wireless communication system which is composed of N autonomous drones.Among the system, each UAV acts as a base station for conducting data transmission from air to ground.We also assume all the UAVs are randomly distributed and share an identical channel at the same time.Despite the fact that one single drone may be requested by multiple ground users, we suppose that it can only serve one user at a fixed time.As shown in Figure 1, the user i, which is served by the considered UAV i, can be influenced by other UAVs.Moreover, the UAV i may also disturb the signal transmission for other users.For this reason, there have been complicated interactions among all the dynamics of N drones.For a given time T , since the controls will be affected by other UAVs, we consider a large-population system made up of N UAVs and there exists a coupling structure among peers.where x i t is the emission energy state of the ith UAV at time t.u i t is the transmit power level varying from person to person, which is the input affecting the energy state.E[x i • ] characterizes the average energy state of ith UAV.The control average term indicates that the given ith UAV will be interfered by the other peers.Moreover, a, ã, b, b are the weight coefficients, which are the uniformly bounded deterministic functions.Accept the various individual random noise W i , the dynamics of emission energy is also influenced by the common noise W 0 , like the temperature of the region, cloud cover and so on.Thus, we take these external factors into account in our controlled system.
In fact, the energy storage of each UAV is certain and limited.For better communication performance, we assume that each UAV needs to choose a transmit power level to minimize the following cost functional where η t ≥ 0, r t > 0 are the uniformly bounded deterministic functions.We notice that the integral item consists of two parts: the first is the minimum square criterion on energy state, which indicates that the current energy of the UAV cannot deviate from the average level to maintain the basic performance of the UAV; the second part is the running cost for the signal transmission process.σ 0 = 5, η = 0.8, r = 1, a i = 2, 4, 6, . . ., 40, σ i = 0.5, 1, 1.5, . . ., 10. Figure 2 presents the decentralized optimal states of different UAVs.We can observe that the overall trend of the different UAVs' states is consistent, the difference between them mainly comes from the different initial values a i and volatility σ i .Figure 3 shows the corresponding decentralized optimal control strategies of 20 UAVs.It is obvious that 20 UAVs' control strategies fluctuate around zero, which are consistent with our theoretical results.In fact, by (5.7), we have E[ū i • ] = 0, for 1 ≤ i ≤ N .

Conclusions
In this paper, a class of indefinite control average mean-field large-population problems have been studied.The solvability of stochastic Hamiltonian system and Riccati equations under indefinite condition has been proved.The decentralized strategies in open-loop form and feedback form have been obtained.The explicit structure of the control average limit and the mean-field NCE equation systems have been determined by some separation techniques.Moreover, the decentralized optimal strategies have been verified to satisfy the -Nash equilibrium property.For illustration, a practical example from the engineering field has been solved by the theoretical results.

( 3 . 23 )
We emphasize that(3.21)  is driven by individual noise W i while (3.22) is driven by common noise W 0 .Equation (3.22) includes the frozen limit term while (3.21) does not.Moreover, the diffusion term of (3.21)only depends on E[x 2
j =i The state system of the ith UAV is governed by the following MF-SDE