EXTENDED MEAN-FIELD CONTROL PROBLEM WITH PARTIAL OBSERVATION ∗

. We study an extended mean-ﬁeld control problem with partial observation, where the dynamic of the state is given by a forward-backward stochastic diﬀerential equation of McKean-Vlasov type. The cost functional, the state and the observation all depend on the joint distribution of the state and the control process. Our problem is motivated by the recent popular subject of mean-ﬁeld games and related control problems of McKean-Vlasov type. We ﬁrst establish a necessary condition in the form of Pontryagin’s maximum principle for optimality. Then a veriﬁcation theorem is obtained for optimal control under some convex conditions of the Hamiltonian function. The results are also applied to studying linear-quadratic mean-ﬁled control problem in the type of scalar interaction


Introduction
The stochastic differential equations (SDEs) of McKean-Vlasov type were introduced by Kac [23] in 1956 as a stochastic model for the Vlasov-kinetic equation of plasma.In recent years, mean-filed games have become very popular subjects since the pioneering work of Lasry and Lions [24][25][26] and simultaneously Caines, Huang and Malhamé [20].Since then, the research of mean-field models has wide applications in many fields like finance and economics.The related McKean-Vlasov type stochastic control problems attract the attentions of many researchers, see for example Andersson, Djehiche [3], Buckdahn, Djehiche and Li [10], Carmona and Delarue [16], Li and Liu [29], Meyer-Brandis et al. [33], Shen et al. [40], Tembine et al. [42] and the references therein.The readers are referred to the monographs of Carmona and Delarue [17] and Bensoussan et al. [6] for an overview of McKean-Vlasov type control problems.
The above mentioned literatures about McKean-Vlasov type stochastic control problems are considered under the assumption that the stochastic noises are observed completely.However, in the real-world, usually controllers can only get partial information at most cases.Thus the stochastic control problems with partial observation are extensively studied, see e.g.Bensoussan [4], Tang [41], Wu [48], Xiong [49] as well as Wang, Wu and Xiong [45,46].We refer the readers to Caines and Kizikale [14], Huang, Caines and Malhamé [21], S ¸en and Caines [38,39] for the investigation of mean-filed games with partial observations.Concerning the mean-field control problems with partial observation, the readers are referred to Hafayed, Abbas and Abba [19], Li and Fu [28], Ma and Liu [31], Wang et al., [43], Wang et al. [47] and the reference therein.These papers are all focus on the mean-filed interaction of scalar type.Recently, Buckdahn et al. [12] studied mean-filed non-Markovian stochastic optimal control problems with partial observation, where the coefficients depend on the conditional law of the state.Moreover, in their continued work [9], Buckdahn, Chen and Li introduced the partial derivative with respect to the measure and considered a general mean-filed stochastic optimal control problems with partial observation, where they do not need any regularity of the coefficients neither in the control variable nor with respect to the law of control process.
To the best of our knowledge, the control problems for partially observed forward-backward stochastic differential equations (FBSDEs) of mean-field type is quite a new topic, and only some special cases have been solved.For example, Li and Liu [29] considered an optimal control problem for fully coupled FBSDEs of mean-filed type but without partial observation; Meherrem and Hafayed [32] studied stochastic optimal control problem for general McKean-Vlasov-type FBSDEs driven by Teugels martingales, associated with some Lévy process having moments of all orders, and an independent Brownian motion.Ma and Liu [31] introduced a linear quadratic optimal control problems for partially observed FBSDEs of mean-filed type; Wang et al. [43] investigated an optimal control problem derived by mean-field FBSDE with noisy observation, where the drift coefficients of the state equation and the observation equation are linear with respect to the state and its expectation.Partially observed optimal control problems for FBSDEs with scalar type mean-filed interaction were studied by Liu and Fu [28].
We mention that, except the paper by Buckdahn et al. [9], the mean-filed interaction are given only through the distribution of state of the problem.However, in many practical applications, it is necessary to study the extended case where the interactions are given through the joint distributions of state and control.For example, motivated by the problems involving the minimization of variance, Yong [50] studied linear quadratic optimal control problems for mean-field SDEs, in which both the state and the cost functional allow to depend on the expected value of the control, then the feedback optimal control is obtained through two Riccati equations.Motivated by certain application to economics such as production of an exhaustible resource, Graber [18] extended the work of Yong [50] to study linear quadratic control problems for mean-field SDEs with common noise involving the expected value of the control.Along this direction, Li et al. [30] investigated linear quadratic control problems for mean-field backward stochastic differential equations (BSDEs).Let us mention that the works of Graber [18], Li et al. [30], Yong [50] are all focused on linear quadratic problem including expected value of the control but not the distribution of the control.For the case that with nonlinear dynamic and joint distribution of state and control, it is recently studied by Acciaio, Backhoff-Veraguas and Carmona [1] and Pham and Wei [37].In fact, Pham and Wei [37] studied the close-loop feedback control for such mean-field SDEs through dynamic programming principle and related Bellman equations, and they also give applications to mean variance portfolio selection and a systemic risk model.Without the restriction on close-loop feedback control, Acciaio et al. [1] established the stochastic maximum principle for the extended control problems of mean-field SDEs via a probabilistic approach, and they also study the weak formulation and the applications to optimal liquidation with market impact.
In view of the wide applications in finance and economics of above extended mean-field control system, the purpose of this paper is to study the maximum principle of the extended mean-field control problem with partial observation, where the state and the observation both depend on the joint distribution of the state and the control process.More precisely, the state dynamic is given by the following mean-field type forward-backward system      dx t =f (t, x t , v t , L(x t , v t ))dt + σ(t, x t , v t , L(x t , v t ))dW t + σ(t, x t , v t , L(x t , v t ))d W v t , −dy t =g(t, x t , y t , z t , zt , v t , L(x t , y t , z t , zt , v t ))dt − z t dW t − zt dY t , x(0) =x 0 , y(T ) = Φ(x T , L(x T )), (1.1) and Y (•) is the observation process given by where v(•) is a control process adapted to the filtration generated by the observation process Y (•).Throughout the paper, for any given process {v t } t∈[0,T ] , we denote it briefly as v(•).In light of the nonlinear filtering theory (see [4,12]), we assume that there exists a reference probability space (Ω, F, F, P) on which (W (•), Y (•)) is a multi-dimensional standard Brownian motion.The symbol L stands for the law of the given random element under P.By inserting (1.2) into (1.1),we get (1.
and define a probability measure P v s.t.dP v = ρ T dP, then under suitable assumptions on h (e.g.h is bounded), according to Girsanov's theorem, (Ω, F, ) is a weak solution of system (1.1)-(1.2).Then our associated cost functional can be given by (see e.g.[41,46] for the case without mean-field term) where E v stands for the expectation w.r.t. the probability space (Ω, F, F, P v ).Our partially observed optimal control problem is to seek u ).The aim is to establish Pontryagin's maximum principle and verification theorem which will give respectively the necessary condition and sufficient condition for the optimality.
Let us summarize the difficulties of above problem and our contributions.In this paper, we study an extended mean-field control problem with partial observation, where the dynamic of the state is given by an FBSDE of extended McKean-Vlasov type and the state is partially observed via a process whose dynamic is also in extended McKean-Vlasov type.Thus, the model of our paper is novel, it contains the partial observation structure and the joint distribution of the state and the control, which leads to several difficulties.The main difficulties and innovations of this paper are as follows: (I) The first difficulty we meet to get the maximum principle for our problem is the partial observation structure.In our paper, inspired by Wang, Wu and Xiong [45,46], we consider that the state and observation are defined on a reference probability space (Ω, F, F, P) (see (1.1) and (1.2)), but the cost functional is defined on probability space (Ω, F, F, P v ) (see (1.5)).In this kind of model, because of the interdependence of control process and observation processes, we can not use the classical method to construct adjoint process and variational equations.
To solve this problem, we adopt the methods of Tang [41].On the one hand, we transform the original partial observation problem to classical problem on the reference probability space by Girsanov's transformation and the dimensional extension, then we construct new adjoint processes and variational equations for the state and the observation.On the other hand, due to the application of Girsanov's transformation, the coefficients l and χ in cost functional (1.5) will be multiplied by ρ (see (2.8)), which leads to the necessity of high order estimates and high order convergence results of variational equations when we derive the variational inequality, see Lemma 3.2, Lemma 3.4 and the proof of Lemma 3.6.(II) The second difficulty is the joint distribution dependence of state dynamic which is a mean-field FBSDE.On the one hand, when we take the variation of control u(•), the joint distribution of u(•) and (x(•), y(•), z(•), z(•)) will change accordingly, which leads to the failure of classical variational method to be applied.To solve this problem, we need use the L-derivative w.r.t.probability measure, especially the partial L-derivatives because of the dependence of joint distribution.In this case, we can obtain new adjoint equations and variation equations, which are both mean-field FBSDEs (see (3.11) and (3.13)), and we give the existence and uniqueness of solutions for variation equations and adjoint equations, see Theorem 3.1 and Remark 3.8.
On the other hand, as we mentioned in (I), we need high order estimates of variation equations which can compose a mean-field FBSDE, see (3.1).Fortunately, it is possible to obtain the high order estimates of (3.1) since this mean-filed FBSDE is not fully coupled.However, due to the existence of mean-field term, it requires subtle calculations and skills, especially for the higher order estimates of the mean-filed backward equation, see the explanation in the first paragraph of the proof of Lemma 3.4, Appendix A.3.Indeed, we first establish the L 2 estimates of the variational equations and then use it to obtain the desired L p estimates.Finally, with the help of high order estimates, we can obtain the related variational inequality which allows us to establish the Pontryagin's maximum principle under the reference probability space.(III) The third difficulty is how to obtain the verification theorem.We emphasize that due to the partial observation structure, when transferring the original control problem to associate equivalent problem, the coefficients l and χ in the original cost functional (1.5) will be multiplied by ρ.So it causes that the convexity assumptions of Hamiltonian function can not be satisfied if l and χ in the original cost functional (1.5) do not depend on ρ, see Remark 4.2 and Remark 5.1.However, one can observe that if l and χ are allowed to depend on ρ, then the convexity assumptions (see (H.3) and (H.4)) may hold.That is the reason why our l and χ depend on ρ in (1.5).Moreover, due to the existence of the joint distribution, we should introduce a new convexity assumption of the Hamiltonian function, see (H.4), then we can establish the verification theorem of extended meanfield control problem under such new convexity assumption.Furthermore, to illustrate the verification theorem, we also give a linear quadratic example which provides an optimal control.
The organization of this paper is as follows.In Section 2, we formulate the extended mean-field problem with partial observation.We also review some preliminaries about L-derivative.In Section 3, we establish a new type Pontryagin's stochastic maximum principle.Section 4 provides a verification theorem under new convexity assumption.Section 5 considers two kinds of examples, scalar interaction model and linear-quadratic model.In the appendix, we give the detailed proofs of some lemmas of Section 3.

Formulation of the problem
Let (Ω, F, F, P) be a filtered probability space satisfying the usual conditions, here we denote F = {F t } 0≤t≤T .Suppose that (W (•), Y (•)) is a standard R m × R d valued Brownian motion defined on above reference probability space, where Y (•) is the observation process.Let F W := {F W t } 0≤t≤T and F Y := {F Y t } 0≤t≤T be the natural filtration generated by W (•) and Y (•) respectively, and augmented by all P-null sets, and let F := {F W,Y t } 0≤t≤T be the natural filtration generated by (W (•), Y (•)) and augmented by all P-null sets.
For a given filtration G = {G t } 0≤t≤T , we denote by H 2,n G the space of all R n -valued, G-progressively measurable processes η(•) on [0, T ] such that E T 0 |η t | 2 dt < +∞.We shall also denote by S 2,n G the set of all continuous processes η(•) ∈ H 2,n  G such that E sup t∈[0,T ] |η t | 2 < +∞.Let (E, d) be a Polish space.The σ-field E equipping E is assumed to be the Borel σ-field B(E).We use the notation P 2 (E) for the space of probability measures with finite second moments over E. Then we can define the 2-Wasserstein distance W 2 (µ, µ ) on P 2 (E) by ; π ∈ P 2 (E × E) with marginals µ and µ .
Then (P 2 (E), W 2 ) is a Polish space.Moreover, one can systematically equip P 2 (E) with its Borel σ-field and characterize real valued Borel measurable functions on P 2 (E), for more details, see [ [17], Chap.5].Noting that if ξ, ξ are E-valued random variables of order 2 (An E-valued random variable ξ is order of 2 means that E[d(ξ 0 , ξ) 2 ] < ∞ for one, and hence for all ξ 0 ∈ E), we have where we recall that L(ξ) stands for the law of ξ under P.Moreover, if E is Euclidean space, then by applying Corollary 5.4 of [17] and Hölder's inequality, one also have We consider the following extended mean-filed type FBSDE: where the coefficients f : In our problem, the state process (x(•), y(•), z(•), z(•)) can not be directly observed.Instead, we can observe a related process Y (•) which is governed by the following SDE where h : where E v is the expectation w.r.t. the probability measure P v .Note that, in above framework, y 0 is deterministic, so it is not necessary to have L(y 0 ) in γ.Now, let us formulate our extended mean-field partially observed control problem.
Then u(•) is called an optimal partially observed control of Problem (EMFPOC).

Partial L-differentiability of functions of measures
Due to the appearance of the joint distribution of the control and state process in our extended mean-field partially observed control problem, we use the concept of L-derivative w.r.t.probability measure introduced by Lions, see e.g.[15][16][17].For the convenience of the reader, in this subsection, we briefly recall the definition of L-derivative and the concept of joint differentiability for functions depending upon a point in R q and a probability measure in P 2 (R p ).We refer the readers to [17] for more details.
Let (Ω, F, P) be a probability space which is rich enough in the sense that for every µ ∈ P 2 (R p ), there is a random variable X ∈ L 2 (Ω; R p ) with law µ (i.e.P X = µ).Let us consider a function f : Then we can introduce the partial derivatives in x and µ of f , respectively as We call the functions p and valued, respectively, on R q , R p , the partial L-derivatives of f at (x, P X ).We often use the fact that joint continuous differentiability in the two arguments is equivalent to the partial differentiability in each of the two arguments together with the joint continuity of the partial derivatives (see e.g.assumption (H1) in next subsection).Here, the joint continuity of ∂ x f means the joint continuity w.r.t the Euclidean distance on R q and the 2-Wasserstein distance on P 2 (R p ).The joint continuity of ∂ µ f is understood as the joint continuity of mapping (x, X) The above discussions can be applied to the coefficients of Problem (EMFPOC) (see [1] for similar discussions).For example, let ξ be a generic element of where the functions and valued, respectively, on R q , R n , R k .Similar discussions can be applied to the coefficients σ, σ, g, h, l, m and γ.
Finally, we introduce the following notations.Let ( Ω, F, P) be a copy of the probability space (Ω, F, P).For any random variable (x, v) over (Ω, F, P), we denote by (x, ṽ) are independent copy of of (x, v), but defined over ( Ω, F, P).The expectation Ẽ[•] = Ω(•)d P acts only over the variables endowed with a tilde.
Remark 2.1.In assumption (H.1), the continuity of the mapping Other continuity assumptions of the related mappings can be understood similarly.
Once we solved the first equation, i.e. x(•) is given now, then the second equation in (2.5) will be a mean-field BSDE.By applying similar methods in the proof of Theorem 4.23 of [17] (the dependence on the distribution of z(•) will not arise additional difficulties, see e.g.[7]), one can show that it has a unique solution (y(•),

F
. Moreover, similar to the proof of Lemma 3.4 in the appendix, one can show that Finally, recalling the definition of ρ(•) (see (1.4) or (2.3)) and the boundednees of h, it follows that 2), according to Girsanov's theorem.
Let us now reformulate the cost functional (2.4).According to Bayes' formula, the cost functional defined as in (2.4) can be rewritten as (noticing that γ(y 0 ) is deterministic) We mention that, under assumptions (H.1)-(H.2),we have |J(v(•))| < +∞, i.e. the above cost functional is well defined.This can be obtained easily from the assumptions on the coefficients l, m, γ and the integrability property of ρ(•), x(•), y(•), z(•), z(•) (see Rem. 2.2).We introduce the following notations for dimensional extension Then equations (2.1) and (2.3) can be compressed into the following form and the cost functional (2.6) can be represented as Then Problem (EMFPOC) becomes to the following equivalent minimization problem: to minimize J(v(•)) over v(•) ∈ U ad subject to (2.7) and (2.8).

Stochastic maximum principle
In this section, we will derive a necessary condition for optimal control in type of Pontryagin's stochastic maximum principle.For simplicity, we set n = l = k = m = d = 1.The arguments hold similarly for the multi-dimensional case.
One can check that, for i = 1, 5, j = 1, 2, 3, 4, 5 and ψ = x, y, z, z, v, Recalling that ( Ω, F, P) is a copy of (Ω, F, P).For any random variable x over (Ω, F, P), x denotes an independent copy of x defined over ( Ω, F, P).The expectation Ẽ[•] acts only over the variables endowed with a tilde.Then we can introduce the following variational equations where we used the notation X 1 t := , and αt := (x t , ỹt , zt , zt , ũt ) is an independent copy of α t := (x t , y t , z t , zt , u t ).
Theorem 3.1.Let assumptions (H.1)-(H.2) hold, then mean-field FBSDE (3.1) admits a unique solution Proof.The first equation of (3.1) can be decomposed into the following two equations and where we have used the assumption (see (H.1)) that, for j = 2, 3, 4, are uniformly bounded.A careful inspection of the proof of Lemma 3.1 of [8], or Theorem 3.1 of [13] and Theorem 4.23 of [17], above bounded properties allow us to use the arguments in [8,13,17] to prove our meanfield BSDE has a unique solution (y Indeed, this will need subtle analysis, but the procedures are almost the same as the proof of Lemma 3.4, so we omit it here.Now let us denote (X ε (•), y ε (•), z ε (•), zε (•)) as the trajectory corresponding to u ε (•).We set and the following notations will be used in the sequel of the paper The following expansion is useful for the proof of following lemmas.For given (x, v) ∈ L 2 (Ω; R n×k ) and (x, v) ∈ L 2 (Ω; R n×k ), it follows that (by denoting We have following lemmas for variation equations whose proofs will be given in the appendix. Lemma 3.2.Suppose assumptions (H.1) and (H.2) hold, then we have Moreover, for any 2 ≤ p ≤ 4 and 0 < ε 0 ≤ p, it follows that Lemma 3.3.Suppose assumptions (H.1) and (H.2) hold, then we have Moreover, we have the following p-order convergence result.

Variational inequality and maximum principle
Let us first compute the Gâteaux derivative of the cost functional.
Lemma 3.6.The functional u(•) → J(u(•)) is Gâteaux differentiable in the direction v(•), and its derivative is given by (3.10) Proof.Recall that the cost functional is defined by (2.8), using notations above Lemma 3.2, we have Let us focus only on the term lim ]dt, and other terms can be tackled in a similar way.By using similar expansion as (3.6), we have To prove (3.10), we only need to show that lim ε→0 E T 0 ∆ ε L (t)dt = 0.By noticing Lemmas 3.2 and 3.4, it is sufficient to show the uniformly integrability of ∆ ε L (t).Let us now only focus on the uniformly integrability of the term ∂ X L(t, (Θ t ) λ,ε )(X 1 t + X ε,1 t ), the argument for other terms will be similar.From assumption (H.2), one can check that ∂ X L(t, (Θ t ) λ,ε )(X 1 t + X ε,1 t ) is dominated by where Λ := ψ=ρ,x,y,z,z Then from Lemmas 3.2 and 3.4, estimate (3.9) and E[sup t∈[0,T ] |ρ 1 t | p ] < +∞, for 2 ≤ p < 4, as well as the following inequalities, for ψ = ρ, x, y, z, z, φ = x, y, z, z, as ε → 0, we can get the uniformly integrability of The proof is complete.
Remark 3.7.From above proof, one can find that we need p-order (p > 2) estimates of both the states and variational states, the second-order estimates are not enough.
Proof.Let us first apply Itô's formula to X 1 t , p t , we derive By noting that "tilde random variables" are independent copies of the "nontilde variables", we apply Fubini's theorem and obtain Then by recalling p T in (3.13), we have Similarly, by applying Itô's formula to y 1 t , q t , and with the help of Fubini's theorem, we have Now from Lemma 3.6, (3.15), (3.16) and Fubini's theorem, we obtain Since u(•) is optimal, we have J(u ε (•)) ≥ J(u(•)), for any ε ≥ 0, which yields that and then for arbitrary Since U ad is convex, for any given v(•) ∈ U ad , we may choose the perturbation , which is still in U ad , and then from (3.17), we have for any v(•) ∈ U ad , For any given v ∈ U (deterministic), we set by choosing a modification if necessary, we know that A is a F Y -progressively measurable set.Now, we take where we used the fact that I A is F Y t -adapted.Thus, from Lebesgue differentiation theorem, we have which together with the definition of A indicates that E I A (t, ω) = 0, a.e.Consequently, we deduce that I A = 0, a.s.a.e., which yields that for any v ∈ U , By noticing that the left hand side of (3.19) is continuous w.r.t.v , we obtain (3.14).
Remark 3.10.Inspired by Proposition 4.6 of [16], our Pontryagin's stochastic maximum principle, i.e.Theorem 3.9 can be generalized to the case that U is an open set which maybe non-convex.In this case, following similar methods of of [16], we can show that

Verification theorem
We have established Pontryagin's maximum principle which gives a necessary condition for the optimal control, see (3.14) in Theorem 3.9.In this section we will show that (3.14) is also a sufficient condition for optimality under the following convexity assumptions.
Remark 4.2.In this paper, we consider an optimal control problem with partially observable information in which the state is governed by a nonlinear mean-field type FBSDE.The cost functional depending on the joint distribution of the state and the control is defined on probability space (Ω, F, F, P v ).The structure of our problem is inspired by Wang, Wu and Xiong [45], and we adopt the method of Tang [41] in our paper, the main feature of this method is that we reformulate the cost functional by Bayes' formula, which transforms the cost functional into the one defined on the reference probability space (Ω, F, F, P) see (2.6) or (2.8) and in additional, the term ρ is multiplied.This method is different from the one of Wang, Wu and Xiong [45].
In this formulation, the cost functional (2.6) contains ρ.If the coefficients l and χ in the cost functional (2.6) do not depend on ρ, then one can check that the mappings (ρ, x, y, z, z, v, η) → ρl(t, x, y, z, z, v, η) and (ρ, x) → ρχ(x, µ 1 ) are usually not convex.Fortunately, if we allow l and χ depend on ρ, then it is possible to make sense that the convexity assumptions (H.3)-(H.4)hold (one simple example is that l(ρ, x, y, z, z, η) = )), then we can get the verification theorem.

Examples
In this section, we will give two examples to illustrate our results.Comparing with the existed literatures concerning McKean-Vlasove type stochastic control problems and partial observation problems, one can find that our maximum principle can deduce the related results in [1, 3, 16, 18, 19, 28-31, 41, 46, 50] in the case of convex control domain.This means that our results are actually an extension of the classical case.

Scalar interactions
The case of scalar interactions is of particular interest and has been widely investigated, see e.g.[3,10,[27][28][29][30][31]50].Usually it can be dealt by using standard calculus without using L-derivatives.In this subsection, we will derive Pontryagin's maximum principle in scalar type interactions by using our Theorem 3.9.

Linear quadratic case
In this subsection, we will consider linear quadratic (LQ) partially observed optimal control problem.Let us consider the following linear forward-backward system with scalar interaction.
where the observation process Y is given by dY t = h t dt + d Wt , with Y 0 = 0. We introduce ρ t = exp t 0 h s dY s − 1 2 t 0 |h s | 2 ds which is the solution of SDE: dρ t = ρ t h t dY t , with ρ 0 = 1, and we define the probability measure P v by dP v = ρ T dP.
Then we give the following cost functional which can be rewritten as Here, all the coefficients are uniformly bounded and deterministic, L i,t is positive function and L i,t is uniformly bounded, for i = 1, 2, 3, 4, 5, 6, and M 1 , M 2 , γ are positive constants.We notice that, in such case, we have where X = ρ x .It is easy to check that L = l 1 and M = χ 1 satisfy assumption (H.2), but l and χ do not satisfy assumption (H.2).However, one can check that Lemma 3.6 still hold (the proof even become very simple since there is no term ρ in L and M ), and then the maximum principle (see Thm. 3.9) works.
To apply Theorem 3.9, we rewrite the state as where In this setting, the Hamiltonian function is of the form The adjoint equations will be given by and According to Theorem 3.9, if U = R, the necessary condition for optimality (3.14) will be Taking expectations, we obtain and the optimal control should satisfy Finally, one can also check assumption (H.3) and (H.4) hold, then the verification theorem yields that the control u(•) given by (5.6) is indeed an optimal control.Furthermore, by inserting (5.6) to the following Hamiltonian system (5.7) one will obtain a fully coupled mean-field FBSDE, once we obtain p 2,t and q t , we can then get the optimal control through (5.6).Moreover, one can also try to write the optimal control u(•) given in (5.6)  Due to the complicated coupled structure of (5.7), usually it is not easy to find such feedback optimal control, the interesting readers are referred to e.g.[28,30,49,50] for some special cases.
Remark 5.1.If the cost functional is given by which can be rewritten as one can also apply Theorem 3.9 to obtain that the optimal control should satisfy we have that for each t ∈ [0, T ], (A.12) Then by using (A.10), and (A.12) and sup 0≤t≤T E(|x Indeed, from (A.2), by using Burkholder-Davis-Gundy (BDG) inequality, we obtain for any S ∈ [0, T ], , and with the help of (A.6), (A.7) and the boundedness of σ and h, we have and Gronwall's inequality yields that
Remark 2.2.Under assumption (H.1), one can show that system (2.1)-(2.2) has a weak solution.Indeed, by inserting (2.2) into (2.1),we get .5) Noticing that the first equation in (2.5) is a forward SDE of McKean-Vlasov type.By applying Theorem 4.21 of [17] (see also Prop.1.2 of [22]), we know that for each given v(•) ∈ U ad , it has a unique solution x(•) ∈ S 2,n F .Moreover, by recalling that sup t∈[0,T ] E|v t | 4 < +∞ and the uniformly boundedness of (f, σ, σ)(t, 0, 0, δ 0 ), one can show that E sup t∈[0,T ] |x t | p < +∞, for any 2 ≤ p ≤ 4 (see Prop. 1.2 of 1,t x t + f 2,t Ex t + f 3,t v t + f 4,t Ev t )dt + c t dW t + ct d Wt , −dy t =(g 1,t x t + g 2,t Ex t + g 3,t y t + g 4,t Ey t + g 5,t z t + g 6,t Ez t + g 7,t v t + g 8,t Ev t )dt 1,t X t + F 2,t EX t + F 3,t v t + F 4,t Ev t + F t )dt + C t dW t + Ct dY t , −dy t =(g 1,t x t + g 2,t Ex t + g 3,t y t + g 4,t Ey t + g 5,t z t + g 6,t Ez t + g 7,t v t + g 8,t Ev t )dt − z t dW t − zt dY t , X 1,t x + g 2,t Ex + g 3,t y + g 4,t Ey + g 5,t z + g 6,t Ez + g 7,t v + g 8,t Ev) 1,t x t + f 2,t Ex t + f 3,t u t + f 4,t Eu t − ct h t )dt + c t dW t + ct dY t , −dy t =(g 1,t x t + g 2,t Ex t + g 3,t y t + g 4,t Ey t + g 5,t z t + g 6,t Ez t + g 7,t u t + g 8,t Eu t )dt as the feedback of the filtered state E[x t |F Y t ], E[y t |F Y t ], E[z t |F Y t ], E[z t |F Y t ] and the expectation of the state E[x t ], E[y t ], E[z t ], E[z t ] through Riccati equations.