Second-order analysis of an optimal control problem in a phase field tumor growth model with singular potentials and chemotaxis

This paper concerns a distributed optimal control problem for a tumor growth model of Cahn-Hilliard type including chemotaxis with possibly singular potentials, where the control and state variables are nonlinearly coupled. First, we discuss the weak well-posedness of the system under very general assumptions for the potentials, which may be singular and nonsmooth. Then, we establish the strong well-posedness of the system in a reduced setting, which however admits the logarithmic potential: this analysis will lay the foundation for the study of the corresponding optimal control problem. Concerning the optimization problem, we address the existence of minimizers and establish both first-order necessary and second-order sufficient conditions for optimality. The mathematically challenging second-order analysis is completely performed here, after showing that the solution mapping is twice continuously differentiable between suitable Banach spaces via the implicit function theorem. Then, we completely identify the second-order Fr\'echet derivative of the control-to-state operator and carry out a thorough and detailed investigation about the related properties.


Introduction
Lots of disclosures have been obtained in the past decades concerning tumor growth modeling: see, e.g., the pioneering works [12,13,46]. The main advantage of a mathematical approach is to be capable of predicting and analyzing tumor growth behavior without inflicting any harm to the patients, thus helping medical practitioners to plan the clinical medications.
The phase field approach to tumor modeling consists in describing the tumor fraction by means of an order parameter ϕ representing the concentration of the tumor, which usually is normalized to range between −1 and 1. Namely, the level sets {ϕ = 1} and {ϕ = −1} may describe the regions of pure phases: the tumorous phase and the healthy phase, respectively. Moreover, the diffuse interface approach postulates the existence of a thin transition layer {−1 < ϕ < 1} in which the phase variable passes rapidly, but continuously, from one phase to the other. We assume the growth and proliferation of the tumor to be driven by the absorption and consumption of some nutrient, so that the equation for the phase variable, which has a Cahn-Hilliard type structure, is coupled with a reaction-diffusion equation for the variable σ capturing the evolution of an unknown species nutrient (e.g., oxygen, glucose) in which the tissue under consideration is embedded.
Let α > 0, β > 0, and let Ω ⊂ R 3 denote some open and bounded domain having a smooth boundary Γ = ∂Ω and the unit outward normal n. We denote by ∂ n the outward normal derivative to Γ. Moreover, we fix some final time T > 0 and introduce for every t ∈ (0, T ] the sets Q t := Ω × (0, t), Q T t := Ω × (t, T ), and Σ t := Γ × (0, t), where we put, for the sake of brevity, Q := Q T and Σ := Σ T . We then consider the following optimal control problem: (CP) Minimize the cost functional subject to the state system β∂ t ϕ − ∆ϕ + F ′ (ϕ) = µ + χ σ in Q ,

4)
∂ n µ = ∂ n ϕ = ∂ n σ = 0 on Σ , (1.5) in Ω , (1.6) and to the control constraint u = (u 1 , u 2 ) ∈ U ad . (1.7) Here, the constants b 1 , b 2 are nonnegative, while b 0 is positive. Moreover, ϕ Q and ϕ Ω are given target functions, and the set of admissible controls U ad is a nonempty, closed and convex subset of the control space (1.8) The state system (1.2)-(1.6) constitutes a simplified and relaxed version of the fourspecies thermodynamically consistent model for tumor growth originally proposed by Hawkins-Daruud et al. in [31] that additionally includes chemotaxis effects. Let us briefly review the role of the occurring symbols. The primary variables ϕ, µ and σ denote the phase field, the associated chemical potential, and the nutrient concentration, respectively. Furthermore, we stress that the additional term α∂ t µ corresponds to a parabolic regularization for equation (1.2), whereas the term β∂ t ϕ is the viscosity contribution to the Cahn-Hilliard equation. The key idea behind these regularizations originates from the fact that their presence allows us to take into account more general potentials that may be singular and possibly nonregular. The nonlinearity P denotes a proliferation function, whereas the positive constant χ represents the chemotactic sensitivity. Lastly, as a common feature of phase field models, F is a nonlinearity which is assumed to possess a double-well shape. Typical examples are given by the regular, logarithmic, and double obstacle potentials, which are defined, in this order, by where k > 1 and k 2 > 0 so that F log and F obs are nonconvex. Observe that F log is very relevant in the applications, where F ′ log (r) becomes unbounded as r → ±1, and that in the case of (1.11) the second equation (1.3) has to be interpreted as a differential inclusion, where F ′ (ϕ) is understood in the sense of subdifferentials.
In this paper, we take two distributed controls that act in the phase equation and in the nutrient equation, respectively. The control variable u 1 , which is nonlinearly coupled to the state variable ϕ in the phase equation (1.2), models the application of a cytotoxic drug into the system; it is multiplied by a truncation function (·) in order to have the action only in the spatial region where the tumor cells are located. For instance, it can be assumed that (−1) = 0, (1) = 1, (ϕ) is in between if −1 < ϕ < 1; see [23,29,32,33] for some insights on possible choices of . On the other hand, the control u 2 can model either an external medication or some nutrient supply.
As far as well-posedness is concerned, the above model has already been investigated in the case χ = 0 in [4,[6][7][8], and in [19] with α = β = χ = 0. There the authors also pointed out how α and β can be set to zero, by providing the proper framework in which a limit system can be identified and uniquely solved. We also note that in [10] a version has been studied in which the Laplacian in the equations (1.2)-(1.4) has been replaced by fractional powers of a more general class of selfadjoint operators having compact resolvents.
For some nonlocal variations of the above model we refer to [21,22,36]. Moreover, in order to better emulate in-vivo tumor growth, it is possible to include in similar models the effects generated by the fluid flow development by postulating a Darcy's law or a Stokes-Brinkman's law. In this direction, we refer to [14,18,21,[23][24][25][26][27]29,46], and we also mention [30], where elastic effects are included. For further models, discussing the case of multispecies, we address the reader to [14,21].
The investigation of the associated optimal control problem also presents a wide number of results of which we mention [9][10][11]16,17,22,28,33,[37][38][39][40][41]43,44]. Notice that, despite the number of contributions, only [16] established second-order optimality conditions under suitable restrictions on the considered model. In particular, the authors of [16] avoid considering the chemotaxis effects and allow only regular potentials to be considered.
In this paper, first we discuss the weak well-posedness of the system (1.2)-(1.6) in a very general framework for the potentials, which includes all of the cases in (1.9)-(1.11). Then, we turn our attention to the strong well-posedness of (1.2)-(1.6) in the cases of the regular F reg and logarithmic F log potentials. This is done in Section 2, while the corresponding optimal control problem is investigated in the following sections. Section 3 is concerned with the existence of minimizers, then the intensive and crucial Section 4 establishes the differentiability properties of the control-to-state operator and contains a number of results on the concerned linearized problems and the basic stability estimates for the solutions. The last two Sections 5 and 6 treat in some detail the first-order necessary and second-order sufficient conditions for optimality, respectively. Let us point out that the second-order analysis is challenging from the mathematical viewpoint and demands to prove that the solution mapping is twice continuously differentiable between suitable Banach spaces. By taking advantage of the regularizing effect due to the aforementioned relaxation terms, we can deal with a complete study of the second-order analysis, still covering the case of singular potentials and chemotaxis. Moreover, we are able to identify the second-order Fréchet derivative of the control-to-state operator and investigate the related properties in a sharp and profound way.
Lastly, let us introduce a convention that will be tacitly employed in the rest of the paper: the symbol small-case c is used to indicate every constant that depends only on the structural data of the problem (such as T , Ω, α or β, the shape of the nonlinearities, and the norms of the involved functions), so that its meaning may change from line to line. When a parameter δ enters the computation, then the symbol c δ denotes constants that depend on δ in addition. On the contrary, precise constants we could refer to are treated in a different way.

General Setting and Properties of the State System
In this section, we introduce the general setting of our control problem and state some results on the state system (1.2)-(1.6). To begin with, for a Banach space X we denote by · X the norm in the space X or in a power thereof, and by X * its dual space. The only exeption from this rule applies to the norms of the L p spaces and of their powers, which we often denote by · p , for 1 ≤ p ≤ +∞. As usual, for Banach spaces X and Y we introduce the linear space X ∩ Y which becomes a Banach space when equipped with its natural norm u X∩Y := u X + u Y , for u ∈ X ∩ Y . Moreover, we recall the definition (1.8) of U and introduce the spaces Furthermore, by ( · , · ), · , and ·, · , we denote the standard inner product and related norm in H, as well as the dual product between V and its dual V * . For given final time T > 0, we introduce the spaces which are Banach spaces when endowed with their natural norms.
Some assumptions on the data are stated here.
For the sake of simplicity, we indicate with a common notation L as a Lipschitz constant for F ′ 2 , P, and . (2.4) Let us note that all of the choices (1.9)-(1.11) are admitted for the potentials. In fact, the assumption (W2) implies that the subdifferential ∂F 1 of F 1 is a maximal monotone graph in R×R with effective domain D(∂F 1 ) ⊂ D(F 1 ), and, since F 1 attains the minimum value 0 at 0, it turns out that 0 ∈ D(∂F 1 ) and 0 ∈ ∂F 1 (0). Now, in the general setting depicted by (W1)-(W4), we are able to provide a first well-posedness result for the system (1.2)-(1.6). First, let us present the notion of weak solution to (1.2)-(1.6).
and if (µ, ϕ, ξ, σ) satisfies the corresponding weak formulation given by for every v ∈ V and a.e. in (0, T ), (2.8) for every v ∈ V and a.e. in (0, T ), (2.10) as well as It is worth noticing that the homogeneous Neumann boundary conditions (1.5) are considered in the condition (2.5) for ϕ (cf. the definition of the space W 0 ) and incorporated in the variational equalities (2.8) and (2.10) for µ and σ, when using the forms Ω ∇µ · ∇v and Ω ∇σ · ∇v. Moreover, let us point out that, at this level, the control pair (u 1 , u 2 ) just yields two fixed forcing terms in (2.8) and (2.10). The initial conditions (2.11) make sense since (2.5) and (2.6) ensure that ϕ and µ, σ are continuous from [0, T ] to V and H, respectively.
Theorem 2.2 (Weak well-posedness). Assume that (W1)-(W4) hold. Moreover, let the initial data (µ 0 , ϕ 0 , σ 0 ) satisfy µ 0 , σ 0 ∈ L 2 (Ω), ϕ 0 ∈ H 1 (Ω), F 1 (ϕ 0 ) ∈ L 1 (Ω), (2.12) and suppose that the source terms u 1 , u 2 are such that Then there exists at least a solution (µ, ϕ, ξ, σ) in the sense of Definition 2.1. Moreover, if u 1 ∈ L ∞ (Q) in addition to (2.13), then the found solution is unique. Furthermore, let Then there is a positive constant C d , depending only on the data of the system, such that (2.14) Before entering the proof, let us remark that the above result is very general and includes also the cases of singular and nonsmooth potentials, such as the double obstacle potential defined by (1.11). For the dependencies of the constant C d , we invite the reader to follow the proof of the estimate (2.14) below.
Proof. For the proof of the existence of a solution, we point out that the arguments are quite standard, since similar procedures have already been used in previous contributions. Thus, for that part, we proceed rather formally, just employing the Yosida approximation of ∂F 1 for our estimates, without recurring to finite-dimensional approximation techniques like the Faedo-Galerkin scheme.
Next, we are going to prove a series of estimates for the solution to problem (2.8)-(2.10), where (∂F 1 )(r) is replaced by F ′ 1,ε and the inclusion in (2.9) reduces to an equality. Namely, we argue on For the sake of simplicity, we still denote by (µ, ϕ, ξ, σ), with ξ = F ′ 1,ε (ϕ), the solution to the approximated system in place of (µ ε , ϕ ε , ξ ε , σ ε ); the correct notation will be reintroduced at the end of each estimate.
First Estimate: We add the term ϕ to both sides of (2.18) and test by ∂ t ϕ. Then, we take v = µ in (2.8) and v = σ in (2.10). Moreover, we add the resulting equalities and, with the help of a cancellation, we deduce that almost everywhere in (0, T ) it holds the identity Note that the last term on the left-hand side is nonnegative due to (W3). Then, we can integrate the above inequality over [0, t] for t ∈ (0, T ], using the initial conditions (2.11). We point out that the quantity thanks to (2.12) and (2.15). Next, owing to the boundedness and regularity properties of P , and F ′ 2 , and by Young's inequality, it is straightforward to infer that so that it suffices to apply Gronwall's lemma to conclude that Second Estimate: Now, owing to (2.19), from a comparison of terms in the variational equalities (2.8) and (2.10) it follows that In fact, arguing for instance on (2.8), and taking advantage of (W3),(W4), (2.13) and (2.19), we have that for a.e. t ∈ (0, T ) and for every v ∈ V it holds Thus, dividing by v V and passing to the superior limit, we readily have the bound for ∂ t µ(t) V * in terms of c( ∂ t ϕ(t) + u 1 (t) + 1). Then, by squaring and integrating over (0, T ), we deduce (2.20) for ∂ t µ. The corresponding property for ∂ t σ can be obtained in a similar way from (2.10).
Continuous dependence estimate: Now, recalling the notation in the statement of the theorem, we set, for i = 1, 2, and consider the difference of the equations in (2.8)-(2.11) to infer that for every v ∈ V and a.e. in (0, T ), (2.27) for every v ∈ V and a.e. in (0, T ), (2.29) µ(0) = µ 0 , ϕ(0) = ϕ 0 , σ(0) = σ 0 , a.e. in Ω, (2.30) where We take v = αµ + ϕ in (2.27), test (2.28) by ( χ 2 + 1 α )ϕ, and let v = σ in (2.29). Then, we add the resulting equalities and integrate over (0, t) and by parts to obtain that for a.e. t ∈ (0, T ), where we also used that µ = 1 α (αµ + ϕ) − 1 α ϕ. Observe that all of the terms on the left-hand side are nonnegative; in particular, the fifth is nonnegative thanks to the monotonicity of ∂F 1 . Next, we denote by I 1 , ..., I 11 the eleven terms on the right-hand side of (2.31), in the above order, and estimate them individually. Using the Young inequality, we infer that and here the last three contributions on the right-hand side can be absorbed on the lefthand side of (2.31). We also immediately observe that I 8 ≤ 0. Moreover, with the help of (W2), (W4), and recalling (2.4), we deduce from Young's inequality that as well as It remains to estimate I 3 . Using the boundedness and Lipschitz continuity of P , the Hölder and Young inequalities, and the continuous embedding V ⊂ L 4 (Ω), we find that for a positive δ to be chosen, for instance, less than or equal to 1 4 ( χ 2 + 1 α ). Since (µ 1 , ϕ 1 , ξ 1 , σ 1 ) is a weak solution to (1.2)-(1.6) in the sense of Definition 2.1, it follows that the function . Hence, we can collect all the above inequalities and apply the Gronwall lemma to finally derive the estimate (2.14).
Observe that (S4) entails that P, P ′ , P ′′ , , ′ , ′′ are Lipschitz continuous on R. Moreover, let us remark that the above setting allows us to include the singular logarithmic potential (1.10) and the associated quartic approximation (1.9), but it excludes the double obstacle potential (1.11), which cannot be considered in the framework of (S2)-(S3). Furthermore, the prescribed regularity for the potential F entails that its derivative can be defined in the classical manner so that we no longer need considering a selection ξ in the notion of strong solution below. Moreover, it will be useful to set, for a fixed R > 0, Under these conditions, we have the following result concerning the well-posedness of the state system (1.2)-(1.6), where the equations and conditions have to be fulfilled almost everywhere in Q.

Theorem 2.3 (Strong well-posedness).
Suppose that the conditions (W1), (S1)-(S4), and (2.32) are fulfilled. Moreover, let the initial data fulfill as well as Then the state system (1.2)-(1.6) has for every u = (u 1 , u 2 ) ∈ U R a unique solution (µ, ϕ, σ) with the regularity Moreover, there is a constant K 1 > 0, which depends on Ω, T, R, α, β and the data of the system, but not on the choice of u ∈ U R , such that Furthermore, there exist two values r * , r * , depending on Ω, T, R, α, β and the data of the system, but not on the choice of u ∈ U R , such that Also, there is some constant K 2 > 0, with the same dependencies as K 1 , such that max i=0,1,2,3 Then, there is a positive constant C D , depending only on data, such that The separation property (2.39) is particularly important for the case of singular potentials such as F log . Indeed, it guarantees that the phase variable always stays away from the critical values r − , r + that may correspond to the pure phases. In this way, the singularity is no longer an obstacle for the analysis; indeed, the values of ϕ range in some interval in which F 1 is smooth.
Notice also that, owing to Theorem 2.1, the control-to-state operator is well defined as a mapping between U = L ∞ (Q) 2 and the Banach space specified by the regularity results (2.35)-(2.37). Actually, the control-to-state operator S may be well defined just after Theorem 2.2, but the notion of weak solutions proposed there (cf. Definition 2.1) would not suffice for the investigation of the optimal control problem (CP).
Proof. Again, we proceed formally, but still using the Yosida approximation F ′ 1,ε , at least in the first part of the proof. Of course, we take for granted all the estimates already done in the existence proof for Theorem 2.2, and start now with additional estimates independent of ε. To avoid a heavy notation, we proceed as in Theorem 2.2 and use the simpler notation (µ, ϕ, σ) for the variables of the approximated system instead of (µ ε , ϕ ε , σ ε ), while we will reintroduce the correct notation exhibiting the dependence of ε at the end of each estimate.
First estimate: We rewrite the variational equality (2.8) as , it follows from the regularity theory for parabolic problems (see, e.g., [35]) that and (2.42) can be equivalently rewritten as the equation (1.2) along with the Neumann boundary condition ∂ n µ = 0 a.e. on Σ. Next, recalling also (2.22) and arguing similarly for the variational equality (2.10), rewritten as for every v ∈ V and a.e. in (0, T ), we also deduce that Hence, (1.4) holds a.e. in Q, and all of the boundary conditions in (1.5) hold a.e. on Σ.
Second estimate: We formally differentiate (2.18) with respect to time, obtaining where g ϕ is bounded in L 2 (0, T ; H) independently of ε, on account of (2.43), (2.44), (S1), and (2.19) (indeed, F ′′ 2 is globally bounded on R). Then, multiplying (2.45) by ∂ t ϕ and integrating over Ω and by parts, we find that where the third term on the left-hand side is nonnegative owing to the monotonicity of F ′ 1,ε . Now, we aim to integrate (2.46) with respect to time. Note that taking t = 0 in (2.18) produces where the right-hand side is bounded in H by virtue of (2.17), (2.33),(2.34), and (S1). Hence, we can integrate (2.46) over [0, t], with t ∈ (0, T ], to conclude that Third estimate: We come back to the elliptic equation (2.21) and observe that now we have at hand that f ϕ is bounded in L ∞ (0, T ; H). Then, arguing similarly as in the proof of (2.22), using monotonicity and elliptic regularity theory, we easily infer that so that the continuity of the embedding W 0 ⊂ C 0 (Ω) entails that Fourth estimate: Next, we consider the parabolic equation (1.2), written as and observe that now, thanks to (2.47), we have that ∂ t ϕ, and consequently f µ , are bounded in L ∞ (0, T ; H). Moreover, we recall (2.33) and note that µ 0 ∈ L ∞ (Ω), in particular. Thus, we can apply the regularity result [34, Thm. 7.1, p. 181] to show that With similar arguments we can easily obtain the same property for the nutrient variable.
In fact, it suffices to rewrite (1.4) as a parabolic equation with forcing term and notice that (2.48) allows us to infer that ∆ϕ, and thus f σ , are bounded in L ∞ (0, T ; H). Hence, we can apply the same argument to conclude that Now, we collect the estimates (2.43)-(2.44), (2.47)-(2.51) and point out that they still hold for the real solution (µ, ϕ, σ) when passing to the limit as ε ց 0, because of the weak or weak star lower semicontinuity of norms. Then, we realize that indeed the global estimate (2.38) in the statement has been proved, with the observation that L ∞ (Q) for ϕ is replaced by C 0 (Q), since this continuity property is actually ensured by and the compact embedding W 0 ⊂ C 0 (Ω) (see, e.g., [42,Sect. 8,Cor. 4]).
Separation property: At this point, the equation (1.3) holds for the limit functions, with the datum F ′ = F ′ 1 + F ′ 2 as in (S1)-(S3) and with the right-hand side bounded in L ∞ (Q). Thus, there exists a positive constant C * for which (2.52) Moreover, the condition (2.34) for the initial ϕ 0 and the growth assumption (S3) imply the existence of some constants r * and r * such that r − < r * ≤ r * < r + and where the standard positive ( · ) + and negative ( · ) − parts are used here. Then, we integrate over Q t = Ω × (0, t), for t ∈ (0, T ], and with the help of (2.52) deduce that where we also applied (2.53) to have that v(0) = 0. Note that the right-hand side above is nonpositive due to (2.54), so that v = 0 almost everywhere, which in turn implies that Then, (2.39) is proven, and, at this point, the assumptions (S1)-(S4) enable us to directly deduce (2.40).
On account of the above regularity, the separation property, and the assumptions (S1)-(S4), we are now in a position to show the refined continuous dependence estimate given by (2.41). In this direction, we employ the notation introduced in the proof of Theorem 2.2 and consider the system of the differences (2.27)-(2.30). Notice that we now have that F is differentiable, so that ξ . Moreover, let us remark that due to the separation property (2.39) and to the reinforced assumptions (S1)-(S3), it follows that F ′ is Lipschitz continuous in the range of the occurring arguments.
First estimate: We test (2.27) by µ, (2.28) -to which we add the term ϕ on both sides -by ∂ t ϕ, as well as (2.29) by σ. Then, we sum up and integrate over Q t and by parts. With the help of the cancellation of two terms we deduce that Using the Young and Hölder inequalities, the Lipschitz continuity and boundedness of P along with the strong regularity (2.38) of the solutions, we infer that Recalling (2.32), we have that Moreover, it is easy to see that Hence, we can collect the inequalities and apply the Gronwall lemma to infer that the differences satisfy Second estimate: By exploiting the ellipticity of equation (2.28), the Lipschitz continuity of F ′ , along with the above estimates, it is straightforward to derive that Third estimate: We argue in a similar way as in (2.42) and rewrite (2.27) as a parabolic variational equality for µ = µ 1 − µ 2 with source term given by On account of (2.38) and the above estimates, we easily deduce that , we can readily infer from the parabolic regularity theory (see, e.g., [35]) that (2.58) Fourth estimate: Arguing in a similar manner for the equality (2.29), we infer that the right-hand side can be rewritten as Ω f σ v, with Note that f σ L 2 (0,T ;H) is again already bounded as above and σ(0 (2.59) By collecting (2.56)-(2.59), we finally conclude the proof of the assertion.

Existence of a Minimizer
Now that the well-posedness results for system (1.2)-(1.6) have been addressed, we can deal with a corresponding optimal control problem, where the source terms u 1 and u 2 act as controls. In this direction, we require that the cost functional J is defined by (1.1) and that the following assumptions are fulfilled: (C1) b 1 , b 2 are nonnegative constants, and b 0 is positive.
Notice that U ad is a nonempty, closed and convex subset of U = L ∞ (Q) 2 . In the following, it will sometimes be necessary to work with a bounded open superset of U ad . We therefore once and for all fix some R > 0 such that U R ⊃ U ad , where U R is defined by (2.32). The first result for (CP) concerns the existence of a minimizer, where the proof readily follows from the direct method of calculus of variations, along with weak and weak star compactness arguments. Proof. At first, let us notice that J is nonnegative, so that we can pick a minimizing sequence {u n } n∈N ⊂ U ad with the corresponding sequence of states {(µ n , ϕ n , σ n )} n∈N to have where S(v) denotes the state corresponding to the control v. Furthermore, by combining the estimates (2.38)-(2.40), which are uniform with respect to n, with the structure of U ad , it is a standard matter (upon extracting a subsequence that we do not relabel) to infer the existence of limits u ∈ U ad and µ, ϕ, σ such that, as n → ∞, and, by the compactness of the embedding W 0 ⊂ C 0 (Ω), also that (see, e.g., [42,Sect. 8,Cor. 4]), It is then a standard matter to pass to the limit in the formulation (1.2)-(1.6) written for (µ n , ϕ n , σ n ) and u n to conclude that (µ, ϕ, σ) = S(u), and then to exploit the weak lower semicontinuity of J to derive that u is a minimizer for the optimization problem (CP).

Differentiability Properties of the Solution Operator
We now discuss the Fréchet differentiability of S, considered as a mapping between suitable Banach spaces. To show such a result, it is favorable to employ the implicit function theorem, because, if applicable, it yields that the control-to-state operator automatically inherits the differentiability order from that of the involved nonlinearities. The technique employed here is an adaptation of that used recently in [43] in a similar context. For the reader's convenience, we give the details of the argument. For this, some functional analytic preparations are in order. We first define the linear spaces which are Banach spaces when endowed with their natural norms. Next, we introduce the linear space which becomes a Banach space when endowed with the norm Finally, we fix constantsr − ,r + such that with the constants introduced in (S2) and (2.39). We then consider the set which is obviously an open subset of the space Y.
We first prove an auxiliary result for the linear initial-boundary value problem which for λ 1 = λ 2 = 1 and λ 3 = λ 4 = 0 coincides with the linearization of the state equation at ((u 1 , u 2 ), (µ, ϕ, σ)). We will need this slightly more general version later for the application of the implicit function theorem. To shorten the exposition, we introduce the Banach space of the initial data satisfying (2.33), which is given by endowed with its natural norm.

Remark 4.2.
A careful inspection of the estimates (4.21)-(4.30) shows that the term M appearing on the right-hand side of the above estimates can be replaced by since, in particular, only the L 2 (Q) norms of the increments h i , i = 1, 2, and of the source terms f i , i = 1, 2, 3, enter the computations. This, along with (4.31), entails that the linearized variables (µ, ϕ, σ) (which correspond to the choices λ 1 = λ 2 = 1, with some positive constant c (cf. also Remark 4.6), being h = (h 1 , h 2 ).
Next, writing (4.7) for t = 0 and recalling (4.10), we have that Now, in view of (4.37) and (4.31), a comparison of terms in (4.7) and standard elliptic estimates yield that ϕ L ∞ (0,T ;H 2 (Ω)) ≤ C 20 M, (4.38) and the compactness of the embedding (W 1,∞ (0, T ; H) ∩ L ∞ (0, T ; H 2 (Ω))) ⊂ C 0 (Q) (see, e.g., [42,Sect. 8,Cor. 4]) then shows that also At this point, we observe that, by bringing the term ∂ t ϕ to the right-hand side, equation About the linear dependence of the right-hand sides of (4.40) and (4.41) on the constant M that is specified in (4.19), we point out that this dependence is a consequence of the linearity of the problem (4.6)-(4.10). Indeed, e.g., if we choose a full set of data for which M = 1 and prove the above estimates, then we obtain all of the bounds (4.31) and (4.37)-(4.41) with some particular constants -fully determined -and without specification of M. Next, we can take a generic set of data for which the constant M ( = 0) in (4.19) is different from 1. Thus, pick the corresponding solution (µ, ϕ, σ) of (4.6)-(4.10) and divide all components of the triplet (µ, ϕ, σ) by M; then, the scaled triplet (µ/M, ϕ/M, σ/M) solves another problem in which the data are all divided by M and satisfy (4.19) with constant 1. Hence, the previously found universal estimates work also for (µ/M, ϕ/M, σ/M) with the same constants as before. As a consequence, it is straightforward to finally obtain (4.40) and (4.41) (and previous estimates as well), simply by multiplying by M.
Remark 4.3. We point out that the above linearity argument has been implicitly used in the paper [11] (see, in particular, [11, (3.10) and (3.14)]). On the other hand, distinct approaches may be possible; in particular, we aim to mention the analysis performed in [43] where, for slightly more regular initial data, the authors can obtain continuity for all components of the solutions (µ, ϕ, σ), and boundedness estimates analogous to (4.40) and (4.41) are satisfied.
Having proved Lemma 4.1, we are in a position to prepare for the application of the implicit function theorem. For this purpose, let us consider two auxiliary linear initialboundary value problems. The first, ∂ n µ = ∂ n ϕ = ∂ n σ = 0 on Σ, (4.47) is obtained from (4.6)-(4.10) for λ 1 = λ 2 = λ 4 = 0, λ 3 = 1. Thanks to Lemma 4.1, this system has for each ( a unique solution (µ, ϕ, σ) ∈ Y, and the associated linear mapping is continuous. The second system reads
We are going to apply the implicit function theorem to the equation (4.59). To this end, we need the differentiability of the involved mappings.
With these denotations, we want to prove the differentiability of the control-to-state mapping u → y defined implicitly by the equation F(u, y) = 0, using the implicit function theorem. Now let u ∈ U R be given and y = S(u). We need to show that the linear and continuous operator D y F(u, y) is a topological isomorphism from Y into itself.
To this end, let v ∈ Y be arbitrary. Then the identity D y F(u, y)(y) = v just means that G 1 (D y G 3 (u, y)(y)) − y = v, which is equivalent to saying that The latter identity means that w is a solution to (4.6)-(4.10) for λ 1 = λ 3 = 1, λ 2 = λ 4 = 0, with the specification (f 1 , f 2 , f 3 ) = −G 1 (D y G 3 (u, y)(v)) ∈ Y. By Lemma 4.1, such a solution w ∈ Y exists and is uniquely determined. We thus can infer that D y F(u, y) is surjective. At the same time, taking v = 0, we see that the equation D y F(u, y)(y) = 0 means that y is the unique solution to (4.6)-(4.10) for λ 1 = 1, λ 2 = λ 3 = λ 4 = 0. Obviously, y = 0, which implies that D y F(u, y) is also injective and thus, by the open mapping principle, a topological isomorphism from Y into itself.
At this point, we may employ the implicit function theorem (cf., e.g., [3, Thms. 4.7.1 and 5.4.5] or [15, 10.2.1]) to conclude that the mapping S is twice continuously Fréchet differentiable from U R into Y and that the first Fréchet derivative DS(u) of S at u ∈ U R is given by the formula (4.64) Now let h = (h 1 , h 2 ) ∈ U be arbitrary and y = (µ, ϕ, σ) = DS(u)(h). Then, which is obviously equivalent to saying that This, in turn, means that y is the unique solution to the problem (4.6)-(4.10) for λ 1 = λ 2 = 1, λ 3 = λ 4 = 0.
In summary, we have shown the following result.
Motivated by the forthcoming analysis, we now present a stability estimate for the solutions to the linearized system. In abuse of notation, we will denote the linearized variable associated with ϕ by ξ, which up to now was devoted to indicate a selection of the subdifferential ∂F 1 evaluated at some point. Since the optimal control problem demands to work with strong solutions, we no longer have any selection to work with, so that from now on the variable ξ will play a different role (cf. Theorem 2.3). We have the following result, where we recall the definition (2.3) of the space V.
Furthermore, let (η i , ξ i , θ i ) denote the associated solutions to the linearized system (i.e., the system (4.6)-(4.10) with λ 1 = λ 2 = 1, λ 3 = λ 4 = 0). Then the mapping DS is Lipschitz continuous on U R in the sense that there exists a positive constant K d such that for all h ∈ U we have Proof. Due to Theorem 4.4, the proof of (4.65) reduces to showing that there exists a constant c > 0 such that which is the estimate we are going to check. Moreover, let us notice that (4.32) in Remark 4.2 guarantees the existence of a positive constant c such that Next, we set and observe that the triple (η, ξ, θ) solves the system obtained from taking the difference between the linearized system written for (η 1 , ξ 1 , θ 1 ) and (η 2 , ξ 2 , θ 2 ), which reads where now f 1 and f 2 are specified by Moreover, let us recall that due to (2.41), we have the stability estimate We now aim at deriving some a priori estimates for the differences (η, ξ, θ) which will entail (4.66). Prior to this, let us premise a general fact that will be employed several times later on. To this end, let f : R → R be a regular, bounded and Lipschitz continuous function with Lipschitz continuous and bounded derivative f ′ (in what follows the role of f will be played by P , , F , and possibly their derivatives). Let ϕ 1 and ϕ 2 be the second components of different strong solutions (µ 1 , ϕ 1 , σ 1 ) and (µ 2 , ϕ 2 , σ 2 ) associated with controls u 1 , u 2 ∈ U R according to Theorem 2.3. Then it follows from the continuity of the embeddings V ⊂ L p (Ω) and W 0 ⊂ W 1,p (Ω), p ∈ [1,6], and the estimate (2.38) that, a. e. in (0, T ), (4.74) In addition, for the sake of a shorter exposition, in the upcoming estimates we will avoid to explicitly write the integration variable s in the time integrals.
Using the Young and Hölder inequalities, the Lipschitz continuity of P, P ′ and ′ , the continuous embedding V ⊂ L 4 (Ω), the uniform bounds (2.38)-(2.40) for (µ i , ϕ i , σ i ), i = 1, 2, as well as (4.67), (4.73), and (4.74), we infer that Next, by similar computations, we deduce that, for any δ > 0 (to be chosen later), The terms involving the potentials can be easily handled by invoking the separation principle (2.39), which entails the Lipschitz continuity of F and of its derivatives. Using this, (4.67), (4.74), and the Young inequality, we obtain that Finally, we have that At this point, we collect the above estimates and adjust δ ∈ (0, 1) small enough. Gronwall's lemma then yields that Second estimate: Estimate (4.75) entails that the term ∂ t ξ is bounded in L 2 (0, T ; H) by the expression on the right-hand side of the above estimate. Hence, equation (4.69) can be expressed as an elliptic equation for ξ whose right-hand side is bounded in L 2 (0, T ; H) by the same expression. Therefore, we easily get from elliptic regularity theory that This concludes the proof of Theorem 4.5.
Remark 4.6. Let us point out that Theorem 4.4 establishes the Fréchet differentiability of S as a mapping from L ∞ (Q) 2 into Y, a space of very regular functions. However, the Fréchet differentiability can also be directly obtained as a mapping from L 2 (0, T ; H) 2 into a space of less regular functions. Indeed, a closer look at the proof of [37,Theorem 2.5] reveals that the line of argumentation employed there can straightforwardly be adapted to our present situation, yielding that in the notation used there it holds that which in turn entails that the control-to-state operator S is Fréchet differentiable as a mapping from L 2 (0, T ; H) 2 into Z ⊃ Y. We have chosen to not follow this approach because, although it will suffice to handle the first-order necessary conditions pointed out below in Section 5, it would not allow us to deal with the second-order sufficient conditions established in Section 6.
Motivated by the forthcoming analysis on the second-order sufficient optimality conditions, let us now also explicitly identify the second-order Fréchet derivative of the controlto-state operator S, which exists according to Theorem 4.4. To this end, let u ∈ U R be given. For arbitrary increments h, k ∈ U, we set where it is known that (η h , ξ h , θ h ), (η k , ξ k , θ k ) ∈ Y. Now, if one adapts the argumentation of the proof of [45,Theorem 5.16, to the present situation, starting from the identities (4.57)-(4.63), then one concludes that the second-order Fréchet derivative D 2 S(u)(k)(h) can be evaluated using the system (4.78)-(4.82) introduced below. Since we intend to give an independent proof, we do not give the details of the argument, here. We begin our analysis with the following result.
It remains to show (4.85). To this end, we first add the term ψ on both sides of (4.79) and then test (4.78) by ν, (4.79) by ∂ t ψ, and (4.80) by ρ. Adding the resulting equalities, and integrating over time and by parts, we infer that Qt ∇ψ · ∇ρ + Qt g 2 ν =: with obvious meaning. Now, using the global estimates (2.38) and (2.40), the continuity of the embedding V ⊂ L 4 (Ω), as well as Hölder's inequality and the estimate (4.32) for the linearized variables, we can easily check that Arguing similarly, where we also invoke Young's inequality, we obtain that as well as Finally, using also that u 1 is bounded, we see that Having shown this, we may bring β∂ t ψ to the right-hand side of (4.79). Elliptic regularity theory and the estimates shown above then yield that also which concludes the proof of the assertion.
We now provide the announced independent proof for the form of the second-order derivative.
Proof. By virtue of Theorem 4.4, D 2 S(u) exists as an element of L(U, L(U, Y)). Now, the embedding of Y in V is continuous. Therefore, S is also twice continuously Fréchet differentiable between U and V, and the expressions for the derivatives coincide. It thus suffices to work in the space V. To this end, we recall that U R is open in U. Hence, there is some K > 0 such that u + k ∈ U R whenever k U ≤ K. In the following, we always tacitly assume that the occurring increments k satisfy this condition.
To prove the claim, we proceed in a direct way, by showing that there exist some ε > 0 and C > 0 such that At this point, we introduce some additional notation: for arbitrary h, k ∈ U, we define the linearized variables Notice that (4.32) implies that where, here and in the remainder of this proof, c > 0 denote constants that may depend on the data, but not on the choice of k ∈ U with u + k ∈ U R .
Next, we fix some h ∈ U with h U = 1 and introduce the auxiliary variables where (ν, ψ, ρ) stands for the unique solution to the bilinearized system as obtained from Theorem 4.7. With this notation, we realize that (4.92) reduces to which is the estimate we are going to show for some ε > 0. To this end, we first observe that these new variables solve the initial-boundary value problem
Before estimating in detail, let us recall that, owing to the continuous dependence results obtained in Theorem 2.3 and in Theorem 4.5, and recalling that h U = 1, we have that In particular, we point out that (4.100) We will use this estimate in the following at several places without further reference.

Similar computations show that also
for a positive δ yet to be determined. As for I 3 , we use (2.40) and, for the first term in Λ 3 , we can argue as in (4.101)-(4.102) and the following lines. Then, we have that and, as for I 4 , we see that where the first term can be absorbed on the left-hand side of (4.103). Therefore, collecting the above estimates and choosing δ > 0 small enough, we obtain from Gronwall's lemma that With this estimate shown, we can move the term β∂ t φ to the right-hand side of (4.96), and it readily follows from elliptic regularity theory that also φ L 2 (0,T ;W 0 ) ≤ c k 2 L 2 (0,T ;H) 2 . In conclusion, the estimate (4.94) is valid with ε = 2. This finishes the proof of the assertion.
The last result we present in this section will be useful later on to handle the secondorder sufficient optimality conditions and establishes the Lipschitz continuity of D 2 S in a suitable sense. Theorem 4.9. Suppose that the conditions (W1), (S1)-(S4), (C3) and (2.32) are fulfilled. Moreover, let the initial data (µ 0 , ϕ 0 , σ 0 ) satisfy (2.33)-(2.34), and let u i = (u i 1 , u i 2 ) ∈ U R be controls with the associated states (µ i , ϕ i , σ i ) = S(u i ), i = 1, 2. Furthermore, let h, k ∈ U be admissible increments with the corresponding linearized variables with a constant K > 0 that does not depend on the choice of h, k ∈ U R .
Proof. By virtue of Theorem 4.8, (4.104) directly follows once we show the existence of some c > 0 such that which is the estimate we are going to check. To this end, we set and observe that the triple of differences (ν, ψ, ρ) solves the system in Ω, (4.110) with Moreover, due to (2.41) and (4.65), we have (4.73) as well as and the corresponding estimate for k. Also, owing to Theorems 2.3, 4.4 and 4.7, for we have (4.67) with the corresponding increment, and from Theorem 4.7, that (4.112) We are now ready to proceed by arguing in a similar fashion as in Theorem 4.5 in order to check (4.105).
First estimate: To begin with, we multiply (4.106) by ν, (4.107), to which we add to both sides the term ψ, by ∂ t ψ, and (4.108) by ρ. Then we add the resulting equalities and integrate Q t and by parts to obtain Qt ∇ψ · ∇ρ =: I 1 + I 2 + I 3 + I 4 .
Then, using the Young and Hölder inequalities, the boundedness and the Lipschitz continuity of P (i) and (i) for i = 1, 2, the continuous inclusion V ⊂ L p (Ω) for p ∈ [1, 6], (4.74), along with the uniform bounds indicated above and the stability estimates (4.73) and (4.111), we infer that Besides, similar computations on the second term allow us to find out that, for any δ > 0, Employing the separation property (2.39), which entails the Lipschitz continuity of F and of its derivatives, and Young's inequality, we obtain that Finally, by Young's inequality we deduce that Therefore, choosing δ > 0 small enough, and invoking Gronwall's lemma, we conclude that Second estimate: The above estimate entails that the norm of ∂ t ψ in L 2 (0, T ; H) is bounded by the expression on the right-hand side. Thus, in equation (4.107) the term ∂ t ψ can be absorbed on the right-hand side. A straightforward computation, which may be left to the reader, shows that the entire right-hand side is bounded in L 2 (0, T ; H) by the same expression. Hence, we can infer from the elliptic regularity theory that also This concludes the proof of the assertion.

First-order Necessary Optimality Conditions
We now derive first-order necessary optimality conditions. By the well-known characterization for differentiable maps on convex sets, it holds (see, e.g., [45]) that where DJ red denotes the derivative of the reduced cost functional given by where (η, ξ, θ) denotes the unique solution to the linearized system obtained from Lemma 4.1 with h = u − u and λ 1 = λ 2 = 1, λ 3 = λ 4 = 0.
As usual, we simplify (5.3) by means of the adjoint state variables (p, q, r), which are defined as the solution triple to the adjoint system whose strong form is given by the backward-in-time parabolic system where (p, q, r) satisfies for every v ∈ V and almost every t ∈ (0, T ), as well as the terminal conditions in Ω . Proof. For brevity, we again argue formally, thus avoiding the introduction of approximation schemes like in the proof of Lemma 4.1 and just providing the relevant a priori estimates. Moreover, let us notice that the adjoint system (5.4)-(5.8) is linear, so that the uniqueness part also follows from standard arguments as a consequence of the following estimates.
We now establish second-order sufficient optimality conditions. Since the control-to-state mapping S is only known to be Fréchet differentiable on U, we are faced with the so-called "two-norm discrepancy" (see also [45,Sec. 4.10.2]). In order to overcome this difficulty, we follow the approach taken in [45,Chap. 5]. Since many of the arguments developed here are rather similar to those employed in [45], we can afford to be sketchy here. For full details, we refer the reader to [45], noting that there the case of one control variable is treated while in our case we have to deal with a pair of controls. In order to simplify the analysis somewhat, we now make an additional assumption.
Notice that under the assumption (C4) we have a zero terminal condition for p + βq in (5.8). This easily leads to the conclusion that we have the additional regularity q ∈ Z, which, in turn, means that the adjoint system is satisfied in the strong form (5.4)-(5.8).
To prove this claim, we multiply (4.78) by p, (4.79) by q, (4.80) by r, add the resulting equalities, and integrate over Q, to obtain that with the functions g 1 and g 2 defined in (4.83)-(4.84). Then, we integrate by parts and make use of the initial and terminal conditions (4.82) and (5.8) to find that − P ′ (ϕ)ξ h (θ k − χ ξ k − η k )(p − r) + ′′ (ϕ)ξ k ξ h u 1 p + ′ (ϕ)ξ h k 1 p + ′ (ϕ)ξ k h 1 p + F (3) (ϕ)ξ h ξ k q , whence the claim follows by using the adjoint system. From this characterization, along with (6.2) and (6.3), we conclude that This explicit expression for the second-order derivative of J red allows us to establish sufficient conditions for optimality of u. We aim at showing that, under suitable assumptions, D 2 J red (u) is a positive definite operator on a suitable subset of L 2 (0, T ; H) 2 , meaning that for any admissible increment h it holds that D 2 J red (u)(h, h) > 0. (6.6) However, (6.6) is rather restrictive as we need such a condition just along some suitable directions. To this end, for every τ > 0, we introduce the sets of strongly active constraints, A 1 τ (u) := (x, t) ∈ Q : − (ϕ(x, t))p(x, t) + b 0 u 1 (x, t) > τ , A 2 τ (u) := (x, t) ∈ Q : r(x, t) + b 0 u 2 (x, t) > τ .
In particular, it follows that u is locally optimal for (CP) in the sense of U.
For the proof we follow the line of argumentation employed in the proof of [45,Thm. 5.17], where in our case we deal with a system of parabolic equations and a pair of controls, with state and control nonlinearly coupled. However, the techniques used in [45] can straightforwardly be adapted to our more complicated situation. We therefore merely sketch the arguments.
We thus can conclude that so that (6.11) directly follows. With this, the proof can be completed by adapting the argumentation of [45] correspondingly to our situation.