Continuity of the Value Function for Deterministic Optimal Impulse Control with Terminal State Constraint

Deterministic optimal impulse control problem with terminal state constraint is considered. Due to the appearance of the terminal state constraint, the value function might be discontinuous in general. The main contribution of this paper is to successfully find some suitable and reasonable conditions under which the value function is continuous. Then the value function can be characterized as the unique viscosity solution to the corresponding Hamilton-Jacobi-Bellman equation.

To measure the performance of the impulse control ξ(·), we introduce the following cost functional (1.4) J(t, x; ξ(·)) = T t g(s, X(s))ds + h(X(T )) + are suitable maps. Here, the terms on the right-hand side of (1.4) are called the running cost, the terminal cost and the impulse cost, respectively. Since we allow τ k = τ k+1 for some k 1, the meaning of X(τ k − 0) stands for the following: Suppose which is the state right before the impulse ξ k is made. In the above, we may assume that g and h are just bounded uniformly from below. By a possible translation, we can simply assume that they are non-negative, for convenience. This will be assumed throughout of the paper. We emphasize that the impulse cost ℓ(t, x, ξ) is strictly positive. The dependence of ℓ(t, x, ξ) on x (as explained above) makes our problem a little different from the standard classical optimal impulse control problem. Similar to the standard one, we can pose the optimal impulse control problem as follows, which we still referred to as a classical one.
Different from the classical situation, we now introduce a terminal state constraint: where D is a non-empty proper domain in R n (non-empty open and connected subset D = R n ) withD being its closure. We may also callD a target. For any initial pair (t, x) ∈ [0, T ) × R n , we introduce the following associated admissible impulse control set T ] X(T ; t, x, ξ(·)) ∈D .
We state the following optimal control problem.
T ] satisfying (1.10) is called an optimal impulse control of Problem (C), the corre-spondingX(·) ≡ X(· ; t, x,ξ(·)) and (X(·),ξ(·)) are called an optimal state trajectory and an optimal pair, respectively. Also, V (· , ·) is called the value function. Recall a common convention that inf ∅ = ∞, regarding ∅ ⊂ R. Thus, when K x [t, T ] = ∅, we accept the convention Therefore, it is convenient to make the following convention: We let which is called the domain of the value function V (· , ·). Clearly, recalling that g(· , ·), h(·) are non-negative and ℓ(· , · , ·) is positive, one automatically has The notation D(K;D) emphasizes the compatibility of the set K, D, and the dynamics (1.1).
With the presence of the terminal state constraint, we may ask the following natural questions: Optimal control problems (with the usual continuous-time controls, not impulse controls) with (terminal) state constraints have been extensively studied in the literature, especially by the Pontryagin maximum principle approach (which is referred to as the variational method), see for example, [10,11,17]. However, such kind of problems are difficult to be treated by dynamic programming method. For time optimal control problem (with a target set, a terminal state constraint), which is comparable to our Problem (C), to ensure the value function to be continuous, the so-called small time local controllability (STLC, for short) was introduced ( [12,1]). This condition implies that when the state gets close to the boundary of the target set (from outside), only a small amount of time is needed to reach the target by a control action. This then will lead to the continuity of the value function. Inspired by the STLC, for our Problem (C), we will introduce a suitable condition that will make the cost difference small between the terminal state getting close to the boundary of the constraint set, either from inside or from outside. This will lead to the continuity of the value function for Problem (C). The major contribution of this paper is the discovery of such a condition which can be regarded as a version of STLC in optimal impulse controls.
The rest of the paper is organized as follows. Section 2 is devoted to some preliminary results. In Section 3, we will prove the continuity of the value function and an interesting example will be present there. Dynamic programming principle will be established in Section 4. In Section 5, we will derive HJB equation and characterize the value function as the unique viscosity solution to the HJB equation.

Preliminary Results
Before going further, let us first introduce the following hypotheses.
(H1) K ⊆ R n is closed satisfying (1.3), and D ⊂ R n is a non-empty proper convex domain (open and connected subset, different from R n ).
(H2) The map f : [0, T ] × R n → R n is continuous and there exists a constant L > 0 such that Note that condition (1.3) implies that K is a convex cone with the vertex located at the origin. Such a condition implies that if ξ and ξ ′ are two admissible impulses, so is ξ + ξ ′ . In what follows, we call the impulse control that contains no impulses the trivial impulse control, denote it by ξ 0 (·). Note that due to the presence of the (strictly positive) impulse cost, the trivial impulse control is different from the zero impulse control (which contains impulses with ξ k = 0). Let us first present the following result concerning the state trajectories.
Form the above (2.4), we see that although s → X(s) might have jumps, these jumps can be controlled in some ways.
Let us now look at the non-emptiness of the admissible impulse control set K x [t, T ]. Recalling D(K;D) (see (1.13)), we see that Note that even in the case K = R n , due to the presence of the terminal state constraint, Problem (C) is still not trivial. One reason is that the continuity of the value function is not necessarily guaranteed. See later sections for details. Now, we present the following simple results, with some K = R n . Then Proof. For any (t, x) ∈ [0, T ] × R n , under the trivial impulse control ξ 0 (·), the state will arrive at X(T − 0; t, x, ξ 0 (·)) ∈ R n . By (2.7), we have some η ∈D and ξ ∈ K such that Then by defining impulse control Thus, K x [t, T ] = ∅. This proves our conclusion.
One of the most interesting examples is the following: If then (2.7) holds. Consequently, for such a case, one has (2.8). The following is a refinement of the above Proposition 2.2.

Proposition 2.3. (i) Suppose D ⊂ R n is a conic domain with the vertex is at the origin. Suppose that
Then there exists a constant C > 0 depending on D such that for any x ∈ R n , one can find an η ∈D and a ξ ∈ K such that Consequently, (2.7) holds, and (2.8) holds.
(ii) Suppose D is bounded and K = R n . Then (2.7) must fail.
for some λ > 0, then by taking η = 0 ∈D and ξ = λξ 0 , we have (2.10). Hence, we need only to look at the case that x / ∈D, with x and ξ 0 are linearly independent. Consider the two-dimensional space H spanned by x and ξ 0 . After a proper linear transformation, we may look at the following situation in R 2 : Likewise, in the case x 2 < 0, we must have x 2 < −αx 1 . Take Taking (2.11) and (2.12) into account, we see that (2.10) holds.
(ii) Let D be bounded. Since K = R n is a closed convex cone, there must be a ζ ∈ K, |ζ| = 1 such that Now, we claim that λζ / ∈D − K for large enough λ > 0. In fact, if there exists an η λ ∈D and a ξ λ ∈ K such that λζ = η λ − ξ λ .
Let us look at the following simple example to get some more feeling.
We consider two cases.
This means that the admissible impulse control set K x [t, T ] could be empty for some initial pair (t, x) ∈ [0, T ] × R n (or (2.8) could fail). Further, we see from the above that even for (t, Then, Consequently, one has But, still (2.8) fails.
Let us now look at the general situation. First of all, under (H1)-(H2), for any x ∈ D, we may let This means that ξ 0 (·) ∈ K x [t, T ]. Hence, under (H1)-(H2), the following is always true: We now would like to get more precise description of D(V ). For state equation (1.1), we consider the following "backward" system For any x ∈ Y (t;D), one has some η ∈D such that Then, with the trivial impulse control ξ 0 (·), we have Thus, Y (t;D) is the set of all possible initial state that if the system starts at (t, x), the state will reach D at T under ξ 0 (·). Next, according to the above, we see that Y (t;D − K) is the set of all initial states x such that the state can reachD by making a possible impulse at T .
This can be described by the following: which is the set of all initial states x ∈ R n such that if the system starts at (t, x), with possible impulses at t 0 , t 1 , · · · , t N , the state will reachD at T . Clearly, for any two partitions Π 1 and Hence, we may define where Π is the mesh size of Π defined by From the construction, we see that Y (t) is the set of all initial state that if the system starts from (t, x), then with impulse controls, the state can reachD at T , i.e., Hence, we have the following characterization of D(V ): The following example gives a concrete construction of Y (t).
Note that as s decreases from T , the vector (Y 1 (s), Y 2 (s)) ⊤ turns counter-clockwise. We may keep making impulses to see that x 1 x 1 where co (M ) is the convex hull of the set M , i.e., the smallest convex set containing M . In the following illustrative figures, the blue arrow lines give the directions of impulses; the dashed arcs give the directions of the points turning. Thus, in the last figure (of the situation T − t > π 2 + ε, any initial point (x 1 , x 2 ) with x 2 > 1, one could first make a horizontal impulse ξ = (ξ 1 , 0) so that (x 1 + ξ 1 , x 2 ) is on the right of the dashed red line. Then by the original system which makes the point turning clockwise, and at t = T , the point will be in Y (T ). By making an impulse at t = T , the state will get intoD.
From the above, we see that However, one has D(K;D) ⊇ [0, T ] ×D.

Properties of the Value Functions
In this section, we will present some properties of the value function V (· , ·). To this end, we introduce the following hypothesis.
As we have indicated in the introduction section, one can assume that g and h are bounded below uniformly. Here, we directly assume them to be non-negative just for convenience. Condition (3.3) implies that as long as an impulse is made, no matter how small the ξ is, there is a strictly positive fixed cost ℓ 0 . Also, the larger the |ξ|, the larger the cost. Condition (3.5) means that if at (t, x) an impulse of size ξ + ξ ′ needs to be made, then one should make just one impulse of that size instead of making an impulse of size ξ immediately followed by another with size ξ ′ . Hence, in an optimal impulse control, τ k < τ k+1 if both are impulsive moments. In the case that ℓ(t, x, ξ) is independent of x, this condition is reduced to which is a classical condition assumed in the optimal impulse control problems. Because of this condition, ξ → ℓ(t, x, ξ) should be "sublinear". Hence, β ∈ (0, 1] and ξ → ℓ(t, x, ξ) grows at most linearly (see (3.3)).
Condition (3.6) means that if an impulse is going to be made, then the later the better, which is essentially due to the discount effect.
Our goal in this section is to obtain, under certain conditions, including (H1)-(H4), and some other additional ones, the bounds of the value functions, the smaller class of impulse controls on which the value functions are the infimum of the cost functional, and each impulse control in this smaller class has no more than a fixed number of impulses with the sizes of the impulses being bounded.
for some δ k > 0. We may assume x k → x 0 . Now for x 0 , we can find ξ 0 ∈ K such that Hence, for k > |ξ 0 |, large enough, we have (noting that D is open) This is a contradiction. Finally, define which satisfies the condition of the lemma.
For value function V (· , ·) of Problem (C), we have the following result.    (iii) Suppose (3.7) holds. For any (t, x) ∈ [0, T ] × R n , Problem (C) admits an optimal impulse control. Moreover, there exists a natural number N 0 1 (only depending on (t, x)) such that Proof. (i) First, from (2.19), we know that D(V ) is non-empty. Let (t, x) ∈ D(V ), then there exists an impulse control ξ(·) ∈ K x [t, T ] such that X(T ; t, x, ξ(·)) ∈D. There are two cases.
Then it is reduced to Case 1.
This is the impulse control that only makes one impulse at T and make the state jump to η ∈D. Clearly, Hence, we can find N 0 1 such that (3.10) holds.

Continuity of the Value Function
In this section, we will establish the continuity of the value functions V (· , ·). Note that unlike the classical situation, when the terminal state constraint is presented, the value functions could be discontinuous. Also, some proper conditions will ensure the continuity of the value functions. To be convincing, let us first look at a simple example.
Hence, during [t, T ] an impulse has to be made. The most economical impulse will be where the choice τ 1 ∈ [t, T ] is irrelevant. Under such an impulse control, we have Apparently, such an impulse control is optimal. Finally, if x + T − t < 0, then we take with an arbitrary τ 1 ∈ [t, T ]. Again, this impulse control is optimal. With such a control, one has Consequently, Clearly, this value function V (· , ·) is discontinuous (along the lines x + T − t = 0 and x + T − t = 1). Now, we modify the cost functional as follows: For any X ∈ R (a possible terminal state location), take ξ ∈ K ≡ R and look at the following: h(X + ξ) + ℓ(T, X, ξ) = 9 X + ξ − 2 5 2 + 1 + |ξ|, 1]. This is the cost at the terminal time T if the terminal state is X and an impulse ξ is made at T . Hence, let us consider the following function which will help us to decide whether we should make an impulse at T . For any given X ∈ R, we want to find the minimum of b → F (b, X). To this end, we observe that Clearly, Further, for X ∈ (b 0 , b 1 ), we have To summarize, we have and X → min whose unique solution is On (b 1 , 1), we solve 9 X − 2 5 2 = 103 180 + X, whose unique solution is The above tells us that (recalling b = X + ξ) 1] h(X + ξ) + ℓ(T, X, ξ) , X ∈ (a 0 , a 1 ), 1] h(X + ξ) + ℓ(T, X, ξ) , X ∈ R \ (a 0 , a 1 ).

h(X) min
This means that if the terminal state X(T − 0) ∈ (a 0 , a 1 ), we should not make an impulse at T , and if X(T − 0) ∈ R \ (a 0 , a 1 ), we should make an impulse as follows: Combining the above analysis, we obtain the value function which is continuous.
Now, let K = [0, ∞) and D = (0, 1). Then from Example 2.4, we see that and only positive impulses can be made. Hence, by looking above computation, we see that if X(T − 0; t, x, ξ 0 (·)) < a 0 , we could make an impulse; for all other cases, we could not/should not make impulses. Therefore, This value function is continuous over D(V ) = D(V ) which is a closed set.
The above example shows that when the terminal cost function h(·) and the impulse cost are compatible, one could get the continuity of the value function V (· , ·). A careful observation shows that when the terminal state gets close to the boundary ∂D of the constraint set D from inside, an impulse should be made to reduce the cost. This essentially eliminates the possible jumps of the best costs between the terminal state X(T − 0) being close to the boundary ∂D from outside and from inside of D. We now would like to present general results.  h(x + ξ) + ℓ(t, x, ξ) < h(x), ∀x ∈ ∂D.
Proof. Let (t, x) ∈ [0, T ] × R n . For any ε > 0, there exists an impulse control ξ ε (·) ∈ K x [t, T ] such that If X ε (T ) ∈ ∂D, then there exists a ζ ∈ K such that Thus, by letting we have Hence, we may assume that X ε (T ) ∈ D. Now, for any x ∈ R n , let X ε (·) = X(· ; t, x, ξ ε (·)), we have Hence, recalling that D is open, for |x − x| small, one sees that ξ ε (·) ∈ K x [t, T ]. Consequently, making use of Propositions 2.1 and 3.2, together with the Lipschitz continuity of x → ℓ(t, x, ξ), we have for some constant C(|x| ∨ | x|) depending on |x| ∨ | x|. Since ε > 0 is arbitrary, we obtain By symmetry, we obtain the continuity of x → V (t, x).

Dynamic Programming Principle and HJB Equation
In this section, we are going to establish Bellman's principle of optimality for our Problem (C). Then the corresponding HJB equation for the value function V (· , ·) will be derived.
Note that in the case (X(t; t, x, ξ 0 (·))) / ∈ V (V ), the above is trivially true. Hence, the above is true only if (t, x) ∈ D(V ).
Combing this with (5.2), the proof is complete.
The above result leads to the following Hamilton-Jacobi-Bellman equation for the value function V (· , ·). The proof is standard.
A continuous function V (· , ·) is called a viscosity solution of the HJB equation (5.8) if it is both viscosity supsolution and viscosity subsolution.
We now state the following result whose proof is (almost) standard (see [15,Theorem 3.2.4]). (ii) If in addition, (3.7) and (4.8) hold, then the value function V (· , ·) is the unique viscosity solution to the HJB equation (5.8).