36:[["$","audio",null,{"id":"tts"}],["$","$L3b",null,{"paperID":"2001.03148","publisher":"arxiv","paperJSON":{"title":"Regularity and stability of feedback relaxed controls","paperID":"2001.03148","avgLineHeight":13.56,"imgScale":4,"sections":[{"heading":"Abstract","paragraphs":[[{"text":"$3c","element":"span"}],[{"text":"Key words. ","element":"span"},{"text":"exploration and exploitation, feedback relaxed control, Lipschitz stability, sensitivity equation, reinforcement learning, Hamilton-Jacobi-Bellman equation.","element":"span"}],[{"style":{"width":"64%"},"width":1185,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/0-0.png","element":"img"}]]},{"heading":"1 Introduction","paragraphs":[[{"text":"In this paper, we propose a relaxed control regularization with a class of exploration rewards to design robust feedback controls for multi-dimensional stochastic control problems in a continuous setting. ","element":"span"},{"text":"In particular, we shall rigorously demonstrate that the constructed optimal feedback control is Lipschitz stable with respect to perturbations in the underlying model.","element":"span"}],[{"text":"Since parameter uncertainty in a given model is practically inevitable, it is essential but challenging to ","element":"span"},{"text":"a priori ","element":"span"},{"text":"evaluate the performance of a pre-computed feedback control in a perturbed system, and to design feedback policies capable of handling model uncertainty. For instance, let us consider the following infinite-horizon stochastic control problem. Suppose (","element":"span"},{"style":{"height":18.29},"width":114.44,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/0-1.png","element":"img","alt":"αt)t≥0","inline":true,"padRight":true},{"text":"is an admissible control process taking values in a ","element":"span"},{"text":"finite ","element":"span"},{"text":"action space ","element":"span"},{"text":"A","element":"span"},{"text":", and the underlying state dynamics follows a controlled stochastic differential equation (SDE) defined as follows: ","element":"span"},{"style":{"height":19.23},"width":378.24,"height":48.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/0-2.png","element":"img","alt":" Xα,x0 = x ∈ Rn, and","inline":true}],[{"style":{"width":"51%"},"width":944,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/0-3.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":13.54},"width":925.32,"height":33.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/0-4.png","element":"img","alt":" b : Rn × A → Rn and σ : Rn × A → Rn×n ","inline":true,"padRight":true},{"text":"are given coefficients. ","element":"span"},{"text":"The aim of the controller is to maximize the total expected discounted reward over all admissible strategies. It is well-known that (see e.g. [","element":"span"},{"href":"#id-0","referenceIndex":19,"text":"19","element":"a"},{"text":", Corollary 5.1 on p. 167] and Theorem ","element":"span"},{"href":"#id-1","text":"2.2 ","element":"a"},{"text":"for more precise statements), under certain regularity assumptions, the optimal control strategy can be represented as a deterministic function ","element":"span"},{"style":{"height":12.8},"width":294.8,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-0.png","element":"img","alt":" αu : Rn → A","inline":true},{"text":", called the optimal feedback control, which maps the current state space into the action space. ","element":"span"},{"text":"Moreover, one can construct such an optimal feedback control ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-1.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"via a verification argument, which consists of solving a nonlinear Hamilton– Jacobi–Bellman (HJB) partial differential equation (PDE) arising from the dynamic programming principle for the optimal reward function ","element":"span"},{"text":"u","element":"span"},{"text":", and then performing a pointwise maximization of the associated Hamiltonian involving the function ","element":"span"},{"text":"u ","element":"span"},{"text":"and its derivatives (","element":"span"},{"style":{"height":20.4},"width":255.56,"height":51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-2.png","element":"img","alt":"∂iu, ∂iju)ni,j=1 ","inline":true,"padRight":true},{"text":"as follows: for ","element":"span"},{"text":"any given ","element":"span"},{"style":{"height":15.2},"width":144.48,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-3.png","element":"img","alt":" x ∈ Rn,","inline":true}],[{"id":"id-3","style":{"width":"95%"},"width":1762,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-4.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":19.54},"width":500.56,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-5.png","element":"img","alt":" a(x, α) = σ(x, α)σT (x, α)/","inline":true},{"text":"2, the functions ","element":"span"},{"text":"c ","element":"span"},{"text":"and ","element":"span"},{"text":"f ","element":"span"},{"text":"denote the discount rate and the instantaneous reward, respectively. We refer the reader to Theorems ","element":"span"},{"href":"#id-1","text":"2.2 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-2","text":"3.5 ","element":"a"},{"text":"for rigorous arguments of the above procedure for control problems of our interest, and to [","element":"span"},{"href":"#id-0","referenceIndex":19,"text":"19","element":"a"},{"text":", Theorem 5.1 on p. 166] for a general statement.","element":"span"}],[{"text":"We observe, however, that the control strategy ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-6.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"satisfying (","element":"span"},{"href":"#id-3","text":"1.1","element":"a"},{"text":") in general is difficult to implement and unstable to parameter perturbations, which in practice would result in numerical instability of learning algorithms. Due to the finiteness of the action space ","element":"span"},{"text":"A ","element":"span"},{"text":"and the fact that arg max is a set-valued mapping, a function ","element":"span"},{"style":{"height":12.8},"width":246.32,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-7.png","element":"img","alt":" αu : Rn → A","inline":true,"padRight":true},{"text":"satisfying (","element":"span"},{"href":"#id-3","text":"1.1","element":"a"},{"text":") in general is non-unique and merely measurable, and hence it is hard to follow such an irregular strategy in practice. More importantly, the discreteness of the set ","element":"span"},{"text":"A ","element":"span"},{"text":"implies that the arg max mapping is not continuous (in the sup-norm), which makes the feedback control ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-8.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"very sensitive to perturbations of the coefficients (","element":"span"},{"style":{"height":16.4},"width":144.04,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-9.png","element":"img","alt":"b, σ, c, f","inline":true},{"text":"). In other words, a slight change of the model parameters will result in a significant change of the feedback control, especially in the regions where two or more actions lead to similar performances based on the current model. Since it is difficult to determine the occurance of such regions ","element":"span"},{"text":"a priori","element":"span"},{"text":", it is unclear how well the control strategy ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-10.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"will perform in a real system with the perturbed coefficients (","element":"span"},{"text":"˜","element":"span"},{"style":{"height":20.21},"width":154,"height":50.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-11.png","element":"img","alt":"b, ˜σ, ˜c, ˜f","inline":true},{"text":"), even if (","element":"span"},{"text":"˜","element":"span"},{"style":{"height":20.21},"width":154,"height":50.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-12.png","element":"img","alt":"b, ˜σ, ˜c, ˜f","inline":true},{"text":") is very close to (","element":"span"},{"style":{"height":17.6},"width":174.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/1-13.png","element":"img","alt":"b, σ, c, f).","inline":true,"padRight":true},{"text":"See the last paragraph of Section ","element":"span"},{"text":"2 ","element":"span"},{"text":"for more details on the instability of feedback controls and its practical impact on learning algorithms.","element":"span"}],[{"text":"A tremendous amount of effort has been made to overcome the above difficulties, particularly in the (discrete-time) Reinforcement Learning (RL) setting (see e.g. [","element":"span"},{"href":"#id-4","referenceIndex":38,"text":"39","element":"a"},{"text":"]), where the agent seeks (nearly) optimal decisions in a random environment with incomplete information. ","element":"span"},{"text":"Generally speaking, the controller must balance between greedily exploiting the available information to choose actions that maximize short-term rewards, and continuously exploring the environment to acquire more knowledge for long-term benefits. In particular, an entropy-regularized formulation has been proposed for solving (discrete-time) RL problems in [","element":"span"},{"href":"#id-5","referenceIndex":45,"text":"46","element":"a"},{"text":", ","element":"span"},{"href":"#id-6","referenceIndex":33,"text":"33","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":"], where the authors incorporate explorations by explicitly including the entropy of the exploration strategy in the optimization objective as a reward function, and balance exploitation and exploration by adjusting a weight imposed on this regularization term. ","element":"span"},{"text":"Empirical studies (e.g. ","element":"span"},{"text":"[","element":"span"},{"href":"#id-5","referenceIndex":45,"text":"46","element":"a"},{"text":", ","element":"span"},{"href":"#id-8","referenceIndex":24,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-6","referenceIndex":33,"text":"33","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":"]) show that such a regularized formulation leads to more robust decision making. Recently, the authors in [","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":", ","element":"span"},{"href":"#id-10","referenceIndex":42,"text":"43","element":"a"},{"text":"] extended this entropy-regularized formulation to continuous-time RL problems by using the relaxed control framework, and study the exploration/exploitation trade-off for one-dimensional linear-quadratic (LQ) control problems via explicit solutions. The relaxed control approach has then been extended to (discrete-time) RL problems with mean-field controls in [","element":"span"},{"href":"#id-11","referenceIndex":22,"text":"23","element":"a"},{"text":"].","element":"span"}],[{"text":"In this work, we propose an exploratory framework with general exploration rewards to design robust feedback controls for continuous-time stochastic exit time problems with continuous state space and discrete action space. Our formulation extends the relaxed control approach in [","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":", ","element":"span"},{"href":"#id-10","referenceIndex":42,"text":"43","element":"a"},{"text":"] to multi-dimensional state dynamics and general exploration rewards, including Shannon’s differ-ential entropy and other commonly used regularization functions in the optimization literature (see e.g. [","element":"span"},{"href":"#id-12","referenceIndex":15,"text":"15","element":"a"},{"text":", ","element":"span"},{"href":"#id-13","referenceIndex":44,"text":"45","element":"a"},{"text":"]); see the remark at the end of Section ","element":"span"},{"text":"3 ","element":"span"},{"text":"for a detailed comparison among different exploration reward functions.","element":"span"}],[{"text":"A major theoretical contribution of this work is a rigorous stability analysis of the regularized control problem and its associated feedback control strategy. Although the entropy-regularized RL formulation has demonstrated remarkable robustness in various empirical studies (e.g. [","element":"span"},{"href":"#id-5","referenceIndex":45,"text":"46","element":"a"},{"text":", ","element":"span"},{"href":"#id-8","referenceIndex":24,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-6","referenceIndex":33,"text":"33","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":", ","element":"span"},{"href":"#id-11","referenceIndex":22,"text":"23","element":"a"},{"text":", ","element":"span"},{"href":"#id-10","referenceIndex":42,"text":"43","element":"a"},{"text":"]), to the best of our knowledge, there is no published theoretical work on the Lipschitz stability of ","element":"span"},{"text":"feedback relaxed controls ","element":"span"},{"text":"with respect to parameter uncertainty (even in a discrete-time setting) nor on the Lipschitz stability of the value functions for regularized continuous-time stochastic control problems with general multi-dimensional nonlinear state dynamics. In fact, most existing results on the Lipschitz stability of feedback controls are for LQ control problems with linear state dynamics and quadratic cost functions (see e.g. [","element":"span"},{"href":"#id-14","referenceIndex":30,"text":"31","element":"a"},{"text":"] for discrete-time LQ problems in an ergodic setting and [","element":"span"},{"href":"#id-15","referenceIndex":6,"text":"6","element":"a"},{"text":"] for finite-horizon continuous-time LQ problems). The stability analysis of such problems relies heavily on the linearity of optimal feedback controls and the associated Riccati equations, and hence cannot be directly extended to general nonlinear control problems. We refer the reader also to [","element":"span"},{"href":"#id-16","referenceIndex":2,"text":"2","element":"a"},{"text":", ","element":"span"},{"href":"#id-17","referenceIndex":29,"text":"30","element":"a"},{"text":", ","element":"span"},{"href":"#id-18","referenceIndex":3,"text":"3","element":"a"},{"text":", ","element":"span"},{"href":"#id-18","referenceIndex":3,"text":"4","element":"a"},{"text":", ","element":"span"},{"href":"#id-15","referenceIndex":6,"text":"7","element":"a"},{"text":", ","element":"span"},{"href":"#id-19","referenceIndex":8,"text":"8","element":"a"},{"text":", ","element":"span"},{"href":"#id-20","referenceIndex":27,"text":"27","element":"a"},{"text":"] for the continuity of various stochastic optimization problems, including stochastic control problems and optimal stopping problems, in the underlying processes with respect to the (extended) weak topology.","element":"span"}],[{"text":"In this work, we shall close the gap by providing a theoretical justification for recent RL heuristics that including an exploration reward in the optimization objective leads to more robust decision making. In particular, we shall demonstrate that the change in value functions of the regularized control problems (in the ","element":"span"},{"style":{"height":15.94},"width":80.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/2-0.png","element":"img","alt":" C2,β","inline":true},{"text":"-norm) depends Lipschitz-continuously on the perturbations of the model parameters, including the coefficients of the state dynamics and reward functions in the optimization objective. We shall also prove that the regularized control problem admits a H¨older continuous feedback control (cf. the original control ","element":"span"},{"href":"#id-3","style":{"height":17.6},"width":190.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/2-1.png","element":"img","alt":" αu in (1.1","inline":true},{"text":") is merely measurable), which is Lipschitz stable (in the ","element":"span"},{"style":{"height":15.94},"width":54.56,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/2-2.png","element":"img","alt":" Cβ","inline":true},{"text":"-norm) with respect to parameter perturbations; see Theorem ","element":"span"},{"href":"#id-21","text":"4.2","element":"a"},{"text":".","element":"span"}],[{"text":"Moreover, this is the first paper which precisely quantifies the performance of a feedback control pre-computed based on a given model in a new multi-dimensional controlled dynamics with perturbed coefficients. We will prove that the gap between the suboptimal reward function achieved by the pre-computed feedback relaxed control and the optimal reward function of the perturbed relaxed control problem depends Lipschitz-continuously on the magnitude of perturbations in the coefficients (see Theorem ","element":"span"},{"href":"#id-22","text":"4.4","element":"a"},{"text":"). We also establish a first-order sensitivity equation for the value function and feedback control of the perturbed relaxed control problem (see Theorem ","element":"span"},{"href":"#id-23","text":"5.2 ","element":"a"},{"text":"and Remark ","element":"span"},{"href":"#id-24","text":"5.1","element":"a"},{"text":"), which enables us to quantify the explicit dependence of the Lipschitz stability of feedback controls on the exploration parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/2-3.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"(see Theorem ","element":"span"},{"href":"#id-25","text":"5.4","element":"a"},{"text":").","element":"span"}],[{"text":"Let us briefly comment on the two main difficulties encountered in the stability analysis of feedback relaxed controls beyond those encountered in the finite-dimensional RL setting (see e.g. [","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"20","element":"a"},{"text":", ","element":"span"},{"href":"#id-26","referenceIndex":12,"text":"12","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":"]) and the LQ setting (see e.g. [","element":"span"},{"href":"#id-14","referenceIndex":30,"text":"31","element":"a"},{"text":", ","element":"span"},{"href":"#id-15","referenceIndex":6,"text":"6","element":"a"},{"text":"]). As we shall see in (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":"), the feedback relaxed control (in the present continuous setting) is defined as the pointwise maximizer of the associated Hamiltonian, which in general involves not only the value function of the regularized control problem, but also its first and second order derivatives. Hence, besides estimating the sup-norm of the value functions as in the finite-dimensional RL setting, we also need to quantify the impact of parameter uncertainty on the (first and second order) derivatives of the value functions, which are solutions to a fully nonlinear HJB PDEs. For continuous-time LQ problems, such an analysis can be greatly simplified by taking advantage of the quadratic structure of the value function, which reduces the study of HJB PDEs to that of Riccati ordinary differential equations. Such a simplification is not possible for general nonlinear control problems, which requires us to derive a precise ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for the derivatives of solutions to the associated fully nonlinear HJB","element":"span"}],[{"text":"equations.","element":"span"}],[{"text":"Moreover, the Lipschitz stability and the first-order sensitivity analysis of the feedback relaxed controls also require us to establish the regularity of the HJB operator and the arg max-mapping between suitable function spaces for regularized control problems. ","element":"span"},{"text":"As already pointed out in [","element":"span"},{"href":"#id-28","referenceIndex":39,"text":"40","element":"a"},{"text":", ","element":"span"},{"href":"#id-29","referenceIndex":25,"text":"26","element":"a"},{"text":"], the fact that the HJB operator is fully nonlinear (since we allow the diffusion coefficients to be controlled) poses a significant challenge for choosing proper function spaces to simultaneously ensure the differentiability of the fully nonlinear HJB operator and the bounded invertibility of its (Fr´echet) derivative, which are essential for deriving the sensitivity equations of the value functions and feedback controls (see Theorem ","element":"span"},{"href":"#id-23","text":"5.2 ","element":"a"},{"text":"and Remark ","element":"span"},{"href":"#id-24","text":"5.1","element":"a"},{"text":"). Here, by taking advantage of the exploration reward functions, we demonstrate that the HJB operator and the arg max-mapping for the regularized control problem are sufficiently smooth between suitable H¨older spaces, which together with an elliptic regularity estimate leads us to the desired sensitivity results for the feedback relaxed controls; see Remark ","element":"span"},{"href":"#id-30","text":"4.1 ","element":"a"},{"text":"for more details.","element":"span"}],[{"text":"Finally, we establish that, as the exploration parameter tends to zero, the value function of the relaxed control problem converges monotonically to that of the classical stochastic control problem with a first-order accuracy (see Theorem ","element":"span"},{"href":"#id-31","text":"6.1","element":"a"},{"text":"). The convergence of value functions (in the ","element":"span"},{"style":{"height":15.93},"width":80.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-0.png","element":"img","alt":" C2,β","inline":true},{"text":"-norm) subsequently enables us to deduce a novel uniform result (on compact sets) for the feedback relaxed control to a pure exploitation strategy of the original control problem. We further prove an exact regularization property for a class of reward functions, which allows us to recover the pure exploitation strategy based on the feedback relaxed control ","element":"span"},{"text":"without ","element":"span"},{"text":"sending the exploration parameter to 0 (see Theorem ","element":"span"},{"href":"#id-32","text":"6.4","element":"a"},{"text":").","element":"span"}],[{"text":"We organize this paper as follows. Section ","element":"span"},{"text":"2 ","element":"span"},{"text":"introduces the stochastic exit control problem, and establishes its connection to HJB equations. ","element":"span"},{"text":"In Section ","element":"span"},{"text":"3","element":"span"},{"text":", we propose a relaxed control regularization involving general exploration reward functions for the stochastic control problem, and establish the H¨older regularity of the feedback relaxed control strategy. Then, for a fixed positive exploration parameter, we prove the Lipschitz stability of the value function and feedback relaxed control with respect to parameter perturbations in Section ","element":"span"},{"text":"4","element":"span"},{"text":", and derive their first-order sensitivity equations in Section ","element":"span"},{"text":"5","element":"span"},{"text":". We establish the convergence of value functions and relaxed control strategies for vanishing exploration parameters in Section ","element":"span"},{"text":"6","element":"span"},{"text":". Appendix ","element":"span"},{"text":"A ","element":"span"},{"text":"is devoted to the proofs of some technical results.","element":"span"}]]},{"heading":"2 Stochastic exit time problem and HJB equation","paragraphs":[[{"text":"In this section, we introduce the stochastic exit time problem of our interest, state the main assumptions on its coefficients, and recall its connection with HJB equations. We start with some useful notation which is needed frequently throughout this work.","element":"span"}],[{"text":"For any given multi-index ","element":"span"},{"style":{"height":17.6},"width":1263.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-1.png","element":"img","alt":" β = (β1, . . . , βn) with βi ∈ N ∪ {0}, i = 1, . . . , n, we define |β| =","inline":true},{"style":{"height":34.27},"width":600.96,"height":85.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-2.png","element":"img","alt":"�ni=1 βi and Dβφ = ∂|β|φ∂xβ11 ...∂xβnn .","inline":true,"padRight":true},{"text":"For any given open subset ","element":"span"},{"style":{"height":17.6},"width":709.92,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-3.png","element":"img","alt":" O ⊂ Rn, k ∈ N ∪ {0}, θ ∈ (0, 1], and","inline":true}],[{"text":"function ","element":"span"},{"style":{"height":16.4},"width":49.92,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-4.png","element":"img","alt":" φ :","inline":true}],[{"style":{"width":"61%"},"width":1128,"height":243,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-5.png","element":"img"}],[{"text":"Then we shall denote by ","element":"span"},{"style":{"height":19.53},"width":106.52,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-6.png","element":"img","alt":" Ck(O","inline":true},{"text":") the space of ","element":"span"},{"text":"k","element":"span"},{"text":"-times continuously differentiable functions in ","element":"span"},{"text":"O ","element":"span"},{"text":"equipped with the norm ","element":"span"},{"style":{"height":24.83},"width":762.68,"height":62.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/3-7.png","element":"img","alt":" |φ|k;O = �km=0[φ]m,0;O, and by Ck,θ(O","inline":true},{"text":") the space consisting of all functions in ","element":"span"},{"style":{"height":19.53},"width":106.52,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-0.png","element":"img","alt":" Ck(O","inline":true},{"text":") satisfying [","element":"span"},{"style":{"height":21.17},"width":231.68,"height":52.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-1.png","element":"img","alt":"φ]k,θ;O < ∞","inline":true},{"text":", equipped with the norm ","element":"span"},{"style":{"height":21.17},"width":509.76,"height":52.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-2.png","element":"img","alt":" |φ|k,θ;O = |φ|k;O + [φ]k,θ;O.","inline":true,"padRight":true},{"text":"When ","element":"span"},{"text":"k ","element":"span"},{"text":"= 0, we use ","element":"span"},{"style":{"height":19.53},"width":105.08,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-3.png","element":"img","alt":" Cθ(O","inline":true},{"text":") to denote ","element":"span"},{"style":{"height":23.1},"width":802.2,"height":57.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-4.png","element":"img","alt":" C0,θ(O), and use | · |θ;O to denote | · |0,θ;O","inline":true},{"text":". We shall omit ","element":"span"},{"text":"the subscript","element":"span"},{"text":"O ","element":"span"},{"text":"in the (semi-)norms if no confusion appears.","element":"span"}],[{"text":"Finally, we shall denote by [","element":"span"},{"style":{"height":19.14},"width":248.24,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-5.png","element":"img","alt":"aij] the n×n","inline":true,"padRight":true},{"text":"matrix whose ","element":"span"},{"text":"ij","element":"span"},{"text":"th-entries are given by ","element":"span"},{"style":{"height":19.34},"width":253.32,"height":48.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-6.png","element":"img","alt":" aij, by Sn, Sn0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.42},"width":50.48,"height":43.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-7.png","element":"img","alt":" Sn>","inline":true},{"text":", respectively, the set of ","element":"span"},{"style":{"height":9.2},"width":102.8,"height":23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-8.png","element":"img","alt":" n × n","inline":true,"padRight":true},{"text":"symmetric, symmetric positive semi-definite and symmetric ","element":"span"},{"text":"positive definite matrices, by ","element":"span"},{"style":{"height":14.8},"width":244.2,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-9.png","element":"img","alt":" X ≥ Y in Sn ","inline":true,"padRight":true},{"text":"the fact that ","element":"span"},{"style":{"height":12},"width":127.12,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-10.png","element":"img","alt":" X − Y","inline":true,"padRight":true},{"text":"is positive semi-definite. For any given ","element":"span"},{"style":{"height":12.8},"width":125.6,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-11.png","element":"img","alt":" K ∈ N","inline":true},{"text":", we denote by ∆","element":"span"},{"style":{"height":8.8},"width":30,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-12.png","element":"img","alt":"K","inline":true,"padRight":true},{"text":"the probability simplex in ","element":"span"},{"style":{"height":18.34},"width":157.92,"height":45.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-13.png","element":"img","alt":" RK, i.e.,","inline":true}],[{"style":{"width":"78%"},"width":1455,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-14.png","element":"img"}],[{"text":"Now we are ready to introduce the control problem of interest. In order to allow irregular feedback control strategies, we consider the following weak formulation of a control problem, which includes the underlying probability space as part of control strategies (see e.g. [","element":"span"},{"href":"#id-33","referenceIndex":43,"text":"44","element":"a"},{"text":", ","element":"span"},{"href":"#id-0","referenceIndex":19,"text":"19","element":"a"},{"text":"]). See Remark ","element":"span"},{"href":"#id-34","text":"2.2 ","element":"a"},{"text":"for possible extensions to stochastic control problems under strong formulation, for which the underlying probability reference system is fixed.","element":"span"}],[{"style":{"height":18.29},"width":982.52,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-15.png","element":"img","alt":"Definition 2.1. A 5-tuple π = (Ω, F, {Ft}t≥0, P, W","inline":true},{"text":") is said to be a reference probability system if (Ω","element":"span"},{"style":{"height":18.29},"width":267.48,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-16.png","element":"img","alt":", F, {Ft}t≥0, P","inline":true},{"text":") is a filtered probability space satisfying the usual condition","element":"span"},{"style":{"height":19.82},"width":389.48,"height":49.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-17.png","element":"img","alt":"1, and W = (Wt)t≥0","inline":true,"padRight":true},{"text":"is an ","element":"span"},{"style":{"height":18.29},"width":353.36,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-18.png","element":"img","alt":" {Ft}t≥0-adapted n","inline":true},{"text":"-dimensional Brownian motion. We denote by Π","element":"span"},{"style":{"height":8.8},"width":41.32,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-19.png","element":"img","alt":"ref","inline":true,"padRight":true},{"text":"the set of all reference probability systems.","element":"span"}],[{"text":"Now let ","element":"span"},{"text":"O ","element":"span"},{"text":"be a given bounded domain in ","element":"span"},{"style":{"height":12},"width":52.68,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-20.png","element":"img","alt":" Rn","inline":true},{"text":", i.e., a bounded connected open subset of ","element":"span"},{"style":{"height":12},"width":65.76,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-21.png","element":"img","alt":" Rn.","inline":true,"padRight":true},{"text":"The aim of the controller is to maximize the expected discounted reward up to the first exit time of a controlled dynamics from the domain ","element":"span"},{"text":"O","element":"span"},{"text":". More precisely, let ","element":"span"},{"style":{"height":18.29},"width":611.56,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-22.png","element":"img","alt":" π = (Ω, F, {Ft}t≥0, P, W) ∈ Πref","inline":true,"padRight":true},{"text":"be a given reference probability system, and ","element":"span"},{"style":{"height":15.49},"width":55.04,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-23.png","element":"img","alt":" Aπ","inline":true,"padRight":true},{"text":"be the set of ","element":"span"},{"style":{"height":18.29},"width":144.2,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-24.png","element":"img","alt":" {Ft}t≥0","inline":true},{"text":"-progressively measurable processes ","element":"span"},{"style":{"height":8.4},"width":28,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-25.png","element":"img","alt":" α","inline":true,"padRight":true},{"text":"taking values in a finite set ","element":"span"},{"text":"A","element":"span"},{"text":". For any given initial state ","element":"span"},{"style":{"height":12.8},"width":130.92,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-26.png","element":"img","alt":" x ∈ Rn","inline":true},{"text":", and control ","element":"span"},{"style":{"height":16},"width":150.72,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-27.png","element":"img","alt":" α ∈ Aπ,","inline":true,"padRight":true},{"text":"we consider the controlled dynamics ","element":"span"},{"style":{"height":12},"width":89.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-28.png","element":"img","alt":" Xα,x ","inline":true,"padRight":true},{"text":"satisfying the following SDE: ","element":"span"},{"style":{"height":19.04},"width":259.68,"height":47.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-29.png","element":"img","alt":" Xα,x0 = x and","inline":true}],[{"id":"id-36","style":{"width":"77%"},"width":1439,"height":49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-30.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":13.54},"width":837.48,"height":33.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-31.png","element":"img","alt":" b : Rn × A → Rn and σ : Rn × A → Rn×n ","inline":true,"padRight":true},{"text":"are given Lipschitz continuous functions (see (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") for precise conditions), and denote by ","element":"span"},{"style":{"height":18.85},"width":605.2,"height":47.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-32.png","element":"img","alt":" τ α,x := inf{t ≥ 0 | Xα,xt ̸∈ O}","inline":true,"padRight":true},{"text":"the first exit time of the dynamics ","element":"span"},{"style":{"height":12},"width":89.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-33.png","element":"img","alt":" Xα,x ","inline":true,"padRight":true},{"text":"from the domain ","element":"span"},{"style":{"height":21.58},"width":489,"height":53.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-34.png","element":"img","alt":" O,2 and by (Γα,xt )t∈[0,τ α,x]","inline":true,"padRight":true},{"text":"the controlled discount factor: Γ","element":"span"},{"style":{"height":31.6},"width":926.68,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-35.png","element":"img","alt":"α,xt := exp�−� t0 c(Xα,xs , αs) ds�for all t ∈ [0, τ α,x","inline":true},{"text":"]. Then, for each given ","element":"span"},{"style":{"height":13.2},"width":113.24,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-36.png","element":"img","alt":" x ∈O","inline":true},{"text":", we shall consider the following value function:","element":"span"}],[{"id":"id-42","style":{"width":"83%"},"width":1539,"height":113,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/4-37.png","element":"img"}],[{"text":"where the functions ","element":"span"},{"text":"f ","element":"span"},{"text":"and ","element":"span"},{"text":"g ","element":"span"},{"text":"denote, respectively, the running reward and the exit reward. Throughout this work, we shall perform the analysis under the following assumptions on the coefficients:","element":"span"}],[{"id":"id-35","style":{"height":17.6},"width":770.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-0.png","element":"img","alt":"H.1. Let n, K ∈ N, K = {1, . . . , K}, A","inline":true,"padRight":true},{"text":"is a set of cardinality ","element":"span"},{"style":{"height":17.6},"width":633.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-1.png","element":"img","alt":" K, i.e., A = {ak}k∈K, and O be","inline":true,"padRight":true},{"text":"a bounded domain in ","element":"span"},{"style":{"height":12},"width":52.68,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-2.png","element":"img","alt":" Rn","inline":true},{"text":". There exist constants ","element":"span"},{"style":{"height":17.6},"width":366.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-3.png","element":"img","alt":" ν, Λ > 0, θ ∈ (0, 1]","inline":true,"padRight":true},{"text":"such that the boundary ","element":"span"},{"style":{"height":13.2},"width":60.44,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-4.png","element":"img","alt":" ∂O","inline":true,"padRight":true},{"text":"of ","element":"span"},{"text":"O ","element":"span"},{"text":"is of class ","element":"span"},{"style":{"height":19.53},"width":347.23,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-5.png","element":"img","alt":" C2,θ, g ∈ C2,θ(O)","inline":true},{"text":", and the functions ","element":"span"},{"style":{"height":16.33},"width":793.48,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-6.png","element":"img","alt":" b : Rn × A → Rn, σ : Rn × A → Rn×n,","inline":true}],[{"id":"id-46","style":{"width":"99%"},"width":1843,"height":256,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-7.png","element":"img"}],[{"id":"id-83","text":"Remark ","element":"span"},{"text":"2.1","element":"span"},{"text":". ","element":"span"},{"text":"The Lipschitz continuity of ","element":"span"},{"style":{"height":12.8},"width":290.28,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-8.png","element":"img","alt":" b and σ on Rn ","inline":true,"padRight":true},{"text":"ensures that, for any given ","element":"span"},{"style":{"height":15.2},"width":179.52,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-9.png","element":"img","alt":" π ∈ Πref,","inline":true},{"style":{"height":15.49},"width":394.92,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-10.png","element":"img","alt":"α ∈ Aπ and x ∈ Rn","inline":true},{"text":", the controlled SDE (","element":"span"},{"href":"#id-36","text":"2.2","element":"a"},{"text":") admits a unique strong solution. Moreover, the non-degeneracy of ","element":"span"},{"style":{"height":12.4},"width":152.04,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-11.png","element":"img","alt":" σ on Rn ","inline":true,"padRight":true},{"text":"ensures that SDEs with non-Lipschitz feedback controls admit a weak solution (cf. Theorems ","element":"span"},{"href":"#id-1","text":"2.2 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-2","text":"3.5","element":"a"},{"text":"); see also Lemma ","element":"span"},{"href":"#id-37","text":"3.1","element":"a"},{"text":".","element":"span"}],[{"text":"As shown in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Lemma 6.38], the fact that ","element":"span"},{"style":{"height":13.2},"width":60.44,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-12.png","element":"img","alt":" ∂O","inline":true,"padRight":true},{"text":"is of class ","element":"span"},{"style":{"height":15.54},"width":76.96,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-13.png","element":"img","alt":" C2,θ ","inline":true,"padRight":true},{"text":"ensures that a function in ","element":"span"},{"style":{"height":19.53},"width":131,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-14.png","element":"img","alt":"C2,θ(O","inline":true},{"text":") has boundary values in ","element":"span"},{"style":{"height":19.54},"width":156.92,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-15.png","element":"img","alt":" C2,θ(∂O","inline":true},{"text":"), and conversely, any function ","element":"span"},{"style":{"height":19.54},"width":421.28,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-16.png","element":"img","alt":" φ ∈ C2,θ(∂O) can be","inline":true,"padRight":true},{"text":"extended to a function in ","element":"span"},{"style":{"height":19.54},"width":131,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-17.png","element":"img","alt":" C2,θ(O","inline":true},{"text":"). Hence, one can introduce a boundary norm ","element":"span"},{"style":{"height":18.48},"width":305.12,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-18.png","element":"img","alt":" | · |2,θ;∂O for the","inline":true,"padRight":true},{"text":"space ","element":"span"},{"style":{"height":19.53},"width":156.92,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-19.png","element":"img","alt":" C2,θ(∂O","inline":true},{"text":"), such that for any given ","element":"span"},{"style":{"height":23.1},"width":1095.08,"height":57.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-20.png","element":"img","alt":" φ ∈ C2,θ(∂O), |φ|2,θ,∂O = infΦ |Φ|2,θ;O, where Φ ∈ C2,θ(O)","inline":true,"padRight":true},{"text":"is a global extension of ","element":"span"},{"style":{"height":16.4},"width":126.2,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-21.png","element":"img","alt":" φ toO","inline":true},{"text":". The space ","element":"span"},{"style":{"height":19.54},"width":156.44,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-22.png","element":"img","alt":" C2,θ(∂O","inline":true},{"text":") equipped with the norm ","element":"span"},{"style":{"height":18.48},"width":148.92,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-23.png","element":"img","alt":" | · |2,θ;∂O","inline":true,"padRight":true},{"text":"is a Banach space (see e.g. the discussions on page 94 in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":"]).","element":"span"}],[{"text":"To simplify the presentation, we study exit time control problems with H¨older continuous co-efficients in this work and analyze classical solutions of associated elliptic HJB equations. Similar results, including the characterization and Lipchitz stability of feedback relaxed controls in Sections ","element":"span"},{"text":"3 ","element":"span"},{"text":"and ","element":"span"},{"text":"4","element":"span"},{"text":", can be obtained for finite horizon control problems with measurable coefficients, whose corresponding parabolic HJB equations admit weak solutions in suitable Sobolev spaces (see [","element":"span"},{"href":"#id-39","referenceIndex":41,"text":"41","element":"a"},{"text":"] for the well-posedness of weak solutions to parabolic HJB equations and [","element":"span"},{"href":"#id-17","referenceIndex":29,"text":"29","element":"a"},{"text":", Theorem 1 on p. 122] for a generalized Itˆo’s formula). The first-order sensitivity analysis in Section ","element":"span"},{"text":"5 ","element":"span"},{"text":"in general can only be performed for classical solutions in H¨older spaces; see Remark ","element":"span"},{"href":"#id-30","text":"4.1 ","element":"a"},{"text":"for details.","element":"span"}],[{"text":"The rest of this section is devoted to the connection between the stochastic exit time problem and a Hamilton-Jacobi-Bellman (HJB) boundary value problem, which plays an essential role in the construction of feedback control strategies. More precisely, we now consider the following HJB equation with inhomogeneous Dirichlet boundary data:","element":"span"}],[{"id":"id-41","style":{"width":"76%"},"width":1407,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-24.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.82},"width":279.68,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-25.png","element":"img","alt":" H0 : RK → R","inline":true,"padRight":true},{"text":"is the pointwise maximum function, i.e., ","element":"span"},{"style":{"height":17.6},"width":617.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-26.png","element":"img","alt":" H0(x) = maxk∈K xk for all x =","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":19.54},"width":602.64,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-27.png","element":"img","alt":"x1, . . . , xK)T ∈ RK, f :O → RK ","inline":true,"padRight":true},{"text":"is the function satisfying ","element":"span"},{"style":{"height":17.6},"width":732.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-28.png","element":"img","alt":" f(x) = (f(x, ak))k∈K for all x ∈O, and","inline":true},{"style":{"height":17.6},"width":240.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-29.png","element":"img","alt":"L = (Lk)k∈K","inline":true,"padRight":true},{"text":"is a family of elliptic operators satisfying for all ","element":"span"},{"style":{"height":19.14},"width":575.72,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-30.png","element":"img","alt":" k ∈ K, φ ∈ C2(O), x ∈ O that","inline":true}],[{"id":"id-52","style":{"width":"86%"},"width":1605,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-31.png","element":"img"}],[{"text":"Above and hereafter, when there is no ambiguity, we shall denote by ","element":"span"},{"style":{"height":17.6},"width":75.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-32.png","element":"img","alt":" φk(·","inline":true},{"text":") a generic function ","element":"span"},{"style":{"height":17.6},"width":413.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-33.png","element":"img","alt":"φ(·, ak) for all k ∈ K","inline":true},{"text":", and adopt the summation convention as in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", ","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":"], i.e., repeated equal dummy indices indicate summation from 1 to ","element":"span"},{"text":"n","element":"span"},{"text":".","element":"span"}],[{"text":"Throughout this paper, we shall focus on the classical solution ","element":"span"},{"href":"#id-41","style":{"height":19.14},"width":575.16,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-34.png","element":"img","alt":" u ∈ C(O) ∩ C2(O) to (2.6) es-","inline":true,"padRight":true},{"text":"tablished in the following theorem, which subsequently enables us to characterize optimal feedback controls for (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":").","element":"span"}],[{"id":"id-61","text":"Theorem 2.1. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") holds, and let ","element":"span"},{"style":{"height":24.82},"width":387.48,"height":62.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-35.png","element":"img","alt":" M = supi,j,k |σijk |0;O","inline":true},{"text":". Then the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") ","element":"span"},{"text":"admits a unique solution ","element":"span"},{"style":{"height":19.13},"width":400.84,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-36.png","element":"img","alt":" u ∈ C(O) ∩ C2(O).","inline":true,"padRight":true},{"text":"Moreover, there exists a constant ","element":"span"},{"style":{"height":16.4},"width":102.16,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/5-37.png","element":"img","alt":" β0 =","inline":true}],[{"id":"id-43","style":{"width":"99%"},"width":1843,"height":193,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-0.png","element":"img"}],[{"text":"Proof. ","element":"span"},{"text":"We shall only prove the uniqueness of solutions in ","element":"span"},{"style":{"height":19.14},"width":261.08,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-1.png","element":"img","alt":" C(O) ∩ C2(O","inline":true},{"text":"), since the existence of classical solutions in ","element":"span"},{"style":{"height":20.33},"width":259.16,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-2.png","element":"img","alt":" C2,min(β0,θ)(O","inline":true},{"text":") will be established constructively based on the relaxed control approximation in Theorem ","element":"span"},{"href":"#id-31","text":"6.1 ","element":"a"},{"text":"(see also [","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 7.5] for a proof of existence based on the method of continuity), and the existence of a Borel measurable function satisfying (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") follows directly from the measurable selection theorem (see [","element":"span"},{"text":"1","element":"span"},{"text":", Theorem 18.19]).","element":"span"}],[{"text":"Let ","element":"span"},{"style":{"height":19.14},"width":418.04,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-3.png","element":"img","alt":" u1, u2 ∈ C(O) ∩ C2(O","inline":true},{"text":") be solutions to (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"). Then for all ","element":"span"},{"style":{"height":13.2},"width":113.24,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-4.png","element":"img","alt":" x ∈ O","inline":true},{"text":", we can deduce from the fundamental theorem of calculus that","element":"span"}],[{"style":{"width":"95%"},"width":1756,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-5.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.6},"width":413.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-6.png","element":"img","alt":" h : [0, T] × O → ∆K","inline":true,"padRight":true},{"text":"is a measurable function, and ","element":"span"},{"text":"˜","element":"span"},{"text":"L ","element":"span"},{"text":"denotes the elliptic operator satisfying for all ","element":"span"},{"style":{"height":23.57},"width":1572,"height":58.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-7.png","element":"img","alt":" φ ∈ C2(O) and x ∈ O that ˜Lφ(x) = ηT (x)Lφ(x) with η(x) =� 10 h(s, x) ds. In","inline":true,"padRight":true},{"text":"particular, the function ","element":"span"},{"text":"h ","element":"span"},{"text":"can be chosen as the weak limit of the functions ([0","element":"span"},{"style":{"height":17.6},"width":361.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-8.png","element":"img","alt":", T] × O ∋ (s, x) �→","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":19.54},"width":1831.8,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-9.png","element":"img","alt":"∇Hε0)(Lu2(x) + f(x) + sL(u1 − u2)(x)) ∈ ∆K)ε>0 in L2([0, T] × O), where (Hε0)ε>0 is a se-","inline":true,"padRight":true},{"text":"quence of smooth approximations of ","element":"span"},{"style":{"height":14.69},"width":53.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-10.png","element":"img","alt":" H0","inline":true,"padRight":true},{"text":"obtained by using the standard mollification argument. Then we can easily show that ","element":"span"},{"style":{"height":20.42},"width":559.92,"height":51.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-11.png","element":"img","alt":" η(x) ∈ ∆K for all x ∈ O, ˜L","inline":true,"padRight":true},{"text":"is a uniform elliptic operator, and ","element":"span"},{"style":{"height":22.05},"width":648.44,"height":55.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-12.png","element":"img","alt":"�Kk=1 ηk(x)ck(x) ≥ 0 for all x ∈ O","inline":true},{"text":". Hence the classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem ","element":"span"},{"text":"3.7]) and ","element":"span"},{"style":{"height":15.49},"width":283.16,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-13.png","element":"img","alt":" u1 = u2 on ∂O","inline":true,"padRight":true},{"text":"imply that ","element":"span"},{"style":{"height":15.09},"width":257.72,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-14.png","element":"img","alt":" u1 = u2 onO","inline":true},{"text":", which shows that the Dirichlet problem (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") admits at most one solution in ","element":"span"},{"style":{"height":19.13},"width":287.52,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-15.png","element":"img","alt":" C(O) ∩ C2(O).","inline":true}],[{"text":"We now present a verification result, i.e., Theorem ","element":"span"},{"href":"#id-1","text":"2.2","element":"a"},{"text":", which shows that the classical solution to the HJB equation (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") is the value function (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"), and the Borel measurable function ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-16.png","element":"img","alt":" αu","inline":true,"padRight":true},{"text":"defined as in (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") is a feedback control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"). The proof will be postponed to Appendix ","element":"span"},{"text":"A","element":"span"},{"text":", which essentially follows from Itˆo’s formula and the existence result of weak solutions to SDEs with non-degenerate diffusion coefficients (see [","element":"span"},{"href":"#id-44","referenceIndex":32,"text":"32","element":"a"},{"text":", Theorem 1]).","element":"span"}],[{"text":"We first recall the definition of optimal feedback control (see e.g. [","element":"span"},{"href":"#id-33","referenceIndex":43,"text":"44","element":"a"},{"text":", Definition 6.1]).","element":"span"}],[{"id":"id-63","text":"Definition 2.2. ","element":"span"},{"text":"A Borel measurable function ","element":"span"},{"style":{"height":12.8},"width":203.6,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-17.png","element":"img","alt":" h :O → A","inline":true,"padRight":true},{"text":"is said to be a feedback control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") if for all ","element":"span"},{"style":{"height":13.2},"width":115.16,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-18.png","element":"img","alt":" x ∈O","inline":true},{"text":", there exists ","element":"span"},{"style":{"height":18.29},"width":1040.36,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-19.png","element":"img","alt":" πx = (Ωx, Fx, {Fxt }t≥0, Px, W) ∈ Πref, and an {Fxt }t≥0","inline":true},{"text":"-progressively ","element":"span"},{"text":"measurable continuous process (","element":"span"},{"style":{"height":18.29},"width":133.16,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-20.png","element":"img","alt":"Xxt )t≥0","inline":true},{"text":", such that ","element":"span"},{"style":{"height":17.2},"width":542.12,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-21.png","element":"img","alt":" Xx0 = x, and for Px-a.s. that","inline":true}],[{"id":"id-98","style":{"width":"99%"},"width":1846,"height":179,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-22.png","element":"img"}],[{"text":"O}","element":"span"},{"text":". A feedback control ","element":"span"},{"text":"h ","element":"span"},{"text":"is said to be optimal if we have for all ","element":"span"},{"style":{"height":10.4},"width":65.96,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-23.png","element":"img","alt":" x ∈","inline":true}],[{"id":"id-99","style":{"width":"99%"},"width":1842,"height":221,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-24.png","element":"img"}],[{"id":"id-1","text":"Theorem 2.2. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") holds. Let ","element":"span"},{"style":{"height":12.8},"width":214.88,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-25.png","element":"img","alt":" v :O → R","inline":true,"padRight":true},{"text":"be the value function defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":")","element":"span"},{"text":", ","element":"span"},{"style":{"height":19.13},"width":368.84,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-26.png","element":"img","alt":"u ∈ C(O) ∩ C2(O)","inline":true,"padRight":true},{"text":"be the solution to the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":")","element":"span"},{"text":", and ","element":"span"},{"style":{"height":12.8},"width":249.2,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-27.png","element":"img","alt":" αu :O → A","inline":true,"padRight":true},{"text":"be a Borel measurable function satisfying ","element":"span"},{"text":"(","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":")","element":"span"},{"text":". Then we have ","element":"span"},{"style":{"height":17.6},"width":818.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/6-28.png","element":"img","alt":" u(x) = v(x) for all x ∈O, and αu is an","inline":true,"padRight":true},{"text":"optimal feedback control of ","element":"span"},{"text":"(","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":")","element":"span"},{"text":".","element":"span"}],[{"id":"id-34","text":"Remark ","element":"span"},{"text":"2.2","element":"span"},{"text":". ","element":"span"},{"text":"As shown in Theorem ","element":"span"},{"href":"#id-1","text":"2.2","element":"a"},{"text":", by considering a weak formulation of the stochastic control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") with reference probability systems varying in Π","element":"span"},{"style":{"height":8.8},"width":41.32,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-0.png","element":"img","alt":"ref","inline":true},{"text":", we can rigorously demonstrate that a measurable function ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-1.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"satisfying (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") is indeed an optimal feedback control strategy.","element":"span"}],[{"text":"One can also consider stochastic exit time problems under a strong formulation, for which we first fix a reference probability system ","element":"span"},{"style":{"height":18.29},"width":467,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-2.png","element":"img","alt":" π = (Ω, F, {Ft}t≥0, P, W","inline":true},{"text":"), and the agent only maximizes the reward functional over all admissible control processes in ","element":"span"},{"style":{"height":15.49},"width":55.04,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-3.png","element":"img","alt":" Aπ","inline":true},{"text":". It has been shown in [","element":"span"},{"href":"#id-45","referenceIndex":14,"text":"14","element":"a"},{"text":", Theorem 2.1] that, if we assume (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and ","element":"span"},{"href":"#id-41","style":{"height":19.22},"width":538,"height":48.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-4.png","element":"img","alt":" c > 0 on ¯O × A, then (2.6","inline":true},{"text":") satisfies the strong comparison principle i.e., a comparison result for semicontinuous viscosity solutions. In particular, (H4) in [","element":"span"},{"href":"#id-45","referenceIndex":14,"text":"14","element":"a"},{"text":"] is satisfied since ","element":"span"},{"style":{"height":15.94},"width":191.68,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-5.png","element":"img","alt":" ∂O ∈ C2,θ ","inline":true,"padRight":true},{"text":"enjoys the exterior ball condition, and (H5) in [","element":"span"},{"href":"#id-45","referenceIndex":14,"text":"14","element":"a"},{"text":"] is satisfied with Γ","element":"span"},{"style":{"height":15.49},"width":169.4,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-6.png","element":"img","alt":"out = ∂O","inline":true,"padRight":true},{"text":"due to the uniform ellipticity condition (","element":"span"},{"href":"#id-46","text":"2.4","element":"a"},{"text":"). The strong comparison principle further enables us to show that the value function of the stochastic control problem (under the strong formulation) is the unique continuous viscosity solution to (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"); see [","element":"span"},{"href":"#id-47","referenceIndex":4,"text":"5","element":"a"},{"text":", Theorem 3.1]. Since the classical solution ","element":"span"},{"text":"u ","element":"span"},{"text":"is a viscosity solution of (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"), we see it is the value function of the stochastic control problem (under the strong formulation), and the strategy ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-7.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"defined in (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") will lead to the optimal reward. Hence, we can still view the function ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-8.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"as an optimal feedback control.","element":"span"}],[{"text":"We reiterate that, due to the fact that arg max is a set-valued mapping, the feedback control strategy (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") in general is non-unique, discontinuous, and sensitive to the perturbation of the co-efficients. For instance, let ","element":"span"},{"text":"K ","element":"span"},{"text":"= 2, and consider the set ","element":"span"},{"style":{"height":17.6},"width":814.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-9.png","element":"img","alt":" G = {x ∈ O | (L1−L2)u(x)+(f1−f2)(x) =","inline":true,"padRight":true},{"text":"0","element":"span"},{"text":"} ","element":"span"},{"text":"at whose boundary the optimal control ","element":"span"},{"href":"#id-43","style":{"height":17.6},"width":192.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-10.png","element":"img","alt":" αu in (2.8","inline":true},{"text":") could have a jump discontinuity. Except for the trivial case where ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-11.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"is a constant on ","element":"span"},{"text":"O","element":"span"},{"text":", one can easily deduce from the connectedness of ","element":"span"},{"text":"O","element":"span"},{"text":", the fact that ","element":"span"},{"style":{"height":19.14},"width":191.48,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-12.png","element":"img","alt":" u ∈ C2(O","inline":true},{"text":"), and the continuity of the coefficients that the set ","element":"span"},{"text":"G ","element":"span"},{"text":"is non-empty. Since the boundary of the level set ","element":"span"},{"text":"G ","element":"span"},{"text":"can have poor regularity, we see the feedback control ","element":"span"},{"style":{"height":12.34},"width":100.32,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-13.png","element":"img","alt":" αu in","inline":true,"padRight":true},{"text":"general is merely Borel measurable, which introduces a substantial difficulty to follow the optimal control in practice. Moreover, the discontinuity of ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-14.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"also implies that a small perturbation of the coefficients could lead to a significant difference of ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-15.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"in the sup-norm, especially near the boundary of the set ","element":"span"},{"text":"G","element":"span"},{"text":". It is well-known (see e.g. [","element":"span"},{"href":"#id-48","referenceIndex":9,"text":"9","element":"a"},{"text":", Section 6.4.2] and [","element":"span"},{"href":"#id-49","referenceIndex":23,"text":"24","element":"a"},{"text":", Figure 4]) that such an instability of feedback controls would result in a numerical instability of the learning process, i.e., the approximate policies generated by an iterative learning algorithm may change subsequently from one iteration to the next, and eventually oscillate among several far-from-optimal policies.","element":"span"}]]},{"heading":"3 Relaxation of stochastic exit time problem","paragraphs":[[{"text":"In this section, we propose a relaxation of the stochastic exit time problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"), which extends the ideas used in [","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"] to control problems with multi-dimensional controlled dynamics and general exploration reward functions. As we shall see shortly, the relaxed control problem has a H¨older continuous feedback control strategy, and enjoys better stability with respect to perturbation of the coefficients.","element":"span"}],[{"text":"The following technical lemma is essential for the formulation of relaxed control problems with multi-dimensional dynamics, whose proof is included in Appendix ","element":"span"},{"text":"A","element":"span"},{"text":".","element":"span"}],[{"id":"id-37","text":"Lemma 3.1. ","element":"span"},{"text":"Suppose ","element":"span"},{"text":"(","element":"span"},{"text":"H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") ","element":"span"},{"text":"holds. Then there exist unique functions ","element":"span"},{"text":"˜","element":"span"},{"style":{"height":15.49},"width":469,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-16.png","element":"img","alt":"b : Rn × ∆K → Rn and","inline":true,"padRight":true},{"text":"˜","element":"span"},{"style":{"height":18.02},"width":357.68,"height":45.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-17.png","element":"img","alt":"σ : Rn × ∆K → Sn> ","inline":true,"padRight":true},{"text":"such that it holds for all ","element":"span"},{"style":{"height":16},"width":400.52,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-18.png","element":"img","alt":" x ∈ Rn, λ ∈ ∆K that","inline":true}],[{"style":{"width":"70%"},"width":1301,"height":131,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-19.png","element":"img"}],[{"text":"Moreover, it holds for all ","element":"span"},{"style":{"height":23.79},"width":1431.28,"height":59.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-20.png","element":"img","alt":" x ∈ Rn, λ ∈ ∆K that ˜σ(x, λ) ≥ √νIn and �i,j |˜σij(·, λ)|0,1+�i |˜bi(·, λ)|0,1 <","inline":true},{"style":{"height":8},"width":56.68,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/7-21.png","element":"img","alt":"∞.","inline":true}],[{"text":"We now proceed to introduce the relaxation of the exit time problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"). Roughly speaking, instead of seeking the optimal feedback action, which maps the current state to ","element":"span"},{"text":"a specific action ","element":"span"},{"text":"in the space ","element":"span"},{"text":"A","element":"span"},{"text":", we seek the optimal feedback control distribution, which is a deterministic mapping from the current state to ","element":"span"},{"text":"a probability measure ","element":"span"},{"text":"over the space ","element":"span"},{"style":{"height":17.6},"width":418.64,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-0.png","element":"img","alt":" A, i.e., λ∗ : O → P(A","inline":true},{"text":"). Once such a mapping is determined, at each given state, the agent will execute the control by sampling a control action based on the distribution ","element":"span"},{"style":{"height":17.6},"width":85.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-1.png","element":"img","alt":" λ∗(x","inline":true},{"text":"). We refer the reader to [","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"] for a more detailed derivation of the following regularized control problem (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") in a one-dimensional setting. Note that the fact that ","element":"span"},{"text":"A ","element":"span"},{"text":"has cardinality ","element":"span"},{"style":{"height":12.8},"width":142.4,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-2.png","element":"img","alt":" K < ∞","inline":true,"padRight":true},{"text":"enables us to identify the space of probability measures over ","element":"span"},{"text":"A ","element":"span"},{"text":"as the probability simplex ∆","element":"span"},{"style":{"height":8.8},"width":45.12,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-3.png","element":"img","alt":"K.","inline":true}],[{"text":"More precisely, let ","element":"span"},{"style":{"height":18.29},"width":625.48,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-4.png","element":"img","alt":" π = (Ω, F, {Ft}t≥0, P, W) ∈ Πref","inline":true,"padRight":true},{"text":"be a given reference probability system, and ","element":"span"},{"style":{"height":15.09},"width":72.32,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-5.png","element":"img","alt":" Mπ","inline":true,"padRight":true},{"text":"be the set of ","element":"span"},{"style":{"height":18.29},"width":144.2,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-6.png","element":"img","alt":" {Ft}t≥0","inline":true},{"text":"-progressively measurable processes ","element":"span"},{"style":{"height":12.8},"width":26,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-7.png","element":"img","alt":" λ","inline":true,"padRight":true},{"text":"taking values in the set ∆","element":"span"},{"style":{"height":8.8},"width":44.64,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-8.png","element":"img","alt":"K.","inline":true,"padRight":true},{"text":"Suppose that (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") holds, for any given initial state ","element":"span"},{"style":{"height":12.8},"width":130.92,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-9.png","element":"img","alt":" x ∈ Rn","inline":true},{"text":", and control ","element":"span"},{"style":{"height":15.09},"width":151.04,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-10.png","element":"img","alt":" λ ∈ Mπ","inline":true},{"text":", we consider the controlled diffusion process ","element":"span"},{"style":{"height":15.14},"width":87.64,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-11.png","element":"img","alt":" Xλ,x ","inline":true,"padRight":true},{"text":"satisfying the following SDE: ","element":"span"},{"style":{"height":22.24},"width":257.28,"height":55.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-12.png","element":"img","alt":" Xλ,x0 = x and","inline":true}],[{"id":"id-53","style":{"width":"77%"},"width":1433,"height":56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-13.png","element":"img"}],[{"text":"where ","element":"span"},{"text":"˜","element":"span"},{"style":{"height":17.82},"width":762.32,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-14.png","element":"img","alt":"b : Rn×∆K → Rn and ˜σ : Rn×∆K → Sn> ","inline":true,"padRight":true},{"text":"are the functions defined in Lemma ","element":"span"},{"href":"#id-37","text":"3.1","element":"a"},{"text":". We further ","element":"span"},{"text":"introduce the first exit time of ","element":"span"},{"style":{"height":15.14},"width":87.64,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-15.png","element":"img","alt":" Xλ,x ","inline":true,"padRight":true},{"text":"from the domain ","element":"span"},{"style":{"height":21.86},"width":816,"height":54.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-16.png","element":"img","alt":" O defined as τ λ,x := inf{t ≥ 0 | Xλ,xt ̸∈ O},","inline":true,"padRight":true},{"text":"and the controlled discount factor Γ","element":"span"},{"style":{"height":31.6},"width":1118.4,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-17.png","element":"img","alt":"λ,xt := exp�−� t0�Kk=1 c(Xλ,xs , ak)λks ds�for all t ∈ [0, τ λ,x].","inline":true}],[{"text":"Now let ","element":"span"},{"style":{"height":19.54},"width":398.32,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-18.png","element":"img","alt":" ρ : RK → R ∪ {∞}","inline":true,"padRight":true},{"text":"be a given exploration reward function satisfying ","element":"span"},{"style":{"height":13.6},"width":205.92,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-19.png","element":"img","alt":" ρ < ∞ on","inline":true,"padRight":true},{"text":"∆","element":"span"},{"style":{"height":8.8},"width":30,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-20.png","element":"img","alt":"K","inline":true,"padRight":true},{"text":"(precise conditions will be specified in (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":")). For any given relaxation parameter ","element":"span"},{"style":{"height":14.8},"width":180.32,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-21.png","element":"img","alt":" ε > 0, we","inline":true,"padRight":true},{"text":"consider the following value function: for each ","element":"span"},{"style":{"height":15.6},"width":126.24,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-22.png","element":"img","alt":" x ∈O,","inline":true}],[{"id":"id-51","style":{"width":"96%"},"width":1780,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-23.png","element":"img"}],[{"text":"Note that the exploration reward function ","element":"span"},{"style":{"height":12},"width":23,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-24.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"plays a crucial role in the above relaxed control regularization. If we set the exploration reward function ","element":"span"},{"style":{"height":12.4},"width":69.04,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-25.png","element":"img","alt":" ρ ≡","inline":true,"padRight":true},{"text":"0 or the relaxation parameter ","element":"span"},{"style":{"height":14.8},"width":112.32,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-26.png","element":"img","alt":" ε = 0,","inline":true,"padRight":true},{"text":"then one can show that Dirac measures supported on the optimal strategies of the original control problem (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") (see ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-27.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"defined as in (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":")) are optimal control distributions of the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":"), and the value function ","element":"span"},{"text":"v ","element":"span"},{"text":"in (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") will be equal to the value function ","element":"span"},{"href":"#id-51","style":{"height":17.6},"width":202.27,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-28.png","element":"img","alt":" vε in (3.2)","inline":true,"padRight":true},{"text":"(see Theorems ","element":"span"},{"href":"#id-31","text":"6.1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-32","text":"6.4","element":"a"},{"text":"). Hence, to achieve the stability of the optimal control strategy for the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":"), we shall impose the following condition on the reward function ","element":"span"},{"style":{"height":12},"width":34.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-29.png","element":"img","alt":" ρ:","inline":true}],[{"id":"id-50","text":"H.2. ","element":"span"},{"text":"There exists a convex function ","element":"span"},{"style":{"height":19.35},"width":247.4,"height":48.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-30.png","element":"img","alt":" H ∈ C2(RK)","inline":true,"padRight":true},{"text":"and a constant ","element":"span"},{"style":{"height":14.29},"width":120.88,"height":35.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-31.png","element":"img","alt":" c0 > 0","inline":true},{"text":", depending on ","element":"span"},{"text":"K","element":"span"},{"text":", such that for all ","element":"span"},{"style":{"height":21.12},"width":1633.48,"height":52.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-32.png","element":"img","alt":" x, y ∈ RK, we have H(x)−c0 ≤ maxk∈K xk ≤ H(x) and ρ(y) = supz∈RK�zT y−H(z)�.","inline":true}],[{"text":"We remark that (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") is satisfied by most commonly used reward functions, including Shannon’s differential entropy proposed in [","element":"span"},{"href":"#id-5","referenceIndex":45,"text":"46","element":"a"},{"text":", ","element":"span"},{"href":"#id-8","referenceIndex":24,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-6","referenceIndex":33,"text":"33","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":", ","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"]. We refer the reader to the discussion at the end of this section for a detailed comparison of different reward functions.","element":"span"}],[{"text":"Given a function ","element":"span"},{"style":{"height":15.54},"width":244.16,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-33.png","element":"img","alt":" H : RK → R","inline":true},{"text":", we define for each ","element":"span"},{"style":{"height":13.6},"width":67.6,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-34.png","element":"img","alt":" ε ≥","inline":true,"padRight":true},{"text":"0 the function ","element":"span"},{"style":{"height":17.82},"width":451.88,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-35.png","element":"img","alt":" Hε : RK → R such that","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":19.54},"width":471.84,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-36.png","element":"img","alt":" x = (x1, . . . , xK)T ∈ RK,","inline":true}],[{"id":"id-58","style":{"width":"68%"},"width":1270,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-37.png","element":"img"}],[{"text":"Note that (","element":"span"},{"style":{"height":18.29},"width":129.8,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/8-38.png","element":"img","alt":"Hε)ε≥0","inline":true,"padRight":true},{"text":"are convex functions if ","element":"span"},{"text":"H ","element":"span"},{"text":"is a convex function. The next lemma follows directly from (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") and standard arguments in convex analysis, whose proof will be given in Appendix ","element":"span"},{"text":"A ","element":"span"},{"text":"for completeness.","element":"span"}],[{"id":"id-56","style":{"width":"96%"},"width":1786,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-0.png","element":"img"}],[{"text":"(1) the function ","element":"span"},{"style":{"height":19.54},"width":379.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-1.png","element":"img","alt":" ρ : RK → R ∪ {∞}","inline":true,"padRight":true},{"text":"is convex on ","element":"span"},{"style":{"height":15.14},"width":61.68,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-2.png","element":"img","alt":" RK","inline":true},{"text":", continuous relative to ","element":"span"},{"text":"∆","element":"span"},{"style":{"height":8.8},"width":30,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-3.png","element":"img","alt":"K","inline":true},{"text":", and satisfies that ","element":"span"},{"style":{"height":17.6},"width":1201.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-4.png","element":"img","alt":" ρ(y) ∈ [−c0, 0] for all y ∈ ∆K and ρ(y) = ∞ for all y ∈ (∆K)c,","inline":true}],[{"id":"id-57","text":"(2) it holds for all ","element":"span"},{"style":{"height":20.93},"width":1497.04,"height":52.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-5.png","element":"img","alt":" x ∈ RK and ε > 0 that Hε(x)−εc0 ≤ H0(x) ≤ Hε(x), Hε(x) = maxy∈∆K�yT x−","inline":true},{"style":{"height":20.8},"width":396.04,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-6.png","element":"img","alt":"ερ(y)�, and (∇Hε)(x","inline":true},{"text":") = arg max","element":"span"},{"style":{"height":21.76},"width":364.64,"height":54.4,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-7.png","element":"img","alt":"y∈∆K�yT x−ερ(y)�","inline":true},{"text":". Consequently, we have for all ","element":"span"},{"style":{"height":19.13},"width":182.16,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-8.png","element":"img","alt":" x, y ∈ RK","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":702.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-9.png","element":"img","alt":" ε > 0 that |Hε(x) − Hε(y)| ≤ |x − y|.","inline":true}],[{"text":"We proceed to study the corresponding HJB equation of the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":"), which plays a crucial role in our subsequent analysis. For each ","element":"span"},{"style":{"height":19.54},"width":578.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-10.png","element":"img","alt":" λ = (λ1, . . . , λK)T ∈ ∆K, let","inline":true},{"style":{"height":19.13},"width":220.16,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-11.png","element":"img","alt":"f λ : O → R","inline":true,"padRight":true},{"text":"be the function satisfying for all ","element":"span"},{"style":{"height":22.05},"width":996,"height":55.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-12.png","element":"img","alt":" x ∈ O that f λ(x) = �Kk=1 f(x, ak)λk = λT f(x) (with","inline":true,"padRight":true},{"text":"f ","element":"span"},{"text":"defined as in (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":")), and ","element":"span"},{"style":{"height":15.54},"width":50.24,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-13.png","element":"img","alt":" Lλ ","inline":true,"padRight":true},{"text":"be the elliptic operator satisfying for all ","element":"span"},{"style":{"height":19.14},"width":509.48,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-14.png","element":"img","alt":" φ ∈ C2(O) and x ∈ O that","inline":true}],[{"id":"id-55","style":{"width":"98%"},"width":1813,"height":342,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-15.png","element":"img"}],[{"text":"where we have used the definition of the elliptic operators ","element":"span"},{"href":"#id-52","style":{"height":17.6},"width":440.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-16.png","element":"img","alt":" L = (Lk)k∈K (cf. (2.7","inline":true},{"text":")), and the definition of the functions ","element":"span"},{"text":"˜","element":"span"},{"style":{"height":12.8},"width":143.08,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-17.png","element":"img","alt":"b and ˜σ","inline":true,"padRight":true},{"text":"(cf. Lemma ","element":"span"},{"href":"#id-37","text":"3.1","element":"a"},{"text":").","element":"span"}],[{"text":"Since the diffusion coefficient of SDE (","element":"span"},{"href":"#id-53","text":"3.1","element":"a"},{"text":") is non-degenerate (see Lemma ","element":"span"},{"href":"#id-37","text":"3.1","element":"a"},{"text":") and all coeffi-cients of the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") are continuous on","element":"span"},{"style":{"height":15.49},"width":151.92,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-18.png","element":"img","alt":"O × ∆K","inline":true},{"text":", a formal application of the dynamic programming principle (see e.g. [","element":"span"},{"href":"#id-0","referenceIndex":19,"text":"19","element":"a"},{"text":", ","element":"span"},{"href":"#id-54","referenceIndex":13,"text":"13","element":"a"},{"text":"] and references within) enables us to associate the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") with the following HJB equation:","element":"span"}],[{"style":{"width":"59%"},"width":1108,"height":76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-19.png","element":"img"}],[{"text":"Moreover, (","element":"span"},{"href":"#id-55","text":"3.4","element":"a"},{"text":") and Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":") imply that the above Dirichlet problem is equivalent to","element":"span"}],[{"id":"id-59","style":{"width":"77%"},"width":1432,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-20.png","element":"img"}],[{"text":"where the function ","element":"span"},{"style":{"height":14.69},"width":52.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-21.png","element":"img","alt":" Hε","inline":true,"padRight":true},{"text":"is defined as in (","element":"span"},{"href":"#id-58","text":"3.3","element":"a"},{"text":"), and ","element":"span"},{"text":"L","element":"span"},{"text":", ","element":"span"},{"text":"f ","element":"span"},{"text":"are defined as those in (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":").","element":"span"}],[{"text":"In order to rigorously justify the connection between (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") and (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"), we establish the well-posedness of classical solutions to (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") in Theorem ","element":"span"},{"href":"#id-60","text":"3.4","element":"a"},{"text":", and then prove a verification result in Theorem ","element":"span"},{"href":"#id-2","text":"3.5","element":"a"},{"text":".","element":"span"}],[{"text":"We need the following proposition, which gives an ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate of classical solutions to (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"). We postpone the proof to Appendix ","element":"span"},{"text":"A","element":"span"},{"text":", which adapts the technique in [","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 7.5 on p. 127] to HJB equations with compact control sets, and reduces the problem to an ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for HJB equations involving only principal terms.","element":"span"}],[{"id":"id-62","text":"Proposition 3.3. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold, and let ","element":"span"},{"style":{"height":24.82},"width":375.48,"height":62.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-22.png","element":"img","alt":" M = supi,j,k |σijk |0;O","inline":true},{"text":". Then there exists a ","element":"span"},{"text":"constant ","element":"span"},{"style":{"height":17.6},"width":462.92,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-23.png","element":"img","alt":" β0 = β0(n, ν, M) ∈ (0, 1)","inline":true},{"text":", such that it holds for all ","element":"span"},{"style":{"height":19.93},"width":739.4,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-24.png","element":"img","alt":" β ∈ (0, min(β0, θ)] that, if uε ∈ C2,β(O)","inline":true,"padRight":true},{"text":"is a solution to the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") ","element":"span"},{"text":"with parameter ","element":"span"},{"style":{"height":15.6},"width":279.52,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-25.png","element":"img","alt":" ε > 0, then uε ","inline":true,"padRight":true},{"text":"satisfies the estimate that ","element":"span"},{"style":{"height":18.48},"width":524.36,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-26.png","element":"img","alt":" |uε|2,β ≤ C(|g|2,β + εc0 + 1)","inline":true},{"text":", where the constant ","element":"span"},{"text":"C ","element":"span"},{"text":"depends only on ","element":"span"},{"style":{"height":16.4},"width":344.2,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-27.png","element":"img","alt":" n, ν, Λ, β and O.","inline":true}],[{"id":"id-60","text":"Theorem 3.4. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold, let ","element":"span"},{"style":{"height":24.82},"width":843.24,"height":62.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-28.png","element":"img","alt":" ε > 0 and M = supi,j,k |σijk |0;O. Then the","inline":true,"padRight":true},{"text":"Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") ","element":"span"},{"text":"admits a unique solution ","element":"span"},{"style":{"height":19.13},"width":409.48,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/9-29.png","element":"img","alt":" uε ∈ C(O) ∩ C2(O).","inline":true,"padRight":true},{"text":"Moreover, there exists","element":"span"}],[{"id":"id-27","style":{"width":"99%"},"width":1843,"height":325,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-0.png","element":"img"}],[{"text":"Proof. ","element":"span"},{"text":"One can deduce by similar arguments as those for Theorem ","element":"span"},{"href":"#id-61","text":"2.1 ","element":"a"},{"text":"and the classical maximum principle that (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") admits a unique classical solution in ","element":"span"},{"style":{"height":19.13},"width":253.4,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-1.png","element":"img","alt":" C(O) ∩ C2(O","inline":true},{"text":"). Moreover, by using the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"bound of classical solutions in Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":", we can establish the existence and regularity of the classical solution ","element":"span"},{"href":"#id-59","style":{"height":17.6},"width":187.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-2.png","element":"img","alt":" uε to (3.5","inline":true},{"text":") based on the method of continuity; see [","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 5.1 on p. 116].","element":"span"}],[{"text":"Now let ","element":"span"},{"style":{"height":19.94},"width":234.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-3.png","element":"img","alt":" uε ∈ C2,β(O","inline":true},{"text":") be the solution to (","element":"span"},{"text":"3.5","element":"span"},{"text":") with some ","element":"span"},{"style":{"height":17.6},"width":161.64,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-4.png","element":"img","alt":" β ∈ (0, θ","inline":true},{"text":"]. The continuity of ","element":"span"},{"href":"#id-59","style":{"height":40.83},"width":1846.4,"height":102.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-5.png","element":"img","alt":" Lλ, f λand ρ on ∆K","inline":true},{"text":", and Lemma ","element":"span"},{"text":"3.2","element":"span"},{"text":"(","element":"span"},{"text":"2","element":"span"},{"text":") ensure that the function ","element":"span"},{"style":{"height":15.55},"width":59.12,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-6.png","element":"img","alt":" λuε ","inline":true,"padRight":true},{"text":"is well-defined on ","element":"span"},{"text":"O","element":"span"},{"text":", and has the expression ","element":"span"},{"style":{"height":19.55},"width":428.96,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-7.png","element":"img","alt":" λuε = (∇Hε)(Luε + f","inline":true},{"text":"). Note that, it holds for any given ","element":"span"},{"style":{"height":19.94},"width":402.92,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-8.png","element":"img","alt":" φ1, φ2 ∈ Cβ(O) that","inline":true},{"style":{"height":19.94},"width":252.44,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-9.png","element":"img","alt":"φ1φ2 ∈ Cβ(O","inline":true},{"text":"). Hence the H¨older continuity of the coefficients (see (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":")) implies that ","element":"span"},{"style":{"height":14},"width":183.08,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-10.png","element":"img","alt":" Luε + f ∈","inline":true},{"style":{"height":19.94},"width":190.8,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-11.png","element":"img","alt":"Cβ(O, RK","inline":true},{"text":"). We can then easily deduce from the local Lipschitz continuity of ","element":"span"},{"style":{"height":17.82},"width":332.88,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-12.png","element":"img","alt":" ∇Hε : RK → RK","inline":true,"padRight":true},{"text":"that ","element":"span"},{"style":{"height":19.94},"width":338.88,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-13.png","element":"img","alt":" λuε ∈ Cβ(O, RK).","inline":true}],[{"text":"The next theorem shows that the function (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") is an optimal feedback control of (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":"), which is defined similarly to Definition ","element":"span"},{"href":"#id-63","text":"2.2","element":"a"},{"text":". The proof of this statement is similar to that of Theorem ","element":"span"},{"href":"#id-1","text":"2.2 ","element":"a"},{"text":"and hence omitted.","element":"span"}],[{"id":"id-2","text":"Theorem 3.5. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold. Let ","element":"span"},{"style":{"height":15.6},"width":339.2,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-14.png","element":"img","alt":" ε > 0, vε :O → R","inline":true,"padRight":true},{"text":"be the value function defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":")","element":"span"},{"text":", ","element":"span"},{"style":{"height":19.14},"width":368.84,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-15.png","element":"img","alt":" uε ∈ C(O) ∩ C2(O)","inline":true,"padRight":true},{"text":"be the solution to the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":")","element":"span"},{"text":", and ","element":"span"},{"style":{"height":18.03},"width":269.52,"height":45.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-16.png","element":"img","alt":" λuε :O → ∆K","inline":true,"padRight":true},{"text":"be the function defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":")","element":"span"},{"text":". Then ","element":"span"},{"style":{"height":19.75},"width":737.84,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-17.png","element":"img","alt":" uε(x) = vε(x) for all x ∈O, and λuε ","inline":true,"padRight":true},{"text":"is an optimal feedback control of ","element":"span"},{"text":"(","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":")","element":"span"},{"text":".","element":"span"}],[{"text":"Remark ","element":"span"},{"text":"3.1","element":"span"},{"text":". ","element":"span"},{"text":"Theorem ","element":"span"},{"href":"#id-60","text":"3.4 ","element":"a"},{"text":"shows that the feedback control ","element":"span"},{"style":{"height":15.55},"width":59.12,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-18.png","element":"img","alt":" λuε ","inline":true,"padRight":true},{"text":"is uniquely defined and H¨older continuous. This improved regularity makes it easier to implement the relaxed control ","element":"span"},{"style":{"height":15.55},"width":118.56,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-19.png","element":"img","alt":" λuε in","inline":true,"padRight":true},{"text":"practice, compared to the original (merely measurable) feedback control ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-20.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"(cf. Theorem ","element":"span"},{"href":"#id-61","text":"2.1","element":"a"},{"text":").","element":"span"}],[{"text":"We end this section with a remark about possible choices of reward functions. ","element":"span"},{"text":"Generally speaking, we shall choose a reward function ","element":"span"},{"style":{"height":12},"width":23,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-21.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"whose generating function ","element":"span"},{"text":"H ","element":"span"},{"text":"and its gradient ","element":"span"},{"style":{"height":12.4},"width":75.48,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-22.png","element":"img","alt":" ∇H","inline":true,"padRight":true},{"text":"can be efficiently evaluated, such that one can design an efficient algorithm to solve the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") (see e.g. [","element":"span"},{"href":"#id-5","referenceIndex":45,"text":"46","element":"a"},{"text":", ","element":"span"},{"href":"#id-8","referenceIndex":24,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-6","referenceIndex":33,"text":"33","element":"a"},{"text":", ","element":"span"},{"href":"#id-7","referenceIndex":20,"text":"21","element":"a"},{"text":", ","element":"span"},{"href":"#id-29","referenceIndex":25,"text":"26","element":"a"},{"text":"]). A common choice of reward functions in the literature is the following entropy-type reward function (see e.g. [","element":"span"},{"href":"#id-64","referenceIndex":28,"text":"28","element":"a"},{"text":", ","element":"span"},{"href":"#id-65","referenceIndex":35,"text":"35","element":"a"},{"text":", ","element":"span"},{"href":"#id-66","referenceIndex":36,"text":"36","element":"a"},{"text":", ","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"]):","element":"span"}],[{"style":{"width":"40%"},"width":747,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-23.png","element":"img"}],[{"text":"whose generating function is ","element":"span"},{"style":{"height":22.05},"width":718.08,"height":55.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-24.png","element":"img","alt":" Hen(x) = ln �Kk=1 exp(xk), x ∈ RK.","inline":true,"padRight":true},{"text":"One can show that ","element":"span"},{"style":{"height":14.69},"width":122.12,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-25.png","element":"img","alt":" Hen ∈","inline":true},{"style":{"height":19.54},"width":375.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-26.png","element":"img","alt":"C∞(RK) ∩ C2,1(RK","inline":true},{"text":"), and it satisfies (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") with ","element":"span"},{"style":{"height":15.09},"width":179.68,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-27.png","element":"img","alt":" c0 = ln K","inline":true,"padRight":true},{"text":"(see e.g. [","element":"span"},{"href":"#id-66","referenceIndex":36,"text":"36","element":"a"},{"text":"]).","element":"span"}],[{"text":"The advantage of the entropy reward function is that both ","element":"span"},{"style":{"height":15.09},"width":278.2,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-28.png","element":"img","alt":" Hen and ∇Hen","inline":true,"padRight":true},{"text":"are given in closed form, and they can be naturally extended to continuous action spaces ","element":"span"},{"text":"A ","element":"span"},{"text":"(see e.g. [","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"]). However, it is important to notice that the evaluation of ","element":"span"},{"style":{"height":15.09},"width":289.24,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-29.png","element":"img","alt":" Hen and ∇Hen","inline":true,"padRight":true},{"text":"involves exponentials. ","element":"span"},{"text":"Hence, when the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-30.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"is small, a naive implementation of iterative algorithms for solving (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"), which in general involves evaluating the value and inverse of ","element":"span"},{"style":{"height":15.09},"width":381.52,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-31.png","element":"img","alt":" Hen and ∇Hen at a","inline":true,"padRight":true},{"text":"large argument ","element":"span"},{"style":{"height":19.54},"width":793.88,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/10-32.png","element":"img","alt":" z = (Luε(x) + f(x))/ε ∈ RK with x ∈ O","inline":true},{"text":", may lead to unreliable results due to unstable floating-point arithmetic; see [","element":"span"},{"href":"#id-67","referenceIndex":10,"text":"10","element":"a"},{"text":", Example 4.2] and [","element":"span"},{"href":"#id-68","referenceIndex":11,"text":"11","element":"a"},{"text":"] for more details. ","element":"span"},{"text":"Moreover, since ","element":"span"},{"style":{"height":19.53},"width":633.36,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-0.png","element":"img","alt":" ∇Hen(x) ∈ (0, 1)K for all x ∈ RK","inline":true},{"text":", the optimal relaxed control of (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") may converge to the optimal control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") with a very slow rate as the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-1.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"tends to zero.","element":"span"}],[{"text":"Alternatively, by virtue of the fact that only the generating function ","element":"span"},{"text":"H ","element":"span"},{"text":"and its gradient are involved in the HJB equation (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") and the feedback control (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":"), we can also obtain a reward function ","element":"span"},{"style":{"height":12},"width":23,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-2.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"by directly constructing a ","element":"span"},{"text":"K","element":"span"},{"text":"-dimensional function ","element":"span"},{"text":"H ","element":"span"},{"text":"based on a recursive application of smoothing functions for the two-dimensional max function. For instance, we can start with the following two-dimensional smoothing functions (see e.g. [","element":"span"},{"href":"#id-12","referenceIndex":15,"text":"15","element":"a"},{"text":", ","element":"span"},{"href":"#id-13","referenceIndex":44,"text":"45","element":"a"},{"text":"]): for ","element":"span"},{"style":{"height":19.54},"width":366.72,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-3.png","element":"img","alt":" x = (x1, x2)T ∈ R2,","inline":true}],[{"id":"id-69","style":{"width":"88%"},"width":1628,"height":306,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-4.png","element":"img"}],[{"text":"Then, for any given ","element":"span"},{"style":{"height":14.4},"width":86.32,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-5.png","element":"img","alt":" K ≥","inline":true,"padRight":true},{"text":"3, by using the fact that max","element":"span"},{"style":{"height":18.29},"width":840.96,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-6.png","element":"img","alt":"k∈K xk = max(maxi∈K1 xi, maxj∈K2 xj), with","inline":true},{"style":{"height":17.6},"width":1185.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-7.png","element":"img","alt":"K1 = {1, . . . , K0}, K2 = {K0+1, . . . , K} and K0 = ⌊(K+1)/2⌋","inline":true},{"text":", we can express the ","element":"span"},{"text":"K","element":"span"},{"text":"-dimensional max function as a nested application of the two-dimensional max function and one-dimensional identity function. Hence, by replacing the two-dimensional max function with the two-dimensional smoothing function (","element":"span"},{"href":"#id-69","text":"3.7","element":"a"},{"text":") (resp. (","element":"span"},{"href":"#id-69","text":"3.8","element":"a"},{"text":")) in the recursive expression, we can obtain the ","element":"span"},{"text":"K","element":"span"},{"text":"-dimensional smoothing function ","element":"span"},{"style":{"height":20.22},"width":1030.32,"height":50.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-8.png","element":"img","alt":" Hchks ∈ C∞(RK) ∩ C2,1(RK) (resp. Hzang ∈ C2,1(RK","inline":true},{"text":")). It has been shown in [","element":"span"},{"href":"#id-67","referenceIndex":10,"text":"10","element":"a"},{"text":", Lemma 3.3] that for any given ","element":"span"},{"style":{"height":14.4},"width":87.76,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-9.png","element":"img","alt":" K ≥","inline":true,"padRight":true},{"text":"2, both functions ","element":"span"},{"style":{"height":17.49},"width":307.88,"height":43.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-10.png","element":"img","alt":" Hchks and Hzang","inline":true,"padRight":true},{"text":"satisfy (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") with ","element":"span"},{"style":{"height":18.29},"width":1484.64,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-11.png","element":"img","alt":"c0 = (log2(K − 1) + 1)/2 for Hchks, and c0 = 3(log2(K − 1) + 1)/32 for Hzang.","inline":true}],[{"text":"Note that, the evaluation of ","element":"span"},{"style":{"height":17.09},"width":235.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-12.png","element":"img","alt":" Hchks, Hzang","inline":true,"padRight":true},{"text":"and their gradients only involves square-roots and multiplications, hence they are numerically more stable than the entropy-type smoothing ","element":"span"},{"style":{"height":14.69},"width":70.36,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-13.png","element":"img","alt":" Hen","inline":true,"padRight":true},{"text":"(see [","element":"span"},{"href":"#id-67","referenceIndex":10,"text":"10","element":"a"},{"text":"]). More importantly, since ","element":"span"},{"style":{"height":17.09},"width":103.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-14.png","element":"img","alt":" Hzang","inline":true,"padRight":true},{"text":"only modifies the function ","element":"span"},{"style":{"height":14.69},"width":53.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-15.png","element":"img","alt":" H0","inline":true,"padRight":true},{"text":"locally near the non-differentiable points, we can determine the optimal control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") precisely from the optimal control of (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") without sending the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-16.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"to zero (see Theorem ","element":"span"},{"href":"#id-32","text":"6.4 ","element":"a"},{"text":"and Remark ","element":"span"},{"href":"#id-70","text":"6.2 ","element":"a"},{"text":"for details).","element":"span"}],[{"text":"Figure ","element":"span"},{"href":"#id-71","text":"1 ","element":"a"},{"text":"compares the functions ","element":"span"},{"style":{"height":19.82},"width":415.04,"height":49.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-17.png","element":"img","alt":" Hen, Hzang : R3 → R","inline":true,"padRight":true},{"text":"and the reward functions generated by them. One can clearly see from Figure ","element":"span"},{"href":"#id-71","text":"1 ","element":"a"},{"text":"(left) that ","element":"span"},{"style":{"height":14.69},"width":70.36,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-18.png","element":"img","alt":" Hen","inline":true,"padRight":true},{"text":"substantially modifies the pointwise maximum function ","element":"span"},{"style":{"height":14.69},"width":53.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-19.png","element":"img","alt":" H0","inline":true,"padRight":true},{"text":"everywhere, while ","element":"span"},{"style":{"height":17.09},"width":103.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-20.png","element":"img","alt":" Hzang","inline":true,"padRight":true},{"text":"only performs a modification of ","element":"span"},{"style":{"height":14.69},"width":53.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-21.png","element":"img","alt":" H0","inline":true,"padRight":true},{"text":"locally near the kinks. For both functions, the difference from ","element":"span"},{"style":{"height":14.69},"width":53.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-22.png","element":"img","alt":" H0","inline":true,"padRight":true},{"text":"peaks around the the points where arg max","element":"span"},{"style":{"height":13.02},"width":119.28,"height":32.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-23.png","element":"img","alt":"k∈K xk","inline":true,"padRight":true},{"text":"is not a singleton. Such points correspond to the regions where the agent of the control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") cannot make a clear decision based on the current model, since two or more different actions would result in a very similar reward.","element":"span"}],[{"text":"Figure ","element":"span"},{"href":"#id-71","text":"1 ","element":"a"},{"text":"(right) depicts the reward functions ","element":"span"},{"style":{"height":18.29},"width":870.64,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-24.png","element":"img","alt":" ρen(y1, y2, y3) and ρzang(y1, y2, y3) with y3 =","inline":true,"padRight":true},{"text":"1 ","element":"span"},{"style":{"height":12},"width":185.48,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-25.png","element":"img","alt":" − y1 − y2","inline":true},{"text":", for all (","element":"span"},{"style":{"height":19.14},"width":1438.76,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-26.png","element":"img","alt":"y1, y2) ∈ C := {(y1, y2) ∈ R2 | 0 ≤ y1, y2 ≤ 1, y1 + y2 ≤ 1}. The point","inline":true,"padRight":true},{"text":"(1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3) corresponds to the pure exploration strategy, i.e., the uniform distribution on the action space ","element":"span"},{"style":{"height":17.6},"width":308.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-27.png","element":"img","alt":" A = {a1, a2, a3}","inline":true},{"text":", while the vertices of ","element":"span"},{"text":"C ","element":"span"},{"text":"corresponds to the pure exploitation strategy, i.e., the Dirac measures supported on some ","element":"span"},{"style":{"height":15.09},"width":163.2,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-28.png","element":"img","alt":" ai ∈ A.","inline":true,"padRight":true},{"text":"Both functions achieve their minimum around the point (1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3), which indicates that the exploration reward functions encourage the controller of the relaxed control problem to explore further, especially when it is difficult to choose a unique optimal action based on the current model.","element":"span"}],[{"text":"Note that, by comparing the values of the reward functions near the point (1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3","element":"span"},{"text":", ","element":"span"},{"text":"1","element":"span"},{"text":"/","element":"span"},{"text":"3) and near the vertices of ","element":"span"},{"text":"C","element":"span"},{"text":", we see that ","element":"span"},{"style":{"height":12},"width":56.44,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-29.png","element":"img","alt":" ρen","inline":true,"padRight":true},{"text":"in general gives more rewards for exploration than ","element":"span"},{"style":{"height":13.09},"width":89.96,"height":32.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-30.png","element":"img","alt":"ρzang","inline":true},{"text":". Consequently, to recover the value function and optima control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"), we have to take a smaller relaxation parameter for (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") with ","element":"span"},{"style":{"height":12},"width":56.44,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-31.png","element":"img","alt":" ρen","inline":true,"padRight":true},{"text":"than that for (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") with ","element":"span"},{"style":{"height":13.09},"width":89.96,"height":32.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-32.png","element":"img","alt":" ρzang","inline":true},{"text":", which could cause a numerical instability issue due to the exponentials in ","element":"span"},{"style":{"height":15.09},"width":278.68,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/11-33.png","element":"img","alt":" Hen and ∇Hen","inline":true,"padRight":true},{"text":"(see e.g. [","element":"span"},{"href":"#id-67","referenceIndex":10,"text":"10","element":"a"},{"text":"]).","element":"span"}],[{"id":"id-71","style":{"width":"77%"},"width":1438,"height":1271,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-0.png","element":"img"}],[{"text":"Figure 1: Comparison of ","element":"figcaption","subtype":"caption"},{"style":{"height":17.49},"width":275.72,"height":43.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-1.png","element":"img","alt":" Hen and Hzang","inline":true,"padRight":true},{"text":"and their corresponding reward functions for ","element":"figcaption","subtype":"caption"},{"text":"K ","element":"figcaption","subtype":"caption"},{"text":"= 3.","element":"figcaption","subtype":"caption"}]]},{"heading":"4 Lipschitz stability of optimal feedback relaxed control","paragraphs":[[{"text":"In this section, we shall fix a relaxation parameter ","element":"span"},{"style":{"height":10.4},"width":74.32,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-2.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0 and study the robustness of the feedback control strategy (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") for a relaxed control problem associated with a perturbed model. In particular, we shall show that the control strategy (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") admits a (locally) Lipschitz continuous dependence on the perturbation of the coefficients, if the reward function is generated by a function ","element":"span"},{"text":"H ","element":"span"},{"text":"with locally Lipschitz continuous Hessian.","element":"span"}],[{"text":"We start by presenting two technical results, which are essential for our subsequent analysis. The first one is due to Nugari [","element":"span"},{"href":"#id-72","referenceIndex":34,"text":"34","element":"a"},{"text":"], which establishes the regularity of Nemytskij operators in H¨older spaces.","element":"span"}],[{"id":"id-73","style":{"height":17.6},"width":899.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-3.png","element":"img","alt":"Lemma 4.1. Let n, K ∈ N, α ∈ (0, 1], O ⊂ Rn ","inline":true,"padRight":true},{"text":"be an open bounded set, ","element":"span"},{"style":{"height":19.14},"width":226.88,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-4.png","element":"img","alt":" φ : RK → R","inline":true,"padRight":true},{"text":"be a continuously differentiable function, and ","element":"span"},{"text":"Φ : ","element":"span"},{"style":{"height":19.54},"width":620.36,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-5.png","element":"img","alt":" u ∈ Cα(O, RK) �→ Φ[u] ∈ Cα(O)","inline":true,"padRight":true},{"text":"be the Nemytskij operator satisfying for all ","element":"span"},{"style":{"height":17.6},"width":1113.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-6.png","element":"img","alt":" u = (u1, . . . uK) that Φ[u](x) = φ(u(x)), x ∈O. Then Φ","inline":true,"padRight":true},{"text":"is well-defined, continuous and bounded. Moreover, if we further suppose ","element":"span"},{"style":{"height":16.4},"width":62.48,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-7.png","element":"img","alt":" ∇φ","inline":true,"padRight":true},{"text":"is locally Lipschitz continuous (resp. ","element":"span"},{"style":{"height":16.4},"width":26,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-8.png","element":"img","alt":" φ","inline":true,"padRight":true},{"text":"is twice continuously differentiable), then ","element":"span"},{"text":"Φ ","element":"span"},{"text":"is locally Lipschitz continuous (resp. continuously differentiable with the Fr´echet derivative ","element":"span"},{"text":"Φ","element":"span"},{"style":{"height":19.54},"width":768.52,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-9.png","element":"img","alt":"′[u] = (∇φ)T (u) for all u ∈ Cα(O, RK)).","inline":true}],[{"id":"id-30","text":"Remark ","element":"span"},{"text":"4.1","element":"span"},{"text":". ","element":"span"},{"text":"Lemma ","element":"span"},{"href":"#id-73","text":"4.1 ","element":"a"},{"text":"enables us to view the fully nonlinear HJB operator ","element":"span"},{"href":"#id-59","style":{"height":17.6},"width":300,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-10.png","element":"img","alt":" Fε in (3.5) and","inline":true,"padRight":true},{"text":"the value-to-action map ","element":"span"},{"style":{"height":15.55},"width":169.52,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/12-11.png","element":"img","alt":" uε �→ λuε ","inline":true,"padRight":true},{"text":"defined in (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") as differentiable maps between suitable H¨older spaces, which is essential for the sensitivity analysis on the value functions and feedback relaxed controls in Section ","element":"span"},{"text":"5","element":"span"},{"text":".","element":"span"}],[{"text":"Note that in general it is not possible to perform the same first-order sensitivity analysis by interpreting the HJB operator ","element":"span"},{"style":{"height":14.69},"width":43.84,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-0.png","element":"img","alt":" Fε","inline":true,"padRight":true},{"text":"as a map between the Sobolev space ","element":"span"},{"style":{"height":19.14},"width":144.44,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-1.png","element":"img","alt":" W 2,p(O","inline":true},{"text":") and the Lebesgue space ","element":"span"},{"style":{"height":17.6},"width":99.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-2.png","element":"img","alt":" Lq(O","inline":true},{"text":"). In fact, since the operator ","element":"span"},{"style":{"height":19.13},"width":424.76,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-3.png","element":"img","alt":" Fε : W 2,p(O) → Lq(O","inline":true},{"text":") in general is only differentiable with ","element":"span"},{"text":"p > q ","element":"span"},{"text":"(see [","element":"span"},{"href":"#id-28","referenceIndex":39,"text":"40","element":"a"},{"text":", Theorem 13]), we see the derivative of ","element":"span"},{"style":{"height":14.69},"width":43.84,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-4.png","element":"img","alt":" Fε","inline":true},{"text":", which is a second-order linear elliptic operator, is not bijective between ","element":"span"},{"style":{"height":19.14},"width":368.6,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-5.png","element":"img","alt":" W 2,p(O) and Lq(O","inline":true},{"text":"). Consequently, we cannot apply the implicit function theorem to derive the sensitivity equation for the value function (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") as in Theorem ","element":"span"},{"href":"#id-23","text":"5.2","element":"a"},{"text":".","element":"span"}],[{"text":"If the operator ","element":"span"},{"style":{"height":14.69},"width":43.84,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-6.png","element":"img","alt":" Fε","inline":true,"padRight":true},{"text":"is only semilinear, i.e., the diffusion coefficient of (","element":"span"},{"href":"#id-36","text":"2.2","element":"a"},{"text":") is uncontrolled, then one can show that ","element":"span"},{"style":{"height":14.69},"width":43.84,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-7.png","element":"img","alt":" Fε","inline":true,"padRight":true},{"text":"is differentiable between ","element":"span"},{"style":{"height":19.14},"width":839.72,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-8.png","element":"img","alt":" W 2,p(O) and Lp(O) for 1 < p < ∞, and its","inline":true,"padRight":true},{"text":"derivative is a bijection between the same spaces (see [","element":"span"},{"href":"#id-29","referenceIndex":25,"text":"26","element":"a"},{"text":"] for the case with ","element":"span"},{"text":"p ","element":"span"},{"text":"= 2). In this case, we can extend Theorem ","element":"span"},{"href":"#id-23","text":"5.2 ","element":"a"},{"text":"and study ","element":"span"},{"style":{"height":12},"width":47.76,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-9.png","element":"img","alt":" Lp","inline":true},{"text":"-perturbation of the coefficients in (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":").","element":"span"}],[{"text":"Now we proceed to introduce a relaxed control problem with a set of perturbed coefficients satisfying the following conditions:","element":"span"}],[{"id":"id-74","style":{"height":17.6},"width":515.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-10.png","element":"img","alt":"H.3. Let ν > 0, θ ∈ (0, 1]","inline":true,"padRight":true},{"text":"be the constants in (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":"), and ","element":"span"},{"text":"Λ","element":"span"},{"style":{"height":12.4},"width":100.24,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-11.png","element":"img","alt":"′ > 0","inline":true,"padRight":true},{"text":"be a constant. The functions ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":21.41},"width":1176.48,"height":53.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-12.png","element":"img","alt":"b : Rn ×A → Rn, ˆσ : Rn ×A → Rn×n, ˆc :O ×A → [0, ∞), ˆf :","inline":true}],[{"style":{"width":"93%"},"width":1737,"height":178,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-13.png","element":"img"}],[{"text":"Let ","element":"span"},{"style":{"height":10.4},"width":72.4,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-14.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0 be a fixed relaxation parameter. We shall consider a perturbed control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") with the coefficients (","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":21.01},"width":191.44,"height":52.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-15.png","element":"img","alt":"b, ˆσ, ˆc, ˆf, ˆg","inline":true},{"text":"), and its relaxation (see (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":")) with parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-16.png","element":"img","alt":" ε","inline":true},{"text":", whose value function is denoted as ˆ","element":"span"},{"style":{"height":12.73},"width":38.56,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-17.png","element":"img","alt":"vε","inline":true},{"text":". Then, by using Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":", Theorems ","element":"span"},{"href":"#id-60","text":"3.4 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-2","text":"3.5","element":"a"},{"text":", one can verify that, under (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":"), the value function ˆ","element":"span"},{"style":{"height":12.74},"width":38.56,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-18.png","element":"img","alt":"vε ","inline":true,"padRight":true},{"text":"is the classical solution ˆ","element":"span"},{"style":{"height":19.14},"width":509.6,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-19.png","element":"img","alt":"uε ∈ C(O) ∩ C2(O) of the","inline":true,"padRight":true},{"text":"following Dirichlet problem:","element":"span"}],[{"id":"id-75","style":{"width":"88%"},"width":1635,"height":77,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-20.png","element":"img"}],[{"text":"where the function ","element":"span"},{"style":{"height":14.69},"width":52.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-21.png","element":"img","alt":" Hε","inline":true,"padRight":true},{"text":"is defined as in (","element":"span"},{"href":"#id-58","text":"3.3","element":"a"},{"text":"), ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":15.54},"width":262.32,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-22.png","element":"img","alt":"f : O → RK ","inline":true,"padRight":true},{"text":"is the function satisfying ","element":"span"},{"text":"ˆ","element":"span"},{"text":"f","element":"span"},{"text":"(","element":"span"},{"text":"x","element":"span"},{"text":") = ( ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":21.22},"width":879.92,"height":53.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-23.png","element":"img","alt":"f(x, ak))k∈K for all x ∈O, and ˆL = ( ˆLk)k∈K","inline":true,"padRight":true},{"text":"is a family of elliptic operators satisfying for all","element":"span"}],[{"style":{"width":"86%"},"width":1599,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-24.png","element":"img"}],[{"text":"Moreover, we can deduce from (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") that, the optimal feedback control of the perturbed relaxed control problem is given by","element":"span"}],[{"id":"id-82","style":{"width":"93%"},"width":1737,"height":86,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-25.png","element":"img"}],[{"text":"Note that Theorem ","element":"span"},{"href":"#id-60","text":"3.4 ","element":"a"},{"text":"shows that the classical solution ˆ","element":"span"},{"href":"#id-75","style":{"height":19.93},"width":768.96,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-26.png","element":"img","alt":"uε of (4.1) is in C2,β(O) for some β > 0,","inline":true,"padRight":true},{"text":"so the above function ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":15.75},"width":59.12,"height":39.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-27.png","element":"img","alt":"λˆuε ","inline":true,"padRight":true},{"text":"is well-defined on ","element":"span"},{"style":{"height":13.2},"width":73.44,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-28.png","element":"img","alt":" ∂O.","inline":true},{"text":"The following result shows the (local) Lipschitz dependence of ˆ","element":"span"},{"style":{"height":20.61},"width":600.12,"height":51.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-29.png","element":"img","alt":"uε −uε and ˆλˆuε −λuε on pertur-","inline":true,"padRight":true},{"text":"bation of the coefficients, which demonstrates the robustness of the relaxed control problem. For notational simplicity, given the functions (","element":"span"},{"style":{"height":21.41},"width":510.64,"height":53.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-30.png","element":"img","alt":"b, σ, c, f, g) and (ˆb, ˆσ, ˆc, ˆf, ˆg","inline":true},{"text":") satisfying (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":") respectively, we shall introduce for each ","element":"span"},{"style":{"height":17.6},"width":158.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-31.png","element":"img","alt":" β ∈ (0, θ","inline":true},{"text":"] the following measurement of perturbations:","element":"span"}],[{"id":"id-77","style":{"width":"99%"},"width":1844,"height":183,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/13-32.png","element":"img"}],[{"id":"id-21","style":{"width":"101%"},"width":1867,"height":524,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-0.png","element":"img"}],[{"text":"Proof. ","element":"span"},{"text":"Throughout this proof, we shall denote by ","element":"span"},{"text":"C ","element":"span"},{"text":"a generic constant, which depends only on ","element":"span"},{"style":{"height":11.2},"width":32.16,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-1.png","element":"img","alt":" ε,","inline":true},{"style":{"height":17.89},"width":610.04,"height":44.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-2.png","element":"img","alt":"n, K, ν, Λ, Λ′, β, c0, Mg and O","inline":true},{"text":", and may take a different value at each occurrence.","element":"span"}],[{"text":"The ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate in Proposition ","element":"span"},{"href":"#id-62","text":"3.3 ","element":"a"},{"text":"shows that there exists a constant ","element":"span"},{"style":{"height":17.6},"width":354.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-3.png","element":"img","alt":" β0 = β0(n, ν, M) ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"text":", ","element":"span"},{"text":"1), such that we have for all ","element":"span"},{"style":{"height":17.6},"width":311.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-4.png","element":"img","alt":" β ∈ (0, min(β0, θ","inline":true},{"text":")] the estimates ","element":"span"},{"style":{"height":18.48},"width":343.12,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-5.png","element":"img","alt":" |uε|2,β, |ˆuε|2,β ≤ C","inline":true},{"text":". Moreover, we have by the fundamental theorem of calculus that","element":"span"}],[{"id":"id-76","style":{"width":"98%"},"width":1828,"height":112,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-6.png","element":"img"}],[{"text":"in ","element":"span"},{"style":{"height":16.8},"width":414.96,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-7.png","element":"img","alt":" O, where η :O → ∆K","inline":true,"padRight":true},{"text":"is the function defined as ","element":"span"},{"style":{"height":23.57},"width":885.6,"height":58.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-8.png","element":"img","alt":" η :=� 10 (∇Hε)�s(Luε +f)+(1−s)(ˆLˆuε +ˆf)�ds.","inline":true}],[{"text":"Now let ","element":"span"},{"style":{"height":17.6},"width":311.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-9.png","element":"img","alt":" β ∈ (0, min(β0, θ","inline":true},{"text":")] be a fixed constant. The fact that ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":611.52,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-10.png","element":"img","alt":" ∇Hε ∈ C1(RK, ∆K) (see (H.2)),","inline":true,"padRight":true},{"text":"the H¨older continuity of coefficients (see (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":")), and the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimates of ","element":"span"},{"style":{"height":18.48},"width":197.76,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-11.png","element":"img","alt":" |uε|2,β and","inline":true},{"style":{"height":18.48},"width":113.12,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-12.png","element":"img","alt":"|ˆuε|2,β","inline":true,"padRight":true},{"text":"yield the estimate that ","element":"span"},{"style":{"height":18.48},"width":162.64,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-13.png","element":"img","alt":" |η|β ≤ C","inline":true,"padRight":true},{"text":"(see Lemma ","element":"span"},{"href":"#id-73","text":"4.1","element":"a"},{"text":"). Then, by setting ","element":"span"},{"style":{"height":19.94},"width":438.24,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-14.png","element":"img","alt":" w = uε − ˆuε ∈ C2,β(O),","inline":true,"padRight":true},{"text":"we can deduce from (","element":"span"},{"href":"#id-76","text":"4.5","element":"a"},{"text":") that ","element":"span"},{"text":"w ","element":"span"},{"text":"is the classical solution to the following Dirichlet problem:","element":"span"}],[{"style":{"width":"66%"},"width":1221,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-15.png","element":"img"}],[{"text":"Hence the fact that ","element":"span"},{"style":{"height":19.94},"width":277.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-16.png","element":"img","alt":" η ∈ Cβ(O, ∆K","inline":true},{"text":") and the global Schauder estimate in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 6.6] lead us to the estimate that","element":"span"}],[{"style":{"width":"57%"},"width":1054,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-17.png","element":"img"}],[{"text":"which, together with the maximum principle (see [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) and the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate of ","element":"span"},{"style":{"height":18.48},"width":113.12,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-18.png","element":"img","alt":"|ˆuε|2,β","inline":true},{"text":", enables us to conclude that:","element":"span"}],[{"style":{"width":"75%"},"width":1392,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-19.png","element":"img"}],[{"text":"with the constant ","element":"span"},{"style":{"height":17.68},"width":100.64,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-20.png","element":"img","alt":" Eper,β","inline":true,"padRight":true},{"text":"defined as in (","element":"span"},{"href":"#id-77","text":"4.3","element":"a"},{"text":"). Now we show the stability of feedback controls. Note that (","element":"span"},{"href":"#id-21","text":"4.4","element":"a"},{"text":") implies that","element":"span"}],[{"style":{"width":"73%"},"width":1351,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-21.png","element":"img"}],[{"text":"The additional assumption that ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":391.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-22.png","element":"img","alt":" H : RK → R in (H.2","inline":true},{"text":") has a locally Lipschitz continuous Hessian implies that ","element":"span"},{"style":{"height":14.69},"width":88.96,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-23.png","element":"img","alt":" ∇Hε","inline":true,"padRight":true},{"text":"is differentiable with locally Lipschitz continuous derivatives, which along with Lemma ","element":"span"},{"href":"#id-73","text":"4.1 ","element":"a"},{"text":"shows that the Nemytskij operator ","element":"span"},{"style":{"height":19.94},"width":595.44,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-24.png","element":"img","alt":" ∇Hε : Cβ(O, RK) → Cβ(O, RK","inline":true},{"text":") is locally Lipschitz continuous. Hence there exists a constant ","element":"span"},{"text":"C","element":"span"},{"text":", such that for all perturbed coefficients (","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":21.41},"width":207.08,"height":53.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-25.png","element":"img","alt":"b, ˆσ, ˆc, ˆf, ˆg)","inline":true,"padRight":true},{"text":"satisfying (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":"), we have","element":"span"}],[{"style":{"width":"64%"},"width":1187,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/14-26.png","element":"img"}],[{"text":"which finishes the desired (local) Lipschitz estimate.","element":"span"}],[{"id":"id-88","text":"Remark ","element":"span"},{"text":"4.2","element":"span"},{"text":". ","element":"span"},{"text":"The assumption that ","element":"span"},{"href":"#id-50","style":{"height":19.53},"width":443.44,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-0.png","element":"img","alt":" H : RK → R in (H.2","inline":true},{"text":") has a locally Lipschitz continuous Hessian is satisfied by most commonly used functions, including ","element":"span"},{"style":{"height":17.49},"width":585.6,"height":43.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-1.png","element":"img","alt":" Hen, Hchks and Hzang given in","inline":true,"padRight":true},{"text":"Section ","element":"span"},{"text":"3","element":"span"},{"text":". In general, if ","element":"span"},{"text":"H ","element":"span"},{"text":"is merely twice continuously differentiable as in (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"), we can follow a similar argument and establish that the H¨older norm of the difference between two relaxed control strategies is continuously dependent on the H¨older norms of the perturbations in the coefficients.","element":"span"}],[{"text":"Note that the Lipschitz stability result (","element":"span"},{"href":"#id-21","text":"4.4","element":"a"},{"text":") in general does not hold for the original control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") (or equivalently, ","element":"span"},{"href":"#id-51","style":{"height":17.6},"width":314.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-2.png","element":"img","alt":" ε = 0 in (3.2)).","inline":true,"padRight":true},{"text":"In fact, for any given ","element":"span"},{"href":"#id-78","referenceIndex":18,"style":{"height":17.6},"width":431.64,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-3.png","element":"img","alt":" β ∈ (0, 1), [18, Theo-","inline":true,"padRight":true},{"text":"rem 2] shows that the Nemytskij operator ","element":"span"},{"style":{"height":19.93},"width":630.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-4.png","element":"img","alt":" f ∈ (Cβ(O))K �→ H0(f) ∈ Cβ(O","inline":true},{"text":") is not continuous, which implies that there exists (","element":"span"},{"style":{"height":22.38},"width":482.64,"height":55.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-5.png","element":"img","alt":"fm)m∈N∪{∞} ⊂ (Cβ(O))K ","inline":true,"padRight":true},{"text":"such that lim","element":"span"},{"style":{"height":18.48},"width":473.28,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-6.png","element":"img","alt":"m→∞ |fm − f∞|β = 0 and","inline":true},{"style":{"height":18.48},"width":768.8,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-7.png","element":"img","alt":"|H0(fm) − H0(f∞)|β ≥ 1 for all m ∈ N","inline":true},{"text":". Now for each ","element":"span"},{"style":{"height":17.6},"width":278.32,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-8.png","element":"img","alt":" m ∈ N ∪ {∞}","inline":true},{"text":", we consider the following simple HJB equation (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"): ∆","element":"span"},{"style":{"height":17.6},"width":850.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-9.png","element":"img","alt":"um + H0(fm) = 0 in O and um = 0 on ∂O","inline":true},{"text":". Hence we have ","element":"span"},{"style":{"height":18.48},"width":1153.76,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-10.png","element":"img","alt":"|∆(um − u∞)|β = |H0(fm) − H0(f∞)|β ≥ 1 for all m ∈ N","inline":true},{"text":", which implies that the ","element":"span"},{"style":{"height":15.94},"width":198.28,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-11.png","element":"img","alt":" C2,β-norm","inline":true,"padRight":true},{"text":"of the value function (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") does not depend continuously on the ","element":"span"},{"style":{"height":15.94},"width":54.55,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-12.png","element":"img","alt":" Cβ","inline":true},{"text":"-perturbation of the model parameters. See Theorem ","element":"span"},{"href":"#id-25","text":"5.4 ","element":"a"},{"text":"for a precise quantification of ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-13.png","element":"img","alt":" ε","inline":true},{"text":"-dependence in (","element":"span"},{"href":"#id-21","text":"4.4","element":"a"},{"text":").","element":"span"}],[{"text":"The remaining part of this section is devoted to an important application of Theorem ","element":"span"},{"href":"#id-21","text":"4.2","element":"a"},{"text":",","element":"span"}],[{"text":"where we shall examine the performance of the control strategy ","element":"span"},{"style":{"height":15.55},"width":59.12,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-14.png","element":"img","alt":" λuε","inline":true},{"text":", computed based on the relaxed control problem with the original coefficients (","element":"span"},{"href":"#id-27","style":{"height":17.6},"width":384.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-15.png","element":"img","alt":"b, σ, c, f, g) (see (3.6","inline":true},{"text":")), on a new relaxed","element":"span"}],[{"text":"control problem with perturbed coefficients satisfying (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":").","element":"span"}],[{"text":"We first observe that, if there exists a classical solution ","element":"span"},{"style":{"height":19.14},"width":360.92,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-16.png","element":"img","alt":" uε ∈ C(O) ∩ C2(O","inline":true},{"text":") to the following","element":"span"}],[{"id":"id-79","style":{"width":"99%"},"width":1843,"height":88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-17.png","element":"img"}],[{"text":"with ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":17.22},"width":163.52,"height":43.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-18.png","element":"img","alt":"L and ˆf","inline":true,"padRight":true},{"text":"defined as in (","element":"span"},{"href":"#id-75","text":"4.1","element":"a"},{"text":"), then by using Itˆo’s formula, one can easily show that the reward function ","element":"span"},{"style":{"height":12.74},"width":38.56,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-19.png","element":"img","alt":" vε","inline":true},{"text":", resulting by implementing the H¨older continous feedback control ","element":"span"},{"style":{"height":15.55},"width":195.68,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-20.png","element":"img","alt":" λuε to the","inline":true,"padRight":true},{"text":"relaxed control problem with the coefficients (","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":21.01},"width":191.44,"height":52.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-21.png","element":"img","alt":"b, ˆσ, ˆc, ˆf, ˆg","inline":true},{"text":"), coincides with the function ","element":"span"},{"style":{"height":17.6},"width":136.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-22.png","element":"img","alt":" uε (see","inline":true,"padRight":true},{"text":"e.g. Theorems ","element":"span"},{"href":"#id-1","text":"2.2 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-2","text":"3.5","element":"a"},{"text":"). On the other hand, we have seen that the (optimal) value function ˆ","element":"span"},{"style":{"height":12.74},"width":38.56,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-23.png","element":"img","alt":"vε ","inline":true,"padRight":true},{"text":"of the perturbed relaxed control problem is the classical solution ˆ","element":"span"},{"href":"#id-75","style":{"height":17.6},"width":183.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-24.png","element":"img","alt":"uε to (4.1","inline":true},{"text":"). Hence it suffices to compare the classical solutions to (","element":"span"},{"href":"#id-79","text":"4.6","element":"a"},{"text":") and (","element":"span"},{"href":"#id-75","text":"4.1","element":"a"},{"text":").","element":"span"}],[{"text":"The following proposition shows that (","element":"span"},{"href":"#id-79","text":"4.6","element":"a"},{"text":") indeed admits an unique classical solution.","element":"span"}],[{"style":{"width":"99%"},"width":1844,"height":64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-25.png","element":"img"}],[{"style":{"height":15.49},"width":182.64,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-26.png","element":"img","alt":"O → ∆K","inline":true,"padRight":true},{"text":"be the function defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":")","element":"span"},{"text":", and ","element":"span"},{"style":{"height":17.6},"width":487.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-27.png","element":"img","alt":" β0 = β0(n, ν, M) ∈ (0, 1)","inline":true,"padRight":true},{"text":"be the constant in Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":". Then the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-79","text":"4.6","element":"a"},{"text":") ","element":"span"},{"text":"admits a unique solution ","element":"span"},{"style":{"height":20.34},"width":385.96,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-28.png","element":"img","alt":" uε ∈ C2,min(β0,θ)(O).","inline":true}],[{"id":"id-80","style":{"width":"101%"},"width":1877,"height":792,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/15-29.png","element":"img"}],[{"text":"We are ready to show that, the difference between this suboptimal reward function ","element":"span"},{"style":{"height":12.8},"width":128.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-0.png","element":"img","alt":" vε and","inline":true,"padRight":true},{"text":"the (optimal) value function ˆ","element":"span"},{"style":{"height":12.74},"width":38.56,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-1.png","element":"img","alt":"vε ","inline":true,"padRight":true},{"text":"of the perturbed relaxed control problem depends Lipschitzcontinuously on the magnitude of perturbations in the coefficients.","element":"span"}],[{"id":"id-22","style":{"width":"101%"},"width":1867,"height":225,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-2.png","element":"img"}],[{"text":"Hessian, then there exists ","element":"span"},{"style":{"height":17.6},"width":466.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-3.png","element":"img","alt":" β0 = β0(n, ν, M) ∈ (0, 1)","inline":true},{"text":", such that for all ","element":"span"},{"style":{"height":17.6},"width":524.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-4.png","element":"img","alt":" β ∈ (0, min(β0, θ)], we have","inline":true,"padRight":true},{"text":"the estimate ","element":"span"},{"style":{"height":18.48},"width":417.44,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-5.png","element":"img","alt":" |ˆuε − uε|2,β ≤ CEper,β","inline":true},{"text":", with the constant ","element":"span"},{"style":{"height":17.68},"width":100.64,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-6.png","element":"img","alt":" Eper,β","inline":true,"padRight":true},{"text":"defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-77","text":"4.3","element":"a"},{"text":")","element":"span"},{"text":", and a constant","element":"span"}],[{"style":{"width":"99%"},"width":1843,"height":99,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-7.png","element":"img"}],[{"text":"and ","element":"span"},{"text":"C ","element":"span"},{"text":"be a generic constant, which depends only on ","element":"span"},{"style":{"height":17.89},"width":840.44,"height":44.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-8.png","element":"img","alt":" ε, n, K, ν, Λ, Λ′, β, c0, Mg and O, and may","inline":true}],[{"style":{"width":"80%"},"width":1486,"height":262,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-9.png","element":"img"}],[{"text":"which, together with the fact that ˆ","element":"span"},{"style":{"height":16.4},"width":251.44,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-10.png","element":"img","alt":"uε = uε = ˆg","inline":true,"padRight":true},{"text":"and the classical maximum principle (see [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]), shows that ˆ","element":"span"},{"style":{"height":14.8},"width":266.88,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-11.png","element":"img","alt":"uε ≥ uε onO.","inline":true}],[{"text":"We now estimate ˆ","element":"span"},{"style":{"height":12.74},"width":124.96,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-12.png","element":"img","alt":"uε−uε","inline":true,"padRight":true},{"text":"by assuming the function ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":382.48,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-13.png","element":"img","alt":" H : RK → R in (H.2","inline":true},{"text":") has a locally Lipschitz continuous Hessian. By using the definition of the optimal control ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":15.55},"width":59.12,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-14.png","element":"img","alt":"λˆuε","inline":true},{"text":", we have that","element":"span"}],[{"style":{"width":"39%"},"width":727,"height":54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-15.png","element":"img"}],[{"text":"By subtracting (","element":"span"},{"href":"#id-79","text":"4.6","element":"a"},{"text":") from the above equation, we have","element":"span"}],[{"style":{"width":"78%"},"width":1458,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-16.png","element":"img"}],[{"text":"Note that, the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate in Proposition ","element":"span"},{"href":"#id-62","text":"3.3 ","element":"a"},{"text":"shows that, under (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":"), (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-74","text":"3","element":"a"},{"text":"), there exists a constant ","element":"span"},{"style":{"height":17.6},"width":444,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-17.png","element":"img","alt":" β0 = β0(n, ν, M) ∈ (0,","inline":true,"padRight":true},{"text":"1), such that we have for all ","element":"span"},{"style":{"height":17.6},"width":355.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-18.png","element":"img","alt":" β ∈ (0, min(β0, θ)]","inline":true,"padRight":true},{"text":"the estimates ","element":"span"},{"style":{"height":18.48},"width":355.6,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-19.png","element":"img","alt":" |uε|2,β, |ˆuε|2,β ≤ C","inline":true},{"text":", which, along with the fact that ","element":"span"},{"style":{"height":19.54},"width":287.28,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-20.png","element":"img","alt":" ∇Hε ∈ C1(RK","inline":true},{"text":") and Lemma ","element":"span"},{"href":"#id-73","text":"4.1","element":"a"},{"text":", implies the ","element":"span"},{"style":{"height":22.29},"width":669.52,"height":55.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-21.png","element":"img","alt":" a priori bounds |ˆλˆuε|β, |λuε|β ≤ C","inline":true},{"text":". Hence, from any given ","element":"span"},{"style":{"height":17.6},"width":364.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-22.png","element":"img","alt":" β ∈ (0, min(β0, θ)],","inline":true,"padRight":true},{"text":"we can deduce from the Schauder theory in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 6.6] and the maximum principle in [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7] that","element":"span"}],[{"id":"id-81","style":{"width":"83%"},"width":1535,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-23.png","element":"img"}],[{"text":"By using the additional assumption that ","element":"span"},{"text":"H ","element":"span"},{"text":"has a locally Lipschitz continuous Hessian, and the identity (","element":"span"},{"href":"#id-80","text":"4.7","element":"a"},{"text":"), we can deduce that ","element":"span"},{"style":{"height":19.54},"width":371.84,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-24.png","element":"img","alt":" ρ(∇Hε) : RK → R","inline":true,"padRight":true},{"text":"is continuously differentiable with a locally Lipschitz continuous gradient, from which, we can obtain from Lemma ","element":"span"},{"href":"#id-73","text":"4.1 ","element":"a"},{"text":"that for any ","element":"span"},{"style":{"height":17.6},"width":137.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-25.png","element":"img","alt":"α ∈ (0,","inline":true,"padRight":true},{"text":"1], the corresponding Nemytskij operator (","element":"span"},{"style":{"height":19.54},"width":674.25,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-26.png","element":"img","alt":"ερ)(∇Hε) : Cα(O, RK) → Cα(O, R","inline":true},{"text":") is locally Lipschitz continuous. Hence, we can obtain from (","element":"span"},{"href":"#id-81","text":"4.8","element":"a"},{"text":") and the definitions of ","element":"span"},{"style":{"height":21.41},"width":324.32,"height":53.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-27.png","element":"img","alt":" λuε and ˆλˆuε (see","inline":true,"padRight":true},{"text":"(","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") and (","element":"span"},{"href":"#id-82","text":"4.2","element":"a"},{"text":")) that","element":"span"}],[{"style":{"width":"83%"},"width":1534,"height":206,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/16-28.png","element":"img"}],[{"text":"from which, we can conclude from the ","element":"span"},{"style":{"height":18.48},"width":481.28,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-0.png","element":"img","alt":" a priori bound of |ˆuε|2,β","inline":true,"padRight":true},{"text":"and Theorem ","element":"span"},{"href":"#id-21","text":"4.2 ","element":"a"},{"text":"the desired estimate ","element":"span"},{"style":{"height":18.48},"width":419.52,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-1.png","element":"img","alt":" |ˆuε − uε|2,β ≤ CEper,β.","inline":true}]]},{"heading":"5 First-order sensitivity equations for relaxed control problems","paragraphs":[[{"text":"In this section, we proceed to derive a first-order Taylor expansion for the value function and the optimal control of the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") with perturbed coefficients, which subsequently leads us to a first-order approximation of the optimal strategy for the perturbed problem based on the pre-computed optimal control. The sensitivity equation further enables us to quantify the explicit dependence of the Lipschitz stability result (","element":"span"},{"href":"#id-21","text":"4.4","element":"a"},{"text":") on the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":32.16,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-2.png","element":"img","alt":" ε.","inline":true}],[{"text":"The following proposition establishes the Fr´echet differentiability of the fully nonlinear HJB operator with inhomogeneous boundary conditions. For notational simplicity, for any given ","element":"span"},{"style":{"height":16.4},"width":70.76,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-3.png","element":"img","alt":" β ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"text":", ","element":"span"},{"text":"1], and bounded open subset ","element":"span"},{"style":{"height":16.33},"width":338.72,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-4.png","element":"img","alt":" O ⊂ Rn with C2,β ","inline":true,"padRight":true},{"text":"boundary, we shall introduce the Banach space Θ","element":"span"},{"style":{"height":11.6},"width":20,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-5.png","element":"img","alt":"β ","inline":true,"padRight":true},{"text":"for the coefficients:","element":"span"}],[{"id":"id-84","style":{"width":"83%"},"width":1546,"height":62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-6.png","element":"img"}],[{"text":"equipped with the product norm ","element":"span"},{"style":{"height":17.6},"width":88.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-7.png","element":"img","alt":" |·|Θβ","inline":true},{"text":", and denote by ","element":"span"},{"style":{"height":17.6},"width":470.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-8.png","element":"img","alt":" ϑ = ((ak, bk, ck, fk)k∈K, g","inline":true},{"text":") a generic element in Θ","element":"span"},{"style":{"height":11.6},"width":20,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-9.png","element":"img","alt":"β","inline":true},{"text":". We also denote by ","element":"span"},{"style":{"height":19.94},"width":160.28,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-10.png","element":"img","alt":" C2,β(∂O","inline":true},{"text":") the Banach space of ","element":"span"},{"style":{"height":15.94},"width":80.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-11.png","element":"img","alt":" C2,β ","inline":true,"padRight":true},{"text":"functions defined on ","element":"span"},{"style":{"height":13.2},"width":60.44,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-12.png","element":"img","alt":" ∂O","inline":true,"padRight":true},{"text":"(see Remark ","element":"span"},{"href":"#id-83","text":"2.1","element":"a"},{"text":"), and by ","element":"span"},{"style":{"height":19.94},"width":467.48,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-13.png","element":"img","alt":" τD : C2,β(O) → C2,β(∂O","inline":true},{"text":") the restriction operator on ","element":"span"},{"style":{"height":13.2},"width":60.44,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-14.png","element":"img","alt":" ∂O","inline":true},{"text":". Furthermore, for any given Banach spaces ","element":"span"},{"text":"X ","element":"span"},{"text":"and ","element":"span"},{"text":"Y ","element":"span"},{"text":", we denote by ","element":"span"},{"text":"B","element":"span"},{"text":"(","element":"span"},{"text":"X, Y ","element":"span"},{"text":") the Banach space containing all continuous linear mappings from ","element":"span"},{"text":"X ","element":"span"},{"text":"into ","element":"span"},{"text":"Y ","element":"span"},{"text":", equipped with the operator norm.","element":"span"}],[{"id":"id-86","text":"Proposition 5.1. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") holds. Let ","element":"span"},{"style":{"height":17.6},"width":364.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-15.png","element":"img","alt":" ε > 0, β ∈ (0, 1], O","inline":true,"padRight":true},{"text":"be a bounded domain in ","element":"span"},{"style":{"height":12.8},"width":149.28,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-16.png","element":"img","alt":" Rn with","inline":true},{"style":{"height":19.54},"width":555.2,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-17.png","element":"img","alt":"C2,β boundary, Hε : RK → R","inline":true,"padRight":true},{"text":"be the function defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-58","text":"3.3","element":"a"},{"text":")","element":"span"},{"text":", ","element":"span"},{"text":"Θ","element":"span"},{"style":{"height":11.6},"width":20,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-18.png","element":"img","alt":"β ","inline":true,"padRight":true},{"text":"be the Banach space defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-84","text":"5.1","element":"a"},{"text":")","element":"span"},{"text":", and ","element":"span"},{"style":{"height":19.94},"width":783.08,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-19.png","element":"img","alt":" F β : Θβ × C2,β(O) → Cβ(O) × C2,β(∂O)","inline":true,"padRight":true},{"text":"be the following HJB operator:","element":"span"}],[{"style":{"width":"95%"},"width":1762,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-20.png","element":"img"}],[{"text":"where for any given ","element":"span"},{"style":{"height":20.7},"width":1441,"height":51.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-21.png","element":"img","alt":" ϑ = ((ak, bk, ck, fk)k∈K, g) ∈ Θβ, f ϑ = (fk)k∈K ∈ Cβ(O)K, gϑ = g and","inline":true},{"style":{"height":21.49},"width":286.16,"height":53.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-22.png","element":"img","alt":"Lϑ = (Lϑk )k∈K","inline":true,"padRight":true},{"text":"is the elliptic operators satisfying ","element":"span"},{"style":{"height":22.61},"width":871.72,"height":56.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-23.png","element":"img","alt":" Lϑk φ = aijk ∂ijφ + bik∂iφ − ckφ for all k ∈ K,","inline":true},{"style":{"height":19.13},"width":215.08,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-24.png","element":"img","alt":"φ ∈ C2(O).","inline":true}],[{"style":{"height":19.93},"width":550.28,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-25.png","element":"img","alt":"C2,β(O), Cβ(O) × C2,β(∂O))","inline":true,"padRight":true},{"text":"satisfying for all ","element":"span"},{"text":"(","element":"span"},{"style":{"height":20.61},"width":932.36,"height":51.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-26.png","element":"img","alt":"ϑ, u) ∈ Θβ × C2,β(O), ˜ϑ ∈ Θβ and v ∈ C2,β(O)","inline":true,"padRight":true},{"text":"that","element":"span"}],[{"style":{"width":"73%"},"width":1354,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-27.png","element":"img"}],[{"text":"Proof. ","element":"span"},{"text":"We first write the HJB operator as ","element":"span"},{"style":{"height":19.94},"width":1019.24,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-28.png","element":"img","alt":" F β = (F1, F2), where F1 : Θβ × C2,β(O) → Cβ(O) is","inline":true,"padRight":true},{"text":"the composition of the Nemytskij operator ","element":"span"},{"style":{"height":19.94},"width":427.16,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-29.png","element":"img","alt":" Hε : Cβ(O)K → Cβ(O","inline":true},{"text":") and the mapping ","element":"span"},{"style":{"height":17.6},"width":221,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-30.png","element":"img","alt":" G : (ϑ, u) ∈","inline":true,"padRight":true},{"text":"Θ","element":"span"},{"style":{"height":20.51},"width":1814.32,"height":51.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-31.png","element":"img","alt":"β × C2,β(O) �→ G[ϑ, u] := Lϑu + f ϑ ∈ Cβ(O)K, and F2 : (ϑ, u) ∈ Θβ × C2,β(O) �→ F2[ϑ, u] :=","inline":true},{"style":{"height":19.94},"width":423.8,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-32.png","element":"img","alt":"τD(u − gϑ) ∈ C2,β(∂O","inline":true},{"text":") is the linear boundary operator.","element":"span"}],[{"text":"Since the function ","element":"span"},{"style":{"height":19.53},"width":314.16,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-33.png","element":"img","alt":" Hε is in C2(RK","inline":true},{"text":"), we can deduce from Lemma ","element":"span"},{"href":"#id-73","text":"4.1 ","element":"a"},{"text":"that the Nemytskij operator ","element":"span"},{"style":{"height":19.94},"width":432.92,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-34.png","element":"img","alt":" Hε : Cβ(O)K → Cβ(O","inline":true},{"text":") is well-defined and continuously differentiable with the Fr´echet derivative (","element":"span"},{"style":{"height":19.94},"width":1219.68,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-35.png","element":"img","alt":"Hε)′[u] = (∇Hε)T (u) ∈ B(Cβ(O)K, Cβ(O)) for all u ∈ Cβ(O)K.","inline":true}],[{"text":"Moreover, since for any given (","element":"span"},{"style":{"height":19.93},"width":1171.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-36.png","element":"img","alt":"ϑ, u) ∈ Θβ × C2,β(O), G[·, u] : Θβ → Cβ(O)K and G[ϑ, ·] :","inline":true},{"style":{"height":19.94},"width":390.48,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-37.png","element":"img","alt":"C2,β(O) → Cβ(O)K ","inline":true,"padRight":true},{"text":"are affine mappings, one can easily compute the partial derivatives ","element":"span"},{"style":{"height":15.49},"width":109.44,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-38.png","element":"img","alt":" ∂uG :","inline":true,"padRight":true},{"text":"Θ","element":"span"},{"style":{"height":19.94},"width":1813.64,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-39.png","element":"img","alt":"β × C2,β(O) → B(C2,β(O), Cβ(O)K) and ∂ϑG : Θβ × C2,β(O) → B(Θβ, Cβ(O)K) of G as","inline":true,"padRight":true},{"text":"follows: (","element":"span"},{"style":{"height":23.36},"width":1662.24,"height":58.4,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/17-40.png","element":"img","alt":"∂uG)[ϑ, u](v) = Lϑv and (∂ϑG)[ϑ, u](˜ϑ) = L˜ϑu + f˜ϑ for all (ϑ, u) ∈ Θβ × C2,β(O),","inline":true,"padRight":true},{"text":"˜","element":"span"},{"style":{"height":19.94},"width":474.68,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-0.png","element":"img","alt":"ϑ ∈ Θβ and v ∈ C2,β(O","inline":true},{"text":"). Moreover, it is clear that ","element":"span"},{"style":{"height":16.08},"width":267.8,"height":40.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-1.png","element":"img","alt":" ∂uG and ∂ϑG","inline":true,"padRight":true},{"text":"are both continuous, which implies that ","element":"span"},{"style":{"height":19.94},"width":558,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-2.png","element":"img","alt":" G : Θβ × C2,β(O) → Cβ(O)K ","inline":true,"padRight":true},{"text":"is continuously differentiable with derivative","element":"span"}],[{"style":{"width":"69%"},"width":1277,"height":60,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-3.png","element":"img"}],[{"text":"for all (","element":"span"},{"href":"#id-85","referenceIndex":17,"style":{"height":20.61},"width":1061.2,"height":51.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-4.png","element":"img","alt":"ϑ, u) ∈ Θβ × C2,β(O), ˜ϑ ∈ Θβ and v ∈ C2,β(O) (see [17","inline":true},{"text":", Theorem 7.2-3]).","element":"span"}],[{"text":"Therefore, by using the chain rule (see [","element":"span"},{"href":"#id-85","referenceIndex":17,"text":"17","element":"a"},{"text":", Theorem 7.1-3]), we see the composite mapping ","element":"span"},{"style":{"height":19.94},"width":558.68,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-5.png","element":"img","alt":"F1 : Θβ × C2,β(O) → Cβ(O","inline":true},{"text":") is also continuously differentiable with the derivative ","element":"span"},{"style":{"height":18},"width":199.6,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-6.png","element":"img","alt":" F ′1[ϑ, u] =","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":17.6},"width":371.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-7.png","element":"img","alt":"Hε)′[G[ϑ, u]]G′[ϑ, u","inline":true},{"text":"] for all (","element":"span"},{"style":{"height":19.94},"width":385.88,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-8.png","element":"img","alt":"ϑ, u) ∈ Θβ × C2,β(O","inline":true},{"text":"). This, along with the fact that ","element":"span"},{"style":{"height":19.94},"width":278.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-9.png","element":"img","alt":" F2 : C2,β(O) ×","inline":true,"padRight":true},{"text":"Θ","element":"span"},{"style":{"height":19.94},"width":263,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-10.png","element":"img","alt":"β → C2,β(∂O","inline":true},{"text":") is a linear operator, enables us to conclude the desired differentiability of the operator ","element":"span"},{"style":{"height":19.94},"width":274.08,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-11.png","element":"img","alt":" F β = (F1, F2).","inline":true}],[{"text":"With the above proposition in hand, we are ready to derive the first-order sensitivity equation for the value function of the relaxed control problem with respect to the parameter perturbations.","element":"span"}],[{"id":"id-23","text":"Theorem 5.2. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold. Let ","element":"span"},{"style":{"height":22.19},"width":325.8,"height":55.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-12.png","element":"img","alt":" ε > 0, (Θβ)β∈(0,1]","inline":true,"padRight":true},{"text":"be the Banach spaces defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-84","text":"5.1","element":"a"},{"text":")","element":"span"},{"text":", ","element":"span"},{"style":{"height":20.51},"width":1013.48,"height":51.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-13.png","element":"img","alt":" ϑ0 = ((σkσTk /2, bk, ck, fk)k∈K, g), uε ∈ C(O) ∩ C2(O)","inline":true,"padRight":true},{"text":"be the solution to the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") ","element":"span"},{"text":"(with the coefficients ","element":"span"},{"style":{"height":17.6},"width":373.64,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-14.png","element":"img","alt":" ϑ0), and β0 ∈ (0, 1)","inline":true,"padRight":true},{"text":"be the constant in Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":".","element":"span"}],[{"text":"Then it holds for each ","element":"span"},{"style":{"height":17.6},"width":348,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-15.png","element":"img","alt":" β ∈ (0, min(β0, θ)]","inline":true,"padRight":true},{"text":"that, there exists a neighborhood ","element":"span"},{"style":{"height":19.54},"width":334.08,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-16.png","element":"img","alt":" V of ϑ0 in Θβ, a","inline":true,"padRight":true},{"text":"neighborhood ","element":"span"},{"style":{"height":19.94},"width":378.44,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-17.png","element":"img","alt":" W of uε in C2,β(O)","inline":true},{"text":", and a mapping ","element":"span"},{"style":{"height":13.2},"width":210.16,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-18.png","element":"img","alt":" S : V → W","inline":true,"padRight":true},{"text":"satisfying the following properties:","element":"span"}],[{"style":{"width":"86%"},"width":1590,"height":253,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-19.png","element":"img"}],[{"text":"(2) ","element":"span"},{"style":{"height":13.2},"width":235.12,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-20.png","element":"img","alt":" S : V → W","inline":true,"padRight":true},{"text":"is continuously differentiable with ","element":"span"},{"style":{"height":17.6},"width":847,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-21.png","element":"img","alt":" S[ϑ0 + δϑ] = uε + S′[ϑ0]δϑ + o(|δϑ|Θβ) as","inline":true},{"style":{"height":17.6},"width":215.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-22.png","element":"img","alt":"|δϑ|Θβ → 0","inline":true},{"text":", and for each ","element":"span"},{"style":{"height":19.94},"width":672.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-23.png","element":"img","alt":" δϑ ∈ Θβ, δu = S′[ϑ0]δϑ ∈ C2,β(O)","inline":true,"padRight":true},{"text":"is the solution to the following Dirichlet problem:","element":"span"}],[{"id":"id-87","style":{"width":"95%"},"width":1761,"height":149,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-24.png","element":"img"}],[{"text":"Proof. ","element":"span"},{"text":"The desired result comes from a direct application of the implicit function theorem (see [","element":"span"},{"href":"#id-85","referenceIndex":17,"text":"17","element":"a"},{"text":", Theorem 7.13-1]). Theorem ","element":"span"},{"href":"#id-60","text":"3.4 ","element":"a"},{"text":"shows that the Dirichlet problem (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") with the coefficients ","element":"span"},{"style":{"height":14.69},"width":47.24,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-25.png","element":"img","alt":"ϑ0","inline":true,"padRight":true},{"text":"admits a solution ","element":"span"},{"style":{"height":19.94},"width":782.4,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-26.png","element":"img","alt":" uε ∈ C2,β(O) for each β ∈ (0, min(β0, θ)].","inline":true}],[{"text":"Let ","element":"span"},{"style":{"height":17.6},"width":311.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-27.png","element":"img","alt":" β ∈ (0, min(β0, θ","inline":true},{"text":")] be a fixed constant. We shall consider the mapping ","element":"span"},{"style":{"height":19.93},"width":394.88,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-28.png","element":"img","alt":" F β : Θβ×C2,β(O) →","inline":true},{"style":{"height":19.94},"width":340.76,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-29.png","element":"img","alt":"Cβ(O) × C2,β(∂O","inline":true},{"text":") defined as follows:","element":"span"}],[{"style":{"width":"95%"},"width":1760,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-30.png","element":"img"}],[{"text":"Due to the fact that ","element":"span"},{"style":{"height":19.94},"width":231.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-31.png","element":"img","alt":" uε ∈ C2,β(O","inline":true},{"text":") satisfies (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") with the coefficients ","element":"span"},{"style":{"height":20.7},"width":574.96,"height":51.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-32.png","element":"img","alt":" ϑ0, we have Hε(Lϑ0uε+f ϑ0) =","inline":true,"padRight":true},{"text":"0 in ","element":"span"},{"style":{"height":20.71},"width":600.92,"height":51.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-33.png","element":"img","alt":" O and Hε(Lϑ0uε+f ϑ0) ∈ Cβ(O","inline":true},{"text":"), which subsequently implies that ","element":"span"},{"style":{"height":20.7},"width":506.4,"height":51.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-34.png","element":"img","alt":" Hε(Lϑ0uε+f ϑ0) = 0 onO.","inline":true,"padRight":true},{"text":"The boundary condition of (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") implies that ","element":"span"},{"style":{"height":19.94},"width":991.68,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-35.png","element":"img","alt":" τD(uε−gϑ0) = 0 in C2,β(∂O). Hence F β[ϑ0, uε] = 0.","inline":true}],[{"text":"Proposition ","element":"span"},{"href":"#id-86","text":"5.1 ","element":"a"},{"text":"shows that ","element":"span"},{"style":{"height":15.54},"width":54.08,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-36.png","element":"img","alt":" F β ","inline":true,"padRight":true},{"text":"is continuously differentiable on Θ","element":"span"},{"style":{"height":19.94},"width":215,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-37.png","element":"img","alt":"β × C2,β(O","inline":true},{"text":"), and for each (","element":"span"},{"text":"˜","element":"span"},{"style":{"height":19.93},"width":417.6,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-38.png","element":"img","alt":"ϑ, v) ∈ Θβ × C2,β(O),","inline":true}],[{"style":{"width":"98%"},"width":1815,"height":136,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/18-39.png","element":"img"}],[{"text":"where we have used the definition of ","element":"span"},{"href":"#id-27","style":{"height":19.93},"width":496.72,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-0.png","element":"img","alt":" λuε ∈ Cβ(O, ∆K) (see (3.6","inline":true},{"text":")). The classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) implies that the map ","element":"span"},{"style":{"height":19.94},"width":899.24,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-1.png","element":"img","alt":" ∂uF β[ϑ0, uε](·) : C2,β(O) → Cβ(O) × C2,β(∂O)","inline":true,"padRight":true},{"text":"is an injection. We now show it is also a surjection. Let ( ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":19.93},"width":493.4,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-2.png","element":"img","alt":"f, ˆg) ∈ Cβ(O) × C2,β(∂O","inline":true},{"text":") be given. Then the assumption that ","element":"span"},{"style":{"height":16.34},"width":205.76,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-3.png","element":"img","alt":" ∂O ∈ C2,β ","inline":true,"padRight":true},{"text":"enables us to apply [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Lemma 6.38] and extend ˆ","element":"span"},{"text":"g ","element":"span"},{"text":"to a function in ","element":"span"},{"style":{"height":19.94},"width":135.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-4.png","element":"img","alt":" C2,β(O","inline":true},{"text":"), which is still denoted by ˆ","element":"span"},{"text":"g","element":"span"},{"text":". The fact that ","element":"span"},{"style":{"height":19.94},"width":318.96,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-5.png","element":"img","alt":" λuε ∈ Cβ(O, ∆K","inline":true},{"text":") (see Theorem ","element":"span"},{"href":"#id-60","text":"3.4","element":"a"},{"text":") and the elliptic regularity theory (see [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 6.14]) ensure that the Dirichlet problem ","element":"span"},{"style":{"height":21.41},"width":455.92,"height":53.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-6.png","element":"img","alt":"∂uF β[ϑ0, uε](v) = ( ˆf, ˆg","inline":true},{"text":") admits a unique solution ","element":"span"},{"style":{"height":19.94},"width":259.68,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-7.png","element":"img","alt":" v ∈ C2,β(O).","inline":true,"padRight":true},{"text":"Hence we see ","element":"span"},{"style":{"height":19.94},"width":270.24,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-8.png","element":"img","alt":" ∂uF β[ϑ0, uε] :","inline":true},{"style":{"height":19.93},"width":561.56,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-9.png","element":"img","alt":"C2,β(O) → Cβ(O) × C2,β(∂O","inline":true},{"text":") is a bijection.","element":"span"}],[{"text":"Therefore, the implicit function theorem (see [","element":"span"},{"href":"#id-85","referenceIndex":17,"text":"17","element":"a"},{"text":", Theorem 7.13-1]) ensures the existence of ","element":"span"},{"style":{"height":19.13},"width":249.52,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-10.png","element":"img","alt":"S ∈ C1(V, W","inline":true},{"text":") with derivative ","element":"span"},{"style":{"height":19.93},"width":1265.12,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-11.png","element":"img","alt":" S′[ϑ0] = −(∂uF β[ϑ0, uε])−1∂ϑF β[ϑ0, uε] ∈ B(Θβ, C2,β(O)). Hence","inline":true,"padRight":true},{"text":"we have ","element":"span"},{"style":{"height":19.94},"width":1684.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-12.png","element":"img","alt":" S[ϑ0 + δϑ] = uε + S′[ϑ0]δϑ + o(|δϑ|Θβ) as |δϑ|Θβ → 0. Let δϑ ∈ Θβ and δu = S′[ϑ0]δϑ,","inline":true,"padRight":true},{"text":"the characterization of partial derivatives of ","element":"span"},{"style":{"height":15.54},"width":54.08,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-13.png","element":"img","alt":" F β ","inline":true,"padRight":true},{"text":"enables us to conclude that ","element":"span"},{"style":{"height":12.8},"width":45.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-14.png","element":"img","alt":" δu","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"href":"#id-87","text":"5.2","element":"a"},{"text":").","element":"span"}],[{"id":"id-24","text":"Remark ","element":"span"},{"text":"5.1","element":"span"},{"text":". ","element":"span"},{"text":"We can further obtain a first-order expansion of the optimal control ","element":"span"},{"style":{"height":15.74},"width":59.12,"height":39.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-15.png","element":"img","alt":" λuε ","inline":true,"padRight":true},{"text":"in terms of the perturbations of the coefficients. If ","element":"span"},{"style":{"height":10.4},"width":66.64,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-16.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0 and the function ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":642.04,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-17.png","element":"img","alt":" H in (H.2) is in C3(RK) (c.f. Hen","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":14.88},"width":99.88,"height":37.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-18.png","element":"img","alt":" Hchks","inline":true,"padRight":true},{"text":"in Section ","element":"span"},{"text":"3","element":"span"},{"text":"), then Lemma ","element":"span"},{"href":"#id-73","text":"4.1 ","element":"a"},{"text":"shows that ","element":"span"},{"style":{"height":19.53},"width":829.92,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-19.png","element":"img","alt":" ∇Hε : Cα(O, RK) → Cα(O, RK), α ∈ (0, 1],","inline":true,"padRight":true},{"text":"is continuously differentiable with derivative (","element":"span"},{"style":{"height":19.54},"width":970.56,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-20.png","element":"img","alt":"∇Hε)′[u]h = (∇2Hε)(u)h for all u, h ∈ Cα(O, RK),","inline":true,"padRight":true},{"text":"where ","element":"span"},{"style":{"height":17.42},"width":107.68,"height":43.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-21.png","element":"img","alt":" ∇2Hε","inline":true,"padRight":true},{"text":"is the Hessian of ","element":"span"},{"style":{"height":14.69},"width":52.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-22.png","element":"img","alt":" Hε","inline":true},{"text":". Hence, by using the chain rule and Theorem ","element":"span"},{"href":"#id-23","text":"5.2","element":"a"},{"text":", we have for all ","element":"span"},{"style":{"height":17.6},"width":436.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-23.png","element":"img","alt":" β ∈ (0, min(β0, θ)] that","inline":true}],[{"style":{"width":"79%"},"width":1468,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-24.png","element":"img"}],[{"text":"as ","element":"span"},{"style":{"height":20.34},"width":558.12,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-25.png","element":"img","alt":" |δϑ|Θβ → 0, where λS[ϑ0+δϑ] ","inline":true,"padRight":true},{"text":"is the optimal feedback control of the relaxed control problem with the perturbed coefficients ","element":"span"},{"style":{"height":15.6},"width":311.08,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-26.png","element":"img","alt":" ϑ0 + δϑ, and δu","inline":true,"padRight":true},{"text":"is the classical solution to (","element":"span"},{"href":"#id-87","text":"5.2","element":"a"},{"text":").","element":"span"}],[{"text":"With the sensitivity equation (","element":"span"},{"href":"#id-87","text":"5.2","element":"a"},{"text":") in hand, we now estimate the precise dependence of ","element":"span"},{"style":{"height":12.8},"width":105.6,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-27.png","element":"img","alt":" δu on","inline":true,"padRight":true},{"text":"the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-28.png","element":"img","alt":" ε","inline":true},{"text":", which strengthens the Lipschitz stability result (","element":"span"},{"href":"#id-21","text":"4.4","element":"a"},{"text":") by quantifying the explicit ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-29.png","element":"img","alt":" ε","inline":true},{"text":"-dependence of the (local) Lipschitz constant. Note that Remark ","element":"span"},{"href":"#id-88","text":"4.2 ","element":"a"},{"text":"shows that the value function (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") (in the ","element":"span"},{"style":{"height":15.94},"width":80.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-30.png","element":"img","alt":" C2,β","inline":true},{"text":"-norm) does not depend continuously on the ","element":"span"},{"style":{"height":15.94},"width":54.56,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-31.png","element":"img","alt":" Cβ","inline":true},{"text":"-perturbation of the parameters, which suggests that for a fixed ","element":"span"},{"style":{"height":20.81},"width":609.16,"height":52.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-32.png","element":"img","alt":" δϑ ∈ Θβ, the | · |2,β-norm of δu","inline":true,"padRight":true},{"text":"will blow up as the parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-33.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"tends to 0.","element":"span"}],[{"text":"Since the H¨older norm of the function ","element":"span"},{"href":"#id-87","style":{"height":19.55},"width":201.52,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-34.png","element":"img","alt":" λuε in (5.2","inline":true},{"text":") tends to infinity as ","element":"span"},{"style":{"height":9.6},"width":76.64,"height":24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-35.png","element":"img","alt":" ε →","inline":true,"padRight":true},{"text":"0, we first present a precise ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for the classical solutions to linear elliptic equations with ","element":"span"},{"style":{"height":16},"width":231.56,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-36.png","element":"img","alt":" ε-dependent","inline":true,"padRight":true},{"text":"coefficients. The proof will be postponed to Appendix ","element":"span"},{"text":"A","element":"span"},{"text":", where we first reduce the equation to a constant coefficient equation involving only second-order terms, and then apply the classical Schauder estimate.","element":"span"}],[{"id":"id-89","style":{"height":17.6},"width":1197.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-37.png","element":"img","alt":"Proposition 5.3. Let α ∈ [0, 1], β ∈ (0, 1), ν, Λ > 0, and O","inline":true,"padRight":true},{"text":"be a bounded domain in ","element":"span"},{"style":{"height":12.8},"width":152.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-38.png","element":"img","alt":" Rn with","inline":true},{"style":{"height":15.94},"width":80.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-39.png","element":"img","alt":"C2,β ","inline":true,"padRight":true},{"text":"boundary. For every ","element":"span"},{"style":{"height":17.6},"width":625.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-40.png","element":"img","alt":" ε ∈ (0, 1], let aε :O → Rn×n, bε :","inline":true}],[{"text":"functions satisfying ","element":"span"},{"style":{"height":15.09},"width":295.64,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-41.png","element":"img","alt":" aε ≥ νIn onO","inline":true},{"text":". Suppose that ","element":"span"},{"text":"[","element":"span"},{"style":{"height":22.13},"width":844.72,"height":55.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-42.png","element":"img","alt":"aijε ]0, [biε]0, [cε]0 ≤ Λ and [aijε ]β, [biε]β, [cε]β ≤","inline":true,"padRight":true},{"text":"Λ","element":"span"},{"style":{"height":17.6},"width":741.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-43.png","element":"img","alt":"ε−α for all ε ∈ (0, 1] and i, j = 1, . . . , n","inline":true},{"text":". Then for every ","element":"span"},{"style":{"height":19.94},"width":743.08,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-44.png","element":"img","alt":" ε ∈ (0, 1], f ∈ Cβ(O) and g ∈ C2,β(O),","inline":true,"padRight":true},{"text":"the Dirichlet problem","element":"span"}],[{"style":{"width":"99%"},"width":1844,"height":326,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-45.png","element":"img"}],[{"text":"which applies to relaxed control problems with reward functions generated by ","element":"span"},{"style":{"height":15.6},"width":296.16,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-46.png","element":"img","alt":" Hen, Hchks and","inline":true},{"style":{"height":17.09},"width":118.08,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/19-47.png","element":"img","alt":"Hzang.","inline":true}],[{"id":"id-25","text":"Theorem 5.4. ","element":"span"},{"text":"Assume the setting of Theorem ","element":"span"},{"href":"#id-23","text":"5.2 ","element":"a"},{"text":"and in addition that the function ","element":"span"},{"style":{"height":15.53},"width":240.32,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-0.png","element":"img","alt":" H : RK → R","inline":true,"padRight":true},{"text":"in (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") has a Lipschitz continuous gradient. Let ","element":"span"},{"style":{"height":17.6},"width":200.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-1.png","element":"img","alt":" β0 ∈ (0, 1)","inline":true,"padRight":true},{"text":"be the constant in Proposition ","element":"span"},{"href":"#id-62","text":"3.3 ","element":"a"},{"text":"and ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":17.6},"width":298.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-2.png","element":"img","alt":"β0 = min(β0, θ)","inline":true},{"text":". Then it holds for all ","element":"span"},{"style":{"height":19.93},"width":667.04,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-3.png","element":"img","alt":" ε ∈ (0, 1], β ∈ (0, ¯β0] and δϑ ∈ Θβ ","inline":true,"padRight":true},{"text":"that, the classical solution ","element":"span"},{"style":{"height":12.8},"width":45.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-4.png","element":"img","alt":" δu","inline":true,"padRight":true},{"text":"to the Dirichlet problem ","element":"span"},{"text":"(","element":"span"},{"href":"#id-87","text":"5.2","element":"a"},{"text":") ","element":"span"},{"text":"satisfies the estimate ","element":"span"},{"style":{"height":22.67},"width":659.88,"height":56.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-5.png","element":"img","alt":" |δu|2,β ≤ Cε−(β+2)/¯β0|δϑ|Θβ, where","inline":true,"padRight":true},{"text":"C ","element":"span"},{"text":"is a constant independent of ","element":"span"},{"style":{"height":12.8},"width":184.84,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-6.png","element":"img","alt":" ε and δϑ.","inline":true}],[{"text":"Proof. ","element":"span"},{"text":"Throughout this proof, let ","element":"span"},{"text":"C ","element":"span"},{"text":"be a generic constant depending possibly on ","element":"span"},{"style":{"height":16.4},"width":272.36,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-7.png","element":"img","alt":" ϑ0 and β, but","inline":true,"padRight":true},{"text":"independent of ","element":"span"},{"style":{"height":12.8},"width":168.24,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-8.png","element":"img","alt":" ε and δϑ","inline":true},{"text":". Proposition ","element":"span"},{"href":"#id-62","text":"3.3 ","element":"a"},{"text":"shows that ","element":"span"},{"style":{"height":20.4},"width":485.28,"height":51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-9.png","element":"img","alt":" |uε|2, ¯β0 ≤ C for all ε ∈ (0,","inline":true,"padRight":true},{"text":"1], which together with (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":"), the fact that ","element":"span"},{"href":"#id-58","style":{"height":19.54},"width":942.64,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-10.png","element":"img","alt":" ∇Hε(x) = ∇H(ε−1x) for all x ∈ RK (see (3.3","inline":true},{"text":")) and the Lipschitz continuity of ","element":"span"},{"style":{"height":12.4},"width":75.48,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-11.png","element":"img","alt":" ∇H","inline":true,"padRight":true},{"text":"implies that ","element":"span"},{"style":{"height":22.16},"width":852.96,"height":55.4,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-12.png","element":"img","alt":" |λuε|0 ≤ C and |λuε|¯β0 ≤ Cε−1 for all ε ∈ (0,","inline":true,"padRight":true},{"text":"1]. Consequently, we have for all ","element":"span"},{"style":{"height":29.97},"width":1410.72,"height":74.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-13.png","element":"img","alt":" β ∈ (0, ¯β0] and ε ∈ (0, 1] that |λuε|β ≤ C|λuε|β/¯β0¯β0 |λuε|(¯β0−β)/¯β00 ≤ Cε−β/¯β0.","inline":true}],[{"text":"Now let us fix ","element":"span"},{"style":{"height":19.94},"width":859.2,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-14.png","element":"img","alt":" β ∈ (0, ¯β0] and δϑ ∈ Θβ. Since λuε ∈ ∆K on","inline":true}],[{"text":"O","element":"span"},{"text":", we can apply Proposition ","element":"span"},{"href":"#id-89","text":"5.3 ","element":"a"},{"text":"(with ","element":"span"},{"href":"#id-87","style":{"height":19.41},"width":335.92,"height":48.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-15.png","element":"img","alt":" α = β/¯β0) to (5.2","inline":true},{"text":") and conclude the desired estimate from the following inequality:","element":"span"}],[{"style":{"width":"97%"},"width":1805,"height":139,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-16.png","element":"img"}]]},{"heading":"6 Convergence analysis for vanishing relaxation parameter","paragraphs":[[{"text":"In this section, we analyze the convergence of the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") to the original control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") as the relaxation parameter tends to zero. ","element":"span"},{"text":"In particular, with the help of the HJB equations (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") and (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"), we shall establish first-order monotone convergence of the value functions, and also uniform convergence of the feedback controls (in regions where a strict complementary condition is satisfied).","element":"span"}],[{"text":"We first study the convergence of the value functions of the relaxed control problems. The following theorem shows that, as the relaxation parameter ","element":"span"},{"style":{"height":8.4},"width":21,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-17.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"tends to zero, the value function (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") converges monotonically to the value function (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") in ","element":"span"},{"style":{"height":19.94},"width":134.84,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-18.png","element":"img","alt":" C2,β(O","inline":true},{"text":") with first order.","element":"span"}],[{"id":"id-31","text":"Theorem 6.1. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold. Let ","element":"span"},{"style":{"height":17.6},"width":209,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-19.png","element":"img","alt":" β0 ∈ (0, 1)","inline":true,"padRight":true},{"text":"be the constant in Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":", and ","element":"span"},{"style":{"height":19.14},"width":874.76,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-20.png","element":"img","alt":" u ∈ C(O) ∩ C2(O) (resp. uε ∈ C(O) ∩ C2(O)","inline":true},{"text":") be the solution to ","element":"span"},{"text":"(","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") ","element":"span"},{"text":"(resp. ","element":"span"},{"text":"(","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":") ","element":"span"},{"text":"with parameter ","element":"span"},{"style":{"height":12.4},"width":116.56,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-21.png","element":"img","alt":" ε > 0","inline":true},{"text":"). Then we have ","element":"span"},{"style":{"height":16.4},"width":602.32,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-22.png","element":"img","alt":" uε1 ≥ uε2 for all ε1 ≥ ε2 > 0","inline":true},{"text":". Moreover, it holds for any ","element":"span"},{"style":{"height":17.6},"width":587.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-23.png","element":"img","alt":"β ∈ (0, min(β0, θ)) that (uε)ε>0","inline":true,"padRight":true},{"text":"converges to ","element":"span"},{"style":{"height":19.94},"width":429.04,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-24.png","element":"img","alt":" u in C2,β(O) as ε → 0","inline":true},{"text":", and satisfies the estimate:","element":"span"}],[{"id":"id-90","style":{"width":"82%"},"width":1528,"height":106,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-25.png","element":"img"}],[{"style":{"height":18.29},"width":354.44,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-26.png","element":"img","alt":"Proof. Let (Fε)ε≥0","inline":true,"padRight":true},{"text":"be defined as in (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") and (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"), and ","element":"span"},{"style":{"height":13.89},"width":186.64,"height":34.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-27.png","element":"img","alt":" ε1 ≥ ε2 >","inline":true,"padRight":true},{"text":"0 be given constants. Lemma ","element":"span"},{"href":"#id-56","text":"3.2 ","element":"a"},{"text":"shows that ","element":"span"},{"style":{"height":21.12},"width":1313.52,"height":52.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-28.png","element":"img","alt":" ρ ≤ 0 on ∆K, and Hε(x) = maxy∈∆K�yT x − ερ(y)�for all x ∈ RK","inline":true},{"text":". Hence, we have ","element":"span"},{"style":{"height":16.81},"width":295.68,"height":42.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-29.png","element":"img","alt":" Hε1 ≥ Hε2, and","inline":true}],[{"style":{"width":"63%"},"width":1180,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-30.png","element":"img"}],[{"text":"where we write ","element":"span"},{"style":{"height":23.57},"width":1543.04,"height":58.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-31.png","element":"img","alt":" η :=� 10 (∇Hε2)(Luε2 + f + sL(uε1 − uε2)) ds. Since η(x) ∈ ∆K for all x ∈ O, we","inline":true,"padRight":true},{"text":"can deduce from the classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) that inf","element":"span"},{"style":{"height":19.57},"width":192.4,"height":48.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-32.png","element":"img","alt":"x∈O(uε1 −","inline":true},{"style":{"height":17.6},"width":722.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-33.png","element":"img","alt":"uε2)(x) ≥ infx∈∂O(uε1 − uε2)−(x) = 0.","inline":true}],[{"text":"Similarly, for any given ","element":"span"},{"style":{"height":10.4},"width":66.16,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-34.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0, we can obtain from Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":") that","element":"span"}],[{"style":{"width":"65%"},"width":1219,"height":124,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/20-35.png","element":"img"}],[{"text":"where we have ˜","element":"span"},{"href":"#id-46","style":{"height":23.57},"width":1553.76,"height":58.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-0.png","element":"img","alt":"η :=� 10 (∇Hε)(Lu + f + sL(uε − u)) ds. By using ak = σk(σk)T /2, (2.4) in (H.1),","inline":true,"padRight":true},{"text":"and the fact that ˜","element":"span"},{"style":{"height":16.8},"width":254.36,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-1.png","element":"img","alt":"η ∈ ∆K onO","inline":true},{"text":", we deduce that ","element":"span"},{"style":{"height":21.86},"width":927.68,"height":54.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-2.png","element":"img","alt":"�Kk=1 ˜ηkck ≥ 0 and �Kk=1 ˜ηkak ≥ (ν/2)In. Hence","inline":true,"padRight":true},{"text":"the classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) and the fact that ","element":"span"},{"style":{"height":16.8},"width":351.68,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-3.png","element":"img","alt":" uε = u on ∂O give","inline":true,"padRight":true},{"text":"us the estimate (","element":"span"},{"href":"#id-90","text":"6.1","element":"a"},{"text":").","element":"span"}],[{"style":{"width":"96%"},"width":1779,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-4.png","element":"img"}],[{"text":"any given ","element":"span"},{"style":{"height":17.6},"width":322.92,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-5.png","element":"img","alt":" β ∈ (0, min(β0, θ","inline":true},{"text":")), there exists a subsequence (","element":"span"},{"style":{"height":17.6},"width":711.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-6.png","element":"img","alt":"uεm)m∈N with limm→∞ εm = 0, such","inline":true,"padRight":true},{"text":"that (","element":"span"},{"style":{"height":17.6},"width":163.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-7.png","element":"img","alt":"uεm)m∈N","inline":true,"padRight":true},{"text":"converges in ","element":"span"},{"style":{"height":19.94},"width":134.84,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-8.png","element":"img","alt":" C2,β(O","inline":true},{"text":") to some function ¯","element":"span"},{"style":{"height":20.34},"width":462.2,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-9.png","element":"img","alt":"u and ¯u ∈ C2,min(β0,θ)(O","inline":true},{"text":"). Since the entire sequence (","element":"span"},{"style":{"height":17.6},"width":118.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-10.png","element":"img","alt":"uε)ε>0","inline":true,"padRight":true},{"text":"converges monotonically to ","element":"span"},{"style":{"height":17.6},"width":600.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-11.png","element":"img","alt":" u, we have u = ¯u and (uε)ε>0","inline":true,"padRight":true},{"text":"converges to ","element":"span"},{"text":"u ","element":"span"},{"text":"in ","element":"span"},{"style":{"height":19.94},"width":653.76,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-12.png","element":"img","alt":"C2,β(O) for all β ∈ (0, min(β0, θ)).","inline":true}],[{"text":"Remark ","element":"span"},{"text":"6.1","element":"span"},{"text":". ","element":"span"},{"text":"The estimate (","element":"span"},{"href":"#id-90","text":"6.1","element":"a"},{"text":") depends on ","element":"span"},{"style":{"height":19.92},"width":308.6,"height":49.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-13.png","element":"img","alt":" ε, c0, ν, bik and O","inline":true,"padRight":true},{"text":"in a rather intuitive way. Note that, ","element":"span"},{"text":"compared with the original control problem (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":"), the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") introduces additional randomness for exploration to achieve more robust decisions, especially at regions where two or more strategies lead to similar performances based on the given model (the points at which arg max in (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") is not a singleton). The relation (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") between feedback controls and the derivatives of value functions further suggests that such regions usually correspond to a sign change of derivatives of value functions.","element":"span"}],[{"text":"The exploration surplus in the value functions clearly increases as ","element":"span"},{"style":{"height":10.69},"width":120.68,"height":26.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-14.png","element":"img","alt":" ε or c0","inline":true,"padRight":true},{"text":"increase (see Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-56","text":"1","element":"a"},{"text":") and Figure ","element":"span"},{"href":"#id-71","text":"1","element":"a"},{"text":"), since the same level of exploration will bring more rewards. It will also increase with diam(","element":"span"},{"text":"O","element":"span"},{"text":") as the dynamics will stay in ","element":"span"},{"text":"O ","element":"span"},{"text":"longer. Furthermore, due to the lack of regularization from the Laplacian operator, a small volatility or a large drift-to-volalitly ratio of the underlying model usually leads to a more rapidly changing value function, which increases the occurrence of the uncertain regions and makes the relaxation approach more beneficial.","element":"span"}],[{"text":"Now we turn to investigate the convergence of the feedback relaxed control (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":"). To distinguish different convergence behaviours related to reward functions generated by ","element":"span"},{"style":{"height":17.49},"width":438.44,"height":43.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-15.png","element":"img","alt":" Hen and Hzang, we first","inline":true,"padRight":true},{"text":"introduce the following concept for functions which only modify the pointwise maximum function locally near the kinks.","element":"span"}],[{"id":"id-91","style":{"height":13.6},"width":524,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-16.png","element":"img","alt":"Definition 6.1. Let n ∈ N","inline":true},{"text":", we say a function ","element":"span"},{"style":{"height":16.4},"width":218.24,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-17.png","element":"img","alt":" φ : Rn → R","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-18.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":15.6},"width":161.92,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-19.png","element":"img","alt":" ϑ ≥ 0, if","inline":true,"padRight":true},{"text":"it holds for all ","element":"span"},{"style":{"height":18.29},"width":1273.44,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-20.png","element":"img","alt":" k = 1, . . . , n and x ∈ Rn with xk ≥ xj + ϑ, ∀j ̸= k, that φ(x) = xk.","inline":true}],[{"text":"It is clear that the pointwise maximum function on ","element":"span"},{"style":{"height":17.6},"width":487.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-21.png","element":"img","alt":" Rn satisfies (Sloc) with ϑ","inline":true,"padRight":true},{"text":"= 0, and the two-dimensional function ","element":"span"},{"style":{"height":17.09},"width":103.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-22.png","element":"img","alt":" Hzang","inline":true,"padRight":true},{"text":"defined in (","element":"span"},{"href":"#id-69","text":"3.8","element":"a"},{"text":") satisfies (","element":"span"},{"style":{"height":17.6},"width":398.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-23.png","element":"img","alt":"Sloc) with ϑ = 1/2.","inline":true,"padRight":true},{"text":"The following lemma shows that property (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-24.png","element":"img","alt":"Sloc","inline":true},{"text":") is preserved under function composition and scaling, which consequently implies that the recursively constructed ","element":"span"},{"text":"K","element":"span"},{"text":"-dimensional ","element":"span"},{"style":{"height":17.09},"width":103.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-25.png","element":"img","alt":" Hzang","inline":true,"padRight":true},{"text":"and its corresponding scaled function (","element":"span"},{"href":"#id-58","style":{"height":18.29},"width":307.6,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-26.png","element":"img","alt":"Hzang)ε (cf. (3.3","inline":true},{"text":")) satisfy (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-27.png","element":"img","alt":"Sloc","inline":true},{"text":"). The proof follows directly from Definition ","element":"span"},{"href":"#id-91","text":"6.1","element":"a"},{"text":", and is included in Appendix ","element":"span"},{"text":"A","element":"span"},{"text":".","element":"span"}],[{"id":"id-109","text":"Lemma 6.2. ","element":"span"},{"text":"(1) For each ","element":"span"},{"style":{"height":23.81},"width":734.48,"height":59.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-28.png","element":"img","alt":" n ∈ N, let H(n)0 : Rn → R be the n","inline":true},{"text":"-dimensional pointwise maxi-","element":"span"}],[{"style":{"width":"95%"},"width":1771,"height":376,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-29.png","element":"img"}],[{"id":"id-94","text":"(2) If ","element":"span"},{"style":{"height":16.4},"width":230.24,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-30.png","element":"img","alt":" φ : Rn → R","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-31.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":14.8},"width":112.72,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-32.png","element":"img","alt":" ϑ ≥ 0","inline":true},{"text":", then for each ","element":"span"},{"style":{"height":12.4},"width":107.44,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-33.png","element":"img","alt":" ε > 0","inline":true},{"text":", the scaled function ","element":"span"},{"style":{"height":19.14},"width":536.48,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-34.png","element":"img","alt":"φε : x ∈ Rn �→ εφ(ε−1x) ∈ R","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-35.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":12.8},"width":59.08,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/21-36.png","element":"img","alt":" εϑ.","inline":true}],[{"text":"The following proposition presents several important convergence properties of the functions (","element":"span"},{"style":{"height":17.6},"width":166.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-0.png","element":"img","alt":"∇Hε)ε>0","inline":true},{"text":". In the sequel, we shall denote by ","element":"span"},{"style":{"height":18.34},"width":293.16,"height":45.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-1.png","element":"img","alt":" ek ∈ RK, k ∈ K","inline":true},{"text":", the unit vector from the ","element":"span"},{"text":"k","element":"span"},{"text":"-th column of the identify matrix ","element":"span"},{"style":{"height":14.69},"width":49.2,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-2.png","element":"img","alt":" IK","inline":true},{"text":", and by conv(","element":"span"},{"text":"S","element":"span"},{"text":") the convex hull of a given set ","element":"span"},{"style":{"height":15.93},"width":163.68,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-3.png","element":"img","alt":" S ⊂ RK.","inline":true}],[{"id":"id-97","text":"Proposition 6.3. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") holds. Let ","element":"span"},{"text":"(","element":"span"},{"style":{"height":18.29},"width":129.8,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-4.png","element":"img","alt":"Hε)ε≥0","inline":true,"padRight":true},{"text":"be defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-58","text":"3.3","element":"a"},{"text":")","element":"span"},{"text":", ","element":"span"},{"text":"(","element":"span"},{"style":{"height":17.6},"width":423.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-5.png","element":"img","alt":"∂H0)(x) = conv({ek ∈","inline":true},{"style":{"height":19.53},"width":1366.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-6.png","element":"img","alt":"RK | xk = H0(x), k ∈ K}) for all x ∈ RK, and U = {x ∈ RK | (∂H0)(x","inline":true},{"text":") is a singleton","element":"span"},{"text":"}","element":"span"},{"text":". Then it holds for all ","element":"span"},{"style":{"height":15.94},"width":139.92,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-7.png","element":"img","alt":" x ∈ RK ","inline":true,"padRight":true},{"text":"and compact subset ","element":"span"},{"style":{"height":13.2},"width":209.48,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-8.png","element":"img","alt":" C ⊂ U that","inline":true}],[{"id":"id-92","style":{"height":19.33},"width":1815.88,"height":48.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-9.png","element":"img","alt":"(1) limk→∞ dist((∇Hεk)(xk), (∂H0)(x)) = 0 provided that limk→∞ xk = x and limk→∞ εk = 0+,","inline":true}],[{"text":"(2) ","element":"span"},{"text":"(","element":"span"},{"style":{"height":17.6},"width":166.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-10.png","element":"img","alt":"∇Hε)ε>0","inline":true,"padRight":true},{"text":"converges uniformly to ","element":"span"},{"style":{"height":15.49},"width":395.92,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-11.png","element":"img","alt":" ∂H0 on C as ε → 0","inline":true},{"text":". If we further suppose the function ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":403.72,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-12.png","element":"img","alt":"H : RK → R in (H.2","inline":true},{"text":") satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-13.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":14.8},"width":107.92,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-14.png","element":"img","alt":" ϑ ≥ 0","inline":true},{"text":", then there exists ","element":"span"},{"style":{"height":15.09},"width":313.16,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-15.png","element":"img","alt":" ε0 > 0 such that","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":17.6},"width":946.12,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-16.png","element":"img","alt":"∇Hε)(x) = (∂H0)(x) for all x ∈ C and ε ∈ (0, ε0].","inline":true}],[{"text":"Proof. ","element":"span"},{"text":"We first establish Property (","element":"span"},{"href":"#id-92","text":"1","element":"a"},{"text":") by considering the following function:","element":"span"}],[{"style":{"width":"52%"},"width":971,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-17.png","element":"img"}],[{"text":"Note that Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-56","text":"1","element":"a"},{"text":") shows that the restriction of ","element":"span"},{"style":{"height":16.8},"width":162.48,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-18.png","element":"img","alt":" ρ on ∆K","inline":true,"padRight":true},{"text":"is continuous, which subsequently implies that ","element":"span"},{"style":{"height":16.4},"width":26,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-19.png","element":"img","alt":" φ","inline":true,"padRight":true},{"text":"is a continuous function. Then we can deduce from [","element":"span"},{"text":"1","element":"span"},{"text":", Theorem 17.31] that the set-valued mapping Ξ : (","element":"span"},{"style":{"height":21.76},"width":955.44,"height":54.4,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-20.png","element":"img","alt":"x, ε) ∈ RK × [0, 1] ⇒ arg maxy∈∆K φ(x, ε, y) ⊂ ∆K","inline":true,"padRight":true},{"text":"is upper hemicontinuous, which along with the fact that Ξ(","element":"span"},{"style":{"height":17.6},"width":305.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-21.png","element":"img","alt":"x, ε) = (∇Hε)(x","inline":true},{"text":") for all (","element":"span"},{"style":{"height":19.54},"width":301.44,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-22.png","element":"img","alt":"x, ε) ∈ RK × (0,","inline":true,"padRight":true},{"text":"1] (see Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":")) enables us to deduce lim","element":"span"},{"style":{"height":18.19},"width":507.36,"height":45.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-23.png","element":"img","alt":"k→∞ dist((∇Hεk)(xk), Ξ(x,","inline":true,"padRight":true},{"text":"0)) = 0 for any given lim","element":"span"},{"style":{"height":15.68},"width":331.2,"height":39.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-24.png","element":"img","alt":"k→∞ xk = x and","inline":true,"padRight":true},{"text":"lim","element":"span"},{"style":{"height":17.62},"width":250.64,"height":44.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-25.png","element":"img","alt":"k→∞ εk = 0+","inline":true},{"text":". Property (","element":"span"},{"href":"#id-92","text":"1","element":"a"},{"text":") now follows from the fact that Ξ(","element":"span"},{"style":{"height":17.6},"width":306.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-26.png","element":"img","alt":"x, 0) = (∂H0)(x","inline":true},{"text":") (see e.g. [","element":"span"},{"href":"#id-93","referenceIndex":37,"text":"37","element":"a"},{"text":", Theorem 2]).","element":"span"}],[{"text":"Now we shall prove Property (","element":"span"},{"href":"#id-92","text":"2","element":"a"},{"text":"). We first define the set ","element":"span"},{"style":{"height":20.23},"width":690.44,"height":50.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-27.png","element":"img","alt":" Uk = {x ∈ RK | xk > xj, ∀j ̸= k} for","inline":true,"padRight":true},{"text":"each ","element":"span"},{"style":{"height":13.2},"width":111.72,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-28.png","element":"img","alt":" k ∈ K","inline":true},{"text":". It is clear that (","element":"span"},{"style":{"height":17.6},"width":134.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-29.png","element":"img","alt":"Uk)k∈K","inline":true,"padRight":true},{"text":"are disjoint open convex sets, ","element":"span"},{"style":{"height":15.68},"width":240.24,"height":39.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-30.png","element":"img","alt":" U = ∪k∈KUk","inline":true},{"text":", and it holds for all ","element":"span"},{"style":{"height":15.28},"width":501.8,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-31.png","element":"img","alt":" k ∈ K and x ∈ Uk that H0","inline":true,"padRight":true},{"text":"is differentiable at ","element":"span"},{"style":{"height":17.6},"width":665.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-32.png","element":"img","alt":" x with (∇H0)(x) = ek = (∂H0)(x).","inline":true}],[{"text":"Let ","element":"span"},{"style":{"height":13.2},"width":133.84,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-33.png","element":"img","alt":" C ⊂ U","inline":true,"padRight":true},{"text":"be a compact set, then we have ","element":"span"},{"style":{"height":17.6},"width":893.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-34.png","element":"img","alt":" C = ∪k∈K(C ∩ Uk) due to U = ∪k∈KUk. Let","inline":true,"padRight":true},{"text":"us fix an arbitrary index ","element":"span"},{"style":{"height":13.2},"width":144,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-35.png","element":"img","alt":" k ∈ K.","inline":true,"padRight":true},{"text":"By using the fact that (","element":"span"},{"style":{"height":17.6},"width":134.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-36.png","element":"img","alt":"Uk)k∈K","inline":true,"padRight":true},{"text":"are disjoint open sets, we can deduce that ","element":"span"},{"style":{"height":15.28},"width":124.08,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-37.png","element":"img","alt":" C ∩ Uk","inline":true,"padRight":true},{"text":"is also compact. Since (","element":"span"},{"style":{"height":18.29},"width":129.8,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-38.png","element":"img","alt":"Hε)ε≥0","inline":true,"padRight":true},{"text":"are convex and differentiable on ","element":"span"},{"style":{"height":15.28},"width":136.8,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-39.png","element":"img","alt":" Uk and","inline":true,"padRight":true},{"text":"lim","element":"span"},{"style":{"height":17.6},"width":636.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-40.png","element":"img","alt":"ε→0 Hε(x) = H0(x) for all x ∈ Uk","inline":true},{"text":", we can deduce from the convexity of ","element":"span"},{"href":"#id-93","referenceIndex":37,"style":{"height":17.6},"width":405.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-41.png","element":"img","alt":" Uk and [38, Theorem","inline":true,"padRight":true},{"text":"25.7] that (","element":"span"},{"style":{"height":17.6},"width":166.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-42.png","element":"img","alt":"∇Hε)ε>0","inline":true,"padRight":true},{"text":"converges uniformly to ","element":"span"},{"style":{"height":15.68},"width":635.88,"height":39.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-43.png","element":"img","alt":" ∇H0 = ∂H0 on C ∩ Uk. Since K","inline":true,"padRight":true},{"text":"is a finite set, we have shown the desired uniform convergence on ","element":"span"},{"text":"C","element":"span"},{"text":".","element":"span"}],[{"text":"Moreover, for each ","element":"span"},{"style":{"height":13.2},"width":116.04,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-44.png","element":"img","alt":" k ∈ K","inline":true},{"text":", the compactness of ","element":"span"},{"style":{"height":15.28},"width":124.08,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-45.png","element":"img","alt":" C ∩ Uk","inline":true,"padRight":true},{"text":"implies that there exists ","element":"span"},{"style":{"height":17.68},"width":252.96,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-46.png","element":"img","alt":" ε0,k > 0 such","inline":true,"padRight":true},{"text":"that ","element":"span"},{"style":{"height":20.42},"width":1061.88,"height":51.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-47.png","element":"img","alt":" C ∩ Uk ⊂ {x ∈ RK | xk > xj + ε0,k, ∀j ̸= k}. Then, if H","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-48.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":15.6},"width":118.56,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-49.png","element":"img","alt":" ϑ ≥ 0,","inline":true,"padRight":true},{"text":"then Lemma ","element":"span"},{"text":"6.2","element":"span"},{"text":"(","element":"span"},{"href":"#id-94","text":"2","element":"a"},{"text":") shows that for all ","element":"span"},{"style":{"height":10.4},"width":70.48,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-50.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0 satisfying ","element":"span"},{"style":{"height":17.68},"width":551.72,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-51.png","element":"img","alt":" εϑ ≤ ε0,k, we have Hε = H0","inline":true,"padRight":true},{"text":"(and hence ","element":"span"},{"style":{"height":17.6},"width":456.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-52.png","element":"img","alt":"∇Hε = ∇H0) on C ∩ Uk","inline":true},{"text":". Hence, by setting ","element":"span"},{"style":{"height":12.29},"width":85.84,"height":30.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-53.png","element":"img","alt":" ε0 >","inline":true,"padRight":true},{"text":"0 to be a constant satisfying ","element":"span"},{"style":{"height":17.68},"width":352.8,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-54.png","element":"img","alt":" ε0ϑ ≤ mink∈K ε0,k,","inline":true,"padRight":true},{"text":"we can conclude for all ","element":"span"},{"style":{"height":17.6},"width":783.84,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-55.png","element":"img","alt":" ε ∈ (0, ε0] that ∇Hε = ∇H0 = ∂H0 on C.","inline":true}],[{"text":"Now we are ready to present the convergence of the feedback relaxed control (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":"). Note that the H¨older continuity of the relaxed controls (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") and the possible discontinuity of the feedback control (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":") suggest that the sequence (","element":"span"},{"style":{"height":19.55},"width":139.4,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-56.png","element":"img","alt":"λuε)ε>0","inline":true,"padRight":true},{"text":"in general does not converge uniformly to ","element":"span"},{"style":{"height":12.34},"width":109.92,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-57.png","element":"img","alt":" αu on","inline":true},{"style":{"height":12.8},"width":192.8,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-58.png","element":"img","alt":"O as ε →","inline":true,"padRight":true},{"text":"0. Thus we shall show that the relaxed controls converge in terms of the Hausdorff metric everywhere in ","element":"span"},{"text":"O","element":"span"},{"text":", and converge uniformly on compact subsets of the following region:","element":"span"}],[{"id":"id-95","style":{"width":"79%"},"width":1476,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-59.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":19.14},"width":344.6,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-60.png","element":"img","alt":" u ∈ C(O) ∩ C2(O","inline":true},{"text":") is the solution to (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") (or equivalently the value function (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") if the function ","element":"span"},{"style":{"height":17.01},"width":139.08,"height":42.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-61.png","element":"img","alt":" σ ∈ Sn0","inline":true},{"text":"; see Theorem ","element":"span"},{"href":"#id-1","text":"2.2","element":"a"},{"text":"), and (","element":"span"},{"style":{"height":17.6},"width":134.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-62.png","element":"img","alt":"Lk)k∈K ","inline":true,"padRight":true},{"text":"are the elliptic operators defined as in (","element":"span"},{"href":"#id-52","text":"2.7","element":"a"},{"text":"). ","element":"span"},{"text":"Note that ","element":"span"},{"style":{"height":15.09},"width":61,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/22-63.png","element":"img","alt":" Ost","inline":true,"padRight":true},{"text":"contains the points at which a strict complementary condition is satisfied, i.e., the optimal feedback control strategy of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") is uniquely determined.","element":"span"}],[{"id":"id-32","text":"Theorem 6.4. ","element":"span"},{"text":"Suppose (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") and (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") hold. Let ","element":"span"},{"text":"(","element":"span"},{"style":{"height":19.55},"width":139.4,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-0.png","element":"img","alt":"λuε)ε>0","inline":true,"padRight":true},{"text":"be the functions defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") ","element":"span"},{"text":"for each ","element":"span"},{"style":{"height":19.14},"width":482.12,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-1.png","element":"img","alt":" ε > 0, u ∈ C(O) ∩ C2(O)","inline":true,"padRight":true},{"text":"be the solution to ","element":"span"},{"text":"(","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":")","element":"span"},{"text":", and ","element":"span"},{"style":{"height":15.09},"width":61,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-2.png","element":"img","alt":" Ost","inline":true,"padRight":true},{"text":"be the set defined as in ","element":"span"},{"text":"(","element":"span"},{"href":"#id-95","text":"6.2","element":"a"},{"text":")","element":"span"},{"text":". Then we have for all ","element":"span"},{"style":{"height":17.6},"width":909.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-3.png","element":"img","alt":" x ∈ O and (xε)ε>0 ⊂ O with limε→0 xε = x that","inline":true}],[{"id":"id-96","style":{"width":"88%"},"width":1634,"height":108,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-4.png","element":"img"}],[{"text":"Moreover, it holds for all compact subset ","element":"span"},{"style":{"height":19.74},"width":401.96,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-5.png","element":"img","alt":" C ⊂ Ost that (λuε)ε>0","inline":true,"padRight":true},{"text":"converges uniformly to the function ","element":"span"},{"style":{"height":19.85},"width":1077.16,"height":49.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-6.png","element":"img","alt":"λ∗ : x ∈ Ost → eκu(x) ∈ ∆K on C as ε → 0, where κu(x","inline":true},{"text":") = arg max","element":"span"},{"style":{"height":20.8},"width":544.4,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-7.png","element":"img","alt":"k∈K�Lku(x) + fk(x)�for all","inline":true},{"style":{"height":15.09},"width":145.48,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-8.png","element":"img","alt":"x ∈ Ost","inline":true},{"text":". If we further suppose the function ","element":"span"},{"href":"#id-50","style":{"height":19.54},"width":413.79,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-9.png","element":"img","alt":" H : RK → R in (H.2","inline":true},{"text":") satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-10.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":13.2},"width":106,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-11.png","element":"img","alt":"ϑ > 0","inline":true},{"text":", then there exists ","element":"span"},{"style":{"height":14.29},"width":119.44,"height":35.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-12.png","element":"img","alt":" ε0 > 0","inline":true,"padRight":true},{"text":"such that it holds for all ","element":"span"},{"style":{"height":19.55},"width":570.28,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-13.png","element":"img","alt":" ε ∈ (0, ε0] that λuε ≡ λ∗ on C.","inline":true}],[{"text":"Proof. ","element":"span"},{"text":"For any give ","element":"span"},{"style":{"height":19.14},"width":560.12,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-14.png","element":"img","alt":" ε > 0, let uε ∈ C(O) ∩ C2(O","inline":true},{"text":") be the solution to (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"). We first prove (","element":"span"},{"href":"#id-96","text":"6.3","element":"a"},{"text":") by fixing an arbitrary point ","element":"span"},{"style":{"height":13.2},"width":121.4,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-15.png","element":"img","alt":" x ∈ O","inline":true},{"text":". By using (","element":"span"},{"href":"#id-27","text":"3.6","element":"a"},{"text":") and Proposition ","element":"span"},{"href":"#id-97","text":"6.3","element":"a"},{"text":"(","element":"span"},{"href":"#id-92","text":"1","element":"a"},{"text":"), we see it suffices to show lim","element":"span"},{"style":{"height":17.6},"width":965.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-16.png","element":"img","alt":"ε→0(Luε(xε) + f(xε)) = Lu(x) + f(x), where L, f","inline":true,"padRight":true},{"text":"are defined as those in (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"). Then the fact that (","element":"span"},{"style":{"height":17.6},"width":119.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-17.png","element":"img","alt":"uε)ε>0","inline":true,"padRight":true},{"text":"converges to ","element":"span"},{"text":"u ","element":"span"},{"text":"uniformly in ","element":"span"},{"style":{"height":19.14},"width":105.08,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-18.png","element":"img","alt":" C2(O","inline":true},{"text":") (see Theorem ","element":"span"},{"href":"#id-31","text":"6.1","element":"a"},{"text":") and the continuity of coefficients enable us to conclude (","element":"span"},{"href":"#id-96","text":"6.3","element":"a"},{"text":").","element":"span"}],[{"text":"We now proceed to demonstrate the uniform convergence of (","element":"span"},{"style":{"height":19.74},"width":265.96,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-19.png","element":"img","alt":"λuε)ε>0 in Ost","inline":true},{"text":". Note that for all ","element":"span"},{"style":{"height":21.66},"width":905.12,"height":54.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-20.png","element":"img","alt":"x ∈ Ost, we have eκu(x) = (∂H0)�Lu(x) + f(x)�","inline":true},{"text":", where the set-valued mapping ","element":"span"},{"style":{"height":17.82},"width":322.32,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-21.png","element":"img","alt":" ∂H0 : RK ⇒ ∆K","inline":true,"padRight":true},{"text":"is defined as in Proposition ","element":"span"},{"href":"#id-97","text":"6.3","element":"a"},{"text":". We further define for any given ","element":"span"},{"style":{"height":13.2},"width":254.6,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-22.png","element":"img","alt":" k ∈ K the set","inline":true}],[{"style":{"width":"61%"},"width":1134,"height":47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-23.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":19.14},"width":343.17,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-24.png","element":"img","alt":" u ∈ C(O) ∩ C2(O","inline":true},{"text":") is the solution to (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":"), and (","element":"span"},{"style":{"height":17.6},"width":134.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-25.png","element":"img","alt":"Lk)k∈K","inline":true,"padRight":true},{"text":"are the elliptic operators defined as in (","element":"span"},{"href":"#id-52","text":"2.7","element":"a"},{"text":"). The continuity of the coefficients in (","element":"span"},{"href":"#id-35","style":{"height":17.6},"width":325.84,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-26.png","element":"img","alt":"Lk)k∈K (see (H.1","inline":true},{"text":")) implies that (","element":"span"},{"style":{"height":18.48},"width":252.8,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-27.png","element":"img","alt":"Ost,k)k∈K are","inline":true,"padRight":true},{"text":"disjoint open sets satisfying ","element":"span"},{"style":{"height":17.68},"width":323.04,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-28.png","element":"img","alt":" Ost = ∪k∈KOst,k.","inline":true}],[{"style":{"width":"96%"},"width":1779,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-29.png","element":"img"}],[{"style":{"height":17.68},"width":165.84,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-30.png","element":"img","alt":"C ∩ Ost,k","inline":true,"padRight":true},{"text":"is a compact set for each ","element":"span"},{"style":{"height":13.2},"width":356.52,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-31.png","element":"img","alt":" k ∈ K. Let k ∈ K","inline":true,"padRight":true},{"text":"be a fixed index. Then the continuity of the coefficients in (","element":"span"},{"style":{"height":17.6},"width":134.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-32.png","element":"img","alt":"Lk)k∈K","inline":true},{"text":", the fact that ","element":"span"},{"style":{"height":19.14},"width":183.32,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-33.png","element":"img","alt":" u ∈ C2(O","inline":true},{"text":"), and the compactness of ","element":"span"},{"style":{"height":17.68},"width":158.16,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-34.png","element":"img","alt":" C ∩ Ost,k","inline":true,"padRight":true},{"text":"imply that, there exist constants ","element":"span"},{"style":{"height":17.6},"width":286.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-35.png","element":"img","alt":" C1, C2 ∈ (0, ∞","inline":true},{"text":") such that we have for all ","element":"span"},{"style":{"height":17.68},"width":597.6,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-36.png","element":"img","alt":" x ∈ C ∩ Ost,k and j ∈ K that,","inline":true}],[{"style":{"width":"81%"},"width":1498,"height":151,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-37.png","element":"img"}],[{"text":"Now by using the fact that (","element":"span"},{"style":{"height":17.6},"width":118.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-38.png","element":"img","alt":"uε)ε>0","inline":true,"padRight":true},{"text":"converges to ","element":"span"},{"text":"u ","element":"span"},{"text":"uniformly in ","element":"span"},{"style":{"height":19.13},"width":105.08,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-39.png","element":"img","alt":" C2(O","inline":true},{"text":"), we can deduce that there exist ","element":"span"},{"style":{"height":15.6},"width":224.56,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-40.png","element":"img","alt":" ε0, C1, C2 >","inline":true,"padRight":true},{"text":"0 such that the same estimates hold for all (","element":"span"},{"style":{"height":19.86},"width":178.92,"height":49.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-41.png","element":"img","alt":"uε)ε∈(0,ε0]","inline":true},{"text":". In other words, let ","element":"span"},{"text":"U ","element":"span"},{"text":"be the set defined as in Proposition ","element":"span"},{"href":"#id-97","text":"6.3","element":"a"},{"text":", we can introduce the compact set","element":"span"}],[{"style":{"width":"71%"},"width":1319,"height":77,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-42.png","element":"img"}],[{"text":"and conclude for all ","element":"span"},{"style":{"height":18.48},"width":1391.04,"height":46.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-43.png","element":"img","alt":" ε ∈ (0, ε0], x ∈ C ∩ Ost,k that Luε(x) + f(x) ∈ Gk and Lu(x) + f(x) ∈ Gk.","inline":true}],[{"style":{"width":"96%"},"width":1779,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-44.png","element":"img"}],[{"href":"#id-97","text":"6.3","element":"a"},{"text":"(","element":"span"},{"href":"#id-92","text":"2","element":"a"},{"text":")) ensures that there exists ","element":"span"},{"style":{"height":15.28},"width":95.92,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-45.png","element":"img","alt":" δk >","inline":true,"padRight":true},{"text":"0, such that we have for all ","element":"span"},{"style":{"height":16.4},"width":500.84,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-46.png","element":"img","alt":" y ∈ Gk and ε < δk that","inline":true},{"style":{"height":17.6},"width":525.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-47.png","element":"img","alt":"|(∇Hε)(y) − (∂H0)(y)| ≤ η","inline":true},{"text":". Hence, by using the fact that ","element":"span"},{"style":{"height":17.6},"width":367.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-48.png","element":"img","alt":" ∂H0 = {ek} on Gk","inline":true},{"text":", we have for all","element":"span"}],[{"style":{"width":"83%"},"width":1548,"height":202,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/23-49.png","element":"img"}],[{"text":"which shows the uniform convergence of (","element":"span"},{"style":{"height":20.43},"width":1041.8,"height":51.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-0.png","element":"img","alt":"λuε)ε>0 to λ∗ on C ∩ Ost,k. Since C = ∪k∈K(C ∩ Ost,k)","inline":true,"padRight":true},{"text":"and ","element":"span"},{"text":"K ","element":"span"},{"text":"is a finite set, we can conclude the desired uniform convergence on ","element":"span"},{"text":"C","element":"span"},{"text":".","element":"span"}],[{"text":"Finally, if we further suppose ","element":"span"},{"text":"H ","element":"span"},{"text":"satisfies (","element":"span"},{"style":{"height":15.28},"width":69.24,"height":38.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-1.png","element":"img","alt":"Sloc","inline":true},{"text":") with constant ","element":"span"},{"style":{"height":14.8},"width":71.92,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-2.png","element":"img","alt":" ϑ ≥","inline":true,"padRight":true},{"text":"0, Proposition ","element":"span"},{"href":"#id-97","text":"6.3","element":"a"},{"text":"(","element":"span"},{"href":"#id-92","text":"2","element":"a"},{"text":") ensures that ","element":"span"},{"style":{"height":15.68},"width":361.2,"height":39.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-3.png","element":"img","alt":" ∇Hε ≡ ∂H0 on Gk","inline":true,"padRight":true},{"text":"for all small enough ","element":"span"},{"style":{"height":10.4},"width":68.08,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-4.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0, which leads to the fact that ","element":"span"},{"style":{"height":15.74},"width":297.12,"height":39.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-5.png","element":"img","alt":" λuε ≡ λ∗ for all","inline":true,"padRight":true},{"text":"small enough ","element":"span"},{"style":{"height":13.2},"width":199.68,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-6.png","element":"img","alt":" ε > 0 on C","inline":true,"padRight":true},{"text":"and finishes our proof.","element":"span"}],[{"id":"id-70","text":"Remark ","element":"span"},{"text":"6.2","element":"span"},{"text":". ","element":"span"},{"text":"One can identify the unit vector ","element":"span"},{"style":{"height":16},"width":310.92,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-7.png","element":"img","alt":" ek ∈ ∆K, k ∈ K","inline":true},{"text":", as the Dirac measure supported on ","element":"span"},{"style":{"height":17.6},"width":88.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-8.png","element":"img","alt":" {ak}","inline":true},{"text":", which shows that, as the relaxation parameter tends to zero, the agent of the relaxed control problem will emphasize more on exploitation, and the relaxed control distribution will collapse to a pure exploitation strategy for the classical control problem.","element":"span"}],[{"text":"Note that Theorem ","element":"span"},{"href":"#id-32","text":"6.4 ","element":"a"},{"text":"demonstrates an exact regularization feature of the reward function ","element":"span"},{"style":{"height":13.09},"width":89.96,"height":32.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-9.png","element":"img","alt":"ρzang","inline":true,"padRight":true},{"text":"generated by ","element":"span"},{"style":{"height":17.09},"width":103.88,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-10.png","element":"img","alt":" Hzang","inline":true},{"text":", which means that we can recover the original control strategy in the region ","element":"span"},{"style":{"height":15.09},"width":61,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-11.png","element":"img","alt":" Ost","inline":true,"padRight":true},{"text":"based on the feedback relaxed control ","element":"span"},{"text":"without ","element":"span"},{"text":"sending the relaxation parameter ","element":"span"},{"style":{"height":11.2},"width":75.76,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-12.png","element":"img","alt":" ε to","inline":true,"padRight":true},{"text":"0. The main intuition of the proof is that the region ","element":"span"},{"style":{"height":15.09},"width":61,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-13.png","element":"img","alt":" Ost","inline":true,"padRight":true},{"text":"can be mapped into a finite number of convex sets (i.e., the sets (","element":"span"},{"style":{"height":17.6},"width":134.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-14.png","element":"img","alt":"Uk)k∈K","inline":true,"padRight":true},{"text":"in the proof of Proposition ","element":"span"},{"href":"#id-97","text":"6.3","element":"a"},{"text":"). Hence, if a reward function only modifies the pointwise maximum function locally near the kinks, then one can employ the local compactness and local convexity structure of ","element":"span"},{"style":{"height":15.09},"width":61,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-15.png","element":"img","alt":" Ost","inline":true,"padRight":true},{"text":"and the finiteness of the action set ","element":"span"},{"text":"A","element":"span"},{"text":", and deduce the local exact regularization property in the region ","element":"span"},{"style":{"height":15.09},"width":75.36,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-16.png","element":"img","alt":" Ost.","inline":true}],[{"text":"The exact regularization feature of ","element":"span"},{"style":{"height":13.09},"width":89.96,"height":32.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-17.png","element":"img","alt":" ρzang","inline":true,"padRight":true},{"text":"helps avoid the possible numerical instability for solving the relaxed control problem (","element":"span"},{"href":"#id-51","text":"3.2","element":"a"},{"text":") with an extremely small relaxation parameter. In contrast, the feedback relaxed control ","element":"span"},{"style":{"height":15.55},"width":59.12,"height":38.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-18.png","element":"img","alt":" λuε ","inline":true,"padRight":true},{"text":"based on the entropy reward function ","element":"span"},{"style":{"height":12},"width":56.44,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-19.png","element":"img","alt":" ρen","inline":true,"padRight":true},{"text":"is always in (0","element":"span"},{"style":{"height":19.54},"width":103.2,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/24-20.png","element":"img","alt":", 1)K,","inline":true,"padRight":true},{"text":"and the convergence rate to the original control strategy can be arbitrarily slow.","element":"span"}]]},{"heading":"7 Conclusions","paragraphs":[[{"text":"To the best of our knowledge, this is the first paper which constructs Lipschitz stable feedback control strategies for general multi-dimensional continuous-time stochastic control problems, and rigorously analyzes the performance of a pre-computed feedback control for a perturbed problem in a continuous setting. ","element":"span"},{"text":"We also perform a novel first-order sensitivity analysis for the value function and feedback relaxed control with respect to perturbations in the model parameters, and quantify the explicit dependence of the Lipschitz stability of feedback controls on the exploration parameter. ","element":"span"},{"text":"These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making.","element":"span"}],[{"text":"A natural next step would be to extend the stability analysis to finite horizon stochastic control problems and mean-field control problems with continuous action spaces (see e.g. [","element":"span"},{"href":"#id-11","referenceIndex":22,"text":"23","element":"a"},{"text":", ","element":"span"},{"href":"#id-9","referenceIndex":41,"text":"42","element":"a"},{"text":"]). The infinite cardinality of action spaces implies that the corresponding relaxed controls take values in an infinite-dimensional space of probability measures, which poses additional challenges for the analysis of the regularized control problems. For example, infinite-dimensional convex analysis on spaces of measures must be employed to analyze the regularity of the modified Hamiltonians and the well-posedness of the associated HJB equations. Moreover, one must endow the action space of relaxed controls with a suitable metric structure (such as the Wasserstein metric) in order to study the spatial regularity and Lipschitz stability of feedback relaxed controls.","element":"span"}],[{"text":"Another interesting direction is to design efficient numerical algorithms for solving the regularized control problems in a continuous setting.","element":"span"}]]},{"heading":"A Proofs of technical results","paragraphs":[[{"style":{"width":"99%"},"width":1834,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-0.png","element":"img"}],[{"text":"be the strong solution to (","element":"span"},{"href":"#id-36","text":"2.2","element":"a"},{"text":") with control ","element":"span"},{"style":{"height":8.4},"width":28,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-1.png","element":"img","alt":" α","inline":true},{"text":", and for all ","element":"span"},{"style":{"height":23.17},"width":723.56,"height":57.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-2.png","element":"img","alt":" t ≥ 0, let Zα,xt =� t0 c(Xα,xs , αs) ds. It","inline":true,"padRight":true},{"text":"is shown in [","element":"span"},{"href":"#id-54","referenceIndex":13,"text":"13","element":"a"},{"text":", Lemma 3.1] that ","element":"span"},{"style":{"height":17.6},"width":373.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-3.png","element":"img","alt":" E[exp(µτ α,x)] < ∞","inline":true,"padRight":true},{"text":"for some constant ","element":"span"},{"style":{"height":13.6},"width":80.08,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-4.png","element":"img","alt":" µ >","inline":true,"padRight":true},{"text":"0, which implies that ","element":"span"},{"style":{"height":12.74},"width":178.4,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-5.png","element":"img","alt":" τ α,x < ∞","inline":true,"padRight":true},{"text":"with probability 1. Applying Itˆo’s formula to the function ","element":"span"},{"style":{"height":17.6},"width":440.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-6.png","element":"img","alt":" φ(y, z) = u(y) exp(−z),","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":17.6},"width":274.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-7.png","element":"img","alt":"y, z) ∈ Rn × R","inline":true},{"text":", gives us that","element":"span"}],[{"id":"id-102","style":{"width":"88%"},"width":1629,"height":236,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-8.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":15.09},"width":105.8,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-9.png","element":"img","alt":" LXα,x","inline":true,"padRight":true},{"text":"is the generator of the controlled dynamics ","element":"span"},{"style":{"height":23.17},"width":797.12,"height":57.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-10.png","element":"img","alt":" Xα,x, and Γα,xt = exp�−� t0 c(Xα,xs , αs) ds�","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":210.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-11.png","element":"img","alt":" t ∈ [0, τ α,x","inline":true},{"text":"]. The fact that ","element":"span"},{"text":"u ","element":"span"},{"text":"is a solution to (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") implies that for ","element":"span"},{"style":{"height":15.6},"width":367.2,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-12.png","element":"img","alt":" P-a.s. ω ∈ Ω, and","inline":true}],[{"id":"id-101","style":{"width":"99%"},"width":1843,"height":244,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-13.png","element":"img"}],[{"text":"Then, by rearranging the terms, using the fact that ","element":"span"},{"style":{"height":18.66},"width":598.28,"height":46.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-14.png","element":"img","alt":" φ(Xα,xτ α,x, Zα,xτ α,x) = g(Xα,xτ α,x)Γα,xτ α,x","inline":true,"padRight":true},{"text":"and taking ","element":"span"},{"text":"the supremum over all ","element":"span"},{"style":{"height":15.68},"width":392.2,"height":39.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-15.png","element":"img","alt":" α ∈ Aπ and π ∈ Πref","inline":true},{"text":", we can deduce that ","element":"span"},{"style":{"height":17.6},"width":491.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-16.png","element":"img","alt":" u(x) ≥ v(x) for all x ∈O.","inline":true}],[{"text":"We proceed to show ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-17.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"is a feedback control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":") (cf. Definition ","element":"span"},{"href":"#id-63","text":"2.2","element":"a"},{"text":"). Let ","element":"span"},{"style":{"height":12.8},"width":290.24,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-18.png","element":"img","alt":" αu :O → A be","inline":true,"padRight":true},{"text":"a Borel measurable function satisfying (","element":"span"},{"href":"#id-43","text":"2.8","element":"a"},{"text":"), and ˜","element":"span"},{"style":{"height":12.8},"width":257.36,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-19.png","element":"img","alt":"αu : Rn → A","inline":true,"padRight":true},{"text":"be an extension of ","element":"span"},{"style":{"height":12.8},"width":245.96,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-20.png","element":"img","alt":" αu such that","inline":true,"padRight":true},{"text":"˜","element":"span"},{"style":{"height":18.66},"width":653.88,"height":46.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-21.png","element":"img","alt":"αu = αu onO and ˜αu = a1 onOc","inline":true},{"text":". We shall consider the functions ","element":"span"},{"style":{"height":17.01},"width":543.72,"height":42.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-22.png","element":"img","alt":" bα : Rn → Rn, σα : Rn → Sn0","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":17.6},"width":1063.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-23.png","element":"img","alt":" bα(x) = b(x, ˜αu(x)), σα(x) = σ(x, ˜αu(x)) for all x ∈ Rn","inline":true},{"text":". The measurability of ","element":"span"},{"style":{"height":12.8},"width":135.84,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-24.png","element":"img","alt":" αu and","inline":true,"padRight":true},{"text":"the continuity of ","element":"span"},{"style":{"height":15.6},"width":68.68,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-25.png","element":"img","alt":" b, σ","inline":true,"padRight":true},{"text":"imply that ","element":"span"},{"style":{"height":15.6},"width":259.04,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-26.png","element":"img","alt":" bα, σα and ˜αu ","inline":true,"padRight":true},{"text":"are Borel measurable. Then, for any given ","element":"span"},{"style":{"height":15.2},"width":144.48,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-27.png","element":"img","alt":" x ∈ Rn,","inline":true,"padRight":true},{"text":"by using the boundedness of functions ","element":"span"},{"href":"#id-44","referenceIndex":32,"style":{"height":17.6},"width":288.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-28.png","element":"img","alt":" bα, σα, and [32","inline":true},{"text":", Theorem 1], we can deduce that there exists ","element":"span"},{"style":{"height":18.29},"width":1026.92,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-29.png","element":"img","alt":" πx = (Ωx, Fx, {Fxt }t≥0, Px, W) ∈ Πref, and an {Fxt }t≥0","inline":true},{"text":"-progressively measurable continuous ","element":"span"},{"text":"process (","element":"span"},{"style":{"height":18.29},"width":133.16,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-30.png","element":"img","alt":"Xxt )t≥0","inline":true},{"text":", such that ","element":"span"},{"style":{"height":17.01},"width":240.48,"height":42.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-31.png","element":"img","alt":" Xx0 = x, and","inline":true}],[{"id":"id-100","style":{"width":"87%"},"width":1609,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-32.png","element":"img"}],[{"text":"Thus we can obtain from the definition of ˜","element":"span"},{"href":"#id-98","style":{"height":18.29},"width":817.76,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-33.png","element":"img","alt":"αu that (Xxt )t≥0 satisfies (2.9) with h = αu","inline":true},{"text":". Moreover, ","element":"span"},{"text":"[","element":"span"},{"href":"#id-17","referenceIndex":29,"text":"29","element":"a"},{"text":", Theorem 2.2.4 on p. 54] implies that ","element":"span"},{"style":{"height":28.02},"width":1045.36,"height":70.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-34.png","element":"img","alt":" EPx[� τ αu,x0 �|b(Xxs , αu(Xxs ))| + |σ(Xxs , αu(Xxs ))|2�ds] <","inline":true},{"style":{"height":8},"width":44,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-35.png","element":"img","alt":"∞","inline":true},{"text":", which shows that ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-36.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"is a feedback control of (","element":"span"},{"href":"#id-42","text":"2.3","element":"a"},{"text":").","element":"span"}],[{"text":"It remains to show ","element":"span"},{"style":{"height":12.33},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-37.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"is an optimal feedback control. If ","element":"span"},{"style":{"height":13.6},"width":150.2,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-38.png","element":"img","alt":" x ∈ ∂O","inline":true},{"text":", we can deduce from the definition that ","element":"span"},{"style":{"height":15.74},"width":94.36,"height":39.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-39.png","element":"img","alt":" τ ˜αu,x ","inline":true,"padRight":true},{"text":"= 0, which shows that ","element":"span"},{"style":{"height":20.11},"width":819.2,"height":50.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-40.png","element":"img","alt":" g(x) = g(Xxτ ˜αu,x) = J(x, αu), where J(x, αu","inline":true},{"text":") is defined ","element":"span"},{"text":"as in (","element":"span"},{"href":"#id-99","text":"2.10","element":"a"},{"text":"). Similarly, we have for all ","element":"span"},{"style":{"height":16},"width":495.8,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-41.png","element":"img","alt":" π ∈ Πref, α ∈ Aπ, x ∈ ∂O","inline":true,"padRight":true},{"text":"that the first exit time of ","element":"span"},{"style":{"height":12},"width":89.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-42.png","element":"img","alt":" Xα,x","inline":true,"padRight":true},{"text":"from ","element":"span"},{"text":"O ","element":"span"},{"text":"is 0, i.e., ","element":"span"},{"style":{"height":12.34},"width":74.2,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-43.png","element":"img","alt":" τ α,x ","inline":true,"padRight":true},{"text":"= 0, which implies that ","element":"span"},{"text":"v","element":"span"},{"text":"(","element":"span"},{"text":"x","element":"span"},{"text":") = ","element":"span"},{"text":"g","element":"span"},{"text":"(","element":"span"},{"text":"x","element":"span"},{"text":"). Hence, we can deduce from the fact that ","element":"span"},{"text":"u ","element":"span"},{"text":"satisfies the boundary condition of (","element":"span"},{"href":"#id-41","text":"2.6","element":"a"},{"text":") that ","element":"span"},{"style":{"height":17.6},"width":863.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-44.png","element":"img","alt":" u(x) = g(x) = v(x) = J(x, αu) for all x ∈ ∂O.","inline":true}],[{"text":"For each ","element":"span"},{"style":{"height":15.6},"width":267.64,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-45.png","element":"img","alt":" x ∈ O, let Xx ","inline":true,"padRight":true},{"text":"be a progressively measurable continuous process satisfying the SDE (","element":"span"},{"href":"#id-100","text":"A.3","element":"a"},{"text":"), defined on the reference probability system ","element":"span"},{"style":{"height":14.88},"width":174.76,"height":37.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-46.png","element":"img","alt":" πx ∈ Πref","inline":true},{"text":". The assumption that ","element":"span"},{"href":"#id-43","style":{"height":17.6},"width":305.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-47.png","element":"img","alt":" αu satisfies (2.8)","inline":true,"padRight":true},{"text":"ensures that ˜","element":"span"},{"style":{"height":17.6},"width":303.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-48.png","element":"img","alt":"αu(Xx) and Xx ","inline":true,"padRight":true},{"text":"obtain the equality in (","element":"span"},{"href":"#id-101","text":"A.2","element":"a"},{"text":") for ","element":"span"},{"style":{"height":19.55},"width":662.4,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-49.png","element":"img","alt":" P-a.s. ω ∈ Ω, and t ∈ [0, τ ˜αu,x(ω)],","inline":true,"padRight":true},{"text":"from which, by using similar arguments as (","element":"span"},{"href":"#id-102","text":"A.1","element":"a"},{"text":"), we can obtain that ","element":"span"},{"href":"#id-99","style":{"height":17.6},"width":539.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-50.png","element":"img","alt":" u(x) = J(x, αu) (c.f. (2.10)).","inline":true,"padRight":true},{"text":"On the other hand, owing to the fact that ˜","element":"span"},{"style":{"height":17.6},"width":281.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-51.png","element":"img","alt":"αu(Xx) ∈ Aπx","inline":true},{"text":", we have by the definition of ","element":"span"},{"text":"v ","element":"span"},{"text":"that ","element":"span"},{"style":{"height":17.6},"width":491,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-52.png","element":"img","alt":"u(x) ≤ v(x) for all x ∈ O","inline":true},{"text":". Combining this with the fact that ","element":"span"},{"style":{"height":17.6},"width":650.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-53.png","element":"img","alt":" u(x) ≥ v(x) for all x ∈ O, we can","inline":true,"padRight":true},{"text":"conclude that ","element":"span"},{"style":{"height":17.6},"width":549.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-54.png","element":"img","alt":" u(x) = v(x) = J(x, αu) in O","inline":true},{"text":", which shows that ","element":"span"},{"style":{"height":12.34},"width":47.84,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-55.png","element":"img","alt":" αu ","inline":true,"padRight":true},{"text":"is an optimal feedback control and ","element":"span"},{"style":{"height":12.8},"width":228.96,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/25-56.png","element":"img","alt":" u ≡ v onO.","inline":true}],[{"text":"Proof of Lemma ","element":"span"},{"href":"#id-37","text":"3.1","element":"a"},{"text":". ","element":"span"},{"text":"The definition of ∆","element":"span"},{"href":"#id-35","style":{"height":17.6},"width":226.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-0.png","element":"img","alt":"K and (H.1","inline":true},{"text":") clearly imply that the function ","element":"span"},{"text":"˜","element":"span"},{"text":"b ","element":"span"},{"text":"is well-defined and enjoys the desired estimates. Hence we shall focus on establishing the properties of the function ˜","element":"span"},{"style":{"height":8},"width":38.4,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-1.png","element":"img","alt":"σ.","inline":true}],[{"text":"It has been shown in [","element":"span"},{"href":"#id-85","referenceIndex":17,"text":"17","element":"a"},{"text":", Theorem 7.14-3] that for any given ","element":"span"},{"style":{"height":17.82},"width":144.08,"height":44.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-2.png","element":"img","alt":" A ∈ Sn>","inline":true},{"text":", there exists a unique ","element":"span"},{"text":"matrix ","element":"span"},{"style":{"height":21.79},"width":1327.56,"height":54.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-3.png","element":"img","alt":" A1/2 ∈ Sn> such that A1/2(A1/2)T = A, A1/2 ≥ √µIn if A ≥ µIn","inline":true},{"text":", and the mapping Φ : ","element":"span"},{"style":{"height":21.15},"width":580.4,"height":52.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-4.png","element":"img","alt":" A ∈ Sn> �→ Φ(A) = A1/2 ∈ Sn> ","inline":true,"padRight":true},{"text":"is infinitely differentiable. Note that (","element":"span"},{"href":"#id-46","text":"2.4","element":"a"},{"text":") and (","element":"span"},{"href":"#id-46","text":"2.5","element":"a"},{"text":") in (H.","element":"span"},{"href":"#id-35","text":"1","element":"a"},{"text":") ","element":"span"},{"text":"ensure that there exists a constant ","element":"span"},{"style":{"height":17.6},"width":188.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-5.png","element":"img","alt":" C ∈ (0, ∞","inline":true},{"text":"), such that it holds for all ","element":"span"},{"style":{"height":16},"width":401.48,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-6.png","element":"img","alt":" x ∈ Rn, λ ∈ ∆K that","inline":true}],[{"id":"id-103","style":{"width":"93%"},"width":1729,"height":136,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-7.png","element":"img"}],[{"text":"We now define the function ˜","element":"span"},{"style":{"height":31.6},"width":1299.56,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-8.png","element":"img","alt":"σ : Rn × ∆K → Sn> by ˜σ(x, λ) = Φ��Kk=1 σ(x, ak)σ(x, ak)T λk�for","inline":true,"padRight":true},{"text":"all ","element":"span"},{"style":{"height":16},"width":320.88,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-9.png","element":"img","alt":" x ∈ Rn, λ ∈ ∆K","inline":true},{"text":". The facts that Φ is a smooth function and ","element":"span"},{"text":"G ","element":"span"},{"text":"is a compact subset of ","element":"span"},{"style":{"height":17.42},"width":50.48,"height":43.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-10.png","element":"img","alt":" Sn>","inline":true,"padRight":true},{"text":"imply that Φ is bounded and Lipschitz continuous on ","element":"span"},{"text":"G","element":"span"},{"text":". Therefore, we can conclude from (","element":"span"},{"href":"#id-46","text":"2.4","element":"a"},{"text":"), (","element":"span"},{"href":"#id-46","text":"2.5","element":"a"},{"text":"), (","element":"span"},{"href":"#id-103","text":"A.4","element":"a"},{"text":") and the definition of ˜","element":"span"},{"style":{"height":8},"width":25,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-11.png","element":"img","alt":"σ","inline":true,"padRight":true},{"text":"that it holds for all ","element":"span"},{"style":{"height":17.68},"width":792,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-12.png","element":"img","alt":" x ∈ Rn, λ ∈ ∆K that ˜σ(x, λ) ≥ √νIn and","inline":true},{"style":{"height":22.32},"width":421.44,"height":55.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-13.png","element":"img","alt":"�i,j |˜σij(·, λ)|0,1 < ∞.","inline":true}],[{"text":"Proof of Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":". ","element":"span"},{"text":"We start by establishing Property (","element":"span"},{"href":"#id-56","text":"1","element":"a"},{"text":"). Since ","element":"span"},{"style":{"height":15.54},"width":250.88,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-14.png","element":"img","alt":" H : RK → R","inline":true,"padRight":true},{"text":"is a continuous convex function, the representation of ","element":"span"},{"href":"#id-50","style":{"height":17.6},"width":341.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-15.png","element":"img","alt":" ρ in (H.2) and [38","inline":true},{"text":", Theorem 12.2] ensure that ","element":"span"},{"style":{"height":12},"width":23,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-16.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"is a closed convex proper function satisfying","element":"span"}],[{"id":"id-104","style":{"width":"69%"},"width":1282,"height":88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-17.png","element":"img"}],[{"text":"The assumption that ","element":"span"},{"style":{"height":19.54},"width":889.68,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-18.png","element":"img","alt":" H(x) − c0 ≤ maxk∈K xk ≤ H(x) for all x ∈ RK ","inline":true,"padRight":true},{"text":"implies that for all ","element":"span"},{"style":{"height":19.14},"width":152.64,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-19.png","element":"img","alt":" y ∈ RK,","inline":true}],[{"style":{"width":"89%"},"width":1650,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-20.png","element":"img"}],[{"text":"which together with the fact that","element":"span"}],[{"style":{"width":"41%"},"width":772,"height":132,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-21.png","element":"img"}],[{"text":"shows that ","element":"span"},{"style":{"height":17.6},"width":234.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-22.png","element":"img","alt":" ρ(y) ∈ [−c0,","inline":true,"padRight":true},{"text":"0] for all ","element":"span"},{"style":{"height":17.6},"width":788.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-23.png","element":"img","alt":" y ∈ ∆K and ρ(y) = ∞ for all y ∈ (∆K)c","inline":true},{"text":". Finally, since ","element":"span"},{"style":{"height":15.6},"width":106.48,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-24.png","element":"img","alt":" ρ is a","inline":true,"padRight":true},{"text":"closed convex function satisfying ","element":"span"},{"style":{"height":19.54},"width":532.08,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-25.png","element":"img","alt":" {y ∈ RK | ρ(y) < ∞} = ∆K","inline":true},{"text":", we can deduce from [","element":"span"},{"href":"#id-93","referenceIndex":37,"text":"38","element":"a"},{"text":", Theorem 10.2] (∆","element":"span"},{"style":{"height":8.8},"width":30,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-26.png","element":"img","alt":"K","inline":true,"padRight":true},{"text":"is the standard simplex and hence locally simplicial) that the restriction of ","element":"span"},{"style":{"height":16.8},"width":202.28,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-27.png","element":"img","alt":" ρ to ∆K is","inline":true,"padRight":true},{"text":"a continuous function.","element":"span"}],[{"text":"We now show Property (","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":"). It is clear from (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") and (","element":"span"},{"href":"#id-58","text":"3.3","element":"a"},{"text":") that ","element":"span"},{"style":{"height":17.6},"width":559.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-28.png","element":"img","alt":" Hε(x)−c0ε ≤ H0(x) ≤ Hε(x)","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":15.93},"width":139.92,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-29.png","element":"img","alt":" x ∈ RK","inline":true},{"text":". Note that (","element":"span"},{"href":"#id-104","text":"A.5","element":"a"},{"text":") and the fact that ","element":"span"},{"style":{"height":16.8},"width":266.16,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-30.png","element":"img","alt":" ρ = ∞ on ∆K","inline":true,"padRight":true},{"text":"imply that for all ","element":"span"},{"style":{"height":13.2},"width":265.28,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-31.png","element":"img","alt":" ε > 0 we have","inline":true}],[{"style":{"width":"71%"},"width":1330,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-32.png","element":"img"}],[{"text":"which shows the function ","element":"span"},{"style":{"height":12},"width":43.16,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-33.png","element":"img","alt":" ερ","inline":true,"padRight":true},{"text":"is the convex conjugate of ","element":"span"},{"style":{"height":17.6},"width":367.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-34.png","element":"img","alt":" Hε, i.e., (Hε)∗ = ερ","inline":true},{"text":". Hence, we can further deduce from [","element":"span"},{"href":"#id-93","referenceIndex":37,"text":"38","element":"a"},{"text":", Theorem 23.5], the differentiability and convexity of ","element":"span"},{"style":{"height":15.09},"width":148.04,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-35.png","element":"img","alt":" Hε that","inline":true}],[{"style":{"width":"83%"},"width":1550,"height":88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-36.png","element":"img"}],[{"text":"Consequently, we can obtain from the fundamental theorem of calculus and the Cauchy-Schwarz inequality that ","element":"span"},{"style":{"height":14.69},"width":52.48,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-37.png","element":"img","alt":" Hε","inline":true,"padRight":true},{"text":"is Lipschitz continuous with constant ","element":"span"},{"style":{"height":18.29},"width":813.6,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/26-38.png","element":"img","alt":" LHε = supx∈RK |(∇Hε)(x)| ≤ maxy∈∆K |y|.","inline":true,"padRight":true},{"text":"Note that ∆","element":"span"},{"style":{"height":8.8},"width":30,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-0.png","element":"img","alt":"K","inline":true,"padRight":true},{"text":"is the convex hull of ","element":"span"},{"style":{"height":17.6},"width":421.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-1.png","element":"img","alt":" {e1, . . . , eK}, where ek","inline":true,"padRight":true},{"text":"is the unit vector from the ","element":"span"},{"text":"k","element":"span"},{"text":"-th column of the identify matrix ","element":"span"},{"href":"#id-93","referenceIndex":37,"style":{"height":17.6},"width":290.32,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-2.png","element":"img","alt":" IK. Hence [38","inline":true},{"text":", Theorem 32.2] ensures that max","element":"span"},{"style":{"height":18.29},"width":153.12,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-3.png","element":"img","alt":"y∈∆K |y|","inline":true,"padRight":true},{"text":"is attained at ","element":"span"},{"style":{"height":17.6},"width":233.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-4.png","element":"img","alt":"{e1, . . . , eK}","inline":true},{"text":", which implies that ","element":"span"},{"style":{"height":16.23},"width":121.36,"height":40.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-5.png","element":"img","alt":" LHε ≤","inline":true,"padRight":true},{"text":"1, and finishes the proof of Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":".","element":"span"}],[{"text":"Before establishing Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":", we first present an ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for solutions of fully nonlinear equations involving only the second order term.","element":"span"}],[{"id":"id-105","text":"Lemma A.1. ","element":"span"},{"text":"[","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 7.2 on p. 125] Let ","element":"span"},{"text":"O ","element":"span"},{"text":"be a bounded connected open subset of ","element":"span"},{"style":{"height":15.6},"width":154.12,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-6.png","element":"img","alt":" Rn, and","inline":true},{"style":{"height":12.8},"width":320.48,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-7.png","element":"img","alt":"F : O × Sn → R","inline":true,"padRight":true},{"text":"be a given function. Suppose the function ","element":"span"},{"text":"F ","element":"span"},{"text":"is differentiable and convex in its second component, and there exist constants ","element":"span"},{"style":{"height":26.18},"width":984.08,"height":65.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-8.png","element":"img","alt":" λ, Λ > 0 such that λIn ≤� ∂F∂rij (x, r)�≤ ΛIn for all","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":17.6},"width":269.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-9.png","element":"img","alt":"x, r) ∈ O × Sn","inline":true},{"text":". Then there exists a constant ","element":"span"},{"style":{"height":17.6},"width":420.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-10.png","element":"img","alt":" α = α(n, Λ/λ) ∈ (0, 1)","inline":true,"padRight":true},{"text":"such that for any ","element":"span"},{"style":{"height":17.6},"width":195.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-11.png","element":"img","alt":" β ∈ (0, α),","inline":true,"padRight":true},{"text":"if we have in addition that ","element":"span"},{"style":{"height":19.94},"width":461.48,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-12.png","element":"img","alt":" ∂O ∈ C2,β, g ∈ C2,β(O)","inline":true},{"text":", and there exist constants ","element":"span"},{"style":{"height":16.4},"width":344.84,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-13.png","element":"img","alt":" γ, µ > 0 such that","inline":true,"padRight":true},{"text":"it holds for all ","element":"span"},{"style":{"height":19.94},"width":1178.72,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-14.png","element":"img","alt":" x, y ∈ O, r ∈ Sn that |F(x, r) − F(y, r)| ≤ γ(µ + |r|)|x − y|β","inline":true},{"text":", then the Dirichlet problem","element":"span"}],[{"style":{"width":"42%"},"width":794,"height":47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-15.png","element":"img"}],[{"text":"admits a unique solution ","element":"span"},{"style":{"height":19.94},"width":234.44,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-16.png","element":"img","alt":" u ∈ C2,β(O)","inline":true,"padRight":true},{"text":"satisfying the estimate ","element":"span"},{"text":"[","element":"span"},{"style":{"height":20.8},"width":654.6,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-17.png","element":"img","alt":"u]2,β ≤ C�|u|0 + |g|2,β + µ�, where","inline":true,"padRight":true},{"text":"the constant ","element":"span"},{"text":"C ","element":"span"},{"text":"depends only on ","element":"span"},{"style":{"height":19.94},"width":917.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-18.png","element":"img","alt":" n, Λ/λ, γ, (α − β)−1 and the C2,β-norm of ∂O.","inline":true}],[{"text":"Now we proceed to prove the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for solutions to (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":").","element":"span"}],[{"text":"Proof of Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":". ","element":"span"},{"text":"Throughout this proof, we shall denote by ","element":"span"},{"text":"C ","element":"span"},{"text":"a generic constant, which may take a different value at each occurrence. Let ","element":"span"},{"style":{"height":19.14},"width":336.92,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-19.png","element":"img","alt":" φ ∈ C(O) ∩ C2(O","inline":true},{"text":") be a given function, we consider the Dirichlet problem","element":"span"}],[{"id":"id-106","style":{"width":"73%"},"width":1353,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-20.png","element":"img"}],[{"text":"where we define ","element":"span"},{"style":{"height":19.82},"width":476.04,"height":49.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-21.png","element":"img","alt":" D2u(x) = [∂iju(x)] ∈ Sn","inline":true},{"text":", and the function ","element":"span"},{"style":{"height":17.68},"width":344.96,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-22.png","element":"img","alt":" Fφ : O × Sn → R","inline":true,"padRight":true},{"text":"such that for all","element":"span"}],[{"style":{"width":"84%"},"width":1561,"height":157,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-23.png","element":"img"}],[{"text":"It follows from (H.","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":") that ","element":"span"},{"style":{"height":17.28},"width":47.84,"height":43.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-24.png","element":"img","alt":" Fφ","inline":true,"padRight":true},{"text":"is differentiable and convex in ","element":"span"},{"text":"r","element":"span"},{"text":". ","element":"span"},{"text":"Moreover, a straightforward computation shows for all (","element":"span"},{"style":{"height":28.67},"width":989.32,"height":71.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-25.png","element":"img","alt":"x, r) ∈ O × Sn that� ∂Fφ∂rij (x, r)�= �Kk=1 ηk(x, r)ak(x","inline":true},{"text":"), where we have","element":"span"}],[{"style":{"width":"85%"},"width":1589,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-26.png","element":"img"}],[{"text":"Note that for each ","element":"span"},{"style":{"height":13.2},"width":114.12,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-27.png","element":"img","alt":" k ∈ K","inline":true},{"text":", the fact that ","element":"span"},{"href":"#id-35","style":{"height":19.54},"width":711.28,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-28.png","element":"img","alt":" ak = σσT /2 and (H.1) (see (2.4)-(2.5","inline":true},{"text":")) imply that there exists a constant ","element":"span"},{"text":"C","element":"span"},{"text":", depending only on ","element":"span"},{"text":"n","element":"span"},{"text":", such that for all ","element":"span"},{"style":{"height":15.6},"width":126.24,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-29.png","element":"img","alt":" x ∈ O,","inline":true}],[{"style":{"width":"57%"},"width":1060,"height":117,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-30.png","element":"img"}],[{"text":"which, along with the fact that (","element":"span"},{"style":{"height":19.54},"width":1225.84,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-31.png","element":"img","alt":"η1(x, r), . . . , ηK(x, r))T ∈ ∆K for all (x, r) ∈ O × Sn (see Lemma","inline":true,"padRight":true},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":")), shows that ","element":"span"},{"style":{"height":28.67},"width":471.24,"height":71.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-32.png","element":"img","alt":"ν2In ≤� ∂Fφ∂rij (x, r)�≤ CIn","inline":true},{"text":", for some constant ","element":"span"},{"text":"C ","element":"span"},{"text":"depending only on ","element":"span"},{"text":"n ","element":"span"},{"text":"and the constant ","element":"span"},{"text":"M ","element":"span"},{"text":"defined in the statement of Proposition ","element":"span"},{"href":"#id-62","text":"3.3","element":"a"},{"text":".","element":"span"}],[{"style":{"width":"96%"},"width":1779,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-33.png","element":"img"}],[{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":")) imply that, if the function ","element":"span"},{"style":{"height":19.14},"width":457.32,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-34.png","element":"img","alt":" φ ∈ C2,η(O), 0 < η ≤ θ","inline":true},{"text":", then the function ","element":"span"},{"style":{"height":17.28},"width":47.84,"height":43.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-35.png","element":"img","alt":" Fφ","inline":true,"padRight":true},{"text":"satisfies for all","element":"span"}],[{"style":{"width":"75%"},"width":1400,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/27-36.png","element":"img"}],[{"text":"for some constant ","element":"span"},{"text":"C ","element":"span"},{"text":"depending only on ","element":"span"},{"text":"n","element":"span"},{"text":". Consequently, we can deduce from Lemma ","element":"span"},{"href":"#id-105","text":"A.1 ","element":"a"},{"text":"that, there exists a constant ","element":"span"},{"style":{"height":17.6},"width":441.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-0.png","element":"img","alt":" β0 = β0(n, ν, M) ∈ (0,","inline":true,"padRight":true},{"text":"1), such that for all ","element":"span"},{"style":{"height":17.6},"width":534.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-1.png","element":"img","alt":" β ∈ (0, min(β0, θ)] and φ ∈","inline":true},{"style":{"height":19.93},"width":135.32,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-2.png","element":"img","alt":"C2,β(O","inline":true},{"text":"), the Dirichlet problem (","element":"span"},{"href":"#id-106","text":"A.6","element":"a"},{"text":") admits a unique solution ","element":"span"},{"style":{"height":19.93},"width":235.16,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-3.png","element":"img","alt":" uφ ∈ C2,β(O","inline":true},{"text":"), and satisfies [","element":"span"},{"style":{"height":20.41},"width":154.48,"height":51.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-4.png","element":"img","alt":"uφ]2,β ≤","inline":true},{"style":{"height":20.93},"width":537.36,"height":52.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-5.png","element":"img","alt":"C�|uφ|0 + |g|2,β + |φ|1,β + 1�","inline":true},{"text":", where the constant ","element":"span"},{"text":"C ","element":"span"},{"text":"depends only on ","element":"span"},{"style":{"height":16.4},"width":347.04,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-6.png","element":"img","alt":" n, ν, Λ, β, and O.","inline":true}],[{"text":"Now let ","element":"span"},{"style":{"height":19.94},"width":608.52,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-7.png","element":"img","alt":" uε ∈ C2,β(O), β ∈ (0, min(β0, θ","inline":true},{"text":")] be a solution to (","element":"span"},{"href":"#id-59","text":"3.5","element":"a"},{"text":"). Then it is clear that ","element":"span"},{"style":{"height":12.74},"width":88.52,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-8.png","element":"img","alt":" uε is","inline":true,"padRight":true},{"text":"a solution to the Dirichlet problem: ","element":"span"},{"style":{"height":19.14},"width":843.32,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-9.png","element":"img","alt":" Fuε(x, D2u(x)) = 0 in O and u = g on ∂O","inline":true},{"text":". We can then deduce from the above arguments that, there exists a constant ","element":"span"},{"text":"C","element":"span"},{"text":", depending only on ","element":"span"},{"style":{"height":16.4},"width":191.12,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-10.png","element":"img","alt":" n, ν, Λ, β","inline":true,"padRight":true},{"text":"and ","element":"span"},{"text":"O","element":"span"},{"text":", such that [","element":"span"},{"style":{"height":20.8},"width":562.32,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-11.png","element":"img","alt":"uε]2,β ≤ C�|g|2,β + |uε|1,β + 1�","inline":true},{"text":". Hence by using the interpolation inequality (see [","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 1.2 on p. 18]), we have ","element":"span"},{"style":{"height":20.8},"width":566.4,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-12.png","element":"img","alt":" |uε|2,β ≤ C�|g|2,β + |uε|0 + 1�.","inline":true}],[{"style":{"width":"99%"},"width":1846,"height":227,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-13.png","element":"img"}],[{"text":"from which, by using the classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) and the fact that ","element":"span"},{"style":{"height":15.49},"width":211.44,"height":38.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-14.png","element":"img","alt":"∇Hε ∈ ∆K","inline":true,"padRight":true},{"text":"(see Lemma ","element":"span"},{"href":"#id-56","text":"3.2","element":"a"},{"text":"(","element":"span"},{"href":"#id-57","text":"2","element":"a"},{"text":")), we can deduce that, there exists a constant ","element":"span"},{"style":{"height":17.6},"width":374.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-15.png","element":"img","alt":" C = C(n, Λ, O) > 0","inline":true,"padRight":true},{"text":"that","element":"span"}],[{"style":{"width":"87%"},"width":1626,"height":106,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-16.png","element":"img"}],[{"text":"which together with the fact that ","element":"span"},{"style":{"height":20.8},"width":550.32,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-17.png","element":"img","alt":" |uε|2,β ≤ C�|g|2,β + |uε|0 + 1�","inline":true},{"text":"leads to the desired estimate.","element":"span"}],[{"text":"Proof of Proposition ","element":"span"},{"href":"#id-89","text":"5.3","element":"a"},{"text":". ","element":"span"},{"text":"The well-posedness of the classical solution ","element":"span"},{"style":{"height":12.73},"width":48.16,"height":31.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-18.png","element":"img","alt":" wε ","inline":true,"padRight":true},{"text":"follows from the standard elliptic regularity theory (see [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 6.14]), hence it suffices to prove the ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate for a fixed ","element":"span"},{"style":{"height":12.4},"width":111.84,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-19.png","element":"img","alt":" ε > 0.","inline":true}],[{"text":"Let ","element":"span"},{"style":{"height":13.6},"width":69.04,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-20.png","element":"img","alt":" ρ >","inline":true,"padRight":true},{"text":"0 be a constant whose value will be specified later, and (","element":"span"},{"style":{"height":19.74},"width":141.32,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-21.png","element":"img","alt":"ξm)Mm=1 ","inline":true,"padRight":true},{"text":"be a partition of unity ","element":"span"},{"text":"in a domain containing","element":"span"},{"text":"O ","element":"span"},{"text":"such that the following properties hold: (1) the support of each function ","element":"span"},{"style":{"height":16.4},"width":49.2,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-22.png","element":"img","alt":"ξm","inline":true,"padRight":true},{"text":"is contained in a ball ","element":"span"},{"style":{"height":18.69},"width":843.24,"height":46.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-23.png","element":"img","alt":" Bρ(xm) for some xm ∈ Rn; (2) ξm ∈ C∞(Rn","inline":true},{"text":") satisfies for all ","element":"span"},{"style":{"height":16},"width":200.84,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-24.png","element":"img","alt":" γ ≥ 0 that","inline":true},{"style":{"height":20.05},"width":632.96,"height":50.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-25.png","element":"img","alt":"|ξm|⌊γ⌋,γ−⌊γ⌋ ≤ Cγρ−γ, where ⌊γ⌋","inline":true,"padRight":true},{"text":"is the integer part of ","element":"span"},{"style":{"height":17.89},"width":175.96,"height":44.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-26.png","element":"img","alt":" γ and Cγ","inline":true,"padRight":true},{"text":"is a constant independent of ","element":"span"},{"text":"m ","element":"span"},{"text":"and ","element":"span"},{"style":{"height":11.6},"width":24,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-27.png","element":"img","alt":" γ","inline":true},{"text":"; (3) for each ","element":"span"},{"style":{"height":22.05},"width":367.72,"height":55.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-28.png","element":"img","alt":" x ∈O, �Mm=1 ξm(x","inline":true},{"text":") = 1 and the number of intersected supports of (","element":"span"},{"style":{"height":19.74},"width":140.84,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-29.png","element":"img","alt":"ξm)Mm=1","inline":true,"padRight":true},{"text":"at ","element":"span"},{"text":"x ","element":"span"},{"text":"is bounded by a constant ","element":"span"},{"style":{"height":14.69},"width":63.24,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-30.png","element":"img","alt":" Mn","inline":true,"padRight":true},{"text":"depending only on the dimension ","element":"span"},{"text":"n","element":"span"},{"text":". In the following, we shall denote by ","element":"span"},{"text":"w ","element":"span"},{"text":"the solution ","element":"span"},{"style":{"height":16.4},"width":255.76,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-31.png","element":"img","alt":" wε, and by C","inline":true,"padRight":true},{"text":"a generic constant independent of ","element":"span"},{"style":{"height":15.6},"width":224.64,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-32.png","element":"img","alt":" α, m and ε.","inline":true}],[{"text":"For each ","element":"span"},{"text":"m ","element":"span"},{"text":"= 1","element":"span"},{"text":", . . . , M","element":"span"},{"text":", we define the function ","element":"span"},{"style":{"height":16.4},"width":207.6,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-33.png","element":"img","alt":" wm = wξm","inline":true},{"text":", which satisfies ","element":"span"},{"style":{"height":16.8},"width":337.4,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-34.png","element":"img","alt":" wm = gξm on ∂O","inline":true,"padRight":true},{"text":"and","element":"span"}],[{"id":"id-107","style":{"width":"99%"},"width":1846,"height":561,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/28-35.png","element":"img"}],[{"text":"which together with the fact that ","element":"span"},{"style":{"height":17.89},"width":255.36,"height":44.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-0.png","element":"img","alt":" ∂ijwm = 0 on","inline":true}],[{"style":{"width":"92%"},"width":1714,"height":215,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-1.png","element":"img"}],[{"text":"Then we can deduce from the interpolation inequality (see [","element":"span"},{"href":"#id-40","referenceIndex":16,"text":"16","element":"a"},{"text":", Theorem 1.3 on p. 19]) and (","element":"span"},{"href":"#id-107","text":"A.7","element":"a"},{"text":") that","element":"span"}],[{"style":{"width":"97%"},"width":1797,"height":59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-2.png","element":"img"}],[{"text":"Note that for all ","element":"span"},{"style":{"height":14.8},"width":84.88,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-3.png","element":"img","alt":" γ ≥","inline":true,"padRight":true},{"text":"0, we can obtain from property (2) of (","element":"span"},{"style":{"height":21.98},"width":560.56,"height":54.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-4.png","element":"img","alt":"ξm)Mm=1 that |ξm|⌊γ⌋,γ−⌊γ⌋ ≤","inline":true},{"style":{"height":21.42},"width":299.36,"height":53.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-5.png","element":"img","alt":"Cγ(2CΛε−α)γ/β","inline":true},{"text":". Hence by repeatedly applying interpolation inequalities, we can simplify (","element":"span"},{"href":"#id-108","text":"A.8","element":"a"},{"text":") into","element":"span"}],[{"id":"id-108","style":{"width":"67%"},"width":1244,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-6.png","element":"img"}],[{"text":"which along with properties (2) and (3) of (","element":"span"},{"style":{"height":19.75},"width":140.84,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-7.png","element":"img","alt":"ξm)Mm=1 ","inline":true,"padRight":true},{"text":"leads to the estimate that","element":"span"}],[{"style":{"width":"84%"},"width":1560,"height":69,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-8.png","element":"img"}],[{"text":"Finally, we can conclude from the classical maximum principle (see e.g. [","element":"span"},{"href":"#id-38","referenceIndex":21,"text":"22","element":"a"},{"text":", Theorem 3.7]) that ","element":"span"},{"style":{"height":17.6},"width":370.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-9.png","element":"img","alt":"|w|0 ≤ C(|f|0 + |g|0","inline":true},{"text":"), which finishes the proof of the desired ","element":"span"},{"text":"a priori ","element":"span"},{"text":"estimate.","element":"span"}],[{"text":"Proof of Lemma ","element":"span"},{"text":"6.2","element":"span"},{"text":". ","element":"span"},{"text":"We first establish Property (","element":"span"},{"href":"#id-109","text":"1","element":"a"},{"text":"). ","element":"span"},{"text":"For any given ","element":"span"},{"style":{"height":19.55},"width":482.12,"height":48.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-10.png","element":"img","alt":" x = (x1, . . . , xn2+n3)T ∈","inline":true},{"style":{"height":20.35},"width":1557.6,"height":50.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-11.png","element":"img","alt":"Rn2+n3, we write x(1) = (x1, . . . , xn2) ∈ Rn2 and x(2) = (xn2+1, . . . , xn2+n3) ∈ Rn3.","inline":true}],[{"text":"Let ","element":"span"},{"style":{"height":15.13},"width":226.68,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-12.png","element":"img","alt":" x ∈ Rn2+n3 ","inline":true,"padRight":true},{"text":"satisfy for some ","element":"span"},{"style":{"height":18.88},"width":1119.28,"height":47.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-13.png","element":"img","alt":" k ∈ {1, . . . , n2 + n3} that xk ≥ maxj̸=k xj + c with c =","inline":true,"padRight":true},{"text":"max(","element":"span"},{"style":{"height":15.6},"width":423.08,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-14.png","element":"img","alt":"ϑ2, ϑ3, c2 + ϑ1, c3 + ϑ1","inline":true},{"text":"). We assume without loss of generality that ","element":"span"},{"style":{"height":15.09},"width":132.2,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-15.png","element":"img","alt":" k ≤ n2","inline":true},{"text":". Then since ","element":"span"},{"style":{"height":16.4},"width":42.92,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-16.png","element":"img","alt":" φ2","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":17.6},"width":480.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-17.png","element":"img","alt":"Sloc) with ϑ2 and c ≥ ϑ2","inline":true},{"text":", we have that ","element":"span"},{"style":{"height":23.81},"width":891.36,"height":59.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-18.png","element":"img","alt":" φ2(x(1)) = xk and φ3(x(2)) ≤ H(n3)0 (x(2)) + c3.","inline":true,"padRight":true},{"text":"Moreover, since ","element":"span"},{"style":{"height":23.81},"width":1529.76,"height":59.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-19.png","element":"img","alt":" xk ≥ H(n3)0 (x(2)) + c and c ≥ c3 + ϑ1, we see φ2(x(1)) ≥ φ3(x(2)) + ϑ1, which,","inline":true,"padRight":true},{"text":"along with the assumption that ","element":"span"},{"style":{"height":16.4},"width":42.92,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-20.png","element":"img","alt":" φ1","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":20.34},"width":990.92,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-21.png","element":"img","alt":"Sloc) with ϑ1, implies φ(x) = φ2(x(1)) = xk. Similar","inline":true,"padRight":true},{"text":"arguments show that the same conclusion holds if ","element":"span"},{"style":{"height":15.09},"width":126.92,"height":37.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-22.png","element":"img","alt":" k ≥ n2","inline":true,"padRight":true},{"text":"+ 1, which enables us to conclude that ","element":"span"},{"style":{"height":16.4},"width":26,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-23.png","element":"img","alt":"φ","inline":true,"padRight":true},{"text":"satisfies (","element":"span"},{"style":{"height":17.6},"width":233.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-24.png","element":"img","alt":"Sloc) with c.","inline":true}],[{"text":"Now let ","element":"span"},{"style":{"height":15.13},"width":219.96,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-25.png","element":"img","alt":" x ∈ Rn2+n3 ","inline":true,"padRight":true},{"text":"be an arbitrary given point. We have by assumptions that ","element":"span"},{"style":{"height":20.33},"width":201.04,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-26.png","element":"img","alt":" φ2(x(1)) ≤","inline":true},{"style":{"height":23.81},"width":888.68,"height":59.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-27.png","element":"img","alt":"H(n2)0 (x(1))+c2 and φ3(x(2)) ≤ H(n3)0 (x(2))+c3","inline":true},{"text":". Hence, by using the fact that ","element":"span"},{"style":{"height":23.81},"width":82.6,"height":59.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-28.png","element":"img","alt":" H(2)0","inline":true,"padRight":true},{"text":"is componentwise increasing and subadditive on ","element":"span"},{"style":{"height":17.94},"width":227.36,"height":44.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-29.png","element":"img","alt":" R2, we have","inline":true}],[{"style":{"width":"93%"},"width":1723,"height":140,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-30.png","element":"img"}],[{"text":"which finishes the proof of Property (","element":"span"},{"href":"#id-109","text":"1","element":"a"},{"text":"). Property (","element":"span"},{"href":"#id-94","text":"2","element":"a"},{"text":") follows directly from the definition of ","element":"span"},{"style":{"height":16.4},"width":55.68,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-31.png","element":"img","alt":" φε.","inline":true}]]},{"heading":"References","paragraphs":[[{"text":"[1] C. D. Aliprantis and K. C. Border, ","element":"span"},{"style":{"height":16.8},"width":1095.84,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/29-32.png","element":"img","alt":" Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd","inline":true,"padRight":true},{"text":"ed., Springer-Verlag, Berlin, 2006.","element":"span"}],[{"id":"id-16","text":"[2] D. Aldous, ","element":"span"},{"text":"Weak convergence and the general theory of processes","element":"span"},{"text":", manuscript, 1981. Available online at https://www.stat.berkeley.edu/ aldous/Papers/weak-gtp.pdf","element":"span"}],[{"id":"id-18","text":"[3] J. Backhoff-Veraguas, D. Bartl, M. Beiglb¨ock, and M. Eder, ","element":"span"},{"text":"All adapted topologies are equal","element":"span"},{"text":", Probab. Theory Relat. Fields, 178 (2020), pp. 1125–1172.","element":"span"}],[{"id":"id-47","text":"[4] J. Backhoff-Veraguas, D. Bartl, M. Beiglb¨ock, and J. Wiesel, ","element":"span"},{"text":"Estimating processes in adapted Wasserstein distance","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/2002.07261","text":"arXiv:2002.07261","element":"a"},{"text":", 2020.","element":"span"}],[{"text":"[5] G. Barles and E. Rouy, ","element":"span"},{"text":"A strong comparison result for the Bellman equation arising in stochastic exit time control problems and its applications","element":"span"},{"text":", Comm. Partial Differential Equations, 23 (1998), pp. 1945–2033.","element":"span"}],[{"id":"id-15","text":"[6] M. Basei, X. Guo, and A. Hu, ","element":"span"},{"text":"Linear quadratic reinforcement learning: Sublinear regret in the episodic continuous-time framework","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/2006.15316","text":"arXiv:2006.15316","element":"a"},{"text":", 2020.","element":"span"}],[{"text":"[7] E. Bayraktar, Y. Dolinsky, and J. Guo, ","element":"span"},{"text":"Continuity of utility maximization under weak convergence","element":"span"},{"text":", Math. Financ. Econ., 14 (2020), pp. 725–757.","element":"span"}],[{"id":"id-19","text":"[8] E. Bayraktar, L. Dolinskyi, and Y. Dolinsky, ","element":"span"},{"text":"Extended weak convergence and utility maximisation with proportional transaction costs","element":"span"},{"text":", Finance Stoch., 24 (2020), pp. 1013–1034.","element":"span"}],[{"id":"id-48","text":"[9] D. P. Bertsekas and J. N. Tsitsiklis, ","element":"span"},{"text":"Neuro-Dynamic Programming","element":"span"},{"text":", Athena Scientific, Belmont, MA, 1996.","element":"span"}],[{"id":"id-67","text":"[10] S. I. Birbil, S.-C. Fang, J. Frenk, and S. Zhang, ","element":"span"},{"text":"Recursive approximate of the high dimensional MAX function","element":"span"},{"text":", Oper. Res. Lett., 33 (2005), pp. 450–458.","element":"span"}],[{"id":"id-68","text":"[11] P. Blanchard, D. J. Higham, and N. J. Higham, ","element":"span"},{"text":"Accurate computation of the log-sum-exp and softmax functions","element":"span"},{"text":", preprint (2019) ","element":"span"},{"href":"http://arxiv.org/abs/1909.03469","text":"arXiv:1909.03469","element":"a"},{"text":". Accepted in IMA J. Numer. Anal., https://doi.org/10.1093/imanum/draa038.","element":"span"}],[{"id":"id-26","text":"[12] O. Bokanowski, S. Maroso, and H. Zidani, ","element":"span"},{"style":{"height":16.4},"width":942.24,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/30-0.png","element":"img","alt":" Some convergence results for Howard’s algorithm,","inline":true,"padRight":true},{"text":"SIAM J. Numer. Anal., 47 (2009), pp. 3001–3026.","element":"span"}],[{"id":"id-54","text":"[13] R. Buckdahn and T. Y. Nie, ","element":"span"},{"text":"Generalized Hamilton-Jacobi-Bellman equations with Dirichlet boundary condition and stochastic exit time optimal control problem","element":"span"},{"text":", SIAM J. Control Optim., 54 (2016), pp. 602–631.","element":"span"}],[{"id":"id-45","text":"[14] S. Chaumont, ","element":"span"},{"text":"Uniqueness to elliptic and parabolic Hamilton–Jacobi–Bellman equations with non-smooth boundary","element":"span"},{"text":", C.R. Math. Acad. Sci. Paris, 339 (2004), pp. 555–560.","element":"span"}],[{"id":"id-12","text":"[15] C. Chen and O. L. Mangasarian, ","element":"span"},{"text":"Smoothing methods for convex inequalities and linear complementarity problems","element":"span"},{"text":", Math. Program., 71 (1995), pp. 51–69.","element":"span"}],[{"id":"id-40","text":"[16] Y.-Z. Chen and L.-C. Wu, ","element":"span"},{"text":"Second Order Elliptic Equations and Elliptic Systems","element":"span"},{"text":", Transl. Math. Monogr. 174, AMS, Providence, RI, 1998.","element":"span"}],[{"id":"id-85","text":"[17] P. Ciarlet, ","element":"span"},{"text":"Linear and Nonlinear Functional Analysis with Applications","element":"span"},{"text":", Appl. Math. 130, SIAM, Philadelphia, 2013.","element":"span"}],[{"id":"id-78","text":"[18] P. Dr´abek, ","element":"span"},{"style":{"height":16.4},"width":978.52,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/30-1.png","element":"img","alt":" Continuity of Nemyckij’s operator in H¨older spaces","inline":true},{"text":", Comm. Math. Univ. Carolinae, 16 (1975), pp. 37–57.","element":"span"}],[{"id":"id-0","text":"[19] W. H. Fleming and H. M. Soner, ","element":"span"},{"text":"Controlled Markov Processes and Viscosity Solutions","element":"span"},{"text":", 2nd ed., Springer, New York, 2006.","element":"span"}],[{"id":"id-7","text":"[20] P. Forsyth and G. Labahn, ","element":"span"},{"text":"Numerical methods for controlled Hamilton-Jacobi-Bellman PDEs in finance","element":"span"},{"text":", J. Comput. Finance, 11 (2007/2008, Winter), pp. 1–43.","element":"span"}],[{"id":"id-38","text":"[21] M. Geist, B. Scherrer, and O. Pietquin, ","element":"span"},{"text":"A theory of regularized Markov decision processes","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/1901.11275","text":"arXiv:1901.11275","element":"a"},{"text":", 2019.","element":"span"}],[{"id":"id-11","text":"[22] D. Gilbarg and N. Trudinger, ","element":"span"},{"text":"Elliptic Partial Differential Equations of Second Order","element":"span"},{"text":", 2nd edition, Springer-Verlag, Berlin, New York, 1985.","element":"span"}],[{"id":"id-49","text":"[23] H. Gu, X. Guo, X. Wei, and R. Xu, ","element":"span"},{"text":"Dynamic programming principles for learning MFGs","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/1911.07314","text":"arXiv:1911.07314","element":"a"},{"text":", 2019.","element":"span"}],[{"text":"[24] X. Guo, A. Hu, R. Xu, and J. Zhang, ","element":"span"},{"text":"A general framework for learning mean-field games","element":"span"},{"text":", ","element":"span"},{"id":"id-8","text":"preprint, ","element":"span"},{"href":"http://arxiv.org/abs/2003.06069","text":"arXiv:2003.06069","element":"a"},{"text":", 2020.","element":"span"}],[{"text":"[25] T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, ","element":"span"},{"text":"Reinforcement learning with deep energy- ","element":"span"},{"id":"id-29","text":"based policies","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/1702.08165","text":"arXiv:1702.08165","element":"a"},{"text":", 2017.","element":"span"}],[{"text":"[26] K. Ito, C. Reisinger, and Y. Zhang, ","element":"span"},{"text":"A neural network based policy iteration algorithm ","element":"span"},{"style":{"height":18.73},"width":1442.68,"height":46.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/31-0.png","element":"img","alt":"with global H2-superlinear convergence for stochastic games on domains","inline":true},{"text":", preprint (2019) ","element":"span"},{"href":"http://arxiv.org/abs/1906.02304","text":"arXiv:1906.02304","element":"a"},{"text":". Accepted in Found. Comput. Math., https://doi.org/10.1007/s10208- 020-09460-1.","element":"span"}],[{"id":"id-20","text":"[27] A. D. Kara and S. Y¨uksel, ","element":"span"},{"text":"Robustness to incorrect system models in stochastic control","element":"span"},{"text":", SIAM J. Control Optim., 58 (2020), pp. 1144–1182.","element":"span"}],[{"id":"id-64","text":"[28] B. W. Kort and D. P. Bertsekas, ","element":"span"},{"text":"A new penalty function algorithm for constrained minimization","element":"span"},{"text":", in Proceedings of the 1972 IEEE Conference on Decision and Control, New Orleans, Louisiana, 1972.","element":"span"}],[{"id":"id-17","text":"[29] N. V. Krylov, ","element":"span"},{"text":"Controlled Diffusion Processes","element":"span"},{"text":", Springer-Verlag, Berlin, 1980.","element":"span"}],[{"id":"id-14","text":"[30] H.J. Langen, ","element":"span"},{"text":"Convergence of dynamic programming models","element":"span"},{"text":", Math. Oper. Res., 6 (1981), pp. 493–512.","element":"span"}],[{"text":"[31] H. Mania, S. Tu, and B. Recht, ","element":"span"},{"text":"Certainty equivalence is efficient for linear quadratic control","element":"span"},{"text":", in Advances in Neural Information Processing Systems, 2019, pp. 10154–10164.","element":"span"}],[{"id":"id-44","text":"[32] Y. S. Mishura and A. Y. Veretennikov, ","element":"span"},{"text":"Existence and uniqueness theorems for solutions of McKean-Vlasov stochastic equations","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/1603.02212","text":"arXiv:1603.02212","element":"a"},{"text":", 2016.","element":"span"}],[{"id":"id-6","text":"[33] O. Nachum, M. Norouzi, K. Xu, and D. Schuurmans, ","element":"span"},{"text":"Bridging the gap between value and policy based reinforcement learning","element":"span"},{"text":", preprint, ","element":"span"},{"href":"http://arxiv.org/abs/1702.08892","text":"arXiv:1702.08892","element":"a"},{"text":", 2017.","element":"span"}],[{"id":"id-72","text":"[34] R. Nugari, ","element":"span"},{"style":{"height":16},"width":1170.52,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2001.03148/images/31-1.png","element":"img","alt":" Further remarks on the Nemitskii operator in H¨older spaces","inline":true},{"text":", Comment. Math. Univ. Carolin. 34 (1993) pp. 89–95.","element":"span"}],[{"id":"id-65","text":"[35] J. M. Peng, ","element":"span"},{"text":"A smoothing function and its applications","element":"span"},{"text":", in Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, M. Fukushima and L. Qi, ed., Kluwer, Dordrecht, 1998, pp. 293–316.","element":"span"}],[{"id":"id-66","text":"[36] J. Peng and Z. Lin, ","element":"span"},{"text":"A non-interior continuation method for generalized linear complementarity problems","element":"span"},{"text":", Math. Program., 86 (1999), pp. 533–563.","element":"span"}],[{"id":"id-93","text":"[37] R. A. Poliquin and R. T. Rockafellar, ","element":"span"},{"text":"Proto-derivative formulas for basic subgradient mappings in mathematical programming","element":"span"},{"text":", Set-Valued Anal., 2 (1994), pp. 275–290.","element":"span"}],[{"id":"id-4","text":"[38] R. T. Rockafellar, ","element":"span"},{"text":"Convex Analysis","element":"span"},{"text":", Princeton University Press, Princeton, NJ, 1970.","element":"span"}],[{"text":"[39] R. S. Sutton and A. G. Barto, ","element":"span"},{"text":"Reinforcement Learning: An Introduction","element":"span"},{"text":", MIT Press, Cam- ","element":"span"},{"id":"id-28","text":"bridge, MA, 1998.","element":"span"}],[{"text":"[40] I. Smears and E. S¨uli, ","element":"span"},{"text":"Discontinuous Galerkin finite element approximation of Hamilton-Jacobi-Bellman equations with Cordes coefficients","element":"span"},{"text":", SIAM J. Numer. Anal., 52 (2014), pp. 993–1016,","element":"span"}],[{"id":"id-39","text":"[41] I. Smears and E. S¨uli, ","element":"span"},{"text":"Discontinuous Galerkin finite element methods for time-dependent ","element":"span"},{"id":"id-9","text":"Hamilton-Jacobi-Bellman equations with Cordes coefficients","element":"span"},{"text":", Numer. Math., (2015), pp. 1–36.","element":"span"}],[{"text":"[42] H. Wang, Z. T. Zariphopoulou, and X. Zhou, ","element":"span"},{"text":"Exploration versus exploitation in reinforcement ","element":"span"},{"id":"id-10","text":"learning: a stochastic control approach","element":"span"},{"text":", J. Mach. Learn. Res., 21(2020). pp. 1–34.","element":"span"}],[{"text":"[43] H. Wang and X. Zhou, ","element":"span"},{"text":"Continuous-time mean-variance portfolio selection: A reinforcement ","element":"span"},{"id":"id-33","text":"learning framework","element":"span"},{"text":", Math. Finance, 30 (2020), pp. 1273–1308.","element":"span"}],[{"text":"[44] J. Yong and X. Zhou, ","element":"span"},{"text":"Stochastic Controls: ","element":"span"},{"text":"Hamiltonian Systems and HJB Equations","element":"span"},{"text":", ","element":"span"},{"id":"id-13","text":"Springer, New York, 1999.","element":"span"}],[{"text":"[45] I. Zang, ","element":"span"},{"text":"A smoothing-out technique for min-max optimization","element":"span"},{"text":", Math. Program., 19 (1980), ","element":"span"},{"id":"id-5","text":"pp. 61–77.","element":"span"}],[{"text":"[46] B. D. Ziebart, A. L. Maas, J. A. Bagnell, and A. K. Dey, ","element":"span"},{"text":"Maximum entropy inverse reinforcement learning","element":"span"},{"text":", In AAAI, volume 8, pp. 1433–1438. Chicago, IL, USA, 2008.","element":"span"}]]}],"_version":"3.3.2"},"paperNode":"$28:props:children:props:children:0:props:product"}]]