A LINEAR FINITE-DIFFERENCE SCHEME FOR APPROXIMATING RANDERS DISTANCES ON CARTESIAN GRIDS ∗

. Randers distances are an asymmetric generalization of Riemannian distances, and arise in optimal control problems subject to a drift term, among other applications. We show that Randers eikonal equation can be approximated by a logarithmic transformation of an anisotropic second order linear equation, generalizing Varadhan’s formula for Riemannian manifolds. Based on this observation, we establish the convergence of a numerical method for computing Randers distances, from point sources or from a domain’s boundary, on Cartesian grids of dimension 2 and 3, which is consistent at order 2 / 3, and uses tools from low-dimensional algorithmic geometry for best eﬃciency. We also propose a numerical method for optimal transport problems whose cost is a Randers distance, exploiting the linear structure of our discretization and generalizing previous works in the Riemannian case. Numerical experiments illustrate our results.


Introduction
A Randers metric is the sum of a Riemannian metric and of an anti-symmetric perturbation, suitably bounded and defined by linear form. By construction, a Randers metric is in general non-symmetric, and so is the associated path-length distance, see Remark 1.3 on terminology. Such metrics can account, in a very simple manner, for the fact that moving a vehicle uphill, or advancing a boat against water currents, costs more than the opposite operation. The asymmetry embedded in Randers metrics opens up numerous applications which cannot be addressed with the simpler Riemannian metrics, ranging from general relativity [46] to image segmentation [18], through quantum vortices [1] and path curvature penalization [14], see Remark 1.1.
In this paper, we present a numerical scheme for computing Randers distances by solving a linear second order Partial Differential Equation (PDE). Our approach is based on a generalization of Varadhan's formula [52], which is commonly used to compute Riemannian distances [23]. Let us emphasize that Randers distances also obey a non-linear first order PDE [3], which can be solved directly numerically [39,42], yet the linear structure of the PDE formulation considered in this paper has a number of computational advantages, including easier numerical implementation, faster computation in some cases, and smoothness of the numerical solution, see Remark 1.2. Some of our results, such as the identification of the optimal scaling of the relaxation parameter ε w.r.t. the grid scale h, and the proof of convergence in the case of point sources, are new as well in the special cases of isotropic and Riemannian metrics. We present an application to numerical optimal transportation, enabled by the linear structure of the discretization, with an asymmetric cost function defined as the Randers distance between the source and target, generalizing previous works limited to Riemannian costs [25].
In order to make our statements more precise, we need to introduce some notations. Throughout this paper, Ω ⊂ R d denotes a smooth bounded and connected open domain, equipped with a metric F : Ω × R d → [0, ∞[, whose explicit form is discussed below (1.2). The corresponding path-length and distance are defined by length F (γ) := We denoted by γ an element of the collection Γ := Lip([0, 1], Ω) of locally Lipschitz paths within the domain closure, and by Γ y x ⊂ Γ the subset of paths from x ∈ Ω to y ∈ Ω. We assume in this paper that F has the structure of a Randers metric: there exists a field M : Ω → S ++ d of symmetric positive definite matrices, and a vector field ω : Ω → R d , both having Lipschitz regularity, and such that for all x ∈ Ω and all v ∈ R d one has We denoted by ·, · the standard Euclidean scalar product, and by |v| M := v, M v the anisotropic (but symmetric) norm on R d defined by a symmetric positive definite matrix M . The smallness constraint (1.2, right) ensures that F x (v) > 0 for all v = 0, x ∈ Ω. Randers metrics include Riemannian metrics as a special case, when the vector field ω vanishes identically over the domain. See Figure 3 for an illustration of their unit balls, distance maps, and minimal paths.
Our approach to the computation of Randers distances goes through the solution to a linear second order PDE, depending on a small parameter ε > 0, and some boundary data g ∈ C 0 (∂Ω, R) u ε + 2ε b, ∇u ε − ε 2 Tr(A b ∇ 2 u ε ) = 0 in Ω, u ε = exp(−g/ε) on ∂Ω, (1.3) where A b is a field of symmetric positive definite matrices, and b is a vector field, depending in a simple algebraic manner on the Randers metric parameters M and ω, see (2.2) and (2.3). In the Riemannian special case one has A b = M −1 and b = ω = 0, consistently with [52]. We establish in Theorem 2.12, following [4], that for all x ∈ Ω u(x) := lim ε→0 −ε ln u ε (x) exists and satisfies u(x) = min p∈∂Ω g(p) + dist F (p, x). (1.4) In other words, u is the Randers distance from the boundary ∂Ω, with initial time penalty g, see Section 4 for the case of point sources. Equation (1.4) is a generalization to Randers metrics of the Varadhan formula [52], and uses a logarithmic change of variables often referred to as the Hopf-Cole transformation [31], see Section 2.4 for more discussion. Note that one often considers the opposite problem, of reaching a boundary point p ∈ ∂Ω from x, which is equivalent up to replacing the vector field ω with its opposite in (1.2), see Definition 2.3 and the discussion below. The distance map u also obeys the first order non-linear Hamilton-Jacobi-Bellman equation |∇u − ω| M −1 = 1 in Ω, u = g on ∂Ω, (1.5) in the sense of viscosity solutions (possibly with discontinuous boundary conditions) [3], which is numerically tractable [39,42] as well. The point of this paper is however to study the linear approach (1.3) which has a number of advantages from the computational point of view, see Remark 1.2. For that purpose we present a finite difference discretization of (1.3) on the Cartesian grid Ω h := Ω ∩ hZ d , of dimension d ∈ {2, 3}, denoting by h > 0 the grid scale, and reading as follows: where δ e h and ∆ e h denote the standard centered and second order finite differences (3.2), modified close to ∂Ω to account for the Dirichlet boundary conditions, see (3.4) and (3.3). We denoted by ρ i (x) ≥ 0 and e i (x) ∈ Z d , 1 ≤ i ≤ I = d(d + 1)/2 the weights and offsets of Selling's decomposition [20,47] of the matrix A b (x), a tool from lattice geometry which is convenient for the design of anisotropic finite differences schemes [10,29,41,42] in dimension d ∈ {2, 3}, see Appendix B. Denoting by u h ε the solution of (1.6) we prove in Theorem 3.19 that −ε ln u h ε → u as (ε, h/ε) → 0. The case of point sources requires the additional assumption ε ln h → 0, and its proof involves fine properties of Selling's decomposition, see Theorem 4.1. The optimal consistency order is achieved when ε = h where µ and ν are given probability measures on Ω, and Π(µ, ν) is the set of probability measures on Ω × Ω whose first and second marginals coincide respectively with µ and ν. The proposed implementation relies on Sinkhorn's matrix scaling algorithm [49], and the linear structure of (1.3). We emphasize that the matrix of costs (c(x, y)) x,y∈Ω h is never numerically constructed, and would not fit in computer memory, but instead that the adequate matrix-vector products are evaluated by solving finite difference equations similar to (1.6), in an efficient manner thanks to a preliminary sparse matrix factorization. Let us acknowledge here that, in contrast with the Riemannian case [50], our approach does not extend to the quadratic cost c(x, y) = dist F (x, y) 2 . Indeed, the natural route to address the quadratic cost is through short time asymptotic estimates of the diffusion equation [50], which is a linear PDE in the Riemannian setting but a non-linear PDE in Randers case [44], see Remark 4.5. In contrast the linear cost (1.7, right) is addressed using the Poisson equation (1.3), which is linear in both the Riemannian and Randers setting. There are alternative numerical approaches to Monge's optimal transport problem, such as [7] which uses a second order cone program, see Remark 5.2.

Contributions
We establish that the solution to a linear second order PDE converges, as a relaxation parameter ε → 0 and after a logarithmic transformation, to the Randers distance from the domain boundary. We propose a finite difference discretization of that linear PDE, on a Cartesian grid of scale h, and establish convergence of the numerical solutions as ε → 0 and h/ε → 0, with optimal consistency when ε = h 2 3 . We extend the approach to the case of point sources, under the additional condition ε ln h → 0. Such a convergence analysis, for geodesic distance computation methods based on the heat or Poisson PDEs, was presented as an open problem in earlier works [24], which were in addition limited to the special case Riemannian metrics. We propose a computational method for optimal transport with Randers distance as cost. Numerical experiments illustrate our results.

Outline
We recall in Section 2 the definition of Randers distances and introduce an extension of Varadhan's formula to Randers manifolds. We describe the coefficients of (1.3) in terms of the Randers metric (1.2), and prove the convergence result (1.4).

Conventions and notations
We denote by | · | the Euclidean norm on R d , and by S d , S + d , and S ++ d the sets of symmetric, symmetric positive semidefinite, and symmetric positive definite matrices of size d × d respectively. Consistently with this notation, we let R From now on, we consider an open, bounded, connected, and nonempty domain Ω ⊂ R d with a W 3,∞ boundary. The unknowns to the partial differential equations, and to their discretization schemes, are distinguished by typography: u for the linear second order PDEs (1.3) or numerical scheme (1.6) and variants, and u for the non-linear PDE (1.5) and related.

Elements of Randers geometry
A Randers metric is defined as the sum of a Riemannian metric, and of a suitably bounded linear term (1.2). We present in Section 2.1 these geometrical objects in more detail, making connections with Zermelo's navigation problem in Section 2.2, see [16] for a review. The eikonal equation (1.5) is discussed in Section 2.3, and its linear variant (1.3) in Section 2.4. We establish in Theorem 2.12 the existence of a solution u ε to the linear PDE (1.3), and the convergence of u ε := −ε ln u ε to the value function of the arrival time problem (1.4) as the relaxation parameter ε > 0 vanishes. The proof, based on the theory of viscosity solutions to degenerate elliptic PDEs, is postponed to Appendix A.
Before specializing to the case of Randers geometry, we briefly recall here the generic or Finslerian definition of a non-symmetric norm, dual-norm, metric, and path-length distance, and some of their elementary properties.
Definition 2.1. A function F : R d → R + is a norm iff it is convex, positively 1-homogeneous, and vanishes only at the origin. The dual norm F * : R d → R + is defined for all v ∈ R d by F * (v) := max{ v, w ; w ∈ R d , F (w) ≤ 1}. (2.1) Note that Definition 2.1 implies the triangular inequality, since 1 2 F (v + w) = F ( v+w 2 ) ≤ 1 2 (F (v) + F (w)) for any v, w ∈ R d , using successively the positive 1-homogeneity and the convexity of F . The dual norm is equivalently characterized as F * (v) := max{ v, w /F (w); |w| = 1}, by homogeneity of F . Conventionally, Definition 2.1 defines a quasi -norm, whereas a norm is subject to the additional symmetry axiom F (v) = F (−v) for all v ∈ R d . However the prefix "quasi" before norms, metrics and distances is dropped in this paper for readability, as already mentioned in Remark 1.3. The following facts, stated without proof, are standard results of convex analysis and Finsler geometry. Lemma 2.2 (Norm duality). Any norm F on R d is Lipschitz continuous on R d , and as a result the extremum in (2.1) is indeed attained, for any v ∈ R d . The dual norm F * is also a norm, and furthermore one has F * * = F identically on R d .
The dual metric F * is defined pointwise from the dual norms. The related path length and distance are defined from (1.1) and denoted length F and dist F .
Let us emphasize that dist F (x, y) = dist F (y, x) in general, for x, y ∈ Ω, since norms and metrics are not assumed here to be symmetric. However the triangular inequality dist F (x, z) ≤ dist F (x, y) + dist F (y, z), and the separation axiom (dist F (x, y) = 0 iff x = y), hold for all x, y, z ∈ Ω under the conditions of Lemma 2.4 below. In the special case where F x = F is a constant metric, and [x, y] ⊂ Ω, one has dist F (x, y) = F (y − x).

Lemma 2.4 (Path-length distance).
Let Ω ⊂ R d be a bounded connected domain with smooth boundary and equipped with a metric F. Then the extremum (1.1) defining dist F (x, y) is attained, for any x, y ∈ Ω, and defines a distance over Ω. Furthermore there exists 0 < c ≤ C such that c|x − y| ≤ dist F (x, y) ≤ C|x − y| for all x, y ∈ Ω.

Algebraic structure of Randers metrics
Randers norms are defined by analogy to Randers metrics (1.2), as the sum of a symmetric part defined from a symmetric positive definite matrix, and of an anti-symmetric linear part, subject to a compatibility condition.
The dual to a Randers norm also is a Randers norm, as shown in the following lemma, whose proof follows from the explicit expression established in ( [39], Prop. 4.1) and the formula for the inverse of a 2 × 2 block matrix [37]. Lemma 2.6 (Randers dual norm [39]). The dual to a Randers norm F of parameters (M, ω) is also a Randers norm, of parameters (A, b) characterized by the following relation: denoting α := 1 − ω, M −1 ω > 0, In the special case where ω = 0, one obtains A = M −1 , b = 0, and α = 1, recovering the well known fact that the dual to a Riemannian norm is also a Riemannian norm, defined by the inverse symmetric matrix. The positive definiteness, hence the invertibility, of the block matrix in (2.2, rhs) follows from the equivalences (2.4) below applied to (M, ω). The duality formula in (2.2) is only the first of a family of algebraic identities associated with Randers norms, presented in Lemma 2.7 below, and used to reformulate the PDEs (1.3) and (1.5). For that purpose, we need to introduce some notation. For any A ∈ S d , and any b ∈ R d we let (2. 3) The Schur complement formula yields the following positive-definiteness equivalences: and ω : Ω → R d are Lipschitz fields obeying |ω| M −1 < 1 pointwise on Ω (which is compact by assumption), then the fields A : Ω → S ++ d and b : Ω → R d defined by Lemma 2.6 as the dual Randers parameters are also Lipschitz, since matrix inversion is differentiable, and obey the equivalent properties (2.4) pointwise on Ω. The following lemma provides several equivalent characterizations of the unit ball associated with a Randers norm, and ends this subsection.
Proof. Note that the second line can be deduced from the first one, by exchanging the role of the Randers norm and of its dual norm. The positive definiteness of A b and M ω follows from (2.4) and Definition 2.5. Under the assumptions of the lemma, one has the equivalences and likewise with strict inequalities, which implies (2.5, left equivalence). The only difficulty lies in the reverse implication of the second equivalence: we must exclude the case where |v| M ≤ ω, v − 1, and indeed this is in Denoting by F the Randers norm of parameters M, ω, and by F * the dual norm, one has A similar equivalence can be obtained with strict inequalities for any w = 0, which concludes the proof of (2.5, right equivalence) and of this lemma.

Zermelo's navigation problem
Zermelo [2] considers a vehicle able to move at speed at most c(x) ∈ R ++ relative to a given medium, which itself is subject to a drift η(x) ∈ R d , where x ∈ Ω is the position. Typically, the vehicle is described as a boat subject to water currents, or as a flying object subject to air currents. The set admissible absolute velocities v at the point x is thus characterized by the following relation (2.7) Given two endpoints x, y ∈ Ω, Zermelo's navigation problem asks for the smallest time T = T η c (x, y) ≥ 0 such that there exists γ ∈ Lip([0, T ], Ω) obeying |γ (t) − η(γ(t))| ≤ c(γ(t)) for a.e. t ∈ [0, T ], and γ(0) = x, γ(T ) = y. In other words, T η c (x, y) is the minimal time from x to y subject to the velocity constraints (2.7). The vehicle described by Zermelo's problem is locally controllable at x ∈ Ω iff |η(x)| < c(x), in other words iff the drift velocity norm is smaller than the maximum relative vehicle speed. The following classical result [2] shows that, under this assumption, Zermelo's problem can be reformulated as a Randers minimal path problem. Proposition 2.8. Let c : Ω → R, and η : Ω → R d be continuous and obey c > 0 and |η| < c, pointwise on Ω. Consider the Randers metric F * of parameters A = c 2 I d and b = η on Ω, as well as its dual F * * = F. Then for all x, y ∈ Ω and ω : Ω → R d be parameters of the Randers metric F. The distance dist F (x, y) is the smallest time T for which there exists a path γ ∈ Lip([0, T ], Ω) obeying 1 ≥ F γ(t) (γ (t)) := |γ (t)| M (γ(t)) + ω(γ(t)), γ (t) for a.e. t ∈ [0, T ], and γ(0) = x, γ(T ) = y. Indeed, this follows from the definition (1.1) and by reparametrization of any Lipschitz path at unit speed w.r.t. the metric F. From this point, the announced result follows from the equivalence of 1 ≥ F x (v) := |v| M (x) + ω(x), v with (2.7), established in (2.5).

The Eikonal equation
Consider a domain Ω, equipped with a Randers metric F with Lipschitz coefficients on Ω, and a penalty function g ∈ C 0 (∂Ω, R). We are interested in the following value function u : Ω → R, corresponding to the minimal time to reach x ∈ Ω from a boundary point p ∈ ∂Ω, with initial time penalty g(p): (2.8) We prove in Theorem A.9 that (2.8) is a viscosity solution, see Definition A.2, to the first order non-linear PDE The boundary condition u = g on ∂Ω is satisfied in a strong sense if g(x) ≤ g(p) + dist F (p, x) for all x, p ∈ ∂Ω, but in the weak sense of Definition A.2 otherwise. The comparison principle Theorem A.8 implies that the viscosity solution is uniquely determined in Ω.
Corollary 2.9. If F is a Randers metric of parameters M, ω, and dual parameters A, b, then the eikonal PDE (2.9, left) admits the following three equivalent formulations in Ω in the sense of viscosity solutions Proof. The equation F * x (∇u(x)) = 1 is a shorthand for (2.10, left) at x ∈ Ω, see Definition 2.5 of a Randers norm. It is equivalent to (2.10, center and right) by (2.6).
In applications, computing the value function (2.8) is often only a means to obtain the globally optimal path γ from ∂Ω to an arbitrary point x * ∈ Ω. This path can be extracted by solving, backwards in time, the following Ordinary Differential Equation (ODE), see e.g. ( [27], Appendix C) for all x ∈ Ω. The ODE needs to be solved on the interval [0, T ] where T = u(x * ), with terminal condition γ(T ) = x * . By dF * x (w) we denote the derivative of F * x w.r.t. the variable w, where x ∈ Ω is fixed. Corollary 2.10. The following expressions are positively proportional to the geodesic flow V defined by (2.11, right), at all points where u is differentiable Proof. Fix a point x ∈ Ω where u is differentiable, and denote v := ∇u(x). Introduce the Randers norm F * = F * x whose parameters are denoted A ∈ S ++ d and b ∈ R d , in such way that F * (v) = 1 by (2.9). Differentiating (2.12, left). The three expressions (2.6) vanish, and their respective gradients w.r.t. v are Since g 1 , g 2 and g 3 are orthogonal to the same level set, and point outward of it, they are positively proportional. The result follows.

Varadhan's formula
Varadhan's formula is based on a logarithmic transformation of the unknown [52], also known as the Hopf-Cole transformation 2 see [31] and Remark 2.14, which turns the linear PDE (2.13) into the non-linear PDE (2.14). The point of this transformation is that, with a proper scaling of the unknown and of the PDE coefficients, a relaxation parameter ε > 0 is eliminated from the boundary conditions and from all the PDE coefficients except one, of principal order. Lemma 2.11 (Logarithmic transformation). Let ε > 0, and let u ε be a viscosity solution to where Ω ⊂ R d is a smooth bounded domain, A b : Ω → S ++ d and b : Ω → R d are Lipschitz, and ε > 0. Then u ε := −ε ln u ε is a viscosity solution to the PDE (2.14) Lemma 2.11 is an immediate consequence of Corollary A.5 established in Appendix A. For later convenience, we introduce the following PDE operators on the domain Ω 15) and observe that, formally, one has S ε u = −e u ε L ε (e − u ε ). The following result relies on the framework of viscosity solutions to take the limit ε → 0 in S ε , letting the second order "viscous" term −ε Tr(A b ∇ 2 u) vanish, and recovering in the limit a first order non-linear equation equivalent to the Randers eikonal equation, see Corollary 2.9.
Theorem 2.12 (Vanishing viscosity limit). The PDE (2.13) admits a unique viscosity solution in Ω. In addition u ε := −ε ln u ε converges locally uniformly in Ω to the value function (2.8), associated with the Randers metric F whose dual metric F * has parameters (A, b).
The elements of proof relying on the concept of viscosity solutions are postponed to Appendix A. In particular, uniqueness of the solution to (2.13) follows from the comparison principle Proposition A.7, see [4]. Convergence as ε → 0 is established in Theorem A.12. We limit our attention here to the existence of a solution to (2.13), which is based on the interpretation of u ε as an expectation of a cost associated with a stochastic process, see Remark 2.13 for an alternative proof. Fix ε > 0, and introduce the stochastic process (X x,ε t ) t≥0 defined as where (W t ) t≥0 is a d-dimensional Wiener process. Define also the exit time τ x,ε by Since Ω is bounded, and A b is positive definite, the exit time τ x,ε is almost surely finite. Thus X x,ε t is a Brownian motion starting at x, with drift 2εb, and whose fluctuations are scaled by ε √ 2A b . According to the Feynman-Kac formula, see Theorem A.11 in Appendix A, the following expectation is the unique solution to the PDE (2.13) In particular, u ε is positive. In the framework of the stochastic approach, Theorem 2.12 expresses the convergence as ε → 0 of the following soft-minimum where div(A b ) denotes column-wise divergence, assuming that A b is continuously differentiable. Indeed, this amounts to replacing in (2.13) the vector field b defining the first order term with b ε := b − ε 2 div(A b ). This small perturbation is easily handled in the setting of viscosity solutions, and the same limit (1.4) is obtained as ε → 0. The divergence form Laplacian is often preferred in applications [23] since it is simpler to implement numerically on some geometries, such as triangulated surfaces using finite elements.
We discuss in the following how the divergence form variant leads, under suitable assumptions, to an alternative and perhaps more elementary proof of the wellposedness of (2.13). On the other hand, the proof based on the stochastic expectancy formula (2.18) benefits an illuminating interpretation in terms of random paths which concentrate along geodesics as ε → 0, and fits in the framework of viscosity solutions which is ideally suited to study the vanishing viscosity limit.
Remark 2.14 (Burgers equation and the Hopf-Cole transformation). The non-linear PDE of Burgers arises in various contexts, ranging from fluid mechanics to traffic flow, and is also a model equation to study the interaction of the advection and diffusion operators, and the formation of shock waves. The d-dimensional instantiation of Burgers PDE asks for a vector field v : R + × Ω → R d defined on a domain Ω ⊂ R d and such that with suitable initial and boundary conditions, and where ν ≥ 0 is known as the inverse Reynolds number. If ν = 0 then discontinuous solutions can be obtained in finite time from smooth initial conditions, and entropy conditions must be imposed to single out the correct physical solution [26]. The Hopf-Cole linearization of Burgers equation assumes that (i) ν > 0, and (ii-a) the dimension is d = 1 [31], or (ii-b) the flow v is irrotational [15] (this property is formally conserved over time by solutions of Burgers equation (2.20)). In that case there exists a potential u : R + × Ω → R such that v = ∇u, and obeying the following time-dependent Hamilton-Jacobi This evolution equation is closely related to the static PDE (2.14) of interest in this paper. The PDE (2.21) is formally equivalent to the linear heat PDE, similarly to Lemma 2.11 and under the assumption that ν > 0, using the following logarithmic change of variables: Conversely, any solution to the heat equation (2.22) yields an irrotational solution to Burgers equation, thus producing a variety of explicit analytical solutions to this non-linear PDE [15]. The Hopf-Cole transformation can also be used as a numerical tool to solve (2.20), either addressing the heat equation (2.22) via the Fast Fourier transform, or using more elaborate schemes intended to achieve stability in the vanishing viscosity limit as ν → 0 [33,45]. The PDE solution obtained in the vanishing viscosity limit ν → 0 is studied in detail in the seminal paper [31], and can be characterized by entropy conditions whose analysis [26] often relies on the HJB reformulation (2.21) and on the concept of viscosity solution.

The numerical scheme
We present a numerical implementation of the linear second order PDE (1.3) based on discrete degenerate elliptic finite differences, on a Cartesian discretization grid. This approach is chosen for the simplicity of its implementation and of the convergence analysis. Alternative discretizations may also be considered, for instance using finite elements on triangulated manifolds, see [23] and Remark 2.13; however, we are not aware of a convergence analysis under mesh refinement in this setting.
Throughout this section, we denote by h > 0 the grid scale of the Cartesian discretization grid, which is fixed unless otherwise specified, and we define the discrete domain as In our application, the values of u on ∂Ω are given by the Dirichlet boundary conditions, and the numerical implementation does not treat them as unknowns. For any u : Ω h → R, any x ∈ Ω h and any e ∈ Z d , we define the first order and second order centered finite differences operators as follows: If x is adjacent to ∂Ω, then (3.2) may involve values outside the domain Ω h , and thus be ill-defined. In order to address this issue, we consider u : Ω h → R which is also defined on the domain boundary. The following finite difference expressions make sense for arbitrary x ∈ Ω h , e ∈ Z d , and they reduce to (3.2) if [x − he, x + he] ⊂ Ω: where we denoted h e x := min{η > 0; x + ηe ∈ Ω h }. (3.5) Note that h e x ∈]0, h] by construction. If u ∈ C 4 (Ω) then one has the consistency relation ⊂ Ω, and r = 1 otherwise. In the next proposition we obtain, by linear combination, consistent finite differences approximations of linear PDE operators of order one and two.
Proposition 3.1. Let D ∈ S d , and let ω ∈ R d . Consider weights ρ i and offsets e i ∈ Z d , for all 1 ≤ i ≤ I, such that Then for u ∈ C 4 (Ω) and x ∈ Ω h one has ⊂ Ω for all 1 ≤ i ≤ I, and r = 1 otherwise.
As an immediate application, we define a finite difference discretization L ε h of the linear operator L ε defined in (2.15). For any u : with boundary condition u = exp(−g/ε) on ∂Ω. The weights ρ i = ρ i (x) and offsets e i = e i (x), 1 ≤ i ≤ I, provide a decomposition of the matrix A b = A b (x) in the sense of (3.6). Note that for the schemes (3.7) to be well-defined, it is crucial that the offsets involved in (3.6) have integer coordinates, and therefore the similar looking eigenvalue-eigenvector decomposition typically cannot be used since it involves arbitrary unit vectors. Obtaining a suitable decomposition is thus non-trivial in general, and it is also not unique. We rely in this paper on Selling's decomposition, which is defined in dimension d ∈ {2, 3}, and has the additional benefit of producing non-negative weights (ρ i ) 1≤i≤I and thus a discrete degenerate elliptic scheme, see Section 3.1 below.
Remark 3.2 (Approximation of the gradient, improved reconstruction, following [23]). The proof techniques used in this paper establish the uniform convergence of u ε h = −ε ln u ε h as (ε, h/ε) → 0, but unfortunately say nothing of its gradient. Therefore, the gradient reconstruction techniques presented below should be only regarded as heuristics. The geodesic backtracking method used in the numerical experiments Section 6, which is based on (2.11) and this gradient estimation, likewise does not come with any theoretical guarantee unfortunately.
An approximate gradient V ε h : Ω h → R d of the solution u ε h of (3.8) can be estimated using (3.7, left): The vector field V ε h is meant to approximate the gradient of Randers distance u from the boundary (1.4): it is negatively proportional to V ε h , reflecting the fact that logarithmic transformation is decreasing, and is normalized consistently with Randers eikonal equation (2.9). An empirical observation of [23], in the context of isotropic and Riemannian metrics which are special cases of Randers metrics (and using a different discretization), is that V ε h is for suitable parameters h, ε an excellent approximation of ∇u. In particular, it can be used for geodesic backtracking via (2.11) and (2.12). Following [23] we may also obtain an empirically improved reconstruction v ε h : Ω h → R of the Randers distance by minimizing where ρ −i := ρ i and e −i := e i for all 1 ≤ i ≤ I, and where the first order upwind finite difference δ e h is defined in (3.11). Equations (3.9, left) and (3.10) also make sense if one replaces the weights and offsets (ρ i , e i ) I i=1 and matrix A b used in the numerical scheme (3.8), with unit weights and the canonical basis and the identity matrix. However, the latter (and simpler) choice yields slightly less accurate results empirically as evidenced in our numerical experiments Section 6. In Figures 4 and 5 we refer to these post-processed distance maps as u A b h and u I2 h respectively. Remark 3.3 (Alternative discretizations, including finite elements). At the foundation of this paper is a second order linear PDE (1.3), an anisotropic variant of the static convection diffusion equation, which can be discretized using a variety of methods. Those eventually lead to a linear system of equations Lu = r, where L is a matrix of shape N × N , denoting by N the number of discretization points. The r.h.s. r ∈ R N is non-negative and accounts for the boundary condition exp(−g/ε) on ∂Ω as in (1.3), or for an arbitrary point source in Ω as in Section 4. For our application to geodesic distance computation using the Hopf-Cole logarithmic change of variables, a basic necessity is that the solution u ∈ R N is positive. This is best ensured if L −1 has positive entries, which itself is ensured if L is a nonsingular M -matrix. The numerical scheme proposed in this paper uses Selling's decomposition of the matrix field A b , see Theorem 3.9, to achieve discrete degenerate ellipticity (DDE), see Definition 3.4, which implies the M -matrix property as desired. Keeping in mind the M -matrix requirement, let us consider some alternative discretization methods.
-Finite elements yield an M -matrix, for the two dimensional usual Laplacian operator, if the triangulation has acute interior angles. For the anisotropic Laplacian div(A b ∇u) the geometric condition becomes are the vertices of an arbitrary triangle of the mesh, and v * := (v 0 + v 1 + v 2 )/3 is its barycenter. Generating a mesh of a domain Ω obeying this acuteness condition w.r.t. a varying metric M : Ω → S ++ d is a challenging problem. The established guarantees are far from this objective [35], and there are likely obstructions to the corresponding three dimensional problem. Interestingly, in the special case where the set of vertices is a two dimensional Cartesian grid, the discretized Dirichlet energy defined by finite elements on the anisotropic triangulation of [35] is equivalent to the energy associated to a finite differences scheme using Selling's offsets ( [29], Thm. 1) as in this paper, showing that the two adaptive discretization procedures are closely related.
-Narrow stencil finite differences schemes use a fixed set of neighbors on the Cartesian grid. These methods in general cannot produce DDE discretizations of the anisotropic Laplacian div(A b ∇u) or Tr(A b ∇u) when the anisotropy of A b is pronounced and is not aligned with the coordinate axes. This obstruction holds even if A b is constant. For instance the DDE property may be achieved with the common two dimensional eight point stencil, 2 under a general anisotropy orientation. In addition, Theorem 1.3 of [40] shows that Selling's decomposition yields the narrowest possible stencil (in two dimensions, and in the sense of convex hull inclusion) ensuring the DDE property.
-Wide stencil semi-Lagrangian schemes [6] use finite differences over an intermediate scale k such that h k 1, often k ≈ √ h. They obey the DDE property, and bypass Selling's decomposition. However these methods suffer from a low order consistency with the PDE, and slow convergence rates of the solution, making them ill suited for applications.
Finally, let us acknowledge that the requirement that L −1 has positive entries can empirically be alleviated if one is willing to replace the point source with a small Gaussian distribution, or to use the heat method of Remark 4.5 which only requires that L −n has positive entries for a sufficiently large number n of time steps. However the latter is more costly due to the time dependency, is limited to Riemannian metrics, and no convergence analysis under mesh refinement has been published to our knowledge. Numerical experiments using the heat method, based on finite elements on surface meshes with some badly shaped triangles, are presented in [23].

Discrete degenerate ellipticity
Discrete degenerate ellipticity is a counterpart to the degenerate ellipticity property of Hamilton-Jacobi-Bellman PDE operators [21,43], which is at the foundation of the theory of viscosity solutions, see Definition A.1.
Definition 3.4 (Discrete degenerate ellipticity [43]). Let X be a finite set, and let U := R X . A (finite difference) scheme on X is a function F : U → U. Such a function can be written in the form and the scheme is said discrete degenerate elliptic (DDE) ifF is non-decreasing w.r.t. the second variable, and w.r.t. the third variable componentwise. The scheme is said elliptic if u → F u − λu is degenerate elliptic for some λ > 0.
Similarly to its continuous counterpart, discrete ellipticity implies a comparison principle, used in the proof of the existence and uniqueness of solutions to discretized PDEs, and of their convergence to the continuous solutions as the grid scale is refined Section 3.3. For completeness, we present the proof of two basic but fundamental properties of discrete elliptic operators, see e.g. [43] for additional discussion.
Lemma 3.5 (Discrete comparison principle). Let F be an elliptic finite differences scheme on a finite set X, and let u, v : Assume for contradiction that u(x * ) > v(x * ), otherwise the result is proved. Then, by discrete degenerate which proves the result by contradiction.
We say that u is a sub-solution (resp. super-solution, resp. solution) of the scheme F , if F u ≤ 0 (resp. F u ≥ 0, resp. F u = 0) on X.
Corollary 3.6 (Solution to elliptic linear operators). If F is an affine (i.e. linear plus constant) and elliptic scheme on a finite set X, then there exists a unique solution u : X → R to F u = 0.
Proof. If F u = F v on X then u = v, by Lemma 3.5. Thus F : R X → R X is injective, hence by linearity it is bijective, and there exists a unique solution to F u = 0.
The finite difference schemes (3.3), (3.4), and (3.8) considered in this paper formally involve a function defined on the uncountable set Ω h = Ω h ∪ ∂Ω, which does not comply with the finiteness assumption in Definition 3.4. However this obstruction is only superficial, since only finitely many boundary values of u are actually involved these schemes, for any given h > 0. Alternatively, one may consider the Dirichlet boundary values of u as given constants rather than unknown variables in the scheme.
The simplest DDE operator is the opposite −δ e h of the upwind finite difference δ e h on Ω h , where h > 0 and e ∈ Z d , which is defined as The operator δ e h is modified similarly to (3.3) and (3.4) if [x, x + he] ⊂ Ω, and is first order consistent with a directional derivative: for any u : Ω h → R and any The opposite −∆ e h of the second order finite difference operator ∆ e h is also DDE. The centered finite difference operator δ e h is not DDE, but linear combinations with ∆ e h whose coefficients have suitable signs and obey suitable bounds satisfy this property, as shown in the next lemma. For that purpose, we observe the relations Proof. In view of (3.13) one has the equality of schemes µδ We conclude by observing that DDE schemes form a cone: linear combinations with non-negative coefficients of DDE schemes are DDE.
By Lemma 3.7, which applies regardless of the fact that ρ i and e i depend on the point x ∈ Ω h , each of these elementary operators is DDE when ρ i ≥ 0 and h|µ i | ≤ ε. Hence L ε h − Id is DDE, and therefore L ε h is elliptic with λ = 1 by Definition 3.4.
As announced in the introduction of this section, and in order to benefit from Lemma 3.5 and Corollary 3.6, we do want the discrete operator L ε h to be DDE. For that purpose, we introduce Selling's decomposition [20,47] of a positive definite matrix D ∈ S ++ d , where d ∈ {2, 3}, which is efficiently computable numerically via Selling's algorithm. In view of their key role in our numerical scheme, Selling's constructions and some of their properties are presented in more detail in Appendix B. Theorem 3.9 (Selling [47], this version [41]). Let D ∈ S ++ d , where d ∈ {2, 3}. Then there exists non-negative weights ρ i ≥ 0, and offsets In the rest of this section, we assume that the weights and offsets (ρ i (x), e i (x)) I i=1 used to define the scheme L ε h , see (3.8), are obtained from Selling's decomposition of the matrix A b (x), for all x ∈ Ω h . For the sake of readability, the dependency of ρ i and e i w.r.t. the base point x is often left implicit in the equations. The following proposition, stated without proof, immediately follows from Corollary 3.8 and Theorem 3.9.
Proposition 3.10. The scheme L ε h is elliptic provided that Ch ≤ ε, where The construction of finite difference schemes for linear and semi-linear PDEs using Selling's algorithm, and the compatibility conditions ensuring the DDE property, are discussed in more detail in [11]. More precisely, it is shown that Selling's decomposition yields an optimal finite differences discretization of linear second order operators such as L ε , in the sense that the stencil envelope Hull{±e i ; 1 ≤ i ≤ I} is the smallest possible one (in dimension d = 2), and that the restriction on the grid scale h in Corollary 3.8 is the weakest possible (up to a fixed multiplicative constant, in dimension d ∈ {2, 3}), among all possible DDE and second order consistent discretizations. In contrast to [11], the main focus of the present paper is not the characterization of the minimal conditions under which the DDE property holds : we rather provide a simple sufficient condition in Proposition 3.10, and then use the DDE property to establish the convergence of a numerical method.
Finally, let us mention an alternative discretization of the PDE operator L ε defined in (2.15), using upwind finite differences for the first order term, which is unconditionally stable but has a lower consistency order where (f j ) d j=1 is the canonical basis of R d , and σ j is the sign of b, f j .

Logarithmic transformation
We use a logarithmic transformation of the unknown to study the convergence of the solutions to the discrete schemes (3.8) and (3.14) as the relaxation parameter ε and the grid scale h tend to zero suitably, mimicking the approach used in the continuous case, see Section 2.4. Our first step is to describe the effect of the logarithmic/exponential transformation on a finite difference scheme.
We define the exponentially transformed scheme F ε as follows: for any x ∈ Ω h , with boundary condition u = g on ∂Ω, where u : Ω h → R. The scheme F ε is DDE, and furthermore if u is a sub-solution (resp. super-solution) of F ε , then u := exp(−u/ε) is a super-solution (resp. sub-solution) of F .
Proof. The two expressions of F ε u(x) given in (3.15), where x ∈ Ω h , are equivalent in view of the linearity ofF . The discrete degenerate ellipticity of F ε follows from the same property of F , and from the fact that t ∈ R → exp(t/ε) − 1 is non-decreasing.
Proposition 3.11 uses the scheme unknown transformation u = exp(−u/ε), in other words the inverse of the Hopf-Cole logarithmic transformation [31], which is classical in the study of relations between the heat, Poisson, and eikonal equations [23,52]. Beware however that, since the mapping t → exp(−t/ε) is decreasing, it exchanges the notions of sub-solutions and super-solutions, as in the final statement of Proposition 3.11. The exponentially transformed upwind finite difference is denoted δ e,ε h , and reads where x ∈ Ω h , e ∈ Z d , and assuming [x, x + he] ⊂ Ω. Otherwise replace h with h e x in the above expression, see (3.5). The next lemma approximates (3.16) in terms of the derivatives of u. Lemma 3.12. Let u ∈ C 3 (Ω) and 0 < h ≤ ε ≤ 1. Then for any x ∈ Ω h , and bounded e ∈ Z d , assuming [x, x + he] ⊂ Ω. Otherwise, replace h with h ε x in the above expression. Proof. The announced result immediately follows from (3.16) and the Taylor expansion ⊂ Ω, and r = 1 otherwise.
Proof. The operators ∆ e,ε h and δ Consistently with the continuous case (2.15), we denote by S ε h the exponential transformation of the finite differences scheme L ε h defined by (3.8). In other words, following Proposition 3.11 on Ω h , with boundary condition u = g on ∂Ω.
Proof. Denoting µ i := ρ i A −1 b b, e i we obtain as announced, using Corollary 3.13 in the second line, where ≈ denotes equality up to a O(h + h r /ε r ) error.
We obtain a consistency order of 2/3 in the domain interior, and 1/2 close to the boundary, by choosing ε as an optimal power of h (respectively ε = h 2/3 and ε = h 1/2 ). Corollary 3.15 (Consistency with the eikonal equation). For any u ∈ C 3 (Ω), any 0 < h ≤ ε ≤ 1, and any x ∈ Ω h one has where r is defined pointwise as in Proposition 3.14. Observing that α = r/(1 + r), and inserting ε = h α in this expression, one obtains the announced result.
The choice of parameter ε suggested by Corollary 3.15, when the solution u is sufficiently smooth, is a conservative value ε ≈ √ h on a boundary layer along ∂Ω, and a smaller value ε ≈ h 2 3 elsewhere. Nevertheless, we use for simplicity a single value of ε on the whole domain in our numerical experiments Section 6. The theoretical analysis of the convergence rates of the method, and of the actual effect of a differentiated choice of ε on those rates, is not developed in this paper and is an opportunity for future work.
The upwind scheme L ε,+ h obeys Proposition 3.14 but with r = 1 over all Ω h , and likewise Corollary 3.15 but with α = 1/2 over all Ω h , leading to the parameter choice ε ≈ √ h. Note that the choice ε = h α with α = r 1+r , considered in Corollary 3.15, minimizes the error term σ(h, ε) where the concavity of the logarithm was used for the second inequality. The parameter scaling h = cε, where c > 0 is a small but fixed positive constant, is commonly considered in applications [23] and appears to produce usable results in practice, but is not consistent asymptotically since σ(h, ch) → c r . This non-consistency leads to non-convergence, as illustrated by the following explicit solution: in the simplified setting where d = 1, A = 1 and b = 0, one easily checks that S ε h admits the solution u(x) = λx (with suitable boundary conditions) where the slope λ obeys with c = h/ε. The correct slope |λ| = 1 is thus only obtained as c = h/ε → 0.

Convergence
We establish the convergence of the logarithmically transformed solution to the numerical scheme L ε h , towards the solution of Randers eikonal equation as ε → 0 and h/ε → 0, see Theorem 3.19 which was announced in the introduction. The proof follows the lines of Theorem 2.1 in [6], and requires some preliminary steps establishing the stability and consistency of the proposed scheme. The arguments apply without modification to the less accurate but unconditionally stable scheme L ε,+ h . Note that, formally, the schemes S ε h and L ε h are defined over For any p ∈ R d with |p| sufficiently large, and for (ε, h/ε) small enough, the scheme S ε h admits a super-solution u : Ω h → R defined as the affine map Proof. Case of the sub-solution. One has S ε h u(x) = −1 for all x ∈ Ω h , in view of (3.8) and (3.15). In addition Indeed, recall that the matrix field A b : Ω → S ++ d is pointwise positive definite (2.4), and continuous over the compact set Ω, hence the smaller eigenvalue of A b (x) is positively bounded below over x ∈ Ω. Then by Proposition 3.14, one has As a consequence of Lemma 3.16, we establish in the next result that the scheme S ε h admit a unique solution, uniformly bounded as (ε, h/ε) → 0.
Corollary 3.17 (Stability). For sufficiently small (ε, h/ε), the scheme L ε h admits a unique solution u ε h , which is positive, and S ε h admits a unique solution u ε h , which obeys u ε h = −ε ln u ε h and satisfies u ≤ u ε h ≤ u on Ω h , where u and u are from Lemma 3.16.
Proof. By Proposition 3.11, the maps u ε := exp(−u/ε) and u ε := exp(−u/ε), where u and u are from Lemma 3.16, are respectively a super-solution and a sub-solution to the scheme L ε h , which is elliptic by Proposition 3.10. Since that scheme is also linear, it admits a unique solution u ε h by Corollary 3.6, obeying u ε ≤ u ε h ≤ u ε by Lemma 3.5. Note that Corollary 3.6 and Lemma 3.5 apply here regardless of the fact that the domain Ω h = Ω h ∪ ∂Ω is infinite, because the finite difference scheme L ε h only uses finitely many boundary values. We conclude that u ε h is positive since u ε is positive, that u ε h := −ε ln u ε h is the unique solution to S ε h by Proposition 3.11, and that u ≤ u ε h ≤ u on Ω h by monotony of the logarithm. The result follows. Lemma 3.18 (Consistency up to the boundary). For any ϕ ∈ C 3 (Ω) and any x ∈ Ω one has lim sup Proof. For any h > 0, x ∈ Ω h , and ξ ∈ R, one has by Proposition 3.14 where r ∈ {1, 2}. In particular r ≥ 1 and therefore ε + (h/ε) r → 0 as (ε, h/ε) → 0. On the other hand, as ξ → 0. The announced result follows from these observations, and from the uniform continuity of the mappings x ∈ Ω → Sϕ(

Randers distance from a point
In this section, we adapt the numerical scheme presented in Section 3 so as to compute Randers distance from a point source, instead of the distance from the boundary. Point sources appear to be the most common setting in applications [23,54,55] where the Poisson equation is used to numerically estimate a geodesic distance. However the convergence of the numerical method in this case does not appear to be backed by previous works, not least because the corresponding PDE is ill posed, see  [24].
We assume that the domain Ω is connected, and contains the origin which w.l.o.g. is the point source of interest, in addition to the previously assumed boundedness and W 3,∞ boundary. For all ε > 0, h > 0, and (4.1) The main result of this section, Theorem 4.1 below, justifies the use of the Poisson method, i.e. solving the linear schemeL ε h , to approximate Randers geodesic distance from the origin.
Note thatL ε h is a discrete elliptic operator when h/ε is sufficiently small, see Proposition 3.10, hence the equationL ε h u ε h = 0 does admit a unique solution by Corollary 3.6. Under the same conditions, the matrix of L ε h admits an inverse, whose coefficients are estimated in the following result.
As evidenced by the constraint ε ln h → 0, Theorems 4.1 and 4.2 have no immediate counterparts in the continuous setting where ε > 0 and h = 0 formally, see also Remark 4.4. Contrast this with the smooth boundary case, where Theorem 2.12 corresponds to Theorem 3.19 with h = 0. The proofs are presented in the rest of this section. In the case of Theorem 4.1, it consists in building sub-solutions and a super-solutions to the operator L ε h , on disk or ring domains around the origin depending on the problem scales h, ε and r, where the radius r > 0 is fixed but small, see Sections 4.1 to 4.3 and Figure 1. Sub-solutions (resp. super-solutions) over these sub-domains are glued together using the following lemma, which immediately follows from the DDE property Definition 3.4. Lemma 4.3. Let F be a DDE scheme on a finite set X, let x ∈ X, and let u, v :   [32], which is beyond the scope of this work. We do not further discuss (4.2), which is understood in the sense of distributions, and thus belongs to a framework distinct from the setting of viscosity solutions considered in this paper.
Remark 4.5 (Heat method). In the Riemannian case (ω = 0) an alternative approach to geodesic distance computation from a point source relies on the short time asymptotics of the heat kernel and u(0, ·) = δ x * is the Dirac mass at the source point [52]. Numerically, the heat equation is solved over a short time interval, using a series of implicit time steps, each of which is equivalent to a Poisson equation [23]. To the extent of our knowledge, solving a single Poisson equation is preferred over the heat method in applications, since it is computationally less expensive, and less susceptible to raise floating point underflow errors, in addition to being more general in view of the extension to Randers metrics presented in this paper. An advantage of the heat equation is however that it allows efficient implementations of optimal transport with quadratic cost [50] in the spirit of Section 5. A natural generalization of (4.3, right) to manifolds [44] equipped with a Finsler metric F, reads with again u(0, ·) = δ x * , and where F * denotes the dual metric, see Definition 2.3. This PDE can be reformulated as a gradient flow, in two different manners [44]. In this setting and under suitable assumptions, the heat kernel asymptotics (4.3, left) extend to Finsler manifolds, see Example 5.5 of [44]. However, discretizing the non-linear and time dependent PDE (4.4) is non-trivial, and also defeats the purpose of this paper which is to consider linear schemes for Randers distance computation. (If non-linear PDEs are considered, then one may as well solve Randers eikonal PDE (1.5) directly, see [39,42].)

Notations
The Euclidean ball, its boundary the sphere, and its intersection with the grid, are denoted where the parameters are center x ∈ R d , the radius r > 0, and the grid scale h > 0. We use the convention B(r) := B(0, r), S(r) := S(0, r), B h (r) := B h (0, r). We introduce constants 0 < c F ≤ C F and R F , which exist by Lemma 2.4, such that for all x, y ∈ Ω Recall that the numerical scheme L ε h is defined in terms of a Lipschitz symmetric matrix field A and vector field b which are the parameters of the dual Randers metric. Selling's decomposition of A b := A − bb , see (2.3), which is uniformly positive definite, is denoted where the bound R S on the offsets exists in view of Theorem 3.9, and I is a suitable integer. The shorthand "C = C(M F )" means that a constant C, appearing in an estimate, can be expressed or bounded in terms of the following problem parameters where A ∞ := sup{ A(x) ; x ∈ Ω}, and Lip(A b ) is the Lipschitz regularity constant of the matrix field A b .

Viscosity regime
We construct a solution to the scheme (4.1) far enough from the point source singularity, at points x ∈ Ω h such that |x| ≥ r, where r is independent of ε and h, by using the results developed in Section 3. For that purpose, a radius r > 0 is fixed in the rest of this section, unless otherwise specified, and such that B(6r) ⊂ Ω.
Then for (ε, h/ε) sufficiently small, and denoting u ε h := −ε ln u ε h , one has with C = C(M F ) Proof. Applying Theorem 3.19 to the domain Ω \ B(r) we obtain the uniform convergence of u ε h over the relatively compact subset int(Ω, r) \ B(2r) as (ε, h/ε) → 0. More precisely The limit u : Ω \ B(r) → R is defined as where the second equality follows from (4.5, right).
where C 1 = C 0 + 3C F . By construction one has u ε h (0) = 1, and u ε h (x) ≥ 0 on ∂Ω, so thatL ε h u ε h ≥ 0 at these boundary points. By choice of the constant C 1 and in view of (4.8), one has , which concludes the proof.

Taylor expansion regime
We construct explicit sub-solutions to the scheme (4.1), at points h |x| ε and ε |x| r, which are radial functions with respectively a power and exponential profile. For that purpose, we need to estimate the derivatives of such functions.
with absolute constants underlying the O notation.
Proof. The expression of ∇u(x) follows from the standard rules for the differentiation of an exponential function ∇(exp •g) = (exp •g)∇g, and of a radial function ∇g(|x|) = g (|x|)n(x). The announced estimate of ∇ 2 u follows from the full expression u(x) −1 ∇ 2 u(x) = µ 2 f 2 nn − µf nn − µf (Id −nn )/|x|, which is obtained using the Leibniz rule for the differentiation of a product, and recalling that the Jacobian matrix of n(x) is (Id −nn )/|x|. Differentiating once more yields the expression of ∇ 3 u, from which the announced estimate follows.

Finite neighborhood regime
We produce a sub-solution to the schemeL ε h which is useful in the immediate neighborhood of the origin, where |x| h. The construction is not based on the approach of viscosity solutions, or on a Taylor expansion, but on the discrete structure of the scheme. For that purpose, we establish additional properties of its coefficients (4.6), suitably normalized: the first d offsets form a basis of Z d , and the corresponding weights are bounded below in a neighborhood of the source point. This implies that the stencils of our numerical scheme are locally connected, and allows to construct a subsolution in Corollary 4.13. The proof is based on the spanning property of Selling's decomposition, see Proposition B.8, which is used here for the first time in the context of PDE numerical analysis.
Proof. Up to grouping duplicates and opposites, we may assume that the vectors ±e 1 , · · · , ±e I are pairwise distinct. Thus by Proposition B.5 one has for all x, y ∈ Ω and all 1 ≤ i ≤ I where C = C(M F ). Then by Proposition B.8, and up to reordering (ρ i , e i ) I i=1 , one has det(e 1 , · · · , e d ) = 1 and The announced result follows, by choosing r S := ρ S /C.
In the rest of this section, we assume that (ρ i , e i ) I i=1 are ordered in such way that (4.12) holds. We also denote ρ −i := ρ i and e −i := −e i for all 1 ≤ i ≤ I. Hence for any where the coefficients are α ε h (x) := 1 + 2 ). In the next lemma, we denote by |x| 1 the sum of the absolute values of the coefficients of a vector x ∈ R d . Proof. The matrix G has integer coefficients by construction, and det(G) = 1 by (4.12, left) hence its inverse is the adjugate matrix G −1 = co(G) which also has integer coefficients, thus G ∈ GL(Z d ) as announced. Since the coefficients of G are bounded by R S , those of the adjugate matrix G −1 are bounded by (d − 1)!R d−1 S , and the equivalence of N with the Euclidean norm follows.

Gluing the sub-solutions
In the previous subsections, we have produced four sub-solutions to the operatorL ε h , on different subsets of the domain Ω h defined according to the distance to the origin, see Lemma 4.6 and Corollaries 4.9, 4.10, and 4.13, and Figure 1. We glue here these partial sub-solutions using Lemma 4.3, to produce a global sub-solution on Ω h and conclude the proof of Theorem 4.1. For that purpose, we introduce four mappings u ε,i h defined on adequate subdomains Ω ε,i h ⊂ Ω h , 1 ≤ i ≤ 4, and depending on the scale parameters (ε, h) as well as constants (λ, µ, ν, ξ) specified later.
We next proceed to prove estimates of the following form: for any where m ε,i h is a suitable function of the scale parameters, specified later in the proof. Thus by Lemma 4.3, . The estimates (4.17) follow from basic upper and lower bounds of the involved functions, and of the norms of the relevant points x. Namely The upper bound on u ε,0 h is derived from the maximum principle, and the lower bound from Lemma 4.6, with C = C(M F ) and for sufficiently small (ε, h/ε). This establishes (4.17, i = 0) with m ε,0 h = exp(−3λr/ε), up to increasing λ so that λ ≥ C. Likewise This establishes (4.17, i = 1) with m ε,1 h = e λ (2ε) −µ , up to increasing µ so that (3/2) µ ≥ e 3λ . Lastly where c N and C N are the equivalence constants in Lemma 4.12. We define ξ by (ξ − R S )c N − 3R S C N = 1. This establishes (4.17, i = 2) with m ε,2 h = e −3R S C N µ (2R S h) µ , up to increasing ν so that e ν ≥ (ξ/(2R S )) µ , in view of the expression of ξ, which concludes the proof.
Proof. We distinguish two cases. (i) If the maximum in (4.16) is attained by the last term, then the announced result follows Lemma 4.6 and the expression of the multiplicative factor α 0 (h/ε) µ exp(−3λr/ε). (ii) If the maximum in (4.16) is attained by one of the first three terms, then |x| ≤ 5r and the announced result follows from the explicit expressions of u ε,1 h , u ε,2 h , u ε,3 h as well as dist F (0, x) ≤ 5C F r.

Convergence on Ω × Ω and inverse matrix
We establish Theorem 4.2, which relates the Randers distance with the inverse matrix of our finite differences scheme. For that purpose, we use the following convention: if U (x; x * ) if a bivariate discrete mapping, defined for all (x, x * ) ∈ Ω h × Ω h , and if F is a finite differences scheme of the form of Definition 3.4, then F U (x; . In other words, the numerical scheme sees U as a function of its first variable x only.
and where the quantity Proof. The proofs of Proposition 4.14 and Corollary 4.7 adapt in a straightforward manner to a point source x * sufficiently close to the origin, as here, rather than the origin itself.
Then locally uniformly on Ω × Ω one has −ε ln U ε h (x; x * ) → dist F (x * , x) as (ε, h/ε, ε ln h) → 0. Proof. First note that x ∈ Ω h → U (x; x * ), for any given x * ∈ Ω h , solves a linear problem which is elliptic when h/ε is sufficiently small, hence has a unique solution, see Corollary 3.6 and Proposition 3.10.
Let r > 0 be such that B(6r) ⊂ Ω. Then for (ε, h/ε) sufficiently small and for all (x, x * ) ∈ int h (Ω, r) × B h (r/2) one has by Corollary 4.7 and Lemma 4.16 and for some constant (4.20) and therefore |U(x; Now let K * ⊂ Ω be a compact set. Up to reducing r one can find a finite cover K * ⊂ ∪ J j=1 B(y j , r/2) such that B(y j , 6r) ⊂ Ω for all 1 ≤ j ≤ J. Applying the above reasoning to each ball B h (y j , r/2), 1 ≤ j ≤ J, instead of B h (r/2), we obtain |U(x; x * ) − dist F (x * , x)| ≤ (2C + C F )r for all (x, x * ) ∈ int h (Ω, r) × (K * ∩ hZ d ), when (ε, h/ε, ε ln h) is small enough. Since r can be chosen arbitrarily small, the result follows.

Application to regularized optimal transport
In this section, we describe a numerical approach to the 1-Wasserstein optimal transport problem, with cost defined as a Randers distance, and with entropic relaxation. The use of such asymmetric cost functions is motivated by various applications as discussed in the introduction of Section 6, whereas the entropic relaxation can be regarded as a side effect of the chosen numerical approach. Given probability measures µ, ν ∈ P(Ω), the addressed problem reads where ε ≥ 0 is the entropic relaxation parameter, and where Π(µ, ν) is the set of probability measures on Ω × Ω whose first and second marginals coincide respectively with µ and ν, known as transport plans between µ and ν. The transport cost and entropy are defined as C(x, y) := dist F (x, y), Ent(P ) := − Ω×Ω ln dP (x, y) e dP 0 (x, y) dP (x, y) where F is a Randers metric on the domain Ω, subject to the well posedness assumptions listed in the last paragraph of Section 1, and P 0 is a reference measure on Ω × Ω. The Euler constant e appearing in Ent(P ) only changes the entropy by an additive constant, since P has total mass one, and allows simplifying later calculations.
As mentioned in the introduction, our approach extends [25] from Riemannian to non-symmetric Randers metrics. However, the quadratic cost dist F (x, y) 2 corresponding to the 2-Wasserstein distance cannot be addressed in our setting, see Remark 4.5. Let us also acknowledge that the effect of entropic relaxation cannot be ignored in the numerical implementation of this class of methods: indeed, empirically, the transport plan is blurred over a radius O(ε), while ε itself must be substantially larger than the discretization grid scale, see Theorem 3.19. Nevertheless such a smoothing is not necessarily an issue in applications [25], and the estimation of the Wasserstein distance itself as ε → 0 can be accelerated by suitable techniques [17].
Finally, let us mention that the existence and uniqueness theory of optimal transport mappings T : X → Y does not apply to the Wasserstein-1 optimal transport problem [38], even in the simpler case C(x, y) = |x − y| where the cost function is Euclidean.
Remark 5.2 (Numerical methods for Monge's problem based on the flow interpretation). The Wasserstein-1 distance W 1 (µ, ν) := W 0 (µ, ν) admits several reformulations: following [7] one has  (5.4, left), and is proved equivalent in [30]. Using a standard finite differences or finite elements discretization of the gradient and divergence operators, these two reformulations yield second order cone programs, a generalization of linear programming which allows certain types of convex quadratic constraints. They are numerically tractable, using augmented Lagrangian methods as in [7] or interior point methods. See [51] for other optimization methods, and applications to geometric data processing.
In comparison, the proposed scheme solves a slightly different problem: Schrödinger's entropic relaxation W ε for some ε > 0, which can be regarded as an advantage or an inconvenient depending on the application. It also requires solving a single sparse linear system, with several rhs see Section 5.2, which makes it very efficient and easy to scale up to 10 5 unknowns in a few seconds as illustrated in the numerical experiments Section 6.

Kantorovich duality
We assume in the following that µ and ν are supported on a finite set X ⊂ Ω, and the support of P 0 is X × X. In this setting we present Kantorovich's dual formulation of the optimal transport problem (5.1), and its numerical solution by Sinkhorn's algorithm. With a slight abuse of notation, we identify a measure µ on the finite set X (resp. P on X × X), which is a weighted sum of Dirac masses µ = x∈X µ x δ x , with the corresponding non-negative vector (µ x ) x∈X (resp. matrix (P xy ) x,y∈X ). With this convention, the set of probability measures on X, and of transport plans between two such probabilities, are defined as P(X) := {µ ∈ R X + ; µ 1 = 1}, Π(µ, ν) := {P ∈ R X×X + ; P 1 = µ, P 1 = ν}, (5.5) where R + := [0, ∞[ denotes the set of non-negative reals, and 1 = (1, · · · , 1) ∈ R X . In particular, µ 1 = x∈X µ x , P 1 = ( y∈X P xy ) x∈X , and P 1 = ( x∈X P xy ) y∈X . In this discrete setting, the optimal transport problem (5.1) reads where A, B := Tr(A B) = x,y∈X A xy B xy . In (5.6) and below, the fraction bar, the logarithm and the exponential function apply componentwise to vectors and matrices. We assume that the reference measure P 0 = (P 0 xy ) has positive entries, and use the standard convention 0 × ∞ = 0 in the definition of the entropic term if some entries of P ∈ Π(µ, ν) vanish. Noting that s ∈ R ++ → s ln s is convex and has a vertical tangent at the origin, we find that the minimization problem (5.6) is convex and that the optimal P has positive entries whenever ε > 0.
Kantorovich duality introduces potentials ϕ, ψ ∈ R X to account for the equality constraints in (5.5), and uses Sion's minimax theorem [34] to re-order the sup and inf: The third line was obtained by solving, component-wise and in closed form, the minimization w.r.t. P . Namely, the convex one dimensional mapping p ∈ R ++ → p C xy + ε ln p/(eP 0 xy ) − ϕ x − ψ y attains its minimum for Note that the objective function of the maximization problem (5.7) is strictly concave.

Sinkhorn's algorithm, and efficient computation
Sinkhorn's algorithm [49] is based on an exponential change of variables in the concave maximization problem (5.7): denoting Φ = exp(ϕ/ε) and Ψ := exp(ψ/ε) we obtain where K ε xy := P 0 xy exp(−C xy /ε), (5.9) and we denoted K ε = (K ε xy ) x,y∈X . We made the dependency of W ε (µ, ν) w.r.t. the ground cost function C explicit in (5.9), because a modified cost is considered in the end of this subsection and Section 5.3. Note that (5.9) is concave w.r.t. Φ or Ψ separately, but not jointly, in contrast with (5.7). It can be numerically solved using alternate maximization, in other words successively solving w.r.t. the unknown Φ with Ψ fixed (resp. w.r.t. Ψ with Φ fixed). This approach is known as Sinkhorn's algorithm [49], and is particularly simple and efficient since the optimal value w.r.t. either of these variables has a closed form, when the other variable is fixed. More precisely, given an arbitrary Ψ 0 ∈ R X ++ one defines for all n ≥ 0 where, as in (5.6), the fraction bar denotes a componentwise division operation. Then the sequence (Φ n , Ψ n ) n≥0 converges geometrically to a maximizer of (5.9), see [49]. The more computationally intensive part of Sinkhorn's algorithm (5.10) is to repeatedly compute the matrixvector products K ε Φ n and K ε Ψ n in (5.10), since the matrix K ε is dense and large. An efficient way to approximate those products using Varadhan's formula was proposed in [50], when the ground cost function is defined as a Riemannian distance or its square. We adapt here this approach to a ground cost C(x, y) := dist F defined as a Randers distance, on a Cartesian grid domain.
For that purpose, we define the modified cost function: for all points x, y of the domain X : We denoted by L ε h the matrix of our linear discretization scheme (3.8) with null Dirichlet boundary conditions. Note that the positivity of the inverse matrix (L ε h ) −1 , and the convergence (5.11, right) which holds locally uniformly on Ω × Ω, are established in Theorem 4.2. We also define P 0 ≡ 1 as the counting measure on Ω h × Ω h .
The quantity W ε (µ h , ν h ; C ε h ), where µ h , ν h ∈ P(Ω h ), is thus defined via (5.9) in terms of the kernel Sinkhorn's algorithm (5.10) in this context involves repeated linear solves (L ε h ) −1 Ψ n and (L ε h ) − Φ n , which are considerably less memory intensive than the dense matrix product with K ε , and also have a lower computational complexity especially if one uses a sparse pre-factorization of the matrix L ε h .

Convergence
We establish that the transport cost W ε (µ h , ν h ; C ε h ), numerically evaluated by the implementation of Sinkhorn's algorithm presented in Section 5.2, converges as (ε, h/ε, ε ln h) → 0 to the 1-Wasserstein optimal transport cost associated with the Randers metric. Note that there are a number of classical and closely related questions that could be raised in this context, such as establishing the convergence of the Kantorovich potentials, but they are out of the scope of this paper. We refer the interested reader to [36] for a survey of the connections and convergence results between the optimal transport problem and Schrödinger's entropic relaxation in the continuous setting, and to [8] for convergence rates of the potentials of the discrete problems (under strong assumptions on the domain and the ground cost function, which are not satisfied in our setting).

Lemma 5.3 (Upper and lower bounds on the discrete entropy)
. Let X be a finite set with N elements, and let µ ∈ P(X). Then 0 ≥ x∈X µ x ln µ x ≥ − ln N , with the usual convention 0 ln 0 = 0.
We omit the proof, which is a classical convexity argument, and note that the upper bound is attained for the Dirac mass concentrated at a single point, and the lower bound is attained for the uniform probability.
, be a bounded, open, connected domain, with a W 3,∞ boundary. Let µ, ν ∈ P(Ω) be supported on a compact subset K of Ω. Let F be a Randers metric on Ω, and let C := dist F : For all h > 0 let Ω h := Ω ∩ hZ d . Let µ h , ν h ∈ P(Ω h ) be supported on K and weakly * converging µ h µ, ν h ν, as h → 0. Consider the ground cost C ε h (5.11) on Ω h × Ω h , and define P 0 as the counting measure. Then Proof. By Lemma 5.3, one has | Ent(P )| = O(ln h) for all transport plans P ∈ Π(µ h , ν h ), since the finite set Ω h × Ω h has O(h −2d ) elements. Therefore For all x, y ∈ Ω defineC ε h (x, y) := C ε h (x h , y h ), where x h , y h ∈ Ω h are the closest grid points to x and y respectively (chosen arbitrarily in case of a tie). ThenC ε h converges uniformly as (ε, h/ε, ε ln h) → 0 to the continuous cost function C on the compact set K × K, by Theorem 4.2. From this point, a direct application of Theorem 5.20, Stability of optimal transport in [53] shows that W 0 (µ h , ν h ; C ε h ) = W 0 (µ h , ν h ;C ε h ) → W 0 (µ, ν; C) as (ε, h/ε, ε ln h) → 0. Combining this result with the estimate (5.14), we establish (5.13) as announced.

Numerical results
We illustrate the numerical methods presented in this paper, for Randers distance computation and numerical optimal transport, with synthetic numerical experiments in dimension d = 2. Geodesic distance computation based on solving the heat or Poisson PDEs has already numerous applications [23,54,55] and is part of established algorithmic geometry libraries such as CGAL ® . Likewise Wasserstein distance computation based on entropic relaxation is an established numerical approach [17,25,50]. The contributions of this paper are thus mostly theoretical, see Section 7.
Randers distances are the simplest geometric model of an asymmetric distance, and have a number of applications discussed in Remark 1.1. The approach presented in this paper for Randers distance computation is applied in [54] to image segmentation problems, using numerical codes provided by the last author and with due acknowledgement 5 . In the optimal transport setting, Randers geometry is likewise relevant when the mass transport cost is naturally strongly asymmetric. As an artificial example, one may consider Monge's earth moving problem on a non-flat terrain, where moving mass uphill is more costly than downhill. Another example, which was a motivation for our analysis but is postponed for future works, is related to the deployment of a fleet of unmanned aerial vehicles (UAVs) for observation in the context of the prevention and monitoring of forest fires. This could be modeled as the optimal transport of a family of Dirac masses, located at the UAVs initial positions, onto a probability distribution over the terrain, defined as the likelihood of a fire start inferred from the type of vegetation and the dryness conditions. When the terrain is mountainous, and the weather is windy, the UAVs displacement costs are strongly asymmetrical, and could be modeled by Randers distances.
In this numerical section, we compare in several occasions the results of the centered scheme L ε h (3.8) emphasized in this paper, with those of the upwind scheme L ε,+ h (3.14) which is unconditionally stable but is also less accurate. We limit our experiments to two-dimensional problems, consistently with the literature, and although our theoretical results apply in dimension three as well, due to the overwhelming cost of solving three-dimensional Laplacian-like linear systems at the considered grid scales.
The PDE domain Ω for the experiments presented in this section is either the square (−1, 1) 2 or the twodimensional unit ball {x ∈ R 2 ; |x| < 1}. It is discretized on a regular Cartesian grid, using finite differences modified as in (3.4) to account for the (null) boundary conditions on ∂Ω. The grid scale h = 0.00625 commonly used in the experiments below corresponds to a grid of size 320 × 320 (intersected with Ω). In the first two problems we numerically approximate where Y is a finite set of target points, and F is a Randers metric on Ω which is described in terms of the parameters A, b of its dual, see Lemma 2.6. From the convergence analysis standpoint, the case of finitely many isolated point sources is a straightforward generalization of the case of a single one considered Section 4, and considering targets instead of sources amounts to a change of sign in the asymmetric part of the metric as discussed below (1.4).
In our experiments, the largest contributor to computation time is the factorization of the sparse linear systems, using the SuperLU routine provided with the scipy Python package. In contrast, the preliminary step of scheme construction (including Selling's algorithm to decompose the matrix A b (x) at each point x ∈ Ω h , and sparse matrix assembly) only accounts for fraction of this cost, and the subsequent solve operation is approximately 10× faster than matrix factorization. In the application to optimal transport, which is based on Sinkhorn's algorithm (5.10), the same linear system needs to be solved multiple times, and thus a single matrix factorization is followed by 13 to 54 solve operations. The SuperLU factorization time when using a 320 × 320 discretization grid (thus ≈ 10 5 unknowns) ranges from 1.3s to 2.8s depending on the test case, on a laptop equipped with a 2.3 GHz Intel Core i5 dual-core processor.
Remark 6.1 (Fast Fourier Transform (FFT)). A PDE which is (i) linear, (ii) has constant coefficients, (iii) has suitable boundary conditions, and (iv) is discretized on a Cartesian grid, can often be efficiently solved using FFT. The Poisson equation (1.3) considered here fulfills (i) and (iv), whereas (iii) is debatable. However (ii) fails unless the Randers metric has constant coefficients. The extension of the FFT to PDEs with varying coefficients is an area of active research [12], which is outside the scope of this work. In any case, FFT schemes lack the discrete degenerate ellipticity property Definition 3.4 which is a key ingredient in establishing the convergence of the proposed numerical method in the vanishing viscosity limit ε → 0.
In applications of the heat method to geodesic distance computation [23,54,55], the metric (isotropic, Riemannian or Randers) always varies over the domain (otherwise the distance is given by an explicit formula), which rules out (ii) and the FFT. In applications to optimal transport via Sinkhorn's algorithm, constant metrics are often considered, and in this special case the FFT approach may be preferred [25].

Randers metric with constant coefficients
We consider a finite set Y of target points and a Randers metric whose dual F * is defined by the following coefficients Since the metric is constant and the domain is convex, the geodesic distance is explicit: for all x ∈ Ω, and the minimal paths are straight lines, see the discussion below Definition 2.3. In particular (6.1) can be evaluated exactly, which allows estimating convergence rates. The exact Randers distance from Y , and the pointwise approximation errors achieved when approximating it using the centered scheme (3.8) and the upwind scheme (3.14), are illustrated on Figure 2. We present on Figure 3, top left Tissot's indicatrix of the metric F, which is a representation of the sets associated to a number of points x ∈ Ω and for a suitable radius r > 0. In Randers case, the set (6.3) is an ellipse which is not centered on the point x, and which admits several equivalent characterizations see Lemma 2.7. The numerical approximation of Randers distance obtained with the centered scheme is illustrated on Figure 3, top right, while the numerical approximations of minimal paths from Y obtained by solving the ODE (2.11) are shown Figure 3, top center.

Randers metric with variable coefficients
A single target point is considered Y = {(0.8, 0)}, and the dual metric parameters are defined at x = (x 1 , x 2 ) ∈ Ω as  (3.14) works best with ε ≈ h 1/2 . Center: the centered scheme is more accurate and works best with ε ≈ h 2/3 . The accuracy of the centered scheme solution is improved with a post-processing step, see Lemma Remark 3.2, which works best using the same stencil as the finite difference scheme (right, bottom), rather than an axis aligned stencil (right, top).
Riemannian case, we also consider, in Figure 3, bottom, the dual metric parameters where A is as in (6.4) and b is chosen as zero in order for those parameters to define a Riemannian metric.

Numerical convergence rates
We discuss the convergence of some approximations of the exact distance function u, defined by the metric parameters and target points (6.2), on the square (−1, 1) 2 . The l ∞ and l 1 errors between u and one of its approximations u ε h are respectively defined as where we excluded some boundary layer (−1, 1) 2 \ [−0.8, 0.8] 2 from the PDE domain Ω = (−1, 1) 2 , consistently with the fact that Theorem 4.1 only guarantees uniform convergence on compact subsets of Ω. We display on Figure 4 the convergence curves for the centered L ε h (3.8) and the (unconditionally stable but less accurate) upwind scheme L ε,+ h (3.14), and for ε = 1 2 h α where α ∈ {1/2, 2/3}. Empirically, the centered scheme works best when α = 2/3, and the upwind scheme when α = 1/2. This experiment illustrates and empirically confirms Corollary 3.15, which establishes that the minimal consistency error with the eikonal equation is achieved when ε ≈ h α , where α = 2/3 for the centered scheme, and α = 1/2 for the upwind scheme. Note however that the empirical solution error appears to be higher than the scheme consistency error, which is O(h α ), see Corollary 3.15. This is likely a consequence of the singularity of the solution at the source points, since the expected convergence rate O(h 2 3 ) is obtained in the case of the distance to a closed curve, see the next paragraph and Figure 6.
The post-processing step discussed in Remark 3.2, and adapted from [23], allows to improve the accuracy of our numerical solution of the Randers eikonal equation solution, as illustrated on Figures 4 and 5. This postprocessing works best when using the stencil of the finite difference scheme, as opposed to a basic axis-aligned stencil, see Figure 5 and the last sentence of Remark 3.2. It allows to achieve the expected convergence rate O(h

Distance from a closed curve
In Figure 6, we consider the problem of approximating the Randers distance from the boundary of a domain Ω := (0, 1) 2 ∪ {x ∈ R 2 | |x| < 1}, in order to illustrate Theorem 3.19, rather than Theorem 4.1 which is about the case of isolated source or target points. The boundary of the domain Ω is not smooth as assumed in Theorem 3.19, but this does not cause any problems in practice. We consider a Randers metric defined by the dual parameters with ρ := 0.7 and with b(0) := 0. In other words, following Zermelo's interpretation Section 2.2, we compute the travel time of a vehicle whose speed is 1 in all directions, relative to a medium whose local speed at x ∈ Ω is Figure 6. Randers distance from the boundary of a domain, with parameters (6.7). Left: reference solution, computed using the eikonal solver [39]. Center: error between the solution obtained using the centered scheme L ε h and the reference solution, for h = 0.00625 and ε = 0.5h 2/3 . Right: error depending on h, for ε = 0.5h 2/3 . defined as b(x). Note that b ∞ = ρ < 1, so the model is locally controllable. This is a variation on the classical test case Figs. 4 and 5 of [48], where we replaced the point source boundary condition with a Dirichlet boundary condition, and the square domain with a partly rounded domain, so as to illustrate the applicability of our numerical scheme in this context.
While the exact solution to the considered problem is not known, we compare our results with the ones obtained using the eikonal solver [39], which is a fast marching method for two dimensional Finsler metrics based on completely different discretization principles, on a finer grid of the form Ω h , h = 0.003125. The l ∞ and l 1 errors in Figure 6 are defined respectively as without excluding a boundary layer from Ω as in (6.6). At small grid scales, we observe convergence at the order 2/3, consistently with the order of consistency 2/3 stated in Corollary 3.15.
Optimal transport problems  (5.10) to numerically approximate the exponential Kantorovich potentials Φ, Ψ ∈ R Ω h + maximizing (5.9), using the efficient approximation (5.12) of the product with the kernel K ε = exp(− dist F (x, y)/ε). The arrows on the figure follow Randers geodesics and illustrate a numerical approximation of the mapping σ : Ω h → Ω defined by σ(x) := 1 µ x y∈Ω h P xy y, (6.8) where (P xy ) x,y∈Ω h is the optimal coupling measure (5.8) for the optimal transport problem (5.6). Thus σ(x) is the barycenter of the image by the transport plan of the Dirac mass at x. The numerical evaluation of σ involves a product with the kernel K ε which again is efficiently approximated using (5.12). Note that the coupling measure P is typically not supported on a graph, not even approximately, and that σ is not a one to one mapping. In particular, σ does not approximate a translation in Figure 7, top left. This behavior reflects the specific properties of the 1-Wasserstein distance, as opposed to the p-Wasserstein distance for p > 1, and it is not related to our numerical approximation procedure. Figure 7, top right displays the error between the approximation W ε h (µ, ν) of the Wasserstein distance obtained with grid scale h > 0 and entropic relaxation ε = 1 2 h 2 3 , and the exact optimal transport cost corresponding to the continuous problem without relaxation ε = h = 0.

Conclusions
In this paper, we introduced and studied a numerical scheme for approximating geodesic distances by solving a linear finite differences scheme, with an application to Schrödinger's entropic relaxation of the optimal transport problem. The approach builds on previous works [23,25,50,52,54,55], and brings the following contributions: (i) justification of the distance computation method in the case of point sources, which is a common setting in applications, (ii) identification of the optimal parameter scaling ε = h Our numerical scheme obeys the discrete degenerate ellipticity property, and thus benefits from comparison principles, numerical stability, and a convergence proof in the setting of viscosity solutions. For that purpose we use adaptive finite differences whose offsets depend on the PDE parameters and are obtained via a tool from discrete geometry known as Selling's decomposition of positive definite matrices [20,47]. Our convergence proof (in the case of a point source) exploits fine properties of Selling's decomposition: uniqueness, Lipschitz regularity, and spanning property (which implies the local connectivity of the stencils derived from it), for the first time in the context of PDE analysis -whereas previous works only relied on the positivity and consistency properties of this decomposition [10,11,29,41,42]. Future work will be devoted to investigating their relevance in other applications to numerical analysis, and possible substitutes in dimension d ≥ 4 where Selling's decomposition does not apply.

Appendix A. Viscosity solutions
In this appendix, we establish the existence, uniqueness, comparison principles and convergence properties announced in Section 2 for the following three PDEs: The The content of this section is presented in the appendix because it often mirrors similar results presented in the discrete setting of Section 3 which we have chosen to emphasize, and because several key results are obtained by specialization of [3][4][5]21]. We present in Appendix A.1 the concepts of degenerate elliptic operator and of viscosity solution to a PDE, and we justify the change of unknown known as the logarithmic transformation. The comparison principle, established in Appendix A.2 for the PDEs of interest, implies the uniqueness and boundedness of their solutions in Ω. We prove in Appendix A.3 the validity of the explicit solutions to (A.1) and (A.2) defined as a distance map (2.8) and as the expectation (2.17) of the stochastic process (2.16), and we establish convergence as ε → 0.

A.1 Degenerate ellipticity, change of unknowns
The PDEs considered in this appendix (A.1) to (A.3) benefit from a common structure, known as degenerate ellipticity [21,43], introduced in Definition A.1 below and whose discrete counterpart is presented in Definition 3.4.
Definition A.1 (Degenerate ellipticity). An operator F : Ω × R × R d × S d → R, denoted F (x, t, p, X), is said degenerate elliptic 6 if it is (i) non-decreasing w.r.t. the second variable t, and (ii) non-increasing w.r.t. the last variable X for the Loewner order. The operator F is said elliptic if F (x, t, p, X) − δt is degenerate elliptic for some constant δ > 0.
The Dirichlet problem for a degenerate elliptic equation writes as where ψ : ∂Ω → R. For example when considering equation (A.2), one should choose This specific operator F is degenerate elliptic, since F (x, t, p, X) does not depend on either t or X, and thus obeys the required monotony conditions. Equation Definition A.2 (Viscosity solution). Let F : Ω × R × R d × S d → R be a continuous degenerate elliptic operator and let ψ ∈ C(∂Ω). A bounded function u : Ω → R is a viscosity sub-solution to (A.4) if for any ϕ ∈ C 2 (Ω) and any local maximum x ∈ Ω of u * − ϕ, It is a viscosity super-solution if for any ϕ ∈ C 2 (Ω) and any local minimum x ∈ Ω of u * − ϕ, It is a viscosity solution if it is both a viscosity sub-solution and super-solution.
Definition A.2 encompasses discontinuous solutions u, obeying the boundary conditions in a weak sense, which allows implementing outflow boundary conditions in the case of the eikonal equation (A.2) by using large enough boundary data g. A well-known property of viscosity solutions is their stability under monotone changes of variables. Proposition A.3. Let F : Ω × R × R d × S d → R be a continuous degenerate elliptic operator, let ψ ∈ C(∂Ω), let I, J ⊂ R be open intervals, let η : I → J be a strictly increasing C 2 -diffeomorphism, and let v : Ω → I be bounded away 7 from ∂I. Define the continuous degenerate elliptic operator G : Ω × R × R d × S d → R and boundary condition χ : ∂Ω → R by G(x, t, p, X) := F (x, η(t), η (t)p, η (t)p ⊗ p + η (t)X), χ(x) := η −1 (ψ(x)).
Proof. We only show the result for sub-solutions, since the case of super-solutions is similar. We assume that v is a sub-solution to (A.6) and prove that u is a sub-solution to (A.4). The proof of the converse is the same, using that The assumption that v is bounded away from ∂I implies that v * and v * are valued in I, hence u * = (η • v) * = η • v * is valued in J and likewise for u * , by continuity of η. Let ϕ ∈ C 2 (Ω) and x ∈ Ω be a local maximum of u * − ϕ. Without loss of generality, we may assume that ϕ(Ω) ⊂ J. Letφ := η −1 • ϕ. Using that η is strictly increasing, and ϕ = η •φ, we deduce that x is a local maximum of v * −φ. We conclude the proof by noticing that for all x ∈ Ω In addition, if x ∈ ∂Ω, then u * (x) − ψ(x) and v * (x) − η −1 (ψ(x)) have the same sign. Combining Proposition A.3 and Remark A.4 allows to address the decreasing change of unknown u = exp(−u/ε) considered by Varadhan [52], see Lemma 2.11. Note the discrete counterpart Proposition 3.11 of this result.
Proof. The PDE (A.1) corresponds to (A.4) with the following operator and boundary conditions presented in Proposition A.7 and Theorem A.8 below, are obtained as a specialization of [5]. For that purpose, we reformulate the first order term of (A.2) in Bellman form, based on the following identity: for all x ∈ Ω and all w ∈ R d where B d := {x ∈ R d ; x ≤ 1} denotes the closed unit ball.
Lemma A.6. The mappings A The comparison principle established in Theorem 2.1 of [5] encompasses both the second order linear PDE (A.1), and the first order non-linear PDE (A.2) considered in this paper, although a reformulation is needed in the latter case.
Proposition A.7. Let u and u be respectively a sub-solution and a super-solution of the linear PDE (A.1), for some ε > 0. Then u * ≤ u * in Ω.
Proof. The announced result is a direct application of Theorem 2.1 in [5], using that A 1/2 b : R d → S ++ d and b : R d → R d are Lipschitz continuous, ∂Ω is of class W 3,∞ , and g ∈ C(∂Ω).
Theorem A.8. Let u, u : Ω → R be respectively a sub-solution and a super-solution of (A.2). Then u * ≤ u * in Ω.
Proof. Since (A.2) involves an operator which is degenerate elliptic but not elliptic, see Definition A.1, we perform the Kruzhkov exponential change of variables and define v := − exp(−u) and v := − exp(−u). By Proposition A.3, v and v are respectively a viscosity sub-solution and super-solution to The boundary ∂Ω is of class W 3,∞ , and the boundary data − exp(−g) ∈ C(∂Ω), consistently with the framework of [5]. Furthermore, the PDE can be rewritten as sup α∈B d − b(x, α), ∇v(x) + v(x) = 0 in Ω, and the required regularity properties of b are established in Lemma A.6, as well as the additional condition which amounts to a local controllability property. Then by Theorem 2.1 of [5], we obtain v * ≤ v * in Ω, and therefore u * ≤ u * in Ω as announced. 8 More directly, if the eigenvalues of A ∈ S ++ d lie in ]0, 2r[, then one has the series expansion

A.3 Explicit solutions, and convergence
We establish that viscosity solutions to Randers eikonal equation (A.2) and to the linear PDE (A.1) may be explicitly obtained as the distance from the boundary (1.4) with suitable penalty term, and as the expectation of a stochastic process (2.17). We also prove bounds for these solutions, see Theorems A.9 and A.11, and conclude the proof of Varadhan's formula for Randers metrics in Theorem A.12.
Noting that any Lipschitz path can be reparametrized at constant speed w.r.t. the metric F, and have its orientation reversed (from x to ∂Ω), we obtain that v(x) = u(x), which concludes the proof.
We obtain a sub-solution and a super-solution to the PDE (A.3), independent of the relaxation parameter, similarly to the discrete case in Lemma 3.16 Lemma A.10. The PDE (A.3) admits, for any ε ≥ 0, the constant sub-solution u : x ∈ Ω → g min , where g min := min{g(y); y ∈ ∂Ω}. It also admits the affine super-solution u : x ∈ Ω → p, x + c max , for any p ∈ R d such that |p| is sufficiently large, where c max := max{g(y) − p, y ; y ∈ ∂Ω}.
Proof. Denote S ε u := |∇u| 2 A b + 2 ∇u, b − ε Tr(A b ∇ 2 u) − 1 the operator of (A.3). Clearly S ε u = −1 < 0 in Ω, whereas S ε u(x) = |p| 2 A b (x) + 2 p, b(x) − 1 ≥ c 0 > 0 for all x ∈ Ω, provided |p| is sufficiently large, since A b and b are bounded over Ω, and A b is uniformly positive definite. The constants g min and c max are chosen so as to comply with the boundary conditions. Theorem A.11. For any ε > 0, the function u ε : Ω → R − defined by (2.17) is a viscosity solution to (A.1). In addition, u ε is positive, and u ≤ u ε ≤ u in Ω, where u ε := −ε ln(u ε ) and u and u are from Lemma A.10.
Proof. Since A 1/2 b : R d → S ++ d and b : R d → R d are Lipschitz continuous, ∂Ω is of class W 3,∞ , and g ∈ C(∂Ω), Theorem 3.1 of [5] implies that u ε is a viscosity solution to (A.1).
We are able to complete the proof of formula (2.18) by making rigorous the passing to the limit between problems (A.3) and (A.2). Note that we follow a standard sketch of proof, already used in Proposition II.6 of [4] for example.
Theorem A.12. With the notations of Theorem A.11, and denoting by u the solution to (2.9), one has u ε → u uniformly on compact subsets of Ω, as ε → 0.
If D ∈ S ++ d and (v 0 , · · · , v d ) is D-obtuse, then (B.2) is known as Selling's decomposition of D. Selling's algorithm provides a constructive proof of existence of such a D-obtuse superbase, in dimension d ∈ {2, 3}. Proposition B.3 (Selling's algorithm). Let b = (v 0 , · · · , v d ) be a superbase of Z d , d ∈ {2, 3}, and let D ∈ S ++ d . If b is not D-obtuse, permute it so that v 0 , Dv 1 > 0 and update it as follows Repeating this operation yields a D-obtuse superbase in finitely many steps.
is such that δ := v 0 , Dv 1 > 0, and if b is defined by (B.3) then one easily checks that b also is a superbase and that E(b ) = E(b) − C d δ, where C 2 = 4 and C 3 = 2. There are only finitely many superbases of Z d whose energy E is below any given bound, since their elements have integer coordinates and since D is positive definite. Hence Selling's algorithm must terminate, which happens when the iteration condition fails, i.e. when a D-obtuse superbase b is obtained. This concludes the proof. Proof. The bounds |v i | ≤ Cµ(D) and |e ij | ≤ 2Cµ(D) are established in Proposition 4.8 and Theorem 4.11 of [41].
Inspecting the proof of these results, one obtains the other announced estimates. Specifically, |v i | D ≤ C D 1 2 is established in the last line of Proposition 4.8 in [41]. Using this refined estimate (instead of |v i | ≤ Cµ(D)) in the proof of Theorem 4.11 in [41] yields |e ij | D −1 ≤ 2C D −1 1 2 (instead of |e ij | ≤ 2Cµ(D)). The result follows.
Selling's decomposition of a matrix D ∈ S ++ d , d ∈ {2, 3}, is obtained by applying Selling's formula Lemma B.2 to a D-obtuse superbase, whose existence is ensured by Selling's algorithm Proposition B.3. This description is constructive and used in all our numerical experiments, since it is efficient enough for the moderately illconditioned matrices encountered in our applications. We normalize Selling's decomposition as follows, up to replacing some offsets with their opposites: Proof. Let b = (v 0 , · · · , v d ) be a superbase of Z d , and define S b := {D ∈ S ++ d ; b is D-obtuse}. For each 0 ≤ i < j ≤ d letẽ ij := ±e ij , where the sign is chosen so thatẽ ij ∈ Z d . By (B.2) one has ρ(D;ẽ ij ) = − v i , Dv j for all D ∈ S b , which is a linear function of D with Lipschitz constant at most |v i ||v j | ≤ C 2 µ(D) 2 by Proposition B.4. In addition, ρ(D; e) = 0 for all D ∈ S b and all e ∈ Z d \ {ẽ ij ; 0 ≤ i < j ≤ d}, thus D → ρ(e; D) is Lipschitz with the announced constant over the set S b . The announced result follows since S ++ d is the union of the closed and convex sets S b associated to superbases b of Z d , by Proposition B.3, and since this union is locally finite by Proposition B.4 We conclude this appendix by establishing, in Proposition B.8, that some offsets of Selling's decomposition, associated with weights suitably bounded below, span the integer lattice Z d by linear combinations with integer coefficients. This implies that the stencils of our numerical scheme (3.8) define a locally connected graph, a property used in Section 4.3 to control its solution in the neighborhood of a point source.
Proof. By Definition B.1, (v 1 , · · · , v d ) is a basis of Z d . We may thus assume that (v 1 , · · · , v d ) is the canonical basis of Z d , up to a change of basis, so that v 0 = (−1, · · · , −1) . Then e 0j = −v j for all 1 ≤ j ≤ d, and e ij = v i − v j for all 1 ≤ i < j ≤ d. Each of the vectors e ij , 0 ≤ i < j ≤ d, thus features at most once the coefficient 1, and at most once the coefficient −1, the other coefficients being 0. The announced result then follows from Proposition 2.37 of [9], which is a classical characterization from Poincare of some unimodular matrices.