Reconstruction of manifold embeddings into Euclidean spaces via intrinsic distances

We consider the problem of reconstructing an embedding of a compact connected Riemannian manifold in a Euclidean space up to an almost isometry, given the information on intrinsic distances between points from its ``sufficiently large'' subset. This is one of the classical manifold learning problems. It happens that the most popular methods to deal with such a problem, with a long history in data science, namely, the classical Multidimensional scaling (MDS) and the Maximum variance unfolding (MVU), actually miss the point and may provide results very far from an isometry; moreover, they may even give no bi-Lipshitz embedding. We will provide an easy variational formulation of this problem, which leads to an algorithm always providing an almost isometric embedding with the distortion of original distances as small as desired (the parameter regulating the upper bound for the desired distortion is an input parameter of this algorithm).


Introduction
Let M be a smooth, connected, compact Riemannian manifold endowed with its intrinsic (geodesic) distance d M .Further, we will further consider M to be embedded in some Euclidean space R n .Assume that we are given a sample {d ij } of pairwise distances between points of some point cloud {y i } ⊂ M , i.e. d ij := d M (y i , y j ).Our goal is to reconstruct an almost isometric embedding of M , or just of its subset {y i }, into R n based on the observed sample.In other words, we are interested in an algorithm, which, based on the input {d ij }, produces a set {x i } ⊂ Σ, with Σ ⊂ R n being some other embedded manifold endowed with its intrinsic distance d Σ , so that d Σ (x i , x j ) ≈ d ij , where the approximate inequality means that the distortion does not exceed a desired level.Note that in data science applications the set {y i } ⊂ M is of course finite, though its cardinality N ∈ N is usually quite large.
Existing results and methods.There is a vast literature both in statistics and computational geometry on manifold reconstruction.The majority of existing methods are based directly on the finite point cloud {y i } N i=1 , which is assumed to be known up to some errors.In particular, in [1][2][3][4][5][6][7][8], the authors consider the problem of C 2 manifold reconstruction based on a finite sample possibly corrupted with small zero-mean additive noise.In applications, one usually employs the respective methods to reduce the dimensionality of the known highdimensional data.Note that this setup is much simpler than the one we consider, where, instead of the point cloud itself, we have only information on the respective distance matrix.Access to the point cloud allows to construct estimates of projectors onto tangent spaces to the manifold and then use them to reconstruct the manifold itself.For instance, [2] and [4] used tangential Delaunay complexes.The approach of [3,7] relied on local PCA estimates.In [8], one iteratively uses a PCA-like procedure to successively improve projector estimates.In [5,6], the so-called putative manifold is used, that is a set of points solving a nonlinear system of equations.Another class of manifold reconstruction methods from a point cloud is based on random projections similarly to the classical Johnson-Lindenstrauss lemma (see, e.g., [9][10][11]).Finally, in [12,13] one studies the case when M is a C m submanifold of R n : the authors used minimizers of a weighed sum of square errors to estimate not only the projectors but also higher order tensors up to order m.As a result, the guarantees on the Hausdorff distance between M and the constructed estimate are much stronger in this case than those for the methods using only the first-order expansions.
The problem we are considering, when only pairwise distances {d ij } N i,j=1 are given, is somewhat less studied.It is worth mentioning, though, that the two problems of manifold reconstruction, the one directly from the point cloud and the other from just a distance matrix, are inherently related.In fact, many methods to solve the former actually contain, as a core part, some method to solve the latter.As an example, the Isomap manifold embedding algorithm [14], often used in applications for the purpose of data dimension reduction, contains as a core the classical multidimensional scaling (MDS) algorithm that deals only with distance matrices.
Note that we are interested in reconstructing the embedding of the original manifold M into an Euclidean space (e.g., for the purpose of data visualization), as opposed to the problem of reconstructing an abstract manifold M (e.g., determined by its metric tensor).The latter is solved in [15], but its solution does not provide any explicit finite-dimensional embedding.Of course, once the metric tensor is reconstructed, one might also reconstruct an embedding by, say, some computational version of the Nash embedding theorem, but such a double-step procedure is unreasonably complicated.Therefore, it is prompting to search for a direct algorithm to solve the posed problem.Such algorithms have already been proposed and are quite widely used in applications.However, we will show that two basic and widely used algorithms, multidimensional scaling (MDS) and maximum variance unfolding (MVU), may infinitely distort the original distances even in simple situations.As a consequence, the methods relying on the classical MDS (e.g., Isomap [14]) may inherit such an undesired property.Note that some newer heuristic methods of dimension reduction with steadily growing popularity, like SNE or t-SNE [16,17], also perform reconstruction of data points just from the distances.Unfortunately, they do not have any rigorous guarantees on the distortion of pairwise distances (and, anyhow, it is clear one might expect at most some bounds on distance distortion "in average", but not uniform).An attempt to understand t-SNE was made in a recent work [18], but the authors only managed to show that t-SNE is able to keep the cluster structure in the data.Finally, the method proposed in [19] also requires just a distance matrix as an input and reconstructs a manifold homeomorphic to the original one, but there are no upper bounds on the distance distortion for this method or its modifications in the literature.
Our contribution.On the contrary, in the present paper, we suggest a quite simple direct algorithm performing a manifold embedding in polynomial time and provide non-asymptotic upper bounds on the relative distortion of pairwise distances that are as small as desired (the requirement for the smallness of distortion is itself an input datum).It is also worth mentioning that, in fact, the algorithm we provide works not only with smooth Riemannian manifolds but rather with a far more general class of compact subsets of a Euclidean space connected by rectifiable arcs and satisfying some curvature estimate (e.g., having a positive reach); this estimate (or the lower bound for the reach) and the intrinsic diameter of the set have to be a priori known as they are also input parameters of the algorithm.
Plan of the paper.The rest of the paper is organized as follows.Section 3 is dedicated to the analysis of MDS and MVU.In particular, relying on the result from [23], we show that MDS, applied to a unit circumference, produces a snowflake-like closed curve which is just Hölder continuous, and hence, infinitely distorts the original distances.Our algorithm will be provided in Section 5.It is based on a semidefinite programming problem, and, consequently, runs in a polynomial time.The analysis of our approach is based on a variational setting proposed in Section 4 and on a simple Γ-convergence result (Thm.4.1).As an application in Section 6, we show that this method can be used also for topological data reconstruction, i.e. for computing Čech cohomologies, and provide explicit estimates on the input parameters for this purpose.Finally, in Section 7 we provide some numerical experiments to illustrate the performance of the proposed algorithm.

Notation and preliminaries
For a metric space E equipped with distance d and a curve θ : [0, 1] → E we denote by | θ| its metric derivative and by ℓ(θ) := For a set M ⊂ R n and ε > 0 let (M ) ε ⊂ R n to be its open ε-neighborhood, i.e. (M ) ε := ∪ x∈N B ε (x).We recall the notion of reach of M introduced by Federer in [20] and defined by Reach (M ) := sup {ε > 0 : every point of (M ) ε has a unique projection on M } .
We further assume that the function space C(M ; R n ) of continuous functions on M with values in R n is equipped with the usual unifom norm.For a set S ⊂ C(M ; R n ) we denote For an m × n matrix X we denote as usual by X T its transpose.Vectors are silently identified with columns.The notation diag (α 1 , . . ., α r ) stands for the diagonal matrix with entries (α 1 , . . ., α r ) over the diagonal.By x • y we denote the usual scalar product of vectors x ∈ R n and y ∈ R n .For any real numbers a and b the notation a ∧ b stands for min{a, b}.
For the general theory of Γ-convergence we refer the reader to [21], wherefrom we borrow also the respective notation.

Main existing methods
The existing algorithms in manifold learning aimed at manifold reconstruction from intrinsic distances are quite numerous, but many of them are very closely related to just two basic ones, multidimensional scaling (MDS) and maximum variance unfolding (MVU), which are aimed at reconstructing the locations of the points y j up to an isometric (or almost isometric) embedding.This would be the case if the algorithm with an input {d ij } N i,j=1 , produced a set of points {x N i } N i=1 ⊂ R n such that the functions f N , defined as f N (y i ) := x N i , tend to some f : M → R n with Σ := f (M ) (almost) isometric to M , when N → ∞.Unfortunately, as we show below, the existing methods in general miss this point.

Multidimensional scaling (MDS)
The classical multidimensional scaling (MDS) introduced by Torgerson and further developed by many authors (see chapter 6 of [22] and references therein), has been formulated for the situation when the distance d M is Euclidean (which happens, e.g., when M is a convex subset of R m ).In practice however MDS method is quite frequently applied when the distance d M is not necessarily Euclidean.This however in general does not allow to reconstruct the embedding of the original manifold M up to an (almost) isometry, as the following example shows.
Example 3.1.We follow the calculations from [23] of the MDS embedding of the finite uniform samples of the unit circumference M .Namely, if {y i } N i=1 is a set of equally spaced points in M , then proposition 7.2.6 of [23] shows that the MDS embedding of {y i } N i=1 in R n lies, up to a rigid motion, on the closed curve γ N : [0, 2π] → R n defined by where lim N a N j = a j := √ 2/j (with j odd).Clearly, in the limit N → ∞ and n → ∞ this gives a closed snowflake-like curve γ homeomorphic but not isometric (nor even bilipschitz) to M ; in fact, To prove (3.1), we calculate But from the Fourier expansion which plugged in (3.2) gives (3.1) as claimed.
Deeper results on what is in fact reconstructed by applying the MDS to a generic metric measure space, together with further examples like MDS on a multidimensional sphere and on a flat torus, can be found in [24] and also in [25].

Maximum variance unfolding (MVU)
The method of maximum variance unfolding (MVU) (alternatively called also semidefinite embedding (SDE)), has been introduced by Weinberger and Saul, see chapter 9.1 of [22], and amounts to finding the points , by maximizing the total variance functional subject to the set of constraints The condition of "y i close to y j " is understood differently in different versions of MVU, but most commonly as d ij ≤ ε for some fixed ε > 0, so that (3.3) becomes The constraints (3.3) (or in particular (3.4)) are reformulated in an equivalent way in terms of the Gram matrix K with entries K ij := x i • x j so that the above maximization becomes semidefinite programming problem.
Similarly to Example 3.1 it is easy to show that MVU, and even more, any method trying to preserve locally the distances as Euclidean ones, in general not only does not allow to reconstruct the embedding of the original manifold M up to an (almost) isometry, but even worse, the constraints (3.4) may not allow to reconstruct even something vaguely similar to M , as the example below shows.
Example 3.2.Taking again M to be a unit circumference S 1 , and {y i } N i=1 to be a set of equally spaced points in M , suppose that {x for some fixed ε > 0, and that continuous functions Parameterizing M in a natural way over [0, 2π] by a curve θ : [0, 2π] → M , θ(t) := (cos t, sin t), for the curve γ : [0, 2π] → R n defined by γ(t) := f (θ(t)) we have therefore that ) is a nondegenerate line segment.But on the other hand one must have γ(0) = γ(2π), which is a contradiction.

Variational setting
From now on we assume M ⊂ R n to be a compact set connected by rectifiable arcs and equipped with the geodesic distance where ℓ(θ) denotes the Euclidean length of θ.Let Σ k ⊂ M be a sequence of closed sets.
Given an ε > 0 and a k ∈ N, we define the functionals by the formulae The scope of this section is to prove the following easy result.
Theorem 4.1.Let M ⊂ R n be compact and connected by rectifiable arcs, and there exist For an x 0 ∈ M , Σ k ⊂ M a sequence of closed sets satisfying Σ k → M as k → ∞ in the sense of the Hausdorff distance, and a C 2 ∈ (0, C2 ] set Then the following asertions hold true. (i) The variational problems have solutions for all k ≥ k, where k ∈ N depends only on ε, (ii) If f k is a solution to (P k ), then there is a subsequence of {f k } (not relabeled) such that lim k f k = f in the sense of uniform convergence, where f solves where Σ := f (M ), in the sense of Hausdorff distance as k → +∞.
Before proving the above theorem, we make a series of remarks.
Remark 4.2.Under conditions of the above Theorem 4.1 one has necessarily where D stands for the intrinsic diameter of M .In fact, since |x − y| ≤ d M (x, y), the estimate (4.2) ensures The conditions on M of the above Theorem 4.1 are automatically satisfied if M ⊂ R n is a C 1,1 smooth compact Riemannian submanifold.The condition (4.2) can be seen then as a bound on curvatures of M .In particular, in this case α := Reach M > 0, and therefore (4.2) is satisfied according to Lemma A.1 for ε 0 := α, with C 1 as in this Lemma.Moreover, in this case the constant C2 may be estimated in terms of α and the intrinsic diameter of M according to Lemma A.2.
Remark 4.4.Clearly, problem (P ) as well as approximating problems (P k ) have many solutions.This is in the very nature of the problem statement: the given data are just intrinsic distances (which themselves do not contain any information on the embedding) and only very weak structural information on the embedding given by the constants C 1 and C 2 (and also by ε 0 ), so that if M is, say, a unit line segment, among solutions to (P ) there are infinitely many other embeddings of M in a given Euclidean space as curves of unit length.However, they must be "twisted not too much", since any map f solving (P ) is required to satisfy for all (x, y) ⊂ M × M .This, in particular, yields that the Euclidean diameter of Σ := f (M ) cannot be arbitrarily small and thus excludes "pathological" embeddings like those provided by the Nash-Kuiper theorem.The fact that one requests the information on the structural constant C 2 to be retained by the embedding f solving (P ) via the requirement (4.6), besides avoiding such pathologies, is also used to force the injectivity of f , which is in a certain sense unavoidable (see Rem. 4.10).On the other hand, we do not force the embedding f to satisfy the curvature-type estimate (4.2).The reason is that in this way we are able to obtain a particularly simple algorithm to solve the approximating problems (P k ) (and hence to approximate embeddings solving (P )) based on solving a semidefinite programming problem.One might of course request more from the embedding a priori, but this would result in introducing more constraints in the optimization problems and hence to substantially more complicated algorithms.
Proof of Theorem 4.1.The proof will be divided into several steps.
Step 1.We first show that the sublevels of functionals F ε,k are equicompact, that is, there is some k ∈ N depending only on ε such that the set is compact for every K > 0. In fact, by Lemma 4.6 for every ε > 0 there is a k ∈ N depending only on ε such that for every Hence, up to redefining each f over M \ Σ k as an extension from Σ k to M with minimum Lipschitz constant, one has that the Lipschitz constants Lip f over M are equibounded for all such f .Therefore, the set D K,ε is compact by Ascoli-Arzelà theorem1 as claimed.
Step 2. Note that the classes C ⊂ C k ⊂ C(M ; R n ) are closed.Since each functional F ε,k is lower semicontinuous (as a supremum of a family of continuous functionals), the claim (i) of the theorem being proven (i.e.existence of solutions to problems (P k ) follows from compactnesss of sublevels of each F ε,k with k ∈ N sufficiently large (depending only on ε).
Step 3. Denote now Since the classes C ⊂ C(M ; R n ) and C k ⊃ C are closed, then the functionals Fε,k also have equicompact sublevels.
Observe that the equality f = lim k f k , where f k ∈ C k , yields that f ∈ C. In fact, for every {x, y} ⊂ M there are {x k , y k } ⊂ Σ k such that lim k x k = x, lim k y k = y.Thus, showing that f ∈ C. Thus by Lemma 4.5 one has that Γ-lim k Fε,k = Fε .Let now for each sufficiently large k ∈ N the function f k ∈ C k stand for a minimizer of F ε,k over C k , i.e. a solution to (P k ).Clearly, f k is also a minimizer of Fε,k over the whole space C(M ; R n ).Hence by the main property of Γ-convergence (Thm.2.10 from [21]) one has that f k converge, up to a subsequence (not relabeled) to a minimizer f of Fε , hence a solution to (P ).Note that when lim k x k = x in M , then lim k f k (x k ) = f (x) in view of the uniform convergence of f k , which implies that f k (M ) → Σ in the sense of Hausdorff distance as k → +∞.This proves claim (ii) of the theorem.
The following technical assertions have been used in the above proof.
for every ρ > 0, the latter inequality being due to the fact that Thus, combining (4.7) and (4.9), we get for every ρ > 0. Taking in the above estimate the supremum with respect to ρ, ε, such that 0 < ρ < ε < ε, we obtain that The last inequality, together with the fact that C k ⊃ C, yields that The inequality and C ⊂ C k .This concludes the proof.
Lemma 4.6.If M ⊂ R n is compact and connected by rectifiable arcs, then for every ε > 0 there is a k ∈ N (depending only on ε) such that for every k ≥ k and for every for all {x, y} ∈ Σ k , x ̸ = y, where C > 0 depends only on K.
Finally, let stand for the set of corresponding points on θ ij (including 0 and 1, so that V ij contains both x i and x j ).Note that there is a natural order of points in V ij .Namely, for {θ ij (s), θ ij (t)} ⊂ V ij we may write θ ij (s) ≤ θ ij (t), if s ≤ t.We will say further that u ∈ V ij and v ∈ V ij are two consecutive points, if there are no points between u and v in the sense of the introduced order.
Choose a k ∈ N such that for every k ≥ k and for pair of indices i, j = 1, . . ., N and every l = 1, . . .m(i, j) there exists a v l k ∈ Σ k with ), when l < m(i, j).
The latter inequality implies, in particular, that k by a geodesic segment, we get a "polygonal line" σ ij made of geodesic segments with vertices at most ε/2 close (in d M ) to the respective points of V ij , geodesic distance between consecutive vertices at most ε, and, finally, We have then (4.12) Finally, for arbitrary {x, y} ∈ Σ k , d M (x, y) ≥ ε, we find a couple {x i , x j } such that and estimate in view of (4.12).But and analogously ) is a minimizer of F ε with ε ≤ ε 0 over some class containing some rigid translation, then Remark 4.8.In view of Remark 4.3 the conditions on M are automatically satisfied if M ⊂ R n is a C 1,1 smooth compact submanifold, with ε 0 and C 1 in this case being as in Lemma A.1.
Proof of Lemma 4.7.Note that F ε (f ) is invariant with respect to the compositions of f with rigid translations, and in particular the value of F ε over any rigid translation is equal to that over the identity map id.The relationship (4.2) implies whenever d M (x, y) < ε.One has therefore for such couples (x, y) ∈ M × M the estimate showing the statement for F ε .The proof for F ε,k is identical.Lemma 4.9.Suppose that for some C > 0 and where Σ := f (M ).If, moreover, f is injective function with a continuous inverse f −1 : Σ → M (which is the case, e.g., when f is proper), and for some c > 0, if d M (x, y) < ε, then also Remark 4.10.The estimate (4.21) cannot hold for f just satisfying (4.18) and (4.20)only for d M (x, y) < ε, unless f is injective, as can be seen from the example of M a line segment of length 2π (identified with [0, 2π]) and f : [0, 2π] → R 2 defined by f (t) := (cos t, sin t).
If f is injective, then take an arbitrary δ > 0, and consider a rectifiable curve σ : continuous, then so is θ, and hence for every t ∈ [0, 1] one has d M (θ(t), θ(t + s)) < ε once s is sufficiently small.Thus from (4.19) we get and taking the limit in the above inequality as δ → 0 + , we arrive at (4.21).

Discrete variational setting and algorithm
Let {y i } ⊂ M be a dense set in M , and denote for the sake of brevity Given an ε > 0 and a k ∈ N, we define the functional The following statement is just a direct application of Theorem 4.1 to the sequence Σ k := {y j } k j=1 ⊂ M , once we denote x j := f (y j ) for all j ∈ N, where f is an embedding provided by Theorem 4.1(ii).Proposition 5.1.Let M , C 1 , C 2 and ε 0 be as in Theorem 4.1.Assume that Then up to a subsequence one has x k i → x i as k → ∞, and for all {i, j} ⊂ N. In particular the statement is valid when M is a C 1,1 smooth compact Riemannian submanifold of R n with ε 0 := Reach M , C 1 and C 2 are as in Lemmata A.1, A.2.
Reduction to semidefinite programming problem.The problem of minimizing F ε,k over the set X k ⊂ (R n ) k is written as minimizing the convex function G ε,k of a matrix defined by over the set of positive semidefinite matrices K satisying the set of convex constraints The solution K of the latter problem is the Gram matrix of a set of vectors {x i }, i.e.
Adding a new scalar variable t ∈ R one reduces the above problem to the following semidefinite programming problem (i.e. a problem of minimization of a linear function with linear constraints over the cone of positive semidefinite matrices), namely minimize t over the pairs (t, K) subject to , for all i, j = 1, . . ., k, i ̸ = j, K positive semidefinite k × k matrix. (5.3)

How to compute čech cohomologies
Since M and Σ := f (M ) are homeomorphic (even bilipschitz equivalent) by Theorem 4.1, they have the same homologies and cohomologies for every reasonable (co)homology theory.In topological data analysis it is quite usual to consider Čech cohomologies of M .Computing them when M is not observed directly, but is just determined by distance matrices, one has to construct its embedding into R n and build Čech complexes built on Euclidean balls centered at samples from the image of such an embedding.We show here that for the embeddings f : M → R n provided by Theorem 4.1(iii) one can give explicit estimates on such complexes (how small should be the radii of the balls and how well fitted should be the set of their centers) so as to get the cohomologies of M .Throughout this section we always denote by B r (x) ⊂ R n the open Euclidean ball of radius r > 0 with center x ∈ R n .Lemma 6.1.Assume that the conditions of Theorem 4.1 be satisfied.If x ∈ Σ ∩ B σ (x), x ∈ Σ, and x = f (y), y ∈ M , σ ≤ C 2 ε 0 , then y ∈ B σ ′ (ȳ), where x = f (ȳ) and σ Vice versa, if y ∈ M ∩ B σ ′′ (ȳ), ȳ ∈ M , where σ ′′ > 0 is defined by the equation ) in view of (4.6), so that the the first claim follows from (4.2).
To prove the second claim, note that the inequality 0 < C 2 ≤ C2 and the definition (4.3) of C2 implies the bound Hence, it suffices to apply (4.4) and recall the definition of σ ′′ (see (6.1)).
Let Λ ⊂ N be a finite set of indices, such that for all y ∈ M there is a λ ∈ Λ and an ȳλ ∈ M , satisfying for some δ > 0. In other words, {ȳ λ } is a finite δ-net of of M (equipped with d M ).Denote now by C Σ (r) the Čech complex built on the Euclidean balls B r (x λ ), where xλ := f (ȳ λ ), and by C M (r) the Čech complex built on the Euclidean balls B r (ȳ λ ).We note that the vertices of all these complexes may be considered the same (namely, the set of vertices of all them may be identified with the index set Λ). Lemma 6.3.Under the conditions of Theorem 4.1 one has when 0 < σ ≤ C 2 ε 0 and σ ′′ is defined by (6.1) with 0 < σ ′′ ≤ C 2 ε 0 .
Proof.Follows immediately from Lemma 6.1.
We now consider the particular case when M ⊂ R n is a C 1,1 smooth compact Riemannian submanifold.Proposition 6.4.Let M ⊂ R n is a C 1,1 smooth compact Riemannian submanifold, so that α := Reach M > 0, the constants C 1 be as in Lemma A.1 and C2 be as in Lemma A.2. Assume C 2 ∈ (0, C2 ] and 0 < σ ≤ C 2 α to be so small that σ ′′ defined by (6.1) satisfy and let δ ∈ (0, α) be such that Then H * (C Σ (σ); R) ≃ H * (M ; R), where H * stands for the Čech cohomology.
Remark 6.5.One may take ȳλ to be drawn by sampling M in i.i.d.way according to the volume measure on M .In fact by proposition 3.2 of [26] one has then that if #Λ > n(M, ρ, p), then with probability at least 1 − p and the number n(M, ρ, p) depends explicitly, besides ρ and p, also on the total volume and the dimension of M .Proof of Proposition 6.4.Lemma A.1 implies (4.2), so that one has Therefore, and it suffices now to apply Lemma 6.3 with ε 0 := α to get the claim.

Numerical experiments
In this section, we present numerical experiments illustrating the performance of our procedure for four sample datasets: a line segment, a two-dimensional sphere, the Swiss Roll, and the flat torus embedded in R 4 as the Clifford torus.In all the experiments, except for the last one, the constant C 2 from (5.3) is set to 2/π.To reconstruct Clifford torus, we chose C 2 = 0.2/π.For quantitative measure of the performance, we compute the error which reflects the average relative error in pairwise distances.Here d ij , 1 ≤ i, j ≤ k, stands for the pairwise distance between the recovered i-th and j-th elements of the sample.We start with the example of a line segment.We took k = 100 equidistant points on the unit interval [0, 1] and embedded them into R 2 using our algorithm with n = 2 and ε = 0.2.The result is shown in Figure 1.Though the recovered points do not lie on a segment, the error (7.1) is equal to Err = 9 • 10 −4 , which is quite small.In the example of a two-dimensional sphere we have two different setups.In the first one, we took k = 100 points on a grid on unit sphere S 2 , computed exact geodesic distances and applied the procedure with parameters n = 3 and ε = 0.6.After that, we computed approximate pairwise geodesic distances over the resulting point cloud as it is done is Isomap.As a result, we obtained Err = 0.13.In the second setup, we had k = 100 points drawn independently from uniform distribution on the sphere and computed exact geodesic distances between them.After that, we performed the embedding into R 3 using our procedure with parameters n = 3 and ε = 0.6, and computed approximate pairwise geodesic distances between the embedded points again using the same method as in Isomap.As a result, we obtained Err = 0.16.The results of the sphere embedding are displayed in Figure 2.
Next, we carried out experiments on the widely known synthetic Swiss Roll dataset from the Scikitlearn library in Python.Here we also have two different setups.In the first one, we generated k = 100 points, computed pairwise Euclidean distances and applied the procedure with parameters n = 3 and ε = 3.The results are shown in Figure 3.After that we computed pairwise Euclidean distances between the recovered points.The resulting average relative error (7.1) was equal to Err = 0.4.In the second setup, we computed exact geodesic distances between points and applied the procedure with parameters n = 2 and ε = 3.After that, we computed the Euclidean distances between the embedded points.The resulting average relative error (7.1) was equal to Err = 0.28.
We took √ k = 15 equally spaced points on each circumference, so the total number of samples was equal to k = 225.After that, we applied the embedding procedure with parameters d = 4, ε = 1.4,and C 2 = 0.2/π.The projections of the initial points and of the embedding into R 4 are displayed in Figure 4.The average relative distortion was equal to Err = 0.21.The distances between the embedded points were estimated in the same way as in Isomap.Let f (t) := arcsin t, t < 0.5.Since f (0) = 0, f ′ (0) = 1 and we have Thus,

1 0|
θ|(t) dt its parametric length.The notation B r (x) ⊂ E stands for the open ball of radius r > 0 with center x ∈ E. The Euclidean norm is denoted by | • |.

Figure 1 .
Figure 1.Reconstruction of a line segment from pairwise distances.The average distance error is 9 • 10 −4 .

Figure 2 .
Figure 2. Reconstruction of a unit sphere from pairwise distances.Top line, columns 1 and 2: points on a grid on the sphere.Top line, columns 3 and 4: the recovered points of the unit sphere.Bottom line, columns 1 and 2: points drawn from the uniform distribution on unit sphere.Bottom line, columns 3 and 4: the recovered points from approximate geodesic distances.

Figure 3 .
Figure 3. Reconstruction of the Swiss Roll from pairwise distances.Top line: sample points in R 3 (left), sample points, projection on the XOZ plane (center), reconstructed Swiss Roll (right).Bottom line: sample points in R 3 (left), sample points, projection on the XOZ plane (center), embedding into R 2 (right).

Figure 4 .
Figure 4. Reconstruction of Clifford torus from pairwise distances.Top line: projections of the original point cloud.Bottom line: projections of the points after embedding.