Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14264

Résumé

We generalize the O(dnϵ2)-time (1 + ε)-approximation algorithm for the smallest enclosing Euclidean ball [2,10] to point sets in hyperbolic geometry of arbitrary dimension. We guarantee a O(1/ϵ2) convergence time by using a closed-form formula to compute the geodesic α-midpoint between any two points. Those results allow us to apply the hyperbolic k-center clustering for statistical location-scale families or for multivariate spherical normal distributions by using their Fisher information matrix as the underlying Riemannian hyperbolic metric.

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

Collection

application/pdf Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen, Gaëtan Hadjeres

Média

Voir la vidéo

Métriques

151
8
842.06 Ko
 application/pdf
bitcache://0ffdb7dba004ff3a6288227e4c12d30b4770ba05

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14264</identifier><creators><creator><creatorName>Frank Nielsen</creatorName></creator><creator><creatorName>Gaëtan Hadjeres</creatorName></creator></creators><titles>
            <title>Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sat 7 Nov 2015</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Fri 20 Apr 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">0ffdb7dba004ff3a6288227e4c12d30b4770ba05</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24657</version>
        <descriptions>
            <description descriptionType="Abstract">
We generalize the O(dnϵ2)-time (1 + ε)-approximation algorithm for the smallest enclosing Euclidean ball [2,10] to point sets in hyperbolic geometry of arbitrary dimension. We guarantee a O(1/ϵ2) convergence time by using a closed-form formula to compute the geodesic α-midpoint between any two points. Those results allow us to apply the hyperbolic k-center clustering for statistical location-scale families or for multivariate spherical normal distributions by using their Fisher information matrix as the underlying Riemannian hyperbolic metric.

</description>
        </descriptions>
    </resource>
.

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen1 Ga¨etan Hadjeres2 ´Ecole Polytechnique 1 Sony Computer Science Laboratories, Inc 1,2 Conference on Geometric Science of Information c 2015 Frank Nielsen - Ga¨etan Hadjeres 1 The Minimum Enclosing Ball problem Finding the Minimum Enclosing Ball (or the 1-center) of a finite point set P = {p1, . . . , pn} in the metric space (X, dX (., .)) consists in finding c ∈ X such that c = argminc ∈X max p∈P dX (c , p) Figure : A finite point set P and its minimum enclosing ball MEB(P) c 2015 Frank Nielsen - Ga¨etan Hadjeres 2 The approximating minimum enclosing ball problem In a euclidean setting, this problem is well-defined: uniqueness of the center c∗ and radius R∗ of the MEB computationally intractable in high dimensions. We fix an > 0 and focus on the Approximate Minimum Enclosing Ball problem of finding an -approximation c ∈ X of MEB(P) such that dX (c, p) ≤ (1 + )R∗ ∀p ∈ P. c 2015 Frank Nielsen - Ga¨etan Hadjeres 3 The approximating minimum enclosing ball problem: prior work Approximate solution in the euclidean case are given by Badoiu and Clarkson’s algorithm [Badoiu and Clarkson, 2008]: Initialize center c1 ∈ P Repeat 1/ 2 times the following update: ci+1 = ci + fi − ci i + 1 where fi ∈ P is the farthest point from ci . How to deal with point sets whose underlying geometry is not euclidean ? c 2015 Frank Nielsen - Ga¨etan Hadjeres 4 The approximating minimum enclosing ball problem: prior work This algorithm has been generalized to dually flat manifolds [Nock and Nielsen, 2005] Riemannian manifolds [Arnaudon and Nielsen, 2013] Applying these results to hyperbolic geometry give the existence and uniqueness of MEB(P), but give no explicit bounds on the number of iterations assume that we are able to precisely cut geodesics. c 2015 Frank Nielsen - Ga¨etan Hadjeres 5 The approximating minimum enclosing ball problem: our contribution We analyze the case of point sets whose underlying geometry is hyperbolic. Using a closed-form formula to compute geodesic α-midpoints, we obtain a intrinsic (1 + )-approximation algorithm to the approximate minimum enclosing ball problem a O(1/ 2) convergence time guarantee a one-class clustering algorithm for specific subfamilies of normal distributions using their Fisher information metric c 2015 Frank Nielsen - Ga¨etan Hadjeres 6 Model of d-dimensional hyperbolic geometry: The Poincar´e ball model The Poincar´e ball model (Bd , ρ(., .)) consists in the open unit ball Bd = {x ∈ Rd : x < 1} together with the hyperbolic distance ρ (p, q) = arcosh 1 + 2 p − q 2 (1 − p 2) (1 − q 2) , ∀p, q ∈ Bd . This distance induces on the metric space (Bd , ρ) a Riemannian structure. c 2015 Frank Nielsen - Ga¨etan Hadjeres 7 Geodesics in the Poincar´e ball model Shorter paths between two points (geodesics) are exactly straight (euclidean) lines passing through the origin circle arcs orthogonal to the unit sphere Figure : “Straight” lines in the Poincar´e ball model c 2015 Frank Nielsen - Ga¨etan Hadjeres 8 Circles in the Poincar´e ball model Circles in the Poincar´e ball model look like euclidean circles but with different center Figure : Difference between euclidean MEB (in blue) and hyperbolic MEB (in red) for the set of blue points in hyperbolic Poincar´e disk (in black). The red cross is the hyperbolic center of the red circle while the pink one is its euclidean center. c 2015 Frank Nielsen - Ga¨etan Hadjeres 9 Translations in the Poincar´e ball model Tp (x) = 1 − p 2 x + x 2 + 2 x, p + 1 p p 2 x 2 + 2 x, p + 1 Figure : Tiling of the hyperbolic plane by squares c 2015 Frank Nielsen - Ga¨etan Hadjeres 10 Closed-form formula for computing α-midpoints A point m is the α-midpoint p#αq of two points p, q for α ∈ [0, 1] if m belongs to the geodesic joining the two points p, q m verifies ρ (p, mα) = αρ (p, q) . c 2015 Frank Nielsen - Ga¨etan Hadjeres 11 Closed-form formula for computing α-midpoints A point m is the α-midpoint p#αq of two points p, q for α ∈ [0, 1] if m belongs to the geodesic joining the two points p, q m verifies ρ (p, mα) = αρ (p, q) . For the special case p = (0, . . . , 0), q = (xq, 0, . . . , 0), we have p#αq := (xα, 0, . . . , 0) with xα = cα,q − 1 cα,q + 1 , where cα,q := eαρ(p,q) = 1 + xq 1 − xq α . c 2015 Frank Nielsen - Ga¨etan Hadjeres 11 Closed-form formula for computing α-midpoints Noting that p#αq = Tp (T−p (p) #αT−p (q)) ∀p, q ∈ Bd we obtain a closed-form formula for computing p#αq how to compute p#αq in linear time O(d) that these transformations are exact. c 2015 Frank Nielsen - Ga¨etan Hadjeres 12 (1+ )-approximation of an hyperbolic enclosing ball of fixed radius For a fixed radius r > R∗, we can find c ∈ Bd such that ρ (c, P) ≤ (1 + )r ∀p ∈ P with Algorithm 1: (1 + )-approximation of EHB(P, r) 1: c0 := p1 2: t := 0 3: while ∃p ∈ P such that p /∈ B (ct, (1 + ) r) do 4: let p ∈ P be such a point 5: α := ρ(ct ,p)−r ρ(ct ,p) 6: ct+1 := ct#αp 7: t := t+1 8: end while 9: return ct c 2015 Frank Nielsen - Ga¨etan Hadjeres 13 Idea of the proof By the hyperbolic law of cosines : ch (ρt) ≥ ch (h) ch (ρt+1) ch (ρ1) ≥ ch (h)T ≥ ch ( r)T . ct+1 ct c∗ pt h > r ρt+1 ρt r ≤ rr θ θ Figure : Update of ct c 2015 Frank Nielsen - Ga¨etan Hadjeres 14 (1+ )-approximation of an hyperbolic enclosing ball of fixed radius The EHB(P, r) algorithm is a O(1/ 2)-time algorithm which returns the center of a hyperbolic enclosing ball with radius (1 + )r in less than 4/ 2 iterations. c 2015 Frank Nielsen - Ga¨etan Hadjeres 15 (1+ )-approximation of an hyperbolic enclosing ball of fixed radius The EHB(P, r) algorithm is a O(1/ 2)-time algorithm which returns the center of a hyperbolic enclosing ball with radius (1 + )r in less than 4/ 2 iterations. Our error with the true MEHB center c∗ verifies ρ (c, c∗ ) ≤ arcosh ch ((1 + ) r) ch (R∗) c 2015 Frank Nielsen - Ga¨etan Hadjeres 15 (1 + + 2 /4)-approximation of MEHB(P) In fact, as R∗ is unknown in general, the EHB algorithm returns for any r: an (1 + )-approximation of EHB(P) if r ≥ R∗ the fact that r < R∗ if the result obtained after more than 4/ 2 iterations is not good enough. c 2015 Frank Nielsen - Ga¨etan Hadjeres 16 (1 + + 2 /4)-approximation of MEHB(P) In fact, as R∗ is unknown in general, the EHB algorithm returns for any r: an (1 + )-approximation of EHB(P) if r ≥ R∗ the fact that r < R∗ if the result obtained after more than 4/ 2 iterations is not good enough. This suggests to implement a dichotomic search in order to compute an approximation of the minimal hyperbolic enclosing ball. We obtain a O(1 + + 2/4)-approximation of MEHB(P) in O N 2 log 1 iterations. c 2015 Frank Nielsen - Ga¨etan Hadjeres 16 (1 + + 2 /4)-approximation of MEHB(P) algorithm Algorithm 2: (1 + )-approximation of MEHB(P) 1: c := p1 2: rmax := ρ (c, P); rmin = rmax 2 ; tmax := +∞ 3: r := rmax; 4: repeat 5: ctemp := Alg1 P, r, 2 , interrupt if t > tmax in Alg1 6: if call of Alg1 has been interrupted then 7: rmin := r 8: else 9: rmax := r ; c := ctemp 10: end if 11: dr := rmax−rmin 2 ; r := rmin + dr ; tmax := log(ch(1+ /2)r)−log(ch(rmin)) log(ch(r /2)) 12: until 2dr < rmin 2 13: return c c 2015 Frank Nielsen - Ga¨etan Hadjeres 17 Experimental results The number of iterations does not depend on d. Figure : Number of α-midpoint calculations as a function of in logarithmic scale for different values of d. c 2015 Frank Nielsen - Ga¨etan Hadjeres 18 Experimental results The running time is approximately O(dn 2 ) (vertical translation in logarithmic scale). Figure : execution time as a function of in logarithmic scale for different values of d. c 2015 Frank Nielsen - Ga¨etan Hadjeres 19 Applications Hyperbolic geometry arises when considering certain subfamilies of multivariate normal distributions. For instance, the following subfamilies N µ, σ2In of n-variate normal distributions with scalar covariance matrix (In is the n × n identity matrix), N µ, diag σ2 1, . . . , σ2 n of n-variate normal distributions with diagonal covariance matrix N(µ0, Σ) of d-variate normal distributions with fixed mean µ0 and arbitrary positive definite covariance matrix Σ are statistical manifolds whose Fisher information metric is hyperbolic. c 2015 Frank Nielsen - Ga¨etan Hadjeres 20 Applications In particular, our results apply to the two-dimensional location-scale subfamily: Figure : MEHB (D) of probability density functions (left) in the (µ, σ) superior half-plane (right). P = {A, B, C}. c 2015 Frank Nielsen - Ga¨etan Hadjeres 21 Openings Plugging the EHB and MEHB algorithms to compute clusters centers in the approximation algorithm by [Gonzalez, 1985], we obtain approximate algorithms for covering in hyperbolic spaces the k-center problem in O kNd 2 log 1 c 2015 Frank Nielsen - Ga¨etan Hadjeres 22 Algorithm 3: Gonzalez farthest-first traversal approximation algo- rithm 1: C1 := P, i = 0 2: while i ≤ k do 3: ∀j ≤ i, compute cj := MEB(Cj ) 4: ∀j ≤ i, set fj := argmaxp∈P ρ(p, cj ) 5: find f ∈ {fj } whose distance to its cluster center is maximal 6: create cluster Ci containing f 7: add to Ci all points whose distance to f is inferior to the distance to their cluster center 8: increment i 9: end while 10: return {Ci }i c 2015 Frank Nielsen - Ga¨etan Hadjeres 23 Openings The computation of the minimum enclosing hyperbolic ball does not necessarily involve all points p ∈ P. Core-sets in hyperbolic geometry the MEHB obtained by the algorithm is an -core-set differences with the euclidean setting: core-sets are of size at most 1/ [Badoiu and Clarkson, 2008] c 2015 Frank Nielsen - Ga¨etan Hadjeres 24 Thank you! c 2015 Frank Nielsen - Ga¨etan Hadjeres 25 Bibliography I Arnaudon, M. and Nielsen, F. (2013). On approximating the Riemannian 1-center. Computational Geometry, 46(1):93–104. Badoiu, M. and Clarkson, K. L. (2008). Optimal core-sets for balls. Comput. Geom., 40(1):14–22. Gonzalez, T. F. (1985). Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306. Nock, R. and Nielsen, F. (2005). Fitting the smallest enclosing Bregman ball. In Machine Learning: ECML 2005, pages 649–656. Springer. c 2015 Frank Nielsen - Ga¨etan Hadjeres 26