New metric and connections in statistical manifolds

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14278
DOI : http://dx.doi.org/10.1007/978-3-319-25040-3_25You do not have permission to access embedded form.

Résumé

We define a metric and a family of α-connections in statistical manifolds, based on ϕ-divergence, which emerges in the framework of ϕ-families of probability distributions. This metric and α-connections generalize the Fisher information metric and Amari’s α-connections. We also investigate the parallel transport associated with the α-connection for α = 1.

New metric and connections in statistical manifolds

Média

Voir la vidéo

Métriques

31
11
280.37 Ko
 application/pdf
bitcache://398cea4fb659c82d176491b8c8ec6f533062b67e

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14278</identifier><creators><creator><creatorName>Rui F. Vigelis</creatorName></creator><creator><creatorName>David de Souza</creatorName></creator><creator><creatorName>Charles Cavalcante</creatorName></creator></creators><titles>
            <title>New metric and connections in statistical manifolds</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sat 7 Nov 2015</date>
	    <date dateType="Updated">Sun 16 Jul 2017</date>
            <date dateType="Submitted">Sat 17 Feb 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">398cea4fb659c82d176491b8c8ec6f533062b67e</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>33086</version>
        <descriptions>
            <description descriptionType="Abstract">
We define a metric and a family of α-connections in statistical manifolds, based on ϕ-divergence, which emerges in the framework of ϕ-families of probability distributions. This metric and α-connections generalize the Fisher information metric and Amari’s α-connections. We also investigate the parallel transport associated with the α-connection for α = 1.

</description>
        </descriptions>
    </resource>
.

2nd Conference on Geometric Science of Information, GSI2015 October 28–30, 2015 – Ecole Polytechnique, Paris-Saclay New Metric and Connections in Statistical Manifolds Rui F. Vigelis,1 David C. de Souza,2 and Charles C. Cavalcante3 1 3 Federal University of Ceará – Brazil 2 Federal Institute of Ceará – Brazil Session “Hessian Information Geometry”, October 28 Outline Introduction ϕ-Functions ϕ-Divergence Generalized Statistical Manifold Connections ϕ-Families Discussion Introduction In the paper R.F. Vigelis, C.C. Cavalcante. On ϕ-families of probability distributions. J. Theor. Probab., 26(3):870–884, 2013, the authors proposed the so called ϕ-divergence Dϕ(p q), for p, q ∈ Pµ. The ϕ-divergence is defined in terms of a ϕ-function. The metric and connections that we propose is derived from the ϕ-divergence Dϕ(· ·). Introduction The proposition of new geometric structures (metric and connections) in statistical manifolds is a recurrent research topic. To cite a few: J. Zhang. Divergence function, duality, and convex analysis. Neural Computation, 16(1): 159–195, 2004. J. Naudts. Estimators, escort probabilities, and φ-exponential families in statistical physics. JIPAM, 5(4): Paper No. 102, 15 p., 2004. S.-i. Amari, A. Ohara, H. Matsuzoe. Geometry of deformed exponential families: invariant, dually-flat and conformal geometries. Physica A, 391(18): 4308–4319, 2012. H. Matsuzoe. Hessian structures on deformed exponential families and their conformal structures. Differential Geom. Appl, 35(suppl.): 323–333, 2014. Introduction Let (T, Σ, µ) be a measure space. All probability distributions will be considered Pµ = p ∈ L0 : p > 0 and ˆ T pdµ = 1 , where L0 denotes the set of all real-valued, measurable functions on T, with equality µ-a.e. ϕ-Functions A function ϕ: R → (0, ∞) is said to be a ϕ-function if the following conditions are satisfied: (a1) ϕ(·) is convex; (a2) limu→−∞ ϕ(u) = 0 and limu→∞ ϕ(u) = ∞; (a3) there exists a measurable function u0 : T → (0, ∞) such that ˆ T ϕ(c(t) + λu0(t))dµ < ∞, for all λ > 0, for each measurable function c : T → R such that ϕ(c) ∈ Pµ. Not all functions satisfying (a1) and (a2) admit the existence of u0. Condition (a3) is imposed so that ϕ-families are parametrizations for Pµ in the same manner as exponential families. ϕ-Functions The κ-exponential function expκ : R → (0, ∞), for κ ∈ [−1, 1], which is given by expκ(u) = (κu + √ 1 + κ2u2)1/κ, if κ = 0, exp(u), if κ = 0, is a ϕ-function. The q-exponential function expq(u) = [1 + (1 − q)u] 1 1−q + , where q > 0 and q = 1, is not a ϕ-function (expq(u) = 0 for u < 1/(1 − q)). A ϕ-function ϕ(·) may not be a φ-exponential function expφ(·), which is defined as the inverse of lnφ(u) = ˆ u 1 1 φ(x) dx, u > 0, for some increasing function φ: [0, ∞) → [0, ∞). ϕ-Divergence We define the ϕ-divergence as Dϕ(p q) = ˆ T ϕ−1(p) − ϕ−1(q) (ϕ−1) (p) dµ ˆ T u0 (ϕ−1) (p) dµ , for any p, q ∈ Pµ. If ϕ(·) = exp(·) and u0 = 1 then Dϕ(p q) coincides with the Kullback–Leibler divergence DKL(p q) = ˆ T p log p q dµ. Generalized Statistical Manifold A metric (gij ) can be derived from the ϕ-divergence: gij = − ∂ ∂θi p ∂ ∂θj q Dϕ(p q) q=p = −Eθ ∂2fθ ∂θi ∂θj , where fθ = ϕ−1(pθ) and Eθ[·] = ´ T (·)ϕ (fθ)dµ ´ T u0ϕ (fθ)dµ . Considering the log-likelihood function lθ = log(pθ) in the place of fθ = ϕ−1(pθ), we get the Fisher information matrix. Generalized Statistical Manifold A family o probability distributions P = {pθ : θ ∈ Θ} ⊆ Pµ is said to be a generalized statistical manifold if the following conditions are satisfied: (P1) Θ is a domain (an open and connected set) in Rn. (P2) p(t; θ) = pθ(t) is a differentiable function with respect to θ. (P3) The operations of integration with respect to µ and differentiation with respect to θi commute. (P4) The matrix g = (gij ), which is defined by gij = −Eθ ∂2fθ ∂θi ∂θj , is positive definite at each θ ∈ Θ. Generalized Statistical Manifold The matrix (gij ) can also be expressed as gij = Eθ ∂fθ ∂θi ∂fθ ∂θj , where Eθ [·] = ´ T (·)ϕ (fθ)dµ ´ T u0ϕ (fθ)dµ . As consequence, the mapping X = i ai ∂ ∂θi → X = i ai ∂fθ ∂θi is an isometry between the tangent space TθP at pθ and TθP = span ∂fθ ∂θi : i = 1, . . . , n , equipped with the inner product X, Y θ = Eθ [XY ]. Connections We use the ϕ-divergence Dϕ(· ·) to define a pair of mutually dual connections D(1) and D(−1), whose Christoffel symbols are given by Γ (1) ijk = − ∂2 ∂θi ∂θj p ∂ ∂θk q Dϕ(p q) q=p and Γ (−1) ijk = − ∂ ∂θk p ∂2 ∂θi ∂θj q Dϕ(p q) q=p . Connections D(1) and D(−1) correspond to the exponential e mixture connections. Connections Expressions for the Christoffel symbols Γ (1) ijk and Γ (−1) ijk are given by Γ (1) ijk = Eθ ∂2fθ ∂θi ∂θj ∂fθ ∂θk − Eθ ∂2fθ ∂θi ∂θj Eθ u0 ∂fθ ∂θk and Γ (−1) ijk = Eθ ∂2fθ ∂θi ∂θj ∂fθ ∂θk + Eθ ∂fθ ∂θi ∂fθ ∂θj ∂fθ ∂θk − Eθ ∂fθ ∂θj ∂fθ ∂θk Eθ u0 ∂fθ ∂θi − Eθ ∂fθ ∂θi ∂fθ ∂θk Eθ u0 ∂fθ ∂θj , where Eθ [·] = ´ T (·)ϕ (fθ)dµ ´ T u0ϕ (fθ)dµ . Terms in red vanish if ϕ(·) = exp(·) and u0 = 1. Connections Using the pair of mutually dual connections D(1) and D(−1), we can specify a family of α-connections D(α) in generalized statistical manifolds, whose Christoffel symbols are Γ (α) ijk = 1 + α 2 Γ (1) ijk + 1 − α 2 Γ (−1) ijk . The connections D(α) and D(−α) are mutually dual. For α = 0 , the connection D(0), which is clearly self-dual. corresponds to the Levi–Civita connection . ϕ-Families A parametric ϕ-family Fp = {pθ : θ ∈ Θ} centered at p = ϕ(c) is defined by pθ(t) := ϕ c(t) + n i=1 θi ui (t) − ψ(θ)u0(t) , where ψ: Θ → [0, ∞) is a normalizing function. The functions satisfy some conditions, which imply ψ ≥ 0. The domain Θ can be chosen to be maximal. If ϕ(·) = exp(·) and u0 = 1, then Fp corresponds to an exponential family. ϕ-Families The normalizing function and ϕ-divergence are related by ψ(θ) = Dϕ(p pθ). The matrix (gij ) is the Hessian of the normalizing function ψ: gij = ∂2ψ ∂θi ∂θj . As a result, Γ (0) ijk = 1 2 ∂gij ∂θk = 1 2 ∂2ψ ∂θi ∂θj ∂θj . ϕ-Families In ϕ-families, the Christoffel symbols Γ (1) ijk vanish identically, i.e., (θi ) is an affine coordinate system, and the connection D(1) is flat (and D(−1) is also flat). Thus Fp admits a coordinate system (ηj ) that is dual to (θi ), and there exist potential functions ψ and ψ∗ such that θi = ∂ψ∗ ∂ηi , ηj = ∂ψ ∂θj , and ψ(p) + ψ∗ (p) = i θi (p)ηi (p). Discussion Advantages of (gij ), and Γ (1) ijk , Γ (−1) ijk being derived from Dϕ(· ·): Duality. Pythagorean Relation. Projection Theorem. Open questions: An example of generalized statistical manifold whose coordinate system is D(−1) -flat. Parallel transport with respect to D(−1) . Divergence or ϕ-function associated with α-connections. End Thank you!