Path connectedness on a space of probability density functions

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14267

Résumé

We introduce a class of paths or one-parameter models connecting arbitrary two probability density functions (pdf’s). The class is derived by employing the Kolmogorov-Nagumo average between the two pdf’s. There is a variety of such path connectedness on the space of pdf’s since the Kolmogorov-Nagumo average is applicable for any convex and strictly increasing function. The information geometric insight is provided for understanding probabilistic properties for statistical methods associated with the path connectedness. The one-parameter model is extended to a multidimensional model, on which the statistical inference is characterized by sufficient statistics.

Path connectedness on a space of probability density functions

Collection

application/pdf Path connectedness on a space of probability density functions Shinto Eguchi, Osamu Komori

Média

Voir la vidéo

Métriques

122
11
120.06 Ko
 application/pdf
bitcache://f3f77e5e3521cb13f0ecf73cba2505963ca4c958

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14267</identifier><creators><creator><creatorName>Shinto Eguchi</creatorName></creator><creator><creatorName>Osamu Komori</creatorName></creator></creators><titles>
            <title>Path connectedness on a space of probability density functions</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sat 7 Nov 2015</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Thu 12 Jul 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">f3f77e5e3521cb13f0ecf73cba2505963ca4c958</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24660</version>
        <descriptions>
            <description descriptionType="Abstract">
We introduce a class of paths or one-parameter models connecting arbitrary two probability density functions (pdf’s). The class is derived by employing the Kolmogorov-Nagumo average between the two pdf’s. There is a variety of such path connectedness on the space of pdf’s since the Kolmogorov-Nagumo average is applicable for any convex and strictly increasing function. The information geometric insight is provided for understanding probabilistic properties for statistical methods associated with the path connectedness. The one-parameter model is extended to a multidimensional model, on which the statistical inference is characterized by sufficient statistics.

</description>
        </descriptions>
    </resource>
.

Path connectedness on a space of probability density functions Osamu Komori1 , Shinto Eguchi2 University of Fukui1 , Japan The Institute of Statistical Mathematics2 , Japan Ecole Polytechnique, Paris-Saclay (France) October 28, 2015 Komori, O. (University of Fukui) GSI2015 October 28, 2015 1 / 18 Contents 1 Kolmogorov-Nagumo (K-N) average 2 parallel displacement A(ϕ) t characterizing ϕ-path 3 U-divergence and its associated geodesic Komori, O. (University of Fukui) GSI2015 October 28, 2015 2 / 18 Setting Terminology . . X : data space P : probability measure on X FP: space of probability density functions associated with P We consider a path connecting f and g, where f, g ∈ FP, and investigate the property from a viewpoint of information geometry. Komori, O. (University of Fukui) GSI2015 October 28, 2015 3 / 18 Kolmogorov-Nagumo (K-N) average Let ϕ : (0, ∞) → R be an monotonic increasing and concave continuous function. Then for f and g in Fp The Kolmogorov-Nagumo (K-N) average . . ϕ−1 ( (1 − t)ϕ(f(x)) + tϕ(g(x)) ) for 0 ≤ t ≤ 1. Remark 1 . . ϕ−1 is monotone increasing, convex and continuous on (0, ∞) Komori, O. (University of Fukui) GSI2015 October 28, 2015 4 / 18 ϕ-path Based on K-N average, we consider ϕ-path connecting f and g in FP: ϕ-path . . ft(x, ϕ) = ϕ−1 ( (1 − t)ϕ(f(x)) + tϕ(g(x)) − κt ) , where κt ≤ 0 is a normalizing factor, where the equality holds if t = 0 or t = 1. Komori, O. (University of Fukui) GSI2015 October 28, 2015 5 / 18 Existence of κt Theorem 1 . . There uniquely exists κt such that ∫ X ϕ−1 ( (1 − t)ϕ(f(x)) + tϕ(g(x)) − κt ) dP(x) = 1 Proof From the convexity of ϕ−1 , we have 0 ≤ ∫ ϕ−1 ( (1 − t)ϕ(f(x)) + tϕ(g(x)) ) dP(x) ≤ ∫ {(1 − t)f(x) + tg(x)}dP(x) ≤ 1 And we observe that limc→∞ ϕ−1 (c) = +∞ since ϕ−1 is monotone increasing. Hence the continuity of ϕ−1 leads to the existence of κt satisfying the equation above. Komori, O. (University of Fukui) GSI2015 October 28, 2015 6 / 18 Illustration of ϕ-path Komori, O. (University of Fukui) GSI2015 October 28, 2015 7 / 18 Examples of ϕ-path Example 1 . 1 ϕ0(x) = log(x). The ϕ0-path is given by ft(x, ϕ0) = exp((1 − t) log f(x) + t log g(x) − κt), where κt = log ∫ exp((1 − t) log f(x) + t log g(x))dP(x). 2 ϕη(x) = log(x + η) with η ≥ 0. The ϕη-path is given by ft(x, ϕη) = exp [ (1 − t) log{ f(x) + η} + t log{g(x) + η} − κt ] , where κt = log [ ∫ exp{(1 − t) log{f(x) + η} + t log{g(x) + η}}dP(x) − η ] . 3 ϕβ(x) = (xβ − 1)/β with β ≤ 1. The ϕβ-path is given by ft(x, ϕβ) = {(1 − t)f(x)β + tg(x)β − κt} 1 β , where κt does not have an explicit form. Komori, O. (University of Fukui) GSI2015 October 28, 2015 8 / 18 Contents 1 Kolmogorov-Nagumo (K-N) average 2 parallel displacement A(ϕ) t characterizing ϕ-path 3 U-divergence and its associated geodesic Komori, O. (University of Fukui) GSI2015 October 28, 2015 9 / 18 Extended expectation For a function a(x): X → R, we consider Extended expectation . . E(ϕ) f {a(X)} = ∫ X 1 ϕ′(f(x)) a(x)dP(x) ∫ X 1 ϕ′(f(x)) dP(x) , where ϕ: (0, ∞) → R is a generator function. Remark 2 If ϕ(t) = log t, then E(ϕ) reduces to the usual expectation. Komori, O. (University of Fukui) GSI2015 October 28, 2015 10 / 18 Properties of extended expectation We note that 1 E(ϕ) f (c) = c for any constant c. 2 E(ϕ) f {ca(X)} = cE(ϕ) f {a(X)} for any constant c. 3 E(ϕ) f {a(X) + b(X)} = E(ϕ) f {a(X)} + E(ϕ) f {b(X)}. 4 E(ϕ) f {a(X)2 } ≥ 0 with equality if and only if a(x) = 0 for P-almost everywhere x in X. Remark 3 If we define f(ϕ) (x) = 1/ϕ′ ( f(x))/ ∫ X 1/ϕ′ (f(x))dP(x), then E(ϕ) f {a(X)} = Ef(ϕ) {a(X)}. Komori, O. (University of Fukui) GSI2015 October 28, 2015 11 / 18 Tangent space of FP Let Hf be a Hilbert space with the inner product defined by ⟨a, b⟩f = E(ϕ) f {a(X)b(X)}, and the tangent space Tangent space associated with extended expectation . . Tf = {a ∈ Hf : ⟨a, 1⟩f = 0}. For a statistical model M = { fθ(x)}θ∈Θ we have E(ϕ) fθ {∂iϕ(fθ(X))} = 0 for all θ of Θ, where ∂i = ∂/∂θi with θ = (θi)i=1,··· ,p. Further, E(ϕ) fθ {∂i∂jϕ(fθ(X))} = E(ϕ) fθ { ϕ′′ ( fθ(X)) ϕ′(fθ(X))2 ∂iϕ(fθ(X))∂iϕ(fθ(X)) } . Komori, O. (University of Fukui) GSI2015 October 28, 2015 12 / 18 Parallel displacement A(ϕ) t Define A(ϕ) t (x) in Tft by the solution for a differential equation ˙A(ϕ) t (x) − E(ϕ) ft { A(ϕ) t ˙ft ϕ′′ ( ft) ϕ′(ft) } = 0, where ft is a path connecting f and g such that f0 = f and f1 = g. ˙A(ϕ) t (x) is the derivative of A(ϕ) t (x) with respect to t. Theorem 2 The geodesic curve {ft}0≤t≤1 by the parallel displacement A(ϕ) t is the ϕ-path. Komori, O. (University of Fukui) GSI2015 October 28, 2015 13 / 18 Contents 1 Kolmogorov-Nagumo (K-N) average 2 parallel displacement A(ϕ) t characterizing ϕ-path 3 U-divergence and its associated geodesic Komori, O. (University of Fukui) GSI2015 October 28, 2015 14 / 18 U-divergence Assume that U(s) is a convex and increasing function of a scalar s and let ξ(t) = argmaxs{st − U(s)} . Then we have U-divergence . . DU(f, g) = ∫ {U(ξ(g)) − fξ(g)}dP − ∫ {U(ξ(f)) − fξ( f)}dP. In fact, U-divergence is the difference of the cross entropy CU( f, g) with the diagonal entropy CU( f, f), where CU(f, g) = ∫ {U(ξ(g)) − fξ(g)}dP. Komori, O. (University of Fukui) GSI2015 October 28, 2015 15 / 18 Connections based on U-divergence For a manifold of finite dimension M = { fθ(x) : θ ∈ Θ} and vector fields X and Y on M, the Riemannian metric is G(U) (X, Y)(f) = ∫ X f Yξ( f)dP for f ∈ M and linear connections ∇(U) and ∇∗(U) are G(U) (∇(U) X Y, Z)(f) = ∫ XY f Zξ(f)dP and G(U) (∇∗ X (U) Y, Z)(f) = ∫ Z f XYξ(f)dP. See Eguchi (1992) for details. Komori, O. (University of Fukui) GSI2015 October 28, 2015 16 / 18 Equivalence between ∇∗ -geodesic and ξ-path Let ∇(U) and ∇∗(U) be linear connections associated with U-divergence DU, and let C(ϕ) = {ft(x, ϕ) : 0 ≤ t ≤ 1} be the ϕ path connecting f and g of FP. Then, we have Theorem 3 A ∇(U) -geodesic curve connecting f and g is equal to C(id) , where id denotes the identity function; while a ∇∗(U) -geodesic curve connecting f and g is equal to C(ξ) , where ξ(t) = argmaxs{st − U(s)}. Komori, O. (University of Fukui) GSI2015 October 28, 2015 17 / 18 Summary 1 We consider ϕ-path based on Kolmogorov-Nagumo average. 2 The relation between U-divergence and ϕ-path was investigated (ϕ corresponds to ξ). 3 The idea of ϕ-path can be applied to probability density estimation as well as classification problems. 4 Divergence associated with ϕ-path can be considered, where a special case would be Bhattacharyya divergence. Komori, O. (University of Fukui) GSI2015 October 28, 2015 18 / 18