On the existence of paths connecting probability distributions

07/11/2017
Publication GSI2017
OAI : oai:www.see.asso.fr:17410:22602
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit
 

Résumé

We introduce a class of paths defined in terms of two deformed exponential functions. Exponential paths correspond to a special case of this class of paths. Then we give necessary and sufficient conditions for any two probability distributions being path connected.

On the existence of paths connecting probability distributions

Collection

application/pdf On the existence of paths connecting probability distributions (slides)
application/pdf On the existence of paths connecting probability distributions Rui F. Vigelis, Luiza Helena Félix de Andrade, Charles Casimiro Cavalcante
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

On the existence of paths connecting probability distributions

Média

Voir la vidéo

Métriques

0
0
237.38 Ko
 application/pdf
bitcache://f71acac458031b71ce35b40a72c4fde7791e837a

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors Platine

alanturinginstitutelogo.png
logothales.jpg

Sponsors Bronze

logo_enac-bleuok.jpg
imag150x185_couleur_rvb.jpg

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors

smai.png
logo_gdr-mia.png
gdr_geosto_logo.png
gdr-isis.png
logo-minesparistech.jpg
logo_x.jpeg
springer-logo.png
logo-psl.png

Organisateurs

logo_see.gif
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/17410/22602</identifier><creators><creator><creatorName>Rui F. Vigelis</creatorName></creator><creator><creatorName>Charles Casimiro Cavalcante</creatorName></creator><creator><creatorName>Luiza Helena Félix de Andrade</creatorName></creator></creators><titles>
            <title>On the existence of paths connecting probability distributions</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2018</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Thu 8 Mar 2018</date>
	    <date dateType="Updated">Thu 8 Mar 2018</date>
            <date dateType="Submitted">Fri 20 Apr 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">f71acac458031b71ce35b40a72c4fde7791e837a</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>37328</version>
        <descriptions>
            <description descriptionType="Abstract">We introduce a class of paths defined in terms of two deformed exponential functions. Exponential paths correspond to a special case of this class of paths. Then we give necessary and sufficient conditions for any two probability distributions being path connected.
</description>
        </descriptions>
    </resource>
.

On the existence of paths connecting probability distributions Rui F. Vigelis 1 , Luiza H. F. de Andrade 2 , and Charles C. Cavalcante 3 1 Computer Engineering, Campus Sobral, Federal University of Ceará, Sobral-CE, Brazil, rfvigelis@ufc.br. 2 Center of Exact and Natural Sciences, Federal Rural University of the Semi-arid Region, Mossoró-RN, Brazil, luizafelix@ufersa.edu.br. 3 Wireless Telecommunication Research Group, Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza-CE, Brazil, charles@ufc.br. Abstract. We introduce a class of paths dened in terms of two de- formed exponential functions. Exponential paths correspond to a special case of this class of paths. Then we give necessary and sucient condi- tions for any two probability distributions being path connected. 1 Introduction In Non-parametric Information Geometry, many geometric structures can be de- ned in terms of paths connecting probability distributions. It is shown in [5,1,6] that two probability distributions are connected by an open exponential path (or exponential arc) if and only if they belong to the same exponential family. Exponential paths are the auto-parallel curves w.r.t. the exponential connection. Using deformed exponential functions, we can dene an analogue version of ex- ponential paths. A deformed exponential ϕ: R → [0, ∞) is a convex function such that limu→−∞ ϕ(u) = 0 and limu→∞ ϕ(u) = ∞. Eguchi and Komori in [3] introduced and investigated a class of paths dened in terms of deformed expo- nential functions. In the present paper we extend the denition of paths given in [3], and then we show equivalent conditions for any two probability distributions being path connected. Throughout the text, (T, Σ, µ) denotes the σ-nite measure space on which probability distributions (or probability density functions) are dened. All prob- ability distributions are assumed to have positive density w.r.t. the underlying measure µ. In other words, they belong to the collection Pµ = {p ∈ L0 : R T pdµ = 1 and p > 0}, where L0 is the space of all real-valued, measurable functions on T, with equality µ-a.e. Let us x a positive, measurable function u0 : T → (0, ∞). Given two prob- ability distributions p and q in Pµ, a ϕ1/ϕ2-path (or ϕ1/ϕ2-arc) is a curve in Pµ dened by α 7→ ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q) + κ(α)u0). The constant κ(α) := κ(α; p, q) ∈ R is introduced so that Z T ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q) + κ(α)u0)dµ = 1. (1) We use the word ϕ-path in the place of ϕ/ϕ-path (i.e., if ϕ1 = ϕ2 = ϕ). The case where ϕ1 = ϕ2 = ϕ and u0 = 1 was analyzed by Eguchi and Komori in [3]. Exponential paths correspond to ϕ1(·) and ϕ2(·) equal to exp(·), and u0 = 1. A ϕ1/ϕ2-path can be seen as a ϕ1-path connecting ϕ1(ϕ−1 2 (p) + κ(1)u0) and ϕ1(ϕ−1 2 (q)+κ(0)u0). Unless ϕ1 = ϕ2 = ϕ, a ϕ1/ϕ2-path does not connect p and q. We can use κ(α) to dene the divergence D(α) (p k q) = − 1 α κ(0) − 1 1 − α κ(1) + 1 α(1 − α) κ(α). This divergence for ϕ1 = ϕ2 = ϕ is related to a generalization of Rényi diver- gence, which was introduced by the authors in [2]. If ϕ1(·) and ϕ2(·) are equal to exp(·), and u0 = 1, then D(α) (· k ·) reduces to Rényi divergence. The main goal of this notes is to give necessary an sucient conditions for the existence of κ(α) in (1) for every p, q ∈ Pµ and α ∈ [0, 1]. Proposition 1. Assume that the measure µ is non-atomic. Let ϕ1, ϕ2 : R → (0, ∞) be two positive, deformed exponential functions, and let u0 : T → (0, ∞) be a positive, measurable function. Fix any α ∈ (0, 1). For every pair of probability distributions p and q in Pµ, there exists a constant κ(α) := κ(α; p, q) satisfying (1) if, and only if, Z T ϕ1(c + λu0)dµ < ∞, for all λ ≥ 0, (2) for each measurable function c: T → R satisfying R T ϕ2(c)dµ < ∞. A proof of this proposition is shown in the next section. Using some results involved in the proof of Proposition 1, we give an equivalent criterion for the existence of u0 satisfying condition (2) for ϕ1 = ϕ2 = ϕ. As consequence, there may exist functions ϕ1 = ϕ2 = ϕ for which we cannot nd u0 satisfying (2), a result which was shown in [2] (Example 2). 2 Results We begin by showing an equivalent criterion for condition (2). Proposition 2. Two deformed exponential functions ϕ1, ϕ2 : R → [0, ∞) and a measurable function u0 : T → (0, ∞) satisfy condition (2) if, and only if, for each λ > 0, we can nd α ∈ (0, 1) and a measurable function c: T → R ∪ {−∞} such that R T ϕ1(c)dµ < ∞ and αϕ1(u) ≤ ϕ2(u − λu0(t)), for all u ≥ c(t), (3) for µ-a.e. t ∈ T. The proof of Proposition (2) requires a preliminary result. Lemma 1. Suppose that, for each λ > 0, we cannot nd α ∈ (0, 1) and a measurable function c: T → R ∪ {−∞} such that R T ϕ1(c)dµ < ∞ and αϕ1(u) ≤ ϕ2(u − λu0(t)), for all u ≥ c(t). (4) Then there exist sequences {λn}, {cn} and {An} of positive numbers λn ↓ 0, measurable functions, and pairwise disjoint, measurable sets, respectively, such that Z An ϕ1(cn)dµ = 1 and Z An ϕ2(cn − λnu0)dµ ≤ 2−n , for all n ≥ 1. (5) Proof. Let {λ0 m} be a sequence of positive numbers λ0 m ↓ 0. For each m ≥ 1, we dene the function fm(t) = sup{u ∈ R : 2−m ϕ1(u) > ϕ2(u − λ0 mu0(t))}, where we use the convention sup ∅ = −∞. We will verify that fm is measurable. For each rational number r, dene the measurable sets Em,r = {t ∈ T : 2−m ϕ1(r) > ϕ2(r − λ0 mu0(t))} and the simple functions um,r = rχEm,r . Let {ri} be an enumeration of the rational numbers. For each m, k ≥ 1, consider the non-negative, simple functions vm,k = max1≤i≤k um,ri . Moreover, denote Bm,k = Sk i=1 Em,ri . By the continuity of ϕ1(·) and ϕ2(·), it follows that vm,kχBm,k ↑ fm as k → ∞, which shows that fm is measurable. Since (4) is not satised, we have that R T ϕ1(fm)dµ = ∞ for all m ≥ 1. In virtue of the Monotone Convergence Theorem, for each m ≥ 1, we can nd some km ≥ 1 such that the function vm = vm,km and the set Bm = Bm,km satisfy R Bm ϕ1(vm)dµ ≥ 2m . Clearly, we have that ϕ1(vm)χBm < ∞ and 2−m ϕ1(vm)χBm ≥ ϕ2(vm − λ0 mu0)χBm . By Lemma 8.3 in [4], there exist an increasing sequence {mn} of indices and a sequence {An} of pairwise disjoint, measurable sets such that R An ϕ1(vmn )dµ = 1. Clearly, R An ϕ2(vmn − λ0 mn u0)dµ ≤ 2−mn . Denoting λn = λ0 mn , cn = vmn , we obtain (5). Proof (Proposition 2). Assume that ϕ1(·), ϕ2(·) and u0 satisfy condition (2). Suppose that expression (3) does not hold. Let {λn}, {cn} and {An} be as stated in Lemma 1. Then we dene c = c0χT \A + P∞ n=1(cn − λnu0)χAn , where A = S∞ n=1 An and c0 : T → R is any measurable function such that R T \A ϕ2(c0)dµ < ∞. In view of (5), we have Z T ϕ2(c)dµ = Z T \A ϕ2(c0)dµ + ∞ X n=1 Z An ϕ2(cn − λnu0)dµ ≤ Z T \A ϕ2(c0)dµ + ∞ X n=1 2−n < ∞. Given any λ > 0, we take n0 ≥ 1 such that λ ≥ λn for all n ≥ n0. Then we can write Z T ϕ1(c + λu0)dµ ≥ ∞ X n=n0 Z An ϕ1(cn + (λ − λn)u0)dµ ≥ ∞ X n=n0 Z An ϕ1(cn)dµ = ∞ X n=1 1 = ∞. (6) which is a contradiction to condition (2). Conversely, suppose that expression (3) holds for a given λ > 0. Let e c: T → R be any measurable function satisfying R T ϕ2(e c)dµ < ∞. Denote A = {t : e c(t) + λu0 ≥ c(t)}. We use inequality (3) to write α Z T ϕ1(e c + λu0)dµ ≤ α Z A ϕ1(e c + λu0)dµ + α Z T \A ϕ1(c)dµ ≤ Z A ϕ2(e c)dµ + Z T \A ϕ2(c − λu0)dµ < ∞. Thus, condition (2) follows. Before we give a proof of Proposition 1, we show the following technical result. Lemma 2. Let ϕ1, ϕ2 : R → (0, ∞) be positive, deformed exponential functions, and e c: T → R a measurable function such that R A ϕi(e c)dµ < 1, for i = 1, 2, where A and B = T \ A are measurable sets such that µ(A) > 0 and µ(B) > 0. Fix any α ∈ (0, 1). Then we can nd measurable functions b1, b2 : T → R for which p = ϕ2(c1) and q = ϕ2(c2) are in Pµ, where c1 = e cχA + b1χB and c2 = e cχA + b2χB, and Z T ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q)) < 1. (7) In addition, we assume b1χB 6= b2χB. Proof. Let {Bn} be a sequence of measurable sets such that B = S∞ n=1 Bn and 0 < µ(Bn) < ∞. For each n ≥ 1, we select measurable sets Cn and Dn such that Bn = Cn ∪ Dn and µ(Cn) = µ(Dn) = 1 2 µ(Bn). Let {γ (1) n } and {γ (2) n } be sequences of positive numbers satisfying ∞ X n=1 γ(1) n < 1 − Z A ϕ1(e c)dµ, and ∞ X n=1 γ(2) n = 1 − Z A ϕ2(e c)dµ. Then we take βn ∈ R and θn > 0 such that ϕ2(βn) + ϕ2(−θn) = 2 γ (2) n µ(Bn) (8) and ϕ1(αβn − (1 − α)θn) + ϕ1(−αθn + (1 − α)βn) ≤ 2 γ (1) n µ(Bn) . (9) Numbers βn and θn satisfying (8) and (9) exist because ϕ1(·) and ϕ2(·) are positive, and βn < ϕ−1 2 (2γ (2) n /µ(Bn)). Let us dene b1 = ∞ X n=1 βnχCn − θnχDn and b2 = ∞ X n=1 −θnχCn + βnχDn . From these choices, it follows that Z B ϕ2(b1)dµ = ∞ X n=1 ϕ2(βn)µ(Cn) + ϕ2(−θn)µ(Dn) = ∞ X n=1 [ϕ2(βn) + ϕ2(−θn)] µ(Bn) 2 = ∞ X n=1 γ(2) n = 1 − Z A ϕ1(e c)dµ, which implies that R T ϕ2(c1) = 1, where c1 = e cχA + b1χB. Similarly, we have that R T ϕ2(c2) = 1, where c2 = e cχA + b2χB. On the other hand, we can write Z B ϕ1(αb1 + (1 − α)b2) = ∞ X n=1 ϕ1(αβn − (1 − α)θn)µ(Cn) + ϕ1(−αθn + (1 − α)βn)µ(Dn) = ∞ X n=1 [ϕ1(αβn − (1 − α)θn) + ϕ1(−αθn + (1 − α)βn)] µ(Bn) 2 ≤ ∞ X n=1 γ(1) n < 1 − Z A ϕ1(e c)dµ, from which expression (7) follows. Finally we can present a proof of Proposition 1. Proof (Proposition 1). Because ϕ2(·) is convex, it follows that R T ϕ2(c)dµ < ∞, where c = αϕ−1 2 (p)+(1−α)ϕ−1 2 (q). Condition (2) along with the Monotone Con- vergence Theorem and the continuity of ϕ1(·) implies the existence and unique- ness of κ(α). Conversely, assume the existence of κ(α) in (1) for every p, q ∈ Pµ. We begin by showing that Z T ϕ1(c − λu0)dµ < ∞, for all λ ≥ 0, (10) for every measurable function c: T → R such that R T ϕ2(c)dµ < ∞. If expres- sion (10) does not hold, then for some measurable function c: T → R with R T ϕ2(c)dµ < ∞, and some λ0 ≥ 0, we have        Z T ϕ1(c − λu0)dµ < ∞, for λ0 ≤ λ, Z T ϕ1(c − λu0)dµ = ∞, for 0 ≤ λ < λ0, (11) or        Z T ϕ1(c − λu0)dµ < ∞, for λ0 < λ, Z T ϕ1(c − λu0)dµ = ∞, for 0 ≤ λ ≤ λ0. (12) Notice that expression (11) with λ0 = 0 corresponds to (10). So in (11) we assume that λ0 > 0. Let {Tn} be a sequence of non-decreasing, measurable sets with 0 < µ(Tn) < µ(T) and µ(T \ S∞ n=1 Tn) = 0. Dene En = Tn ∩ {c − λ0u0 ≤ n}, for each n ≥ 1. Clearly, the sequence {En} is non-decreasing and satises µ(En) < ∞ and µ(T \ S∞ n=1 En) = 0. If expression (11) is satised for λ0 > 0, we select a suciently large n0 ≥ 1 such that R T \En0 ϕi(c − λ0u0)dµ < 1, for i = 1, 2. Denote A := T \ En0 and B := En0 . According to Lemma (2), we can nd measurable functions for which p = ϕ2(c1) and q = ϕ2(c2) are in Pµ, where c1 = (c − λ0u0)χA + b1χB+ and c2 = (c − λ0u0)χA + b2χB, and inequality (7) is satised. For any λ > 0, we can write Z T ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q) + λu0) ≥ Z B ϕ1(c − (λ0 − λ)u0)dµ = Z T ϕ1(c − (λ0 − λ)u0)dµ − Z An0 ϕ1(c − (λ0 − λ)u0)dµ = ∞. By this expression and inequality (7), we conclude that the constant κ(α) as dened by (1) cannot be found. Now suppose that (12) is satised. Let {λn} be a sequence in (λ0, ∞) such that λn ↓ λ0. We dene inductively an increasing sequence {kn} ⊆ N as follows. Choose k0 ≥ 1 such that R T \Ek0 ϕ1(c − λ1u0)dµ ≤ 2−2 . Given kn−1 we select some kn > kn−1 such that Z Ekn \Ekn−1 ϕ1(c − λ0u0)dµ ≥ 1 and Z T \Ekn ϕ1(c − λn+1u0)dµ ≤ 2−(n+2) . Let us denote An = Ekn \ Ekn−1 for n ≥ 1. Notice that the sets An are pairwise disjoint. Take n0 > 1 such that R A ϕ2(c)dµ < 1, where A = S∞ n=n0 An. Now we dene e c = P∞ n=n0 (c − λnu0)χAn . As a result of these choices, it follows that Z A ϕ1(e c)dµ = ∞ X n=n0 Z An ϕ(c − λnu0)dµ ≤ ∞ X n=n0 2−n0 < 1 and Z A ϕ2(e c)dµ < Z A ϕ2(c)dµ < 1. Denote B = T \ A. In view of Lemma (2), there exist measurable functions b1, b2 : T → R such that p = ϕ2(c1) and q = ϕ2(c2) are in Pµ, where c1 = e cχA + b1χB and c2 = e cχA + b2χB, and inequality (7) is satised. Consequently, if the constant κ(α) as dened in (1) exists, then k(α) > 0. Fixed arbitrary λ > 0, we take n1 ≥ n0 such that λn − λ ≤ λ0 for all n ≥ n1. Observing that R An ϕ1(c − λ0u0)dµ ≥ 1, we can write Z T ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q) + λu0)dµ ≥ Z A ϕ1(e c + λu0)dµ ≥ ∞ X n=n1 Z An ϕ1(c − (λn − λ)u0)dµ ≥ ∞ X n=n1 1 = ∞, which shows that κ(α) cannot be found. Suppose that condition (2) is not satised. By Proposition 2 and Lemma 1, we can nd sequences {λn}, {cn} and {An} of positive numbers λn ↓ 0, measurable functions, and pairwise disjoint, measurable sets, respectively, such that Z An ϕ1(cn)dµ = 1 and Z An ϕ2(cn − λnu0)dµ ≤ 2−n , for all n ≥ 1. By expression (10), we can conclude that P∞ n=n0 R An ϕ1(cn−λnu0)dµ < ∞. Then we can take some n0 > 1 for which the function e c = P∞ n=n0 (cn − λnu0)χAn satises R A ϕi(e c)dµ < 1, for i = 1, 2, where A = S∞ n=n0 An. Let us denote B = T \ A. From (2), there exist measurable functions b1, b2 : T → R such that p = ϕ2(c1) and q = ϕ2(c2) belong to Pµ, where c1 = e cχA + b1χB and c2 = e cχA + b2χB, and inequality (7) holds. Given any λ > 0, take n1 ≥ n0 such that λ ≥ λn for all n ≥ n1. Then we can write Z T ϕ1(αϕ−1 2 (p) + (1 − α)ϕ−1 2 (q) + λu0)dµ ≥ Z A ϕ1(e c + λu0)dµ ≥ ∞ X n=n1 Z An ϕ1(cn + (λ − λn)u0)dµ ≥ ∞ X n=n1 Z An ϕ1(cn)dµ = ∞. This expression and inequality (7) imply that the constant κ(α) as dened by (1) cannot be found. Therefore, condition (2) have to be satised. The next result is a consequence of Proposition 2. Proposition 3. Let ϕ: R → [0, ∞) be a deformed exponential. Then we can nd a measurable function u0 : R → (0, ∞) for which condition (2) holds for ϕ1 = ϕ2 = ϕ if, and only if, lim sup u→∞ ϕ(u) ϕ(u − λ0) < ∞, for some λ0 > 0. (13) Proof. By Proposition 2 we can conclude that the existence of u0 implies (13). Conversely, assume that expression (13) holds for some λ0 > 0. In this case, there exists M ∈ (1, ∞) and c ∈ R such that ϕ(u) ϕ(u−λ0) ≤ M for all u ≥ c. Let {λn} be any sequence in (0, λ0] such that λn ↓ 0. For each n ≥ 1, dene cn = sup{u ∈ R : αϕ(u) > ϕ(u − λn)}, (14) where α = 1/M and we adopt the convention sup ∅ = −∞. From the choice of {λn} and α, it follows that −∞ ≤ cn ≤ c. We claim that ϕ(cn) ↓ 0. If the sequence {cn} converges to some c > −∞, the equality αϕ(cn) = ϕ(cn − λn) implies αϕ(c) = ϕ(c) and then ϕ(c) = 0. In the case cn ↓ −∞, it is clear that ϕ(cn) ↓ 0. Let {Tk} be a sequence of pairwise disjoint, measurable sets with µ(Tk) < ∞ and µ(T \ S∞ k=1 Tk) = 0. Thus we can select a sub-sequence {cnk } such that P∞ k=1 ϕ(cnk )µ(Tk) < ∞. Let us dene c = P∞ k=1 cnk χTk and u0 = P∞ k=1 λnk χTk . From (14) it follows that αϕ(u) ≤ ϕ(u − u0(t)), for all u ≥ c(t). Proposition 2 implies that ϕ(·) and u0 satisfy condition (2). References 1. Alberto Cena and Giovanni Pistone. Exponential statistical manifold. Ann. Inst. Statist. Math., 59(1):2756, 2007. 2. David C. de Souza, Rui F. Vigelis, and Charles C. Cavalcante. Geometry induced by a generalization of Rényi divergence. Entropy, 18(11):Paper No. 407, 16, 2016. 3. Shinto Eguchi and Osamu Komori. Path connectedness on a space of probability density functions. In Geometric science of information, volume 9389 of Lecture Notes in Comput. Sci., pages 615624. Springer, Cham, 2015. 4. Julian Musielak. Orlicz spaces and modular spaces, volume 1034 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1983. 5. Giovanni Pistone and Maria Piera Rogantin. The exponential statistical manifold: mean parameters, orthogonality and space transformations. Bernoulli, 5(4):721760, 1999. 6. Marina Santacroce, Paola Siri, and Barbara Trivellato. New results on mixture and exponential models by Orlicz spaces. Bernoulli, 22(3):14311447, 2016.