On Mixture and Exponential Connection by Open Arcs

Publication GSI2017
OAI : oai:www.see.asso.fr:17410:22637
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit


Results on mixture and exponential connections by open arcs are revised and used to prove additional duality properties of statistical models.

On Mixture and Exponential Connection by Open Arcs


application/pdf On Mixture and Exponential Connection by Open Arcs Marina Santacroce, Paola Siri, Barbara Trivellato
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

On Mixture and Exponential Connection by Open Arcs


233.34 Ko


Creative Commons Aucune (Tous droits réservés)


Sponsors Platine


Sponsors Bronze


Sponsors scientifique





<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/17410/22637</identifier><creators><creator><creatorName>Marina Santacroce</creatorName></creator><creator><creatorName>Paola Siri</creatorName></creator><creator><creatorName>Barbara Trivellato</creatorName></creator></creators><titles>
            <title>On Mixture and Exponential Connection by Open Arcs</title></titles>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><subjects><subject>exponential models</subject><subject>mixture models</subject><subject>Orlicz spaces</subject><subject>Kullback-Leibler divergence</subject><subject>dual systems</subject></subjects><dates>
	    <date dateType="Created">Fri 9 Mar 2018</date>
	    <date dateType="Updated">Fri 9 Mar 2018</date>
            <date dateType="Submitted">Mon 15 Oct 2018</date>
	    <alternateIdentifier alternateIdentifierType="bitstream">c811a2ec0ba3afce4dad54dc85f76aad424b31a3</alternateIdentifier>
            <description descriptionType="Abstract">Results on mixture and exponential connections by open arcs are revised and used to prove additional duality properties of statistical models.

On Mixture and Exponential Connection by Open Arcs Marina Santacroce, Paola Siri, and Barbara Trivellato Dipartimento di Scienze Matematiche “G.L. Lagrange”, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy marina.santacroce@polito.it, paola.siri@polito.it, barbara.trivellato@polito.it Abstract. Results on mixture and exponential connections by open arcs are revised and used to prove additional duality properties of statistical models. Keywords: exponential models, mixture models, Orlicz spaces, Kullback- Leibler divergence, dual systems. 1 Introduction In this paper we review some results on mixture and exponential connections by arc and their relation to Orlicz spaces. These results are essentially contained in our previous works, as well as in papers by Pistone and different coauthors. We use some of them in order to prove a new theorem concerning the duality be- tween statistical exponential models and Lebesgue spaces. Moreover, the notions of connection by mixture and exponential arcs, as well as divergence finiteness between two densities, are presented here in a unified framework. The geometry of statistical models started with the paper of Rao [12] and has been described in its modern formulation by Amari [1, 2] and Amari and Nagaoka [3]. Until the nineties, the theory was developed only in the parametric case. The first rigorous infinite dimensional extension has been formulated by Pistone and Sempi [11]. In that paper, using the Orlicz space associated to an exponentially growing Young function, the set of positive densities is endowed with a structure of exponential Banach manifold. More recently, different authors have generalized this structure replacing the exponential function with a new class of functions, called deformed exponentials (see, e.g., Vigelis and Cavalcante [16]). However, the connection to open arcs has not been investigated yet. The geometry of nonparametric exponential models and its analytical properties in the topology of Orlicz spaces have been also studied in subsequent works, such as Cena and Pistone [6] and Santacroce, Siri and Trivellato [14, 15], among others. In the exponential framework, the starting point is the notion of maximal ex- ponential model centered at a given positive density p, introduced by Pistone and Sempi [11]. One of the main result in Cena and Pistone [6] states that any density belonging to the maximal exponential model centered at p is connected by an open exponential arc to p and viceversa (by open, we essentially mean that the two densities are not the extremal points of the arc). Further upgrades of these statements have been proved in Santacroce, Siri and Trivellato [14, 15]. In [14], the equivalence between the equality of the maximal exponential models centered at two (connected) densities p and q and the equality of the Orlicz spaces referred to the same densities is proved. In [15], another additional equiv- alent condition, involving transport mappings, is given. Moreover, in the last paper, it is also shown that exponential connection by arc is stable with respect to projections and that projected densities belong to suitable sub-models. The manifold setting of exponential models, introduced in Pistone and Sempi [11], turns out to be well-suited for applications in physics as some recent papers show (see, e.g., Lods and Pistone [9]). On the other hand, statistical exponen- tial models built on Orlicz spaces have been exploited in several fields, such as differential geometry, algebraic statistics, information theory and, very recently, in mathematical finance (see Santacroce, Siri and Trivellato [15]). In a large branch of mathematical finance convex duality is strongly used to tackle portfolio optimization problems. In particular, the duality between Orlicz spaces has been receiving a growing attention (see [4] among the others). In the last section of this paper we prove a general duality result involving the vector space generated by the maximal exponential model which could be well suited for a financial framework. 2 Mixture and Exponential Arcs Let (X, F, µ) be a fixed probability space and denote with P the set of all densities which are positive µ-a.s. and with Ep the expectation with respect to pdµ, for each fixed p ∈ P. Let us consider the Young function Φ1(x) = cosh(x) − 1, equivalent to the more commonly used Φ2(x) = e|x| − |x| − 1. Its conjugate function is Ψ1(y) = R y 0 sinh−1 (t)dt, which, in its turn, is equivalent to Ψ2(y) = (1 + |y|) log(1 + |y|) − |y|. Given p ∈ P, we consider the Orlicz space associated to Φ1, defined by LΦ1 (p) = {u measurable : ∃ α > 0 s.t. Ep(Φ1(αu)) < +∞} . (1) Recall that LΦ1 (p) is a Banach space when endowed with the Luxembourg norm kukΦ,p = inf n k > 0 : Ep  Φ u k  ≤ 1 o . (2) Finally, it is worth to note the following chain of inclusions: L∞ (p) ⊆ LΦ1 (p) ⊆ La (p) ⊆ Lψ1 (p) ⊆ L1 (p), a > 1. Definition 1. p, q ∈ P are connected by an open exponential arc if there exists an open interval I ⊃ [0, 1] such that one the following equivalent relations is satisfied: 1. p(θ) ∝ p(1−θ) qθ ∈ P, ∀θ ∈ I; 2. p(θ) ∝ eθu p ∈ P, ∀θ ∈ I, where u ∈ LΦ1 (p) and p(0) = p, p(1) = q. Observe that connection by open exponential arcs is an equivalence relation. Let us consider the cumulant generating functional map defined on LΦ1 0 (p) = {u ∈ LΦ1 (p) : Ep(u) = 0}, by the relation Kp(u) = log Ep(eu ). We recall from Pistone and Sempi [11] that Kp is a positive convex and lower semicontinuous function, vanishing at zero, and that the interior of its proper domain, denoted here by ◦ dom Kp, is a non empty convex set. For every density p ∈ P, we define the maximal exponential model at p as E(p) =  q = eu−Kp(u) p : u ∈ ◦ dom Kp  ⊆ P. (3) We now state one of the central results of [6, 14, 15], which gives equivalent con- ditions to open exponential connection by arcs, in a complete version, containing all the recent improvements. Theorem 1. (Portmanteau Theorem) Let p, q ∈ P. The following statements are equivalent. i) q ∈ E(p); ii) q is connected to p by an open exponential arc; iii) E(p) = E(q); iv) log q p ∈ LΦ1 (p) ∩ LΦ1 (q); v) LΦ1 (p) = LΦ1 (q); vi) q p ∈ L1+ε (p) and p q ∈ L1+ε (q), for some ε > 0; vii) the mixture transport mapping m Uq p : LΨ1 (p) −→ LΨ1 (q) (4) v 7→ p q v is an isomorphism of Banach spaces. The equivalence of conditions i) ÷ iv) is proved in Cena and Pistone [6]. State- ments v) and vi) have been added by Santacroce, Siri and Trivellato [14], while statement vii) by Santacroce, Siri and Trivellato [15]. It is worth noting that, among all conditions of Portmanteau Theorem, v) and vi) are the most useful from a practical point of view: the first one allows to switch from one Orlicz space to the other at one’s convenience, while the second one permits to work with Lebesgue spaces. On the other hand condition vii), involving the mixture transport mapping, could be a useful tool in physics applications of exponential models, as the recent research on the subject demonstrates (see, e.g. Pistone [10], Lods and Pistone [9], Brigo and Pistone [5]). In these applications, finiteness of Kullback-Leibler divergence, implied from Portmanteau Theorem, is a desirable property. Corollary 1. If q ∈ E(p), then the Kullback-Leibler divergences D(qkp) and D(pkq) are both finite. The converse of this corollary does not hold, as the counterexamples in San- tacroce, Siri and Trivellato [14, 15] show. In the following we introduce mixture connection between densities and study its relation with exponential arcs. Definition 2. We say that two densities p, q ∈ P are connected by an open mixture arc if there exists an open interval I ⊃ [0, 1] such that p(θ) = (1−θ)p+θq belongs to P, for every θ ∈ I. The connection by open mixture arcs is an equivalence relation as well as in the exponential case. Given p ∈ P, we denote by M(p) the set of all densities q ∈ P which are connected to p by an open mixture arc. Theorem 2. Let p, q ∈ P. The following statements are equivalent. i) q ∈ M(p); ii) M(p) = M(q); iii) q p , p q ∈ L∞ . The previous theorem is the counterpart of Portmanteau Theorem for open mixture arcs (Santacroce, Siri and Trivellato [14]). From Theorems 1 and 2, it immediately follows that M(p) ⊆ E(p), while, in gen- eral, the other inclusion does not hold. A counterexample is given in Santacroce, Siri and Trivellato [14]. Moreover, in the same paper E(p) and M(p) are proved to be convex. The following proposition restates some of the previous results concerning densi- ties either connected by open mixture or exponential arcs or with finite relative divergence. Assuming a different perspective, a new condition is expressed in term of the ratios q p and p q which have to belong to L∞ or to its closure with respect to a suitable topology. Proposition 1. Let p, q ∈ P. The following statements are true. i) q ∈ M(p) if and only if q p , p q ∈ L∞ ; ii) q ∈ E(p) if and only if q p ∈ L1+ (p) = L∞ Φ1+,p and p q ∈ L1+ (q) = L∞ Φ1+,q , for some  > 0, where Φ1+(x) = x1+ ; iii) D(qkp) < +∞ and D(pkq) < +∞ if and only if q p ∈ LΨ1 (p) = L∞ Ψ1,p and p q ∈ LΨ1 (q) = L∞ Ψ1,q . Proof. Since i) and ii) have been already discussed, we consider only condition iii). To prove it, we just need to observe that D(qkp) < +∞ if and only if q p ∈ LΨ1 (p) (Cena and Pistone ([6]) and, that simple functions are dense in LΨ1 (p) (Rao and Ren [13]). The next theorem states a closure result concerning densities belonging to the open mixture model. Theorem 3. For any p ∈ P the open mixture model M(p) is L1 (µ)-dense in the non negative densities P≥, that is M(p) = P≥, where the overline denotes the closure in the L1 (µ)-topology. (See Santacroce, Siri and Trivellato [14] for the proof.) Remark 1. Since M(p) ⊆ E(p), we immediately deduce that also E(p) is L1 (µ)- dense in P≥. The last result was already proved, by different arguments, in Imparato and Trivellato [8]. As a consequence, the positive densities with finite Kullback-Leibler divergence with respect to any p ∈ P is L1 (µ)-dense in the set of all densities P≥. This also corresponds to the choice ϕ(x) = x(log(x))+ in the following result, proved in [14]. Proposition 2. Assume ϕ : (0, +∞) → (0, +∞) is a continuous function. Then the set Pϕ =  q ∈ P : Ep  ϕ  q p  < +∞  is L1 (µ)-dense in P≥. 3 Dual systems In this paragraph we show a new duality result concerning the linear space generated by the maximal exponential model. Let p ∈ P and define U = ∩ q∈E(p) L1 (q), V = Lin{E(p)}. Proposition 3. It holds i) LΦ1 (p) ⊆ U ii) V ⊆ pLΨ1 (p). Proof. In order to prove i) it is sufficient to observe that if u ∈ LΦ1 (p), by v) of Portmanteau Theorem, u ∈ LΦ1 (q) ⊆ L1 (q) for any q ∈ E(p). With regard to ii), we consider v ∈ V . Since v = P i∈F αiqi, with F a finite set, αi ∈ R and qi ∈ E(p), we have that v p = X i∈F αi qi p . From Corollary 1 and of Proposition 1 iii) we get qi p ∈ LΨ1 (p) and the conclusion follows. The next result is our main contribution and shows that U and V are dual spaces with the duality given by the bilinear map (u, v) → hu, vi = Eµ(uv) and that the dual system is separated in both U and V (see Grothendieck [7] for a standard reference on general dual systems). Therefore, if we endow U and V with the weak topologies σ(U, V ) and σ(V, U), respectively, they become locally convex Hausdorff topological vector spaces. Theorem 4. The map h·, ·i : U×V −→ R (u,v) 7−→ hu, vi = Eµ(uv) is a well-defined bilinear form. Moreover, the two separation axioms are satisfied (a.1) hu, vi = 0 ∀u ∈ U =⇒ v = 0 µ-a.s. (a.2) hu, vi = 0 ∀v ∈ V =⇒ u = 0 µ-a.s.. Proof. The map h·, ·i is clearly well-defined by the definitions of U and V , and its bilinearity trivially follows from the linearity of the expectation. We first show statement (a.1) holds. We consider v ∈ V such that hu, vi = 0 ∀u ∈ U. Since, ∀A ∈ F, 1 1A ∈ U, we immediately get h1 1A, vi = Eµ(1 1Av) = 0 and, therefore, v = 0, µ-a.s.. With regard to statement (a.2), let us suppose u ∈ U such that hu, vi = 0 ∀v ∈ V . By definition (3) of the maximal exponential model, if v ∈ E(p), then v = ew−Kp(w) p, with w ∈ ◦ dom Kp, from which hu, vi = e−Kp(w) Ep(uew ). Then the hypothesis, restricted to E(p), becomes Ep(uew ) = 0, ∀w ∈ ◦ dom Kp, (5) from which we will deduce u = 0 µ-a.s.. In order to do this, we define A = {u > 0}, which we suppose not negligible, without loss of generality. Let w̄ = c1 1A +d1 1Ac where the two constants c 6= d are chosen in order to have Ep(w̄) = 0. We check now that ±w̄ belong to ◦ dom Kp, which is equivalent to ±w̄ belong to LΦ1 0 (p) and Ep(e(1+)(±w̄) ) < +∞ for some  > 0. In fact, it holds the stronger result Ep(eαw̄ ) = eαc Z A pdµ + eαd Z Ac pdµ < +∞, for any α ∈ R. From condition (5) applied to w̄, that is Ep(uew̄ ) = 0, we deduce that Ep(u1 1Ac ) = −ed−c Ep(u1 1A). Similarly, from Ep(ue−w̄ ) = 0, we have that Ep(u1 1Ac ) = −ec−d Ep(u1 1A). Since c 6= d, it follows that Ep(u1 1A) = Ep(u1 1Ac ) = 0 and, therefore, u = 0 µ-a.s.. Remark 2. Note that E(p) is obviously contained in V ∩ P. The next example shows that the inclusion is strict. Example 1. Let X = (2, ∞), endowed with the probability measure µ whose Radon-Nikodym derivative with respect to the Lebesgue measure is 1 kx(log x)2 (k > 0 normalizing costant). Define the densities p, q1 and q2 ∈ P where q1(x) = p(x) = 1 and q2(x) = x−1 cx (c > 0 normalizing constant). Since Ep(q1+ 2 ) = Eµ(q1+ 2 ) = Z ∞ 2  x − 1 cx 1+ 1 kx(log x)2 dx < ∞ and Ep(q− 2 ) = Eµ(q− 2 ) = Z ∞ 2  cx x − 1  1 kx(log x)2 dx < ∞ we deduce that q2 ∈ E(p). Define now q ∈ V by q(x) := 1 1 − c q1(x) + c c − 1 q2(x) = 1 1 − c + c c − 1 x − 1 cx = 1 (1 − c)x , ∀x ∈ X. Let us observe that c = Z ∞ 2 x − 1 x dµ(x) = Z ∞ 2 x − 1 kx2(log x)2 dx = 1 − Z ∞ 2 1 kx2(log x)2 dx, which implies c ∈ (0, 1) and thus q > 0. Since 1/(1 − c) + c/(c − 1) = 1, we immediately get q ∈ P. On the other hand, since for every  > 0, we get Ep(q− ) = Eµ(q− ) = (1 − c) Z ∞ 2 x kx(log x)2 dx = ∞, we infer that q 6∈ E(p). 4 Conclusions In the paper, we review several results contained in our previous works, some- times presenting them under different perspectives. We prove an original result in Theorem 4, where a duality is stated between the intersection of L1 (q), when q ranges in a maximal exponential model, and the linear space generated by the exponential model itself. The search for dual systems represents the preliminary step to formulate minimax results, which are fundamental instruments for solv- ing utility maximization problems through convex analysis. Thus, the duality of Theorem 4 could be used in portfolio optimization when allowing for more general set of strategies than the ones considered in the literature. References 1. Amari, S.: Differential geometry of curved exponential families-curvatures and in- formation loss. Ann. Stat.10, 357-385 (1982) 2. Amari, S.: Differential-geometrical methods in statistics. In: Lecture Notes in Statist. vol. 28, Springer-Verlag, New York (1985) 3. Amari, S., Nagaoka, H.: Methods of information geometry. In: Transl. Math. Monogr. vol. 191. American Mathematical Society, Providence, RI; Oxford Uni- versity Press, Oxford (2000) 4. Biagini, S., Frittelli, M.: A unified framework for utility maximization problems: an Orlicz space approach. Ann. Appl. Probab. 18 (3), 929-966 (2008). 5. Brigo, D., Pistone, G.: Projection based dimensionality reduction for measure valued evolution equations in statistical manifolds. arXiv:1601.04189v3 (2016) 6. Cena, A., Pistone, G.: Exponential Statistical Manifold. AISM 59, 27-56 (2007) 7. Grothendieck A.: Topological Vector Spaces. Gordon& Breach Science Publishers, London (1973) 8. Imparato, D., Trivellato, B.: Geometry of Extendend Exponential Models. In: Al- gebraic and Geometric Methods in Statistics pp. 307-326 (2009) 9. Lods, B., Pistone G.: Information geometry formalism for the spatially homogeneous Boltzmann equation. Entropy 17, 4323-4363 (2015) 10. Pistone G.: Examples of the application of nonparametric information geometry to statistical physics. Entropy 15, 4042-4065 (2013) 11. Pistone G., Sempi C.: An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 23 (5), 1543-1561 (1995) 12. Rao, C.R.: Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 37, 81-91 (1945) 13. Rao M.M., Ren Z.D.: Theory Of Orlicz Spaces. Marcel Dekker Inc., New York (1991) 14. Santacroce M., Siri P., Trivellato B.: New results on mixture and exponential mod- els by Orlicz spaces. Bernoulli 22 (3), 1431-1447 (2016) 15. Santacroce M., Siri P., Trivellato B.: Exponential models by Orlicz spaces and Applications. Submitted (2017) 16. Vigelis R.F. and Cavalcante C.C.: On ϕ-families of probability distributions. J. Theoret. Probab. 26 (3), 870-884 (2013)