Infinite-Dimensional Manifolds of Finite-Entropy Probability Measures

28/08/2013
Auteurs : Nigel Newton
OAI : oai:www.see.asso.fr:2552:4863
DOI :

Résumé

Infinite-Dimensional Manifolds of Finite-Entropy Probability Measures

Métriques

880
142
110.77 Ko
 application/pdf
bitcache://c283c53b21c4518073a399e06849fedcca318d4a

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors financier

logo_gdr-mia.png
logo_inria.png
image010.png
logothales.jpg

Sponsors logistique

logo-minesparistech.jpg
logo-universite-paris-sud.jpg
logo_supelec.png
Séminaire Léon Brillouin Logo
logo_cnrs_2.jpg
logo_ircam.png
logo_imb.png
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/2552/4863</identifier><creators><creator><creatorName>Nigel Newton</creatorName></creator></creators><titles>
            <title>Infinite-Dimensional Manifolds of Finite-Entropy Probability Measures</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2013</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Mon 16 Sep 2013</date>
	    <date dateType="Updated">Mon 25 Jul 2016</date>
            <date dateType="Submitted">Wed 19 Sep 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">c283c53b21c4518073a399e06849fedcca318d4a</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>9616</version>
        <descriptions>
            <description descriptionType="Abstract"></description>
        </descriptions>
    </resource>
.

NJN GSI 2013 1 Infinite-Dimensional Manifolds of Finite-Entropy Probability Measures Nigel Newton School of Computer Science and Electronic Engineering, University of Essex, UK 2 Information Geometry • The differentiable structure of sets of probability measures induced by statistical divergences. • For example, the α-divergences: • Kullback-Leibler (KL) divergence (α = −1) related to Shannon information and Boltzmann-Gibbs entropy. 2 1iflog )1,1(if1 1 4 1iflog)|( 2/)1( 2 += +−∈⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − −== ∫ ∫ ∫ + α α α α α α dQ dP dQ dP dP dQ dP dQ dP QPD NJN GSI 2013 NJN GSI 2013 3 Information Geometry • We seek manifolds of probability measures (N, θ) with charts θ : N → B such that Dα(θ−1|θ−1) : B × B → [0, ∞) is appropriately smooth. • Probability measures of interest will be represented by their densities: • KL divergence is bilinear in the density and its log (regarded as elements of dual spaces of functions): and so we would like p and log p to be smooth. measure)referenceaisμμ (/ ddPp = qpppQPD log,log,)|(1 −=− NJN GSI 2013 4 An Exponential Model • Underlying probability space (X, X, μ). • Real-valued random variables (η1, ..., ηd): – linearly independent elements of L0(μ) – • Manifold (N, θ), where θ−1 : B → P(X ) is defined by • Model space is Rd, Dα ∈C∞(Ν ). d i i i Byy R)(expE ⊆∈∞<∑ openallforημ )(: expE exp)( 1 1 BN y y d yd p i i i i i i − − === ∑ ∑ θ η η μ θ μ and • The map Rd ∋ y → γ (y) ∈ L0(μ) defined by is a bijection onto a d-dimensional subspace of L0(μ). • We want to move to infinite-dimensional subspaces. • To control the density (exponential of γ), we need a strong norm/topology: exponential Orlicz space LΨ(μ). (Pistone et al.) • Finite-dimensional α-models (α ≠ 1) are even harder to extend since no norm on p(1−α)/2 will control log p. NJN GSI 2013 5 Extension to Infinite Dimensions? ii i yy ηγ ∑=)( Aim To construct infinite-dimensional statistical manifolds with charts that directly control both p and log p. • This enables the use of a weaker topology on the model space; one that can be chosen according to the desired smoothness of Dα. • It emphasises the duality between mixture and exponential representations. NJN GSI 2013 6 The Manifold Mλ (λ ∈ [2, ∞) ) • Underlying probability space (X, X, μ). • Mλ is the set of probability measures on X with the following properties: • Model space: • Chart: φ : Mλ → Gλ NJN GSI 2013 7 ∞<∞< λ μ λ μμ |log|EE,~ ppP and { }∞<=→== λ μμ λ λ μ ||E,0E:X:)(0 aaaLG R pppP logElog1:)( μφ −+−= The Function ψ • Inverse of the function (0, ∞) ∋ y → y + log y ∈ R. • Convex, linear growth, bounded derivatives of all orders. • First derivative: NJN GSI 2013 8 ψ ψ ψ + = 1 )1( z ψ(z) Proposition 1) The map φ is a bijection. Its inverse is given by where ρ(a)(x) = a(x) + Z(a), and Z(a) is the unique number for which the RHS is a probability density. 2) The map ρ : G → Lλ(μ) is of class C ⎡λ⎤ −1; its first derivative is NJN GSI 2013 9 )()))((()( Gaxaxp ∈= ρψ ))((E ))((E )1( )1( a ua uuD a ρψ ρψ ρ μ μ −= The Tangent Space • A tangent vector U at P ∈ M is an equivalence class of differentiable curves at P: • TPM = “linear space of all tangent vectors at P” • Tangent bundle: • Global chart: Φ : TM → G × G NJN GSI 2013 10 ( ) ( ) 0000if),(,),(, QPQPtQtP tt && ==−∈≡−∈ andεεδδ UMP PMTTM ∈ = ( ) ),( )()(),(:),( 0 ua MTUPPPUP Pt t tdt d = ∈∈=Φ = whereφφ Differentiable Maps • Let f :M → Y (a Banach space). • If, for every P∈M, there exists a continuous linear map dfP : TPM → Y such that then we say that f is d-differentiable, with derivative dfP. • Weaker than Fréchet differentiability, but stronger than Gateaux differentiability. • For d-differentiable maps, we use the notation NJN GSI 2013 11 MTUPUdfPf PtP t tdt d ∈∈= = )()( 0 allfor ( )abledifferentichetFrisif é1 : fuDfUdfUf aP − == φo The Amari Embedding Maps • Let Fα : M → L2(μ) be defined by • Range is chosen to be L2(μ) since • In the exponential Orlicz manifold Fα : Vq → L2/(1−α)(Q). NJN GSI 2013 12 ⎩ ⎨ ⎧ = −∈ = − − 1iflog )1,1[if )( 2/)1( 1 2 α αα α α p p PF ⎪ ⎩ ⎪ ⎨ ⎧ +=− +−∈− −=− = − −− − 1if)(,)()( )1,1(if)(,)( 1if)()(,)( )|( )(111 )(1 4 )(111 2 22 2 α α α μ μααα μ α L L L QFPFQF QFPF QFPFPF QPD Differentiability of Fα Proposition 1) The map Fα is of class C ⎡λ/2⎤ −1. 2) If λ = 2, the map Fα is d-differentiable. The first derivative is NJN GSI 2013 13 ),(),( 1 2/)1( UPuauD p p UF a Φ= + = − whereρ α α Differentiability of the Divergences NJN GSI 2013 14 Proposition 1) For any 0 ≤ i, j ≤ ⎣λ⎦ − 1 with i + j ≤ ⎡λ⎤ − 1, the map Dα : M × M → [0, ∞) is of class Ci,j. 2) If λ = 2, the d-derivatives d1D2Dα = d2D1Dα exist. The first few derivatives are )( )( )( 2 2 2 ,)|( ),(),,(,)()()|( ,)()()|( μααα μαααα μαααα L L L VFUFUVD TMVQUPVFPFQFPVD UFQFPFQUD − − −− −=⋅⋅ ∈−=⋅ −=⋅ The Fisher Metric • From the Eguchi relations, setting Q = P, • This defines an inner product on TPM with • But, || · ||P is not equivalent to ||Φ2||G, even if λ = 2. • (M, < · , · >) is a pseudo-Riemannian manifold. • Seems to be unavoidable in infinite dimensions. NJN GSI 2013 15 vuDD p p VFUFUVDVU aaLP ρρμμααα 2)( )1( E,:, 2 + ==−= − GP uU ≤ Third Derivatives • If λ > 3, Da admits the third derivative where and • If u and v are such that y ∈ Gλ NJN GSI 2013 16 wyDD p p wvuDD aaaa ρρφφ μα ˆ )1( E);,()|( 2, 11 2 2 1 + −=−− D ( ) ( )1111 ,E: ),,( 2 1 2 1 VFUFVUpVFUFVFUF vuay −−+−= Γ= +− αα μ α Paa WYwvuDD ,);,()|( , 11 2 2 1 −=−− φφαD )(:ˆ 2/ 2/ μρ λ λ LG → α-Derivatives • Let be the set of vector fields U : M → TM for which • For any we can use the Eguchi relations to define an a-derivative as follows: • This does not define an operator ∇α with domain • This is a consequence of the non-equivalence of norms on TPM. However, putting s = ∞, it does define a limited notion of α-parallel transport on the tangent bundle. NJN GSI 2013 17 ( )v)u,UvV -1 U ,(, aa α α Γ+Φ=∇ ),( 0],1[))(;( Nu ∈∞∈∈ lsLMC sl μλ l sV 0 sV∈U 0 1 1 )1/(: VV →∇ −ss α U 1 1 0 1 VV × Finite-Dimensional Submanifolds Let (N, θ) be a finite-dimensional Cn-embedded submanifold of M, for some 1 ≤ n ≤ ∞. • The Fisher metric on N is a Riemannian metric: • If λ > 3 and n ≥ 2, N admits the full geometry of α- covariant derivatives: • Finite-dimensional α-models fit this framework. NJN GSI 2013 18 φρφρμ jaiaPjiji DD p p Pg ∂∂ + =∂∂= 2, )1( E,)( φρφφρ αμα lajia lkk ji N DaD p p PgP ∂∂∂Γ + =Γ ),,( )1( E)()( 2 , , A Few References NJN GSI 2013 19 • A. Cena, G. Pistone, Exponential statistical manifold, AISM 59, 27-56 (2007). • P. Gibilisco and G. Pistone, Connections on non-parametric statistical manifolds by Orlicz space geometry, Infinite Dimensional Analysis: Quantum Probability and Related Topics 1, 325-347 (1998). • M. R. Grasselli, Dual connections in non-parametric classical information geometry, AISM 62, 873-896 (2010). • N.J. Newton, An infinite-dimensional statistical manifold modelled on Hilbert space, J. Functional Anal. 263, 1661-1681 (2012). • N.J. Newton, Infinite-dimensional statistical manifolds based on a balanced chart, arXiv:1308.3602v1 [math.PR] (2013). • G. Pistone, M.P. Rogantin, The exponential statistical manifold: mean parameters, orthogonality and space transformations, Bernoulli 5, 721-760 (1999). • G. Pistone, C. Sempi, An infinite-dimensional geometric structure on the space of all probability measures equivalent to a given one, Ann. Statist. 23, 1543-1561 (1995). • R.F. Vigelis, C.C. Cavalcante, On ϕ–families of probability distributions, J. Theoretical Probability (2011).