Invariant geometric structures on statistical models

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14331

Résumé

We review the notion of parametrized measure models and tensor fields on them, which encompasses all statistical models considered by Chentsov [6], Amari [3] and Pistone-Sempi [10]. We give a complete description of n-tensor fields that are invariant under sufficient statistics. In the cases n = 2 and n = 3, the only such tensors are the Fisher metric and the Amari-Chentsov tensor. While this has been shown by Chentsov [7] and Campbell [5] in the case of finite measure spaces, our approach allows to generalize these results to the cases of infinite sample spaces and arbitrary n. Furthermore, we give a generalisation of the monotonicity theorem and discuss its consequences.

Invariant geometric structures on statistical models

Collection

application/pdf Invariant geometric structures on statistical models Lorenz Schwachhöfer, Nihat Ay, Jürgen Jost, Hong Van Le

Média

Voir la vidéo

Métriques

124
7
172.57 Ko
 application/pdf
bitcache://60fc4902ebacfe49f7f0d49cb42631c35817ae74

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14331</identifier><creators><creator><creatorName>Nihat Ay</creatorName></creator><creator><creatorName>Lorenz Schwachhöfer</creatorName></creator><creator><creatorName>Jürgen Jost</creatorName></creator><creator><creatorName>Hong Van Le</creatorName></creator></creators><titles>
            <title>Invariant geometric structures on statistical models</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sun 8 Nov 2015</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Wed 19 Sep 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">60fc4902ebacfe49f7f0d49cb42631c35817ae74</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24726</version>
        <descriptions>
            <description descriptionType="Abstract">
We review the notion of parametrized measure models and tensor fields on them, which encompasses all statistical models considered by Chentsov [6], Amari [3] and Pistone-Sempi [10]. We give a complete description of n-tensor fields that are invariant under sufficient statistics. In the cases n = 2 and n = 3, the only such tensors are the Fisher metric and the Amari-Chentsov tensor. While this has been shown by Chentsov [7] and Campbell [5] in the case of finite measure spaces, our approach allows to generalize these results to the cases of infinite sample spaces and arbitrary n. Furthermore, we give a generalisation of the monotonicity theorem and discuss its consequences.

</description>
        </descriptions>
    </resource>
.

Statistical models General parametrized measure models Monotonicity Invariant Geometric Structures on Statistical Models Lorenz Schwachh¨ofer1 (joint work with Nihat Ay2, J¨urgen Jost2, Hˆong Vˆan Lˆe3) GSI, October 30, 2015 1 TU Dortmund University, Germany, 2 Max-Planck-Institute for Mathematics in Science, Leipzig, Germany, 3 Academy of Scienence of the Czech Republik, Prague L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Outline of the talk 1 Statistical models Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors 2 General parametrized measure models 3 Monotonicity L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors Statistical models What is a statistical model or a parametrized measure model? Heuristicly speaking, a statistical model is a family p(ξ)ξ∈M of probability measure on a fixed sample space Ω which vary “differentiably” with ξ ∈ M, where the parameter space M is a (finite dimensional) manifold. More precisely, we make the following definition: L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors Definition (Amari (1980), Ay, Jost, Le, S. (2013)) Let Ω be a measure space. A parametrized measure model with a regular density function and reference measure µ0 is a family of measures given by p(ξ) = ¯p(ξ; ω)µ0, where ¯p > 0 is differentiable in the ξ-variable, and ∂V log ¯p(ξ; ·) ∈ L1(Ω, p(ξ)). We call such a model statistical if all p(ξ) are probability measures, i.e., if p(ξ) := p(ξ)(Ω) = 1. Furthermore, we call the model k-integrable if ∂V log ¯p(ξ; ·) ∈ Lk(Ω, p(ξ)). L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors Remark: We can always get a statistical model from a parametrized measure model by normalization: If p(ξ)ξ∈M is a parametrized measure model, then we obtain a statistical model by setting p0(ξ) := p(ξ) p(ξ) , where p(ξ) := p(ξ)(Ω) (i.e. use projectivization of a finite measure). But sometimes it is more convenient to work without this normalization. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors A statistics is a measurable function κ : Ω → Ω between two measure spaces. If p(ξ)ξ∈M is a parametrized model (statistical model), then so is p (ξ)ξ∈M, where p (ξ) := κ∗(p(ξ)), where κ∗(µ)(A ) := µ(κ−1 (A )). We say that a statistic κ : Ω → Ω is sufficient for the model p(ξ)ξ∈M, if p(ξ) = ¯p (ξ; κ(·))µ for some measure µ and a function ¯p on M × Ω . This means that the models p(ξ)ξ∈M and p (ξ)ξ∈M carry the same information. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors Invariant Tensors Let p(ξ)ξ∈M be a parametrized measure model on Ω. An invariant tensor field of the model is a tensor field on M such that for any sufficient statistic κ : Ω → Ω it is the pull-back of a tensor field via κ. Examples of invariant tensor fields: τ1 ξ (V ) := Ω ∂V log ¯p(ξ; ·) dp(ξ) = ∂V Ω log ¯p(ξ; ·) dp(ξ) = ∂V Ω ¯p(ξ; ·)dµ0 = ∂V p(ξ) . Thus, τ1 ≡ 0 for statistical models (i.e., if p(ξ) ≡ 1). The Fisher metric is a symmetric 2-tensor: τ2 ξ (V , W ) := Ω ∂V log ¯p(ξ; ·)∂W log ¯p(ξ; ·) dp(ξ) L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors The Amari-Chentsov tensor is a symmetric 3-tensor: τ3 ξ (V , W , U) := Ω ∂V log ¯p(ξ; ·)∂W log ¯p(ξ; ·)∂U log ¯p(ξ; ·) dp(ξ) We can generalize this to arbitrary degrees. The canonical n-tensor: τn ξ (V1, . . . , Vn) := Ω ∂V1 log p(ξ; ·) · · · ∂Vn log p(ξ; ·) dp(ξ) Observe that τn is only defined if the model is k-integrable for some k ≥ n. All of these n-tensors τn are invariant under sufficient statistics. Question: Are there any others? L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors For n = 2 and 3 the answer was given by Chentsov and Campbell if Ω is finite. Theorem (Chentsov (1976), Campbell (1986)) Let Ω be a finite measure space. The only invariant 2-tensors on a parametrized measure model are of the form σ(V , W ) = f τ2 (V , W ) + g τ1 (V )τ1 (W ) = f τ2 (V , W ) + g ∂V p(ξ) ∂W p(ξ) , where f , g are continuous functions depending on p(ξ) . In particular, for a statistical model (i.e. p(ξ) ≡ 1), the only such tensor is (up to a constant) the Fisher metric. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors Theorem (Chentsov (1976), Campbell (1986)) Let Ω be a finite measure space. The only invariant 3-tensors on a parametrized measure model are of the form σ(V , W , U) = f τ3 (V , W , U) +g1 τ2 (V , W )τ1 (U) + g2 τ2 (W , U)τ1 (V ) +g3 τ2 (U, V )τ1 (W ) +h τ1 (V )τ1 (W )τ1 (U) for functions f , g1, g2, g3, h depending on p(ξ)||. In particular, for a statistical model (i.e. p(ξ) ≡ 1), the only such tensor is (up to a constant) the Amari-Chentsov tensor τ3. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors What about invariant tensors for n ≥ 4? We can define other invariant tensors by Forming arbitrary tensor products of these. For instance, the following are invariant 4-tensor fields: σ0(V1, V2, V3, V4) = τ4 (V1, V2, V3, V4), or σ1(V1, V2, V3, V4) = τ2 (V1, V3)τ2 (V2, V4), or σ2(V1, V2, V3, V4) = τ1 (V3)τ3 (V1, V2, V4), or σ3(V1, V2, V3, V4) = τ1 (V1)τ1 (V4)τ2 (V2, V3), or ... ... ... ... ... ... ... ... ... Taking linear combiations of such tensors with functions depending on p(ξ) , for instance, p(ξ) 2 σ1(V1, V2, V3, V4) + σ3(V1, V2, V3, V4). L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Definition of statistical models / parametrized measure models Sufficient statistics and invariant tensors We say that such tensors and their linear combinations are algebraically generated by (τn)n∈N. We can generalize extend the result of Chentsov / Campbell to arbitrary measure spaces and tensors of arbitrary degree: Theorem (Ay, Lost, Lˆe, S. (2014)) Let Ω be an arbitrary measure space. Then any invariant tensors on a model on Ω is algebraically generated by (τn)n∈N in the sense specified above. Remark: If Ω is itself a manifold and p(ξ) consists of smooth densities, then Bauer, Bruverus and Michor showed that any 2-tensor field which is invariant under diffeomorphisms of Ω must be a multiple of the Fisher metric. Their proof can be generalized to saying that in this case, any tensor of any degree which is invariant under diffeomorphisms of Ω is algebraically generated by (τn)n∈N. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity General parametrized measure models When defining statistical models / parametrized measure models with the help of a regular density function as p(ξ) = ¯p(ξ; ω)µ0 with ¯p(ξ; ω) > 0, there are some issues which are problematic. For fixed ξ, the density function ¯p(ξ; ·) is defined only up to changes on a µ0-null set. How does such a change affect the differentiability of ¯p? Since ¯p > 0, all measures p(ξ) have exactly the same null sets. Is this really a necessary restriction? L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity We propose a new definition. For this, observe that the space S(Ω) of signed finite measures on Ω is a vector space, in fact, a Banach space. Let P(Ω) be the set of probability measures on Ω and M(Ω) the set of finite measures on Ω. Evidently, P(Ω) ⊂ M(Ω) ⊂ S(Ω). Definition (Ay, Jost, Le, S., 2015) Let Ω be a measure space. A parametrized measure model (statistical model) is a map p : M → M(Ω) (p : M → P(Ω)), which is differentiable as a map between Banach manifolds when regarded as a map into S(Ω). Thus, the differential is a linear map dpξ : TξM → S(Ω). L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Advantages: There is no need for a reference measure µ0. This definition includes all the models previously defined, but also some ”strange” examples such as: Ω = (0, π) p(ξ) :=    1 + ξ (sin2 (t − 1/ξ))1/ξ2 dt for ξ = 0 dt for ξ = 0. This example is differentiable in the sense of this definition, but the density function ¯p(t, ξ) w.r.t. µ0 = dt is not differentiable. (“Differentiability on the average” suffices) L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity The function ¯p and hence log ¯p is no longer defined. However, we show the following: If p : M → M(Ω) is a parametrized measure model, then dpξ(V ) ∈ S(Ω) is dominated by p(ξ) for all ξ ∈ M, V ∈ TξM. Thus, we can define ∂V log p := d{dpξ(V )} dp(ξ) ∈ L1 (Ω, p(ξ)) as the Radon-Nikodym derivative. Then ∂V p(ξ) = ∂V log p · p(ξ) holds just as in the presence of regular density function. That is: The expression ∂V log p makes sense, even though log p does not. How do we interpret k-integrability, i.e. the condition ∂V log p ∈ Lk(Ω, p(ξ)) ⊂ L1(Ω, p(ξ))? L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Consider the following rewriting of the canonical tensor τn in the presence of a regular density function: p(ξ) = ¯p(ξ; ω)µ0. τn (V1, . . . , Vn) = Ω ∂V1 log ¯p · · · ∂Vn log ¯p dp(ξ) = Ω ∂V1 ¯p ¯p · · · ∂Vn ¯p ¯p ¯p dµ0 = Ω ∂V1 ¯p ¯p1−1/n · · · ∂Vn ¯p ¯p1−1/n dµ0 = nn Ω ∂V1 n √ ¯p · · · ∂V1 n √ ¯p dµ0 = nn Ω d(∂V1 n √ ¯pµ0 · · · ∂V1 n √ ¯pµ0) = nn Ω d(∂V1 p(ξ)1/n · · · ∂V1 p(ξ)1/n ). Can we make sense out of n-th roots of measures? L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Answer: YES, WE CAN! For 0 < r ≤ 1, we can define Banach spaces (Sr (Ω), · r ), whose elements may be interpreted as ”r-th powers of a measure”. They have the subsets Pr (Ω) ⊂ Mr (Ω) ⊂ Sr (Ω) of r-th powers of (probability) measures. We can work with these quite intuitively: There is a multiplication map · : Sr (Ω) → Ss(Ω) → Sr+s(Ω), which is bilinear and bounded, if r, s, r + s ∈ (0, 1]. There is a power raising map πk : Sr (Ω) → Skr (Ω) for all r, kr ∈ (0, 1]. This map is continuous for all k > 0 and differentiable for k ≥ 1. We now say that a (general) parametrized measure model p : M → M(Ω) is k-integrable, if p1/k : M → M1/k(Ω) ⊂ S1/k(Ω) is (weakly) differentiable. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Then the equation τn (V1, . . . , Vn) = nn Ω d(∂V1 p(ξ)1/n · · · ∂V1 p(ξ)1/n ) means the following: If we define the canonical form on S1/n(Ω) as Ln Ω(ν1, . . . , νn) := nn Ω d(ν1 · · · νn), then τn = (p1/n )∗ (Ln Ω). L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity We can show many results that a priori need a parametrized measure model with a regular density function without this assumption. For example: The result that any invariant tensor is algebraically generated by (τn)n∈N also holds for this kind of parametrized measure models / statistical models. If κ : Ω → Ω is a statistic and p : M → M(Ω) is a parametrized measure model (p : M → P(Ω) a statistical model) and p := κ∗p, and if p is k-integrable, then so is p . (The proof of this is not trivial since we do not assume that κ admits transverse measures!) For any statistic, the Monotonicity formula holds: L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Monotonicity Theorem (Ay, Jost, Le, S. (2015)) Let p : M → M(Ω) be a k-integrable parametrized measure model (statistical model), for k ≥ 2, let κ : Ω → Ω be a statistic, and let p := κ∗p as before, so that p is also k-integrable. Then then the Fisher metrics gF , gF of p, p satisfy the monotonicity condition gF (V , V ) − gF (V , V ) ≥ 0. for all V ∈ TξM. The quantity gF (V , V ) − gF (V , V ) is the information loss of the model under κ. That is, even for these general parametrized measure models the information loss is still measured by the Fisher metric. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity There is, however, one difference between general parametrized measure models or those with regular density function. Namely, we have an interpretation of equality in the Monotonicity formula: Let p : M → M(Ω), κ : Ω → Ω and p := κ∗p as in the Monotonicity Theorem, and assume that p is given by a regular density function. Then g(V , V ) − g (V , V ) = 0 for all V ∈ TM iff κ is a sufficient statistic of the model. This fails to hold if the existence of a regular density function is not assumed, as the following example will show. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity Example: Ω := (−1, 1) × (0, 1), Ω := (−1, 1), κ : Ω → Ω is the projection onto the first component. For ξ ∈ R define p(ξ) := ¯p(ξ; s, t) ds dt, where p(ξ; s, t) =    ξ2 for ξ ≥ 0 and s ≥ 0 3ξ2t2 for ξ < 0 and s ≥ 0 1 for s < 0. This fails to be regular, as for ξ = 0, we do not have ¯p > 0 a.e. One can easily calculate that g(∂ξ, ∂ξ) = g (∂ξ, ∂ξ) = 4, so there is no information loss. However, κ is not sufficient for the model. κ is a sufficient statistic when restricting the model to ξ ∈ [0, ∞) and when restricting to ξ(−∞, 0]. But the measures µ± for which p(ξ) = ¯p(ξ, s)µ± are different (µ+ = d(s, t), µ− = 3t2d(s, t)) and hence κ is not a sufficient statistic for the whole model. L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models Statistical models General parametrized measure models Monotonicity References: N.Ay, J.Jost, H.V.Lˆe, L.S., Information geometry and sufficient statistics, Probability Theory and Related Fields 162 no. 1-2, (2015), 327-364. N.Ay, J.Jost, H.V.Lˆe, L.S., Parametrized measure models, arXiv:1510.07305 N.Ay, J.Jost, H.V.Lˆe, L.S., Information Geometry, (Textbook, Springer, 2016?) Thank you for your attention! L. Schwachh¨ofer Invariant Geometric Structures on Statistical Models