Résumé

Computational Information Geometry (CIG) in Statistics: foundations

Média

Voir la vidéo

Métriques

76
7
503.34 Ko
 application/pdf
bitcache://6be80dd06a3fdc9716907a56c311bfd0fd4cb25d

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors financier

logo_gdr-mia.png
logo_inria.png
image010.png
logothales.jpg

Sponsors logistique

logo-minesparistech.jpg
logo-universite-paris-sud.jpg
logo_supelec.png
Séminaire Léon Brillouin Logo
logo_cnrs_2.jpg
logo_ircam.png
logo_imb.png
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/2552/4885</identifier><creators><creator><creatorName>Karim Anaya-Izquierdo</creatorName></creator><creator><creatorName>Paul Marriott</creatorName></creator><creator><creatorName>Paul Vos</creatorName></creator><creator><creatorName>Frank Critchley</creatorName></creator></creators><titles>
            <title>Computational Information Geometry (CIG) in Statistics: foundations</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2013</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Mon 16 Sep 2013</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Fri 20 Jul 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">6be80dd06a3fdc9716907a56c311bfd0fd4cb25d</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>25094</version>
        <descriptions>
            <description descriptionType="Abstract"></description>
        </descriptions>
    </resource>
.

Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Computational Information Geometry (CIG) in Statistics: foundations Karim Anaya-Izquierdo, FC, Paul Marriott and Paul Vos (Bath, OU, Waterloo and East Carolina) GSI'13: Paris, August 2013 Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis handling both data and model uncertainty Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis handling both data and model uncertainty inference in graphical & related models Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis handling both data and model uncertainty inference in graphical & related models transdimensional & other issues in MCMC Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda OVERALL AIM: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis handling both data and model uncertainty inference in graphical & related models transdimensional & other issues in MCMC mixture estimation (see PM's talk) Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda KEY IDEA: NB: Statist'l. model $ (sample space Ω, {proby. d/ns. on Ω}). Represent inference problems arising in such models inside adequately large but nite dimensional spaces. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda KEY IDEA: NB: Statist'l. model $ (sample space Ω, {proby. d/ns. on Ω}). Represent inference problems arising in such models inside adequately large but nite dimensional spaces. In these embedding spaces, the building blocks of IG in statistics are explicit, computable & algorithmically usable. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda KEY IDEA: NB: Statist'l. model $ (sample space Ω, {proby. d/ns. on Ω}). Represent inference problems arising in such models inside adequately large but nite dimensional spaces. In these embedding spaces, the building blocks of IG in statistics are explicit, computable & algorithmically usable. Modulo a possible initial discretisation, for a r.v. of interest, an operational universal model space $ the simplex: ∆k := fπ = (π0, π1, ..., πk ) : πi 0, ∑k i=0 πi = 1g, (1) having a unique label for each vertex, representing the r.v. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda KEY IDEA: NB: Statist'l. model $ (sample space Ω, {proby. d/ns. on Ω}). Represent inference problems arising in such models inside adequately large but nite dimensional spaces. In these embedding spaces, the building blocks of IG in statistics are explicit, computable & algorithmically usable. Modulo a possible initial discretisation, for a r.v. of interest, an operational universal model space $ the simplex: ∆k := fπ = (π0, π1, ..., πk ) : πi 0, ∑k i=0 πi = 1g, (1) having a unique label for each vertex, representing the r.v. Multinomials on k + 1 categories $ int(∆k ), the r.i. of ∆k Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda KEY IDEA: NB: Statist'l. model $ (sample space Ω, {proby. d/ns. on Ω}). Represent inference problems arising in such models inside adequately large but nite dimensional spaces. In these embedding spaces, the building blocks of IG in statistics are explicit, computable & algorithmically usable. Modulo a possible initial discretisation, for a r.v. of interest, an operational universal model space $ the simplex: ∆k := fπ = (π0, π1, ..., πk ) : πi 0, ∑k i=0 πi = 1g, (1) having a unique label for each vertex, representing the r.v. Multinomials on k + 1 categories $ int(∆k ), the r.i. of ∆k (1) allows d/ns. with different support sets. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda (One Iteration of) Statistical Science: Working Problem Formulation: WPF = (Q, p.c., model, data, inference) ) A Q takes the form: `what is θQ θQ[F]?', so that θQ has same (= population) meaning in all models perturbations of problem formulation are pertinent ) sensitivity analyses are sensible perturb (weight) data via CSF: see CALB, (2001), JRSS, B Focus: (perturb) the working model, M say, ... a set of (often, explicitly parameterised) d/ns. on Ω Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Overall aim of CIG in Statistics Key Idea Statistical Science (... needs Sensitivity Analysis) Agenda AGENDA: Represent working model M by a subset of ∆k (cf. coarse-graining). Use IG of ∆k to: numerically compute statistically important features of M ... including: properties of likelihood (can be nontrivial here) adequacy of rst order asymptotic methods ... notably, via higher order asymptotic expansions curvature based dimension reduction mixture model structure/inference (see PM's talk). Focus: ideas, not proofs (given in arXiv paper [2]). Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions KEY QUESTION: CIG approach is inherently discrete and nite. Sometimes: this is without loss. In general: 9 appropriate theory for suitably ne partitions: cost: some loss of generality (obvious ~ relation induced). bene t: excellent foundation for a computational theory. ... while: FMP ) models can (arguably, should) be seen as fundamentally categorical. Poses the key question: What is the effect on the inferential objects of interest of a particular selection of such categories? Addressed in Theorems 1 & 2 but, rst, ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions EXAMPLE 1: leukaemia patient data 43 survival times Z from diagnosis, measured in days Q: what is the mean survival time µ µ[F]? for (later) expository purposes: suppose Z Exponential, but only observe censored value Y = minfZ, tg ) ... Y a 1-D curved EF, inside a 2-D regular EF [PM & West (2002)] t chosen to give reasonable, but not perfect, t directly illustrates 2 points: 1 whereas model is continuous, data are discrete ) ZERO loss in treating them as sparse categorical Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions EXAMPLE 1: leukaemia patient data 43 survival times Z from diagnosis, measured in days Q: what is the mean survival time µ µ[F]? for (later) expository purposes: suppose Z Exponential, but only observe censored value Y = minfZ, tg ) ... Y a 1-D curved EF, inside a 2-D regular EF [PM & West (2002)] t chosen to give reasonable, but not perfect, t directly illustrates 2 points: 1 whereas model is continuous, data are discrete ) ZERO loss in treating them as sparse categorical 2 a further level of coarseness using bin size = 4 days produces effectively NO inferential loss ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions EXAMPLE 1: log-likelihood for interest parameter µ Panel (a): bin size: circles = 1 day; solid line = 4 days 600 800 1000 1200 1400 1600 1800 -3.5-2.5-1.5-0.5 (a) Log-likelihoods mu log-likelihood -0.002 0.000 0.001 0.002 0.003 0.004 -0.50.00.51.01.5 (b)Fullexponentialfamily theta1 theta2 Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions Information loss under discretisation for continuous r.v.'s, need to: truncate & discretise Ω into a nite number of bins. Theorems 1 & 2 show: the associated info. loss can be made arbitrarily small. Key: control bin-conditional moments of r.v.'s of interest, uniformly in the parameters of the model. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions THEOREM 1: likelihood ratios Let: f (x; θ), θ 2 Θ, be a family of density functions with common support X Rd , each continuously diff'ble. on r.i.(X ) 6= ? X be compact fk∂f (x; θ)/∂xk : x 2 X g be uniformly bounded in θ 2 Θ. Then, 8 > 0 and 8 sample sizes N > 0, 9 a nite measurable partition fBi g k( ,N) i=0 of X such that: for all (x1, ..., xN ) 2 X N, and for all (θ0, θ) 2 Θ2, log Likc(θ) Likc(θ0) log Likd (θ) Likd (θ0) where Likc and Likd are the likelihood functions for the continuous and discretised d/ns. respectively. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions Theorem 2 considers discretisation of an EF ) the tools of classical IG can be applied In general: a discretised full EF 6= a full EF, and 9 information loss However, Theorem 2 shows: this loss can be made inferentially unimportant all IG results on the 2 families can be made arbitrarily close Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions THEOREM 2: Amari structure Let: f (x; θ) = ν(x) expfθT s(x) ψ(θ)g, x 2 X , θ 2 Θ, be an EF satisfying the regularity conditions of Amari (1990), p.16 s(x) be uniformly continuous s(X ) be compact. Then, 8 > 0, 9 a nite measurable partition fBi g k( ) i=0 of X such that, for all choices of bin labels si 2 s(Bi ): all terms of Amari's IG for f (x; θ) ... can be approximated to the relevant order of ... by the corring. terms for the discretised family: n (πi (θ), si ) : πi (θ) := R Bi f (x; θ)dx, si 2 s(Bi ) o . In particular, ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions THEOREM 2: (continued) (a) for all θ 2 Θ, and any norm, kµc(θ) µd (θ)k = O( ) where µc(θ) = R X xf (x; θ)dx and µd (θ) = ∑i si πi (θ). (b) the expected Fisher information matrices for θ of f (x; θ) and of fπi (θ)g, denoted Ic(θ) and Id (θ) resp., satisfy: kIc(θ) Id (θ)k∞ = O( 2 ) (c) the skewness tensors [Amari, (1990), p. 105] Tc(θ) and Td (θ), for f (x; θ) and fπi (θ)g resp., satisfy: kTc(θ) Td (θ)k∞ = O( 3 ). Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Key question for a computational theory Example 1 Information loss under discretisation Theorem 1: likelihood ratios Theorem 2: Amari structure Extensions EXTENSIONS: Above: compactness condition keeps the geometry nite. Later paper: case where compactness not needed. There: `space of all d/ns.' = (closure of) ∞-D simplex extending classical IG ) convergence issues use appropriate Hilbert space structures ... esp., to bound loss of inferential information when move to nite () computable) simplex. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull BACKGROUND: IG $ 1 af ne geometries, non-linearly related via duality & Fisher information. In a full EF context: +1 geometry $ natural parameterisation 1 geometry $ mixture parameterisation. Closures of EF's have been studied by, e.g.: B-N ('78), Brown ('86), Lauritzen ('96) & Rinaldo (2006) and, in ∞-D case: Csiszar & Matus (2005). Here, rather than pointwise limits, focus = limits of families of d/ns. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull IG theory follows Amari (1990) via Murray & Rice's (1993) af ne space construction, extended by Marriott (2002). Recall: r.v.'s take values in a nite set of categories (bins) B = fBi gi2I ) d/n. = set of corring. probabilities fπi gi2I NB: identify bin Bi with its label i 2 I = f0, ..., kg 1 af ne space structure over d/ns. on B: (Amix , Vmix , +) where: Amix = fai gi2I : ∑i2I ai = 1 , Vmix = fvi gi2I : ∑i2I vi = 0 and `+' is the usual addition of sequences. ∆k is a 1-cvx. subset of (Amix , Vmix , +). Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull +1 af ne space structure over d/ns. on B: {sets of d/ns. with same support} form a simplicial complex support $ ? 6= F I ... where `F' connotes `face' each F has a separate +1 structure: (Aexp,F , Vexp,F , F ) de ning F on AF := ffai gi2F : ai > 0g by fai g F fbi g , 9λ > 0 s.t. 8i 2 F, ai = λbi, we put: Aexp,F := AF / F and Vexp,F := ffvi gi2F : vi 2 Rg, de ning F by hfai gi F fvi g := hfai exp(vi )gi. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull Extended TRInomial IG: (obvious extns. ) general case in [2]) ∆2: bin probs. π = (π0, π1, π2), πi 0 panels (a) to (d) show 1 geodesics in 1 parameters panels (a), (c) $ ∆2 in 1 (mixture) parameters panels (b), (d) $ +1 (natural) parameters (each πi > 0) cT = (1, 2, 3), X Trinomial(1; π) (a), (b): blue lines = level sets of E(cT X) = -1 geodesics (d), (c): black lines = 1-D full EFs* = +1 geodesics *with probs. of form: πi exp(θci )/ ∑2 j=0 πj exp(θcj ) these -1-parallel blue lines & +1-parallel black lines ... are everywhere orthogonal w.r.t. the Fisher info. metric Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull (a)-1-g eodesicsin-1-simple x 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 (b)-1-g eodesicsin+1-simple x -10 -5 0 5 10 -10-50510 (c)+1-g eodesicsin-1-simple x 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 (d)+1-g eodesicsin+1-simple -10 -5 0 5 10 -10-50510 (b): -1 geodesics nonlinear in +1 parameters ... & v.v.: see (c). (a): -1 geodesics extend naturally to the bdy. in -1-parameters (c): limits of +1 geodesics lie in bdy. of ∆2; de ne +1 closure s.t. these continuous limits are de ned `at ∞' in +1-parameters: – shown schematically as dotted triangle in (b); – key to understanding the simplicial nature of +1-geometry. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations (a): -1 in -1 (b): -1 in +1 (c): +1 in -1 (d): +1 in +1 Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull SHAPE OF LOG-LIKELIHOOD: natural spaces for CIG = high-D simplicial structures ) primary question: behaviour of log-likelihood l( ) in them? 2 important issues ... typically, sample size N << k = dimension of the simplex ∆k contains sub-simplices of varying support ... ) standard intuition about shape of l( ) will not hold, ... esp.: standard χ2-approxn. to d/n. of the deviance fails. discretising: data fxi gN i=1 f (x; θ) ! counts fni gi2I Multinomial(N; π(θ)), (I = f0, ..., kg) . I = P [ Z where P := fi : ni > 0g & Z := fi : ni = 0g. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull SHAPE OF LOG-LIKELIHOOD: (continued) observed face := face spanned by vertices (bins) in P unobserved face := face spanned by vertices (bins) in Z The log-likelihood l( ) is: strictly concave on the observed face strictly decreasing in the normal direction from it to the unobserved face and, otherwise, constant. For more re geometry of the observed face: see PM's talk. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull SHAPE OF LOG-LIKELIHOOD: (continued) Theorem 3: Let: Vmix := fvi gi2I : ∑i2I vi = 0 , V0 := fvi gi2I 2 Vmix : vi = 0, i 2 P for any iZ 2 Z: ViZ := fvi gi2I 2 Vmix : vi = 0, i 2 Z n fiZ g . Then: V0 is a linear subspace of Vmix l( ) is constant on -1-af ne subspaces of the form π + V0 Vmix has direct sum V0 ViZ . Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull Spectrum of Fisher information: denote: all bin probs. except π0 by π(0) := (π1, ..., πk )T . viewed as the covariance matrix of the score, N 1 (Fisher info. matrix for +1-params.) is: I(π) := diag(π(0)) π(0)πT (0) ... whose explicit spectral decomposition is, in all cases, an example of interlacing eigenvalue results. Accordingly, ... Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull Spectrum of Fisher information: continued ... the Fisher spectrum mimics key features of the bin probabilities. Of central importance: 1 eigenvalues are exponentially small , the same is true of the fπi gk i=0 the Fisher info. matrix is singular , one of the fπi gk i=0 vanishes. [Again, typically, 2 eigenvalues are close whenever 2 corresponding πi are.] Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull Spectrum of Fisher information: continued In particular, if fπi gk i=1 comprise g > 1 distinct values λ1 > ... > λg > 0, λi occurring mi times, ... then, the spectrum of I(π) comprises g simple eigenvalues feλi gg i=1, the roots of an explicit polynomial, satisfying λ1 > eλ1 > ... > λg > eλg 0 together, if g < k, with fλi : mi > 1g, each such λi having multiplicity mi 1, while eλg > 0 , π0 > 0. [Further, each eλi (i < g) is typically (much) closer to λi than to λi+1, making it a near replicate of λi.] Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull CLOSURE: Given a full EF embedded in a high-D sparse simplex, an important question is to identify its limit points – how it is connected to the boundary. Panel (c) in the trinomial example above illustrates that: 1-D EF limits lie at vertices which vertex is determined by the rank order of the components of the tangent vector of the +1-geodesic. In general (see [2]): nding the limit points $ nding redundant linear constraints this can be converted, via duality, into: nding extremal points in a nite-D af ne space. cf.: Geyer (2009): directions of recession. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion Af ne geometries Extended trinomial IG The shape of the log-likelihood Spectrum of Fisher information Closure Total positivity and the convex hull Total positivity and the convex hull: The -1-convex hull of an EF is of great interest, mixture models being widely used in statistical science. Explored further in PM's talk, we simply state the main result here. It follows easily from the total positivity of EFs that, generically, convex hulls are of maximal dimension k. Here, `generically' means that the +1 tangent vector which de nes the EF has components which are all distinct. Theorem 4: The -1-convex hull of an open subset of a generic 1-D EF is of full dimension. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion EXAMPLE 1 (continued): leukaemia patient data Return now to Example 1 to illustrate above results. In particular, to show: an application of dimension reduction based on IG. Recall, we have ... 43 survival times Z from diagnosis, measured in days Q: what is the mean survival time µ µ[F]? for expository purposes: suppose Z Exponential, but only observe censored value Y = minfZ, tg ) ... Y a 1-D curved EF, inside a 2-D regular EF [PM & West (2002)] t chosen to give reasonable, but not perfect, t Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion DIMENSION REDUCTION: 600 800 1000 1200 1400 1600 1800 -3.5-2.5-1.5-0.5 (a)Log-likelihoods mu log-likelihood -0.002 0.000 0.001 0.002 0.003 0.004 -0.50.00.51.01.5 (b)Fullexponentialfamily theta1 theta2 Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion EXAMPLE 1 (continued): DIMENSION REDUCTION (a): plot of l(µ): shows appreciable skewness ... suggests standard rst order asymptotics can be improved by the higher order asymptotic methods of classical IG. (b): in +1-params: solid curve = Y's 1-D curved EF embedded in 2-D full EF dashed lines = contours of l( ) for full EF clear, even visually: Y has low +1 curvature on this inferential scale ) its curved EF behaves inferentially like a 1-D full EF ) can use Marriott & Vos (2004) DR techniques Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion (c)DistributionofMLE mu Density 500 1000 1500 2000 2500 3000 0.00000.00050.00100.00150.0020 Panel (c) shows how well a saddlepoint-based approxn. does at approximating the d/n. of bµMLE. Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations Introduction Discretisation Extended Multinomial IG Example 1 (continued) Conclusion CONCLUSION: The power and elegance of IG have yet to be fully exploited in statistical practice, to which end ... the overall aim of CIG here is to provide tools to help resolve outstanding problems in statistical science, via ... an operational `universal space of all possible models', ... such problems including: (local-to-global) sensitivity analysis handling both data and model uncertainty inference in graphical & related models transdimensional & other issues in MCMC mixture estimation (see PM's talk) Anaya-Izquierdo, FC, Marriott & Vos CIG in Statistics: foundations