Sélectionnez d'abord la publication

(2015) GSI2015

From Geometry and Physics to Computational Linguistics Matilde Marcolli GSI2015
Détails de l'article
I will show how techniques from geometry (algebraic geometry and topology) and physics (statistical physics) can be applied to Linguistics, in order to provide a computational approach to questions of syntactic 
From Geometry and Physics to Computational Linguistics
A two-color interacting random balls model for co-localization analysis of proteins Charles Kervrann, Frederic Lavancier GSI2015
Détails de l'article
A model of two-type (or two-color) interacting random balls is introduced. Each colored random set is a union of random balls and the interaction relies on the volume of the intersection between the two random sets. This model is motivated by the detection and quantification of co-localization between two proteins. Simulation and inference are discussed. Since all individual balls cannot been identified, e.g. a ball may contain another one, standard methods of inference as likelihood or pseudolikelihood are not available and we apply the Takacs-Fiksel method with a specific choice of test functions.
A two-color interacting random balls model for co-localization analysis of proteins
New metric and connections in statistical manifolds Charles Casimiro Cavalcante, David de Souza, Rui F. Vigelis GSI2015
Détails de l'article
We define a metric and a family of α-connections in statistical manifolds, based on ϕ-divergence, which emerges in the framework of ϕ-families of probability distributions. This metric and α-connections generalize the Fisher information metric and Amari’s α-connections. We also investigate the parallel transport associated with the α-connection for α = 1.
New metric and connections in statistical manifolds
Computing Boundaries in Local Mixture Models Paul Marriott, Vahed Maroufy GSI2015
Détails de l'article
Local mixture models give an inferentially tractable but still flexible alternative to general mixture models. Their parameter space naturally includes boundaries; near these the behaviour of the likelihood is not standard. This paper shows how convex and differential geometries help in characterising these boundaries. In particular the geometry of polytopes, ruled and developable surfaces is exploited to develop efficient inferential algorithms.
Computing Boundaries in Local Mixture Models
Variational Bayesian Approximation method for Classification and Clustering with a mixture of Studen Ali Mohammad-Djafari GSI2015
Détails de l'article
Clustering, classification and Pattern Recognition in a set of data are between the most important tasks in statistical researches and in many applications. In this paper, we propose to use a mixture of Student-t distribution model for the data via a hierarchical graphical model and the Bayesian framework to do these tasks. The main advantages of this model is that the model accounts for the uncertainties of variances and covariances and we can use the Variational Bayesian Approximation (VBA) methods to obtain fast algorithms to be able to handle large data sets.
Variational Bayesian Approximation method for Classification and Clustering with a mixture of Studen
Transformations and Coupling Relations for Affine Connections James Tao, Jun Zhang GSI2015
Détails de l'article
The statistical structure on a manifold M is predicated upon a special kind of coupling between the Riemannian metric g and a torsion-free affine connection ∇ on the TM, such that ∇ g is totally symmetric, forming, by definition, a “Codazzi pair” { ∇ , g}. In this paper, we first investigate various transformations of affine connections, including additive translation (by an arbitrary (1,2)-tensor K), multiplicative perturbation (through an arbitrary invertible operator L on TM), and conjugation (through a non-degenerate two-form h). We then study the Codazzi coupling of ∇ with h and its coupling with L, and the link between these two couplings. We introduce, as special cases of K-translations, various transformations that generalize traditional projective and dual-projective transformations, and study their commutativity with L-perturbation and h-conjugation transformations. Our derivations allow affine connections to carry torsion, and we investigate conditions under which torsions are preserved by the various transformations mentioned above. Our systematic approach establishes a general setting for the study of Information Geometry based on transformations and coupling relations of affine connections – in particular, we provide a generalization of conformal-projective transformation.
Transformations and Coupling Relations for Affine Connections
Weakly Correlated Sparse Components with Nearly Orthonormal Loadings Matthieu Genicot, Nickolay Trendafilov, Wen Huang GSI2015
Détails de l'article
There is already a great number of highly efficient methods producing components with sparse loadings which significantly facilitates the interpretation of principal component analysis (PCA). However, they produce either only orthonormal loadings, or only uncorrelated components, or, most frequently, neither of them. To overcome this weakness, we introduce a new approach to define sparse PCA similar to the Dantzig selector idea already employed for regression problems. In contrast to the existing methods, the new approach makes it possible to achieve simultaneously nearly uncorrelated sparse components with nearly orthonormal loadings. The performance of the new method is illustrated on real data sets. It is demonstrated that the new method outperforms one of the most popular available methods for sparse PCA in terms of preservation of principal components properties.
Weakly Correlated Sparse Components with Nearly Orthonormal Loadings
Barycenter in Wasserstein space existence and consistency Jean-Michel Loubes, Thibaut Le Gouic GSI2015
Détails de l'article
We study barycenters in the Wasserstein space Pp(E) of a locally compact geodesic space (E, d). In this framework, we define the barycenter of a measure ℙ on Pp(E) as its Fréchet mean. The paper establishes its existence and states consistency with respect to ℙ. We thus extends previous results on ℝ d , with conditions on ℙ or on the sequence converging to ℙ for consistency.
Barycenter in Wasserstein space existence and consistency
An Information Geometry Problem in Mathematical Finance Imre Csiszár, Michel Broniatowski, Thomas Breuer GSI2015
Détails de l'article
Familiar approaches to risk and preferences involve minimizing the expectation EIP(X) of a payoff function X over a family Γ of plausible risk factor distributions IP. We consider Γ determined by a bound on a convex integral functional of the density of IP, thus Γ may be an I-divergence (relative entropy) ball or some other f-divergence ball or Bregman distance ball around a default distribution IPo. Using a Pythagorean identity we show that whether or not a worst case distribution exists (minimizing EIP(X) subject to IP∈Γ), the almost worst case distributions cluster around an explicitly specified, perhaps incomplete distribution. When Γ is an f-divergence ball, a worst case distribution either exists for any radius, or it does/does not exist for radius less/larger than a critical value. It remains open how far the latter result extends beyond f-divergence balls.
An Information Geometry Problem in Mathematical Finance
Uniqueness of the Fisher-Rao Metric on the Space of Smooth Densities Martin Bauer, Martins Bruveris, Peter Michor GSI2015
Détails de l'article
We review the manifold projection method for stochastic nonlinear filtering in a more general setting than in our previous paper in Geometric Science of Information 2013. We still use a Hilbert space structure on a space of probability densities to project the infinite dimensional stochastic partial differential equation for the optimal filter onto a finite dimensional exponential or mixture family, respectively, with two different metrics, the Hellinger distance and the L2 direct metric. This reduces the problem to finite dimensional stochastic differential equations. In this paper we summarize a previous equivalence result between Assumed Density Filters (ADF) and Hellinger/Exponential projection filters, and introduce a new equivalence between Galerkin method based filters and Direct metric/Mixture projection filters. This result allows us to give a rigorous geometric interpretation to ADF and Galerkin filters. We also discuss the different finite-dimensional filters obtained when projecting the stochastic partial differential equation for either the normalized (Kushner-Stratonovich) or a specific unnormalized (Zakai) density of the optimal filter.
Uniqueness of the Fisher-Rao Metric on the Space of Smooth Densities
New model search for nonlinear recursive models, regressions and autoregressions Anna-Lena Kißlinger, Wolfgang Stummer GSI2015
Détails de l'article
Scaled Bregman distances SBD have turned out to be useful tools for simultaneous estimation and goodness-of-fit-testing in parametric models of random data (streams, clouds). We show how SBD can additionally be used for model preselection (structure detection), i.e. for finding appropriate candidates of model (sub)classes in order to support a desired decision under uncertainty. For this, we exemplarily concentrate on the context of nonlinear recursive models with additional exogenous inputs; as special cases we include nonlinear regressions, linear autoregressive models (e.g. AR, ARIMA, SARIMA time series), and nonlinear autoregressive models with exogenous inputs (NARX). In particular, we outline a corresponding information-geometric 3D computer-graphical selection procedure. Some sample-size asymptotics is given as well.
New model search for nonlinear recursive models, regressions and autoregressions
Entropy minimizing curves with application to automated flight path design Florence Nicol, Stephane Puechmorel GSI2015
Détails de l'article
Air traffic management (ATM) aims at providing companies with a safe and ideally optimal aircraft trajectory planning. Air traffic controllers act on flight paths in such a way that no pair of aircraft come closer than the regulatory separation norm. With the increase of traffic, it is expected that the system will reach its limits in a near future: a paradigm change in ATM is planned with the introduction of trajectory based operations. This paper investigate a mean of producing realistic air routes from the output of an automated trajectory design tool. For that purpose, an entropy associated with a system of curves is defined and a mean of iteratively minimizing it is presented. The network produced is suitable for use in a semi-automated ATM system with human in the loop.
Entropy minimizing curves with application to automated flight path design
Optimal Transport, Independance versus Indetermination duality, impact on a new Copula Design Benoit Huyot, Jean-François Marcotorchino, Yves Mabiala GSI2015
Détails de l'article
This article leans on some previous results already presented in [10], based on the Fréchet’s works,Wilson’s entropy and Minimal Trade models in connectionwith theMKPtransportation problem (MKP, stands for Monge-Kantorovich Problem). Using the duality between “independance” and “indetermination” structures, shown in this former paper, we are in a position to derive a novel approach to design a copula, suitable and efficient for anomaly detection in IT systems analysis.
Optimal Transport, Independance versus Indetermination duality, impact on a new Copula Design
Deblurring and Recovering Conformational States in 3D Single Particle Electron Bijan Afsari, Gregory S. Chirikjian GSI2015
Détails de l'article
In this paper we study two forms of blurring effects that may appear in the reconstruction of 3D Electron Microscopy (EM), specifically in single particle reconstruction from random orientations of large multi-unit biomolecular complexes. We model the blurring effects as being due to independent contributions from: (1) variations in the conformation of the biomolecular complex; and (2) errors accumulated in the reconstruction process. Under the assumption that these effects can be separated and treated independently, we show that the overall blurring effect can be expressed as a special form of a convolution operation of the 3D density with a kernel defined on SE(3), the Lie group of rigid body motions in 3D. We call this form of convolution mixed spatial-motional convolution.We discuss the ill-conditioned nature of the deconvolution needed to deblur the reconstructed 3D density in terms of parameters associated with the unknown probability in SE(3). We provide an algorithm for recovering the conformational information of large multi-unit biomolecular complexes (essentially deblurring) under certain biologically plausible prior structural knowledge about the subunits of the complex in the case the blurring kernel has a special form.
Deblurring and Recovering Conformational States in 3D Single Particle Electron
Heights of toric varieties, integration over polytopes and entropy José Ignacio Burgos Gil, Martin Sombra, Patrice Philippon GSI2015
Détails de l'article
We present a dictionary between arithmetic geometry of toric varieties and convex analysis. This correspondence allows for effective computations of arithmetic invariants of these varieties. In particular, combined with a closed formula for the integration of a class of functions over polytopes, it gives a number of new values for the height (arithmetic analog of the degree) of toric varieties, with respect to interesting metrics arising from polytopes. In some cases these heights are interpreted as the average entropy of a family of random processes.
Heights of toric varieties, integration over polytopes and entropy
Dimension Reduction on Polyspheres with Application to Skeletal Representations Benjamin Eltzner, Stephan Huckemann, Sungkyu Jung GSI2015
Détails de l'article
We present a novel method that adaptively deforms a polysphere (a product of spheres) into a single high dimensional sphere which then allows for principal nested spheres (PNS) analysis. Applying our method to skeletal representations of simulated bodies as well as of data from real human hippocampi yields promising results in view of dimension reduction. Specifically in comparison to composite PNS (CPNS), our method of principal nested deformed spheres (PNDS) captures essential modes of variation by lower dimensional representations.
Dimension Reduction on Polyspheres with Application to Skeletal Representations
Bag-of-components an online algorithm for batch learning of mixture models Frank Nielsen, Olivier Schwander GSI2015
Détails de l'article
Practical estimation of mixture models may be problematic when a large number of observations are involved: for such cases, online versions of Expectation-Maximization may be preferred, avoiding the need to store all the observations before running the algorithms. We introduce a new online method well-suited when both the number of observations is large and lots of mixture models need to be learned from different sets of points. Inspired by dictionary methods, our algorithm begins with a training step which is used to build a dictionary of components. The next step, which can be done online, amounts to populating the weights of the components given each arriving observation. The usage of the dictionary of components shows all its interest when lots of mixtures need to be learned using the same dictionary in order to maximize the return on investment of the training step. We evaluate the proposed method on an artificial dataset built from random Gaussian mixture models.
Bag-of-components an online algorithm for batch learning of mixture models
Symplectic Structure of Information Geometry: Fisher Metric and Euler-Poincaré Equation of Souriau Lie Group Thermodynamics Frédéric Barbaresco GSI2015
Détails de l'article
We introduce the Symplectic Structure of Information Geometry based on Souriau’s Lie Group Thermodynamics model, with a covariant definition of Gibbs equilibrium via invariances through co-adjoint action of a group on its momentum space, defining physical observables like energy, heat, and momentum as pure geometrical objects. Using Geometric (Planck) Temperature of Souriau model and Symplectic cocycle notion, the Fisher metric is identified as a Souriau Geometric Heat Capacity. In the framework of Lie Group Thermodynamics, an Euler-Poincaré equation is elaborated with respect to thermodynamic variables, and a new variational principal for thermodynamics is built through an invariant Poincaré-Cartan-Souriau integral. Finally, we conclude on Balian Gauge theory of Thermodynamics compatible with Souriau’s Model.
Symplectic Structure of Information Geometry: Fisher Metric and Euler-Poincaré Equation of Souriau Lie Group Thermodynamics
Differential geometric properties of textile plot Tomonari Sei, Ushio Tanaka GSI2015
Détails de l'article
The textile plot proposed by Kumasaka and Shibata (2008) is a method for data visualization. The method transforms a data matrix in order to draw a parallel coordinate plot. In this paper, we investigate a set of matrices induced by the textile plot, which we call the textile set, from a geometrical viewpoint. It is shown that the textile set is written as the union of two differentiable manifolds if data matrices are restricted to be full-rank.
Differential geometric properties of textile plot
Asymptotics of superposition of point processes Aurélien Vasseur, Laurent Decreusefond GSI2015
Détails de l'article
The characteristic independence property of Poisson point processes gives an intuitive way to explain why a sequence of point processes becoming less and less repulsive can converge to a Poisson point process. The aim of this paper is to show this convergence for sequences built by superposing, thinning or rescaling determinantal processes. We use Papangelou intensities and Stein’s method to prove this result with a topology based on total variation distance.
Asymptotics of superposition of point processes
Curvatures of Statistical Structures Barbara Opozda GSI2015
Détails de l'article
Curvature properties for statistical structures are studied. The study deals with the curvature tensor of statistical connections and their duals as well as the Ricci tensor of the connections, Laplacians and the curvature operator. Two concepts of sectional curvature are introduced. The meaning of the notions is illustrated by presenting few exemplary theorems.
Curvatures of Statistical Structures
Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen, Gaëtan Hadjeres GSI2015
Détails de l'article
We generalize the O(dnϵ2)-time (1 + ε)-approximation algorithm for the smallest enclosing Euclidean ball [2,10] to point sets in hyperbolic geometry of arbitrary dimension. We guarantee a O(1/ϵ2) convergence time by using a closed-form formula to compute the geodesic α-midpoint between any two points. Those results allow us to apply the hyperbolic k-center clustering for statistical location-scale families or for multivariate spherical normal distributions by using their Fisher information matrix as the underlying Riemannian hyperbolic metric.
Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry
Asymptotic properties of random polytopes Pierre Calka GSI2015
Détails de l'article
Random polytopes have constituted some of the central objects of stochastic geometry for more than 150 years. They are in general generated as convex hulls of a random set of points in the Euclidean space. The study of such models requires the use of ingredients coming from both convex geometry and probability theory. In the last decades, the study has been focused on their asymptotic properties and in particular expectation and variance estimates. In several joint works with Tomasz Schreiber and J. E. Yukich, we have investigated the scaling limit of several models (uniform model in the unit-ball, uniform model in a smooth convex body, Gaussian model) and have deduced from it limiting variances for several geometric characteristics including the number of k-dimensional faces and the volume. In this paper, we survey the most recent advances on these questions and we emphasize the particular cases of random polytopes in the unit-ball and Gaussian polytopes.
Asymptotic properties of random polytopes
Online k-MLE for mixture modeling with exponential families Christophe Saint-Jean, Frank Nielsen GSI2015
Détails de l'article
This paper address the problem of online learning finite statistical mixtures of exponential families. A short review of the Expectation-Maximization (EM) algorithm and its online extensions is done. From these extensions and the description of the k-Maximum Likelihood Estimator (k-MLE), three online extensions are proposed for this latter. To illustrate them, we consider the case of mixtures of Wishart distributions by giving details and providing some experiments.
Online k-MLE for mixture modeling with exponential families
Fitting Smooth Paths on Riemannian Manifolds - Endometrial Surface Antoine Arnould, Chafik Samir, Michel Canis, Pierre-Antoine Absil, Pierre-Yves Gousenbourger GSI2015
Détails de l'article
We present a new method to fit smooth paths to a given set of points on Riemannian manifolds using C1 piecewise-Bézier functions. A property of the method is that, when the manifold reduces to a Euclidean space, the control points minimize the mean square acceleration of the path. As an application, we focus on data observations that evolve on certain nonlinear manifolds of importance in medical imaging: the shape manifold for endometrial surface reconstruction; the special orthogonal group SO(3) and the special Euclidean group SE(3) for preoperative MRI-based navigation. Results on real data show that our method succeeds in meeting the clinical goal: combining different modalities to improve the localization of the endometrial lesions.
Fitting Smooth Paths on Riemannian Manifolds - Endometrial Surface
Multivariate L-moments based on transports Alexis Decurninge GSI2015
Détails de l'article
Univariate L-moments are expressed as projections of the quantile function onto an orthogonal basis of univariate polynomials. We present multivariate versions of L-moments expressed as collections of orthogonal projections of a multivariate quantile function on a basis of multivariate polynomials. We propose to consider quantile functions defined as transports from the uniform distribution on [0; 1] d onto the distribution of interest and present some properties of the subsequent L-moments. The properties of estimated L-moments are illustrated for heavy-tailed distributions.
Multivariate L-moments based on transports
Multivariate divergences with application in multisample density ratio models Amor Keziou GSI2015
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

We introduce what we will call multivariate divergences between K, K ≥ 1, signed finite measures (Q1, . . . , Q K ) and a given reference probability measure P on a σ-field (X,B), extending the well known divergences between two measures, a signed finite measure Q1 and a given probability distribution P. We investigate the Fenchel duality theory for the introduced multivariate divergences viewed as convex functionals on well chosen topological vector spaces of signed finite measures. We obtain new dual representations of these criteria, which we will use to define new family of estimates and test statistics with multiple samples under multiple semiparametric density ratio models. This family contains the estimate and test statistic obtained through empirical likelihood. Moreover, the present approach allows obtaining the asymptotic properties of the estimates and test statistics both under the model and under misspecification. This leads to accurate approximations of the power function for any used criterion, including the empirical likelihood one, which is of its own interest. Moreover, the proposed multivariate divergences can be used, in the context of multiple samples in density ratio models, to define new criteria for model selection and multi-group classification.
Multivariate divergences with application in multisample density ratio models
Reparameterization invariant metric on the space of curves Alice Le Brigant, Frédéric Barbaresco, Marc Arnaudon GSI2015
Détails de l'article
This paper focuses on the study of open curves in a manifold M, and its aim is to define a reparameterization invariant distance on the space of such paths. We use the square root velocity function (SRVF) introduced by Srivastava et al. in [11] to define a reparameterization invariant metric on the space of immersions =Imm([0,1],M) by pullback of a metric on the tangent bundle T derived from the Sasaki metric. We observe that such a natural choice of Riemannian metric on T induces a first-order Sobolev metric on with an extra term involving the origins, and leads to a distance which takes into account the distance between the origins and the distance between the image curves by the SRVF parallel transported to a same vector space, with an added curvature term. This provides a generalized theoretical SRV framework for curves lying in a general manifold M.
Reparameterization invariant metric on the space of curves
Random Pairwise Gossip on CAT(k) Metric Spaces Anass Bellachehab, Jérémie Jakubowicz GSI2015
Détails de l'article
In the context of sensor networks, gossip algorithms are a popular, well established technique, for achieving consensus when sensor data are encoded in linear spaces. Gossip algorithms also have several extensions to non linear data spaces. Most of these extensions deal with Riemannian manifolds and use Riemannian gradient descent. This paper, instead, studies gossip in a broader CAT(k) metric setting, encompassing, but not restricted to, several interesting cases of Riemannian manifolds. As it turns out, convergence can be guaranteed as soon as the data lie in a small enough ball of a mere CAT(k) metric space. We also study convergence speed in this setting and establish linear rates of convergence.
Random Pairwise Gossip on CAT(k) Metric Spaces
Kernel Density Estimation on Symmetric Spaces Dena Asta GSI2015
Détails de l'article
We introduce a novel kernel density estimator for a large class of symmetric spaces and prove a minimax rate of convergence as fast as the minimax rate on Euclidean space. We prove a minimax rate of convergence proven without any compactness assumptions on the space or Hölder-class assumptions on the densities. A main tool used in proving the convergence rate is the Helgason-Fourier transform, a generalization of the Fourier transform for semisimple Lie groups modulo maximal compact subgroups. This paper obtains a simplified formula in the special case when the symmetric space is the 2-dimensional hyperboloid.
Kernel Density Estimation on Symmetric Spaces