Kernel Density Estimation on Symmetric Spaces

28/10/2015
Auteurs : Dena Asta
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14314

Résumé

We introduce a novel kernel density estimator for a large class of symmetric spaces and prove a minimax rate of convergence as fast as the minimax rate on Euclidean space. We prove a minimax rate of convergence proven without any compactness assumptions on the space or Hölder-class assumptions on the densities. A main tool used in proving the convergence rate is the Helgason-Fourier transform, a generalization of the Fourier transform for semisimple Lie groups modulo maximal compact subgroups. This paper obtains a simplified formula in the special case when the symmetric space is the 2-dimensional hyperboloid.

Kernel Density Estimation on Symmetric Spaces

Collection

Métriques

114
4
3.79 Mo
 application/pdf
bitcache://fc0c9959d31ac45d4320403aa2b7bea5d87a71db

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14314</identifier><creators><creator><creatorName>Dena Asta</creatorName></creator></creators><titles>
            <title>Kernel Density Estimation on Symmetric Spaces</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sun 8 Nov 2015</date>
	    <date dateType="Updated">Mon 25 Jul 2016</date>
            <date dateType="Submitted">Fri 20 Apr 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">fc0c9959d31ac45d4320403aa2b7bea5d87a71db</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24706</version>
        <descriptions>
            <description descriptionType="Abstract">
We introduce a novel kernel density estimator for a large class of symmetric spaces and prove a minimax rate of convergence as fast as the minimax rate on Euclidean space. We prove a minimax rate of convergence proven without any compactness assumptions on the space or Hölder-class assumptions on the densities. A main tool used in proving the convergence rate is the Helgason-Fourier transform, a generalization of the Fourier transform for semisimple Lie groups modulo maximal compact subgroups. This paper obtains a simplified formula in the special case when the symmetric space is the 2-dimensional hyperboloid.

</description>
        </descriptions>
    </resource>
.

Dena Marie Asta Department of Statistics Ohio State University Supported by NSF grant DMS-1418124 and NSF Graduate Research Fellowship under grant DGE-1252522. Kernel Density Estimation on Symmetric Spaces 2 Geometric Methods for Statistical Analysis q  Classical statistics assumes data is unrestricted on Euclidean space q  Exploiting the geometry of the data leads to faster and more accurate tools ¯X = 1 n nX i=1 Xi var[X] = E[X2 ] E[X]2 implicit geometry in non-Euclidean data explicit geometry in networks Motivation: Non-Euclidean Data 3 Normal Distributions sphere Diffusion Tensor Imaging Material Stress, Gravitational Lensing Directional Headings 3x3 symmetric positive definite matrices 3x3 symmetric positive definite matrices hyperboloid 4 Nonparametric Methods: Non-Euclidean Data q  Classical non-parametric estimators assume Euclidean structure q  Sometimes the given data has other geometric structure to exploit. kernel density estimator kernel regression conditional density estimator Motivation: Non-Euclidean Distances 5 Normal Distributions sphere Diffusion Tensor Imaging Euclidean distances are often not the right notion of distance between data points. Material Stress, Gravitational Lensing Directional Headings 3x3 symmetric positive definite matrices 3x3 symmetric positive definite matrices hyperboloid Motivation: Non-Euclidean Distances 6 sphere Euclidean distances are often not the right notion of distance between data points. Directional Headings Distance between directional headings should be shortest path-length. Motivation: Non-Euclidean Distances 7 Normal Distributions Euclidean distances are often not the right notion of distance between data points. hyperboloid mean standarddeviation An isometric representation of the hyperboloid is the Poincare Half-Plane. Each point in either model represents a normal distribution. Distance is the Fisher Distance, which is similar to KL-Divergence. Motivation: Non-Euclidean Distances 8 Euclidean distance not the right distance à Euclidean volume not the right volume We want to minimize risk for density estimation on a (Riemmanian) manifold. estimator based on n samples true density manifold volume measure based on intrinsic distanceEf Z M (f ˆfn)2 dµ Existing Estimators 9 ˆfh (X1,...,Xn)(x) = 1 nh nX i=1 K ✓ x Xi h ◆ optimal rate of convergence1 (s=smoothness parameter, d=dimension) O(n-2s/(2s+d)) division by h undefined for general M subtraction undefined for general M Euclidean KDE Exploiting Geometry: Symmetries 10 q  symmetries = geometry q  symmetries make the smoothing of data (convolution by a kernel) tractable q  translations in Euclidean space are specific examples of symmetries q  other spaces call for other symmetries 11 Exploiting symmetries to convolve Kernel density estimation is about convolving a kernel with the data. More general spaces, depending on their geometry, we will require symmetries other than translations… ˆfh (X1,...,Xn) = Kh ⇤ empirical(X1, . . . , Xn) (g ⇤ f)(x) = Z Rn g(t)f(x t) dt density on the space of translations on Rn density on Rn (g ⇤ f)(x) = Z Rn g(t)f(x t) dt = Z Rn g(Tt)f(T 1 t (x)) dt 12 Exploiting symmetries to convolve Tv(w) = v + w Identify t with Tt and interpret g as a density on the space of Tt’s. Kernel density estimation is about convolving a kernel with the data. More general spaces, depending on their geometry, we will require symmetries other than translations… density on Rn density on the space of translations on Rn ˆfh (X1,...,Xn) = Kh ⇤ empirical(X1, . . . , Xn) (g ⇤ f)(x) = Z G g(T)f(T 1 (x)) dT 13 X is a symmetric space, a space having a suitable space G of symmetries. space of symmetries on X Generalized kernel density estimation involves convolving a generalized kernel with the data. density on X density on the space G Exploiting symmetries to convolve ˆfh (X1,...,Xn) = Kh ⇤ empirical(X1, . . . , Xn) (“empirical density”) G-Kernel Density Estimator: general form density on group of symmetries G “empirical density” on symmetric space Xbandwidth and cutoff parameters sample observations We can use harmonic analysis on symmetric spaces to define and analyze this estimator. 1Asta, D., 2014.
  ˆfh,C (X1,...,Xn) = Kh ⇤ empirical(X1, . . . , Xn) 15 Harmonic Analysis on Symmetric Spaces 1Terras, A., 1985.
  H : L2(X) ⌧ L2(· · · ) : H 1 The (Helgason-)Fourier Transform sends convolutions to products. Helgason-Fourier Transform: for symmetric space X, an isometry frequency space depends on the geometry of X F : L2(R) ⌧ L2(R) : F 1 Fourier Transform: an isometry 16 Generalization: G-Kernel Density Estimator 1Asta, D., 2014.
  q  The true density is sufficiently smooth (in Sobolev ball). q  The kernel transforms nicely with the space of data q  The kernel is sufficiently smooth assumptions on kernel and true density: G-Kernel Density Estimator 17 THEOREM: G-KDE achieves the same minimax rate on symmetric spaces as the ordinary KDE achieves on Rd.1 1Asta, D., 2014.
  O(n-2s/(2s+d)) optimal rate of convergence1 (s=Sobolev smoothness parameter, d=dimension) ˆfh,C (X1,...,Xn) = H 1 [ (X1,...,Xn)H[Kh]IC] 18 Kernels on Symmetries Symmetric Positive Definite (nxn) Matrices SPDn: Kernels are densities on space G=GLn of nxn invertible matrices. Hyperboloid H2: Kernels are densities on space G=SL2 of 2x2 invertible matrices having determinant 1. Each SL2-matrix M determines an isometry (distance-preserving function): M : H2 ⇠= H2 M (x) = M11x + M12 M21x + M22 Each GLn-matrix M determines an isometry (distance-preserving function): M (X) = MT XMM : SPDn ⇠= SPDn example of kernel K (hyperbolic version of the gaussian): solution to the heat equation on SL2: 19 Kernels on Symmetries Hyperboloid H2: Kernels are densities on space G=SL2 of 2x2 invertible matrices having determinant 1. Each SL2-matrix M determines an isometry (distance-preserving function): M : H2 ⇠= H2 M (x) = M11x + M12 M21x + M22 samples from K (points in SL2) represented in H2=SL2/SO2 H[Kh](s, k✓) / eh2 ¯s2 h¯s 20 Recap: G-KDE 1Asta, D., 2014.
  Exploiting the geometric structure of the data type: q  Tractable data smoothing = convolving a kernel on a space of symmetries q  Harmonic analysis on symmetric spaces allows us to prove minimax rate q  Symmetric spaces are general enough to include: Normal Distributions Diffusion Tensor Imaging Material Stress, Gravitational Lensing Directional Headings