Maximum likelihood estimators on manifolds

07/11/2017
Publication GSI2017
OAI : oai:www.see.asso.fr:17410:22550
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit
 

Résumé

Maximum likelihood estimator (MLE) is a well known estimator in statistics. The popularity of this estimator stems from its asymptotic and universal properties. While asymptotic properties of MLEs on Euclidean spaces attracted a lot of interest, their studies on manifolds are still insufficient. The present paper aims to give a uni ed study of the subject. Its contributions are twofold. First it proposes a framework of asymptotic results for MLEs on manifolds: consistency, asymptotic normality and asymptotic efficiency. Second, it extends popular testing problems on manifolds. Some examples are discussed.

Maximum likelihood estimators on manifolds

Collection

application/pdf Maximum likelihood estimators on manifolds Hatem Hajri, Salem Said, Yannick Berthoumieu
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

Maximum likelihood estimator (MLE) is a well known estimator in statistics. The popularity of this estimator stems from its asymptotic and universal properties. While asymptotic properties of MLEs on Euclidean spaces attracted a lot of interest, their studies on manifolds are still insufficient. The present paper aims to give a uni ed study of the subject. Its contributions are twofold. First it proposes a framework of asymptotic results for MLEs on manifolds: consistency, asymptotic normality and asymptotic efficiency. Second, it extends popular testing problems on manifolds. Some examples are discussed.
Maximum likelihood estimators on manifolds

Média

Voir la vidéo

Métriques

0
0
284 Ko
 application/pdf
bitcache://65e800bb3e10736924f74c78ea7b100dae5f4c14

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors Platine

alanturinginstitutelogo.png
logothales.jpg

Sponsors Bronze

logo_enac-bleuok.jpg
imag150x185_couleur_rvb.jpg

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors

smai.png
gdrmia_logo.png
gdr_geosto_logo.png
gdr-isis.png
logo-minesparistech.jpg
logo_x.jpeg
springer-logo.png
logo-psl.png

Organisateurs

logo_see.gif
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/17410/22550</identifier><creators><creator><creatorName>Yannick Berthoumieu</creatorName></creator><creator><creatorName>Salem Said</creatorName></creator><creator><creatorName>Hatem Hajri</creatorName></creator></creators><titles>
            <title>Maximum likelihood estimators on manifolds</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2018</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><subjects><subject>Maximum likelihood estimator</subject><subject>consistency</subject><subject>asymptotic normality</subject><subject>statistical tests on manifolds</subject><subject>asymptotic efficiency of MLE</subject></subjects><dates>
	    <date dateType="Created">Wed 7 Mar 2018</date>
	    <date dateType="Updated">Wed 7 Mar 2018</date>
            <date dateType="Submitted">Tue 13 Nov 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">65e800bb3e10736924f74c78ea7b100dae5f4c14</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>37267</version>
        <descriptions>
            <description descriptionType="Abstract">Maximum likelihood estimator (MLE) is a well known estimator in statistics. The popularity of this estimator stems from its asymptotic and universal properties. While asymptotic properties of MLEs on Euclidean spaces attracted a lot of interest, their studies on manifolds are still insufficient. The present paper aims to give a uni ed study of the subject. Its contributions are twofold. First it proposes a framework of asymptotic results for MLEs on manifolds: consistency, asymptotic normality and asymptotic efficiency. Second, it extends popular testing problems on manifolds. Some examples are discussed.
</description>
        </descriptions>
    </resource>
.

Maximum likelihood estimators on manifolds Hatem Hajri1 , Salem Said2 , Yannick Berthoumieu2 1 Institut Vedecom, 77 rue des chantiers, Versailles, 2 Laboratoire IMS (CNRS - UMR 5218), Université de Bordeaux {hatem.hajri@vedecom.fr, {salem.said, yannick.berthoumieu }@ims-bordeaux.fr} Abstract. Maximum likelihood estimator (MLE) is a well known es- timator in statistics. The popularity of this estimator stems from its asymptotic and universal properties. While asymptotic properties of MLEs on Euclidean spaces attracted a lot of interest, their studies on manifolds are still insufficient. The present paper aims to give a unified study of the subject. Its contributions are twofold. First it proposes a framework of asymptotic results for MLEs on manifolds: consistency, asymptotic normality and asymptotic efficiency. Second, it extends popular testing problems on manifolds. Some examples are discussed. Keywords: Maximum likelihood estimator, consistency, asymptotic normality, asymptotic efficiency of MLE, statistical tests on manifolds. 1 Introduction Density estimation on manifolds has many applications in signal and image processing. To give some examples of situations, one can mention Covariance matrices: In recent works [1–5], new distributions called Gaus- sian and Laplace distributions on manifolds of covariance matrices (positive definite, Hermitian, Toeplitz, Block Toeplitz...) are introduced. Estimation of parameters of these distributions has led to various applications (image classifi- cation, EEG data analysis, etc). Stiefel and Grassmann manifolds: These manifolds are used in various applications such as pattern recognition [6–8] and shape analysis [9]. Among the most studied density functions on these manifolds, one finds the Langevin, Bingham and Gaussian distributions [10]. In [6–8], maximum likelihood estima- tions of the Langevin and Gaussian distributions are applied for tasks of activity recognition and video-based face recognition. Lie groups: Lie groups arise in various problems of signal and image pro- cessing such as localization, tracking [11, 12] and medical image processing [13]. In [13], maximum likelihood estimation of new distributions on Lie groups, called Gaussian distributions, is performed and applications are given in medical im- age processing. The recent work [4] proposes new Gaussian distributions on Lie groups and a complete program, based on MLE, to learn data on Lie groups using these distributions. The present paper is structured as follows. Section 2 focuses on consistency of MLE on general metric spaces. Section 3 discusses asymptotic normality and asymptotic efficiency of MLE on manifolds. Finally Section 4 presents some hypothesis tests on manifolds. 2 Consistency In this section it is shown that, under suitable conditions, MLEs on general met- ric spaces are consistent estimators. The result given here may not be optimal. However, in addition to its simple form, it is applicable to several examples of distributions on manifolds as discussed below. Let (Θ, d) denote a metric space and let M be a measurable space with µ a positive measure on it. Consider (Pθ)θ∈Θ a family of distributions on M such that Pθ(dx) = f(x, θ)µ(dx) and f > 0. If x1, · · · , xn are independent random samples from Pθ0 , a maximum likeli- hood estimator is any θ̂n which solves max θ Ln(θ) = Ln(θ̂n) where Ln(θ) = 1 n n X i=1 log f(xi, θ) The main result of this section is Theorem 1 below. The notation Eθ[g(x)] stands for R M g(y)f(y, θ)µ(dy). Theorem 1. Assume the following assumptions hold for some θ0 ∈ Θ. (1) For all x, f(x, θ) is continuous with respect to θ. (2) Eθ0 [| log f(x, θ)|] < ∞ for all θ, L(θ) = Eθ0 [log f(x, θ)] is continuous on Θ and uniquely maximized at θ0. (3) For all compact K of Θ, Q(δ) := Eθ0 [sup{| log f(x, θ) − log f(x, θ0 )| : θ, θ0 ∈ K, d(θ, θ0 ) ≤ δ}] satisfies limδ→0 Q(δ) = 0. Let x1, · · · , xn, · · · be independent random samples of Pθ0 . For every compact K of Θ, the following convergence holds in probability lim n→∞ sup θ∈K |Ln(θ) − L(θ)| = 0 Assume moreover (4) There exists a compact K0 ⊂ Θ containing θ0 such that Eθ0 [| sup{log f(x, θ) : θ ∈ Kc 0}|] < ∞ and Eθ0 [sup{log f(x, θ) : θ ∈ Kc 0}] < L(θ0) Then, whenever θ̂n exists and is unique for all n, it satisfies θ̂n converges to θ0 in probability. Proof. Since L is a deterministic function, it is enough to prove, for every com- pact K, (i) Convergence of finite dimensional distributions: (Ln(θ1), · · · , Ln(θp)) weakly converges to (L(θ1), · · · , L(θp)) for any θ1, · · · , θp ∈ K. (ii) Tightness criterion: for all ε > 0, lim δ→0 lim sup n→∞ P sup θ,θ0∈K,d(θ,θ0)<δ |Ln(θ) − Ln(θ0 )| > ε  = 0 Fact (i) is a consequence of the first assumption in (2) and the strong law of large numbers (SLLN). For (ii), set F = {(θ, θ0 ) ∈ K2 , d(θ, θ0 ) < δ} and note P sup F |Ln(θ) − Ln(θ0 )| > ε  ≤ P(Qn(δ) > ε) where Qn(δ) = 1 n Pn i=1 supF | log f(xi, θ) − log f(xi, θ0 )|. By assumption (3), there exists δ0 > 0 such that Q(δ) ≤ Q(δ0) < ε for all δ ≤ δ0. An applica- tion of the SLLN shows that, for all δ ≤ δ0, limn Qn(δ) = Q(δ) and consequently lim sup n→∞ P(Qn(δ) > ε) = lim sup n→∞ P(Qn(δ) − Q(δ) > ε − Q(δ)) = 0 This proves fact (ii). Assume (4) holds. The bound P(θ̂n / ∈ K0) ≤ P(sup Kc 0 Ln(θ) > sup K0 Ln(θ)) ≤ P(sup Kc 0 Ln(θ) > Ln(θ0)) and the inequality supθ∈Kc 0 Ln(θ) ≤ 1 n Pn i=1 supθ∈Kc 0 log f(xi, θ) give P(θ̂n / ∈ K0) ≤ P  1 n n X i=1 sup θ∈Kc 0 log f(xi, θ) > Ln(θ0)  By the SLLN, lim supn P(θ̂n / ∈ K0) ≤ 1{Eθ0 [supθ∈Kc 0 log f(x,θ)]≥L(θ0)} = 0. With K0(ε) := {θ ∈ K0 : d(θ, θ0) ≥ ε}, one has P(d(θ̂n, θ0) ≥ ε) ≤ P(θ̂n ∈ K0(ε)) + P(θ̂n / ∈ K0) where P(θ̂n ∈ K0(ε)) ≤ P(supK0(ε) Ln > Ln(θ0)). Since Ln converges to L uni- formly in probability on K0(ε), supK0(ε) Ln converges in probability to supK0(ε) L and so lim supn P(d(θ̂n, θ0) ≥ ε) = 0 using assumption (2). 2.1 Some examples In the following some distributions which satisfy assumptions of Theorem 1 are given. More examples will be discussed in a forthcoming paper. (i) Gaussian and Laplace distributions on Pm. Let Θ = M = Pm be the Riemannian manifold of symmetric positive definite matrices of size m × m equipped with Rao-Fisher metric and its Riemannian distance d called Rao’s distance. The Gaussian distribution on Pm as introduced in [1] has density with respect to the Riemannian volume given by f(x, θ) = 1 Zm(σ) exp − d2 (x,θ) 2σ2  where σ > 0 and Zm(σ) > 0 is a normalizing factor only depending on σ. Points (1) and (3) in Theorem 1 are easy to verify. Point (2) is proved in Proposition 9 [1]. To check (4), define O = {θ : d(θ, θ0) > ε} and note Eθ0 [sup O (−d2 (x, θ))] ≤ Eθ0 [sup O (−d2 (x, θ))12d(x,θ0)≤ε−1] (1) By the triangle inequality −d2 (x, θ) ≤ −d(x, θ0)2 + 2d(θ, θ0)d(x, θ0) − d2 (θ, θ0) and consequently (1) is smaller than Eθ0 [sup O (2d(θ, θ0)d(x, θ0) − d2 (θ, θ0))12d(x,θ0)≤ε−1] But if 2d(x, θ0) ≤ ε − 1 and d(θ, θ0) > ε, 2d(θ, θ0)d(x, θ0) − d2 (θ, θ0) < d(θ, θ0)(ε − 1 − ε) < −ε Finally (1) ≤ −ε and this gives (4) since K0 = Oc is compact. Let x1, · · · , xn, · · · , ... be independent samples of f(·, θ0). The MLE based on these samples is the Riemannian mean θ̂n = argminθ Pn i=1 d2 (xi, θ). Existence and uniqueness of θ̂n follow from [14]. Theorem 1 shows the convergence of θ̂n to θ0. This convergence was proved in [1] using results of [15] on convergence of empirical barycenters. (ii) Gaussian and Laplace distributions on symmetric spaces. Gaus- sian distributions can be defined more generally on Riemannian symmetric spaces [4]. MLEs of these distributions are consistent estimators [4]. This can be recov- ered by applying Theorem 1 as for Pm. In the same way, it can be checked that Laplace distributions on Pm [2] and symmetric spaces satisfy assump- tions of Theorem 1 and consequently their estimators are also consistent. No- tice, for Laplace distributions, MLE coincides with the Riemannian median θ̂n = argminθ Pn i=1 d(xi, θ). 3 Asymptotic normality and asymptotic efficiency of the MLE Let Θ be a smooth manifold with dimension p equipped with an affine connection ∇ and an arbitrary distance d. Consider M a measurable space equipped with a positive measure µ and (Pθ)θ∈Θ a family of distributions on M such that Pθ(dx) = f(x, θ)µ(dx) and f > 0. Consider the following generalization of estimating functions [16]. Definition 1. An estimating form is a function ω : M × Θ −→ T∗ Θ such that for all (x, θ) ∈ M × Θ, ω(x, θ) ∈ T∗ θ Θ and Eθ[ω(x, θ)] = 0 or equivalently Eθ[ω(x, θ)Xθ] = 0 for all Xθ ∈ TθΘ. Assume l(x, θ) = log(f(x, θ)) is smooth in θ and satisfies appropriate integrabil- ity conditions, then differentiating with respect to θ, the identity R M f(x, θ)µ(dx) = 1, one finds ω(x, θ) = dl(x, θ) is an estimating form. The main result of this section is the following Theorem 2. Let ω : M × Θ −→ T∗ Θ be an estimating form. Fix θ0 ∈ Θ and let (xn)n≥1 be independent samples of Pθ0 . Assume (i) There exist (θ̂N )N≥1 such that PN n=1 ω(xn, θ̂N ) = 0 for all N and θ̂N con- verges in probability to θ0. (ii) For all u, v ∈ Tθ0 Θ, Eθ0 [|∇ω(x, θ0)(u, v)|] < ∞ and there exists (ea)a=1,··· ,p a basis of Tθ0 Θ such that the matrix A with entries Aa,b = Eθ0 [∇ω(x, θ0)(ea, eb)] is invertible. (iii) The function R(δ) = Eθ0 [ sup t∈[0,1],θ∈B(θ0,δ) |∇ω(x, γ(t))(ea(t), eb(t)) − ∇ω(x, θ0)(ea, eb)|] satisfies limδ→0 R(δ) = 0 where (ea, a = 1 · · · , p) is a basis of Tθ0 Θ as in (ii) and ea(t), t ∈ [0, 1] is the parallel transport of ea along γ the unique geodesic joining θ0 and θ̄. Let Logθ(θ̂N ) = Pp a=1 ∆aea be the decomposition of Logθ(θ̂N ) in the basis (ea)a=1,··· ,p. The following convergence holds in distribution as N −→ ∞ √ N(∆1, · · · , ∆p)T ⇒ N(0, (A† )−1 ΓA−1 ) where Γ is the matrix with entries Γa,b = Eθ0 [ω(x, θ0)ea.ω(x, θ0)eb]. Proof. Take V a small neighborhood of θ0 and let γ : [0, 1] −→ V be the unique geodesic contained in V such that γ(0) = θ0 and γ(1) = θ̂N . Let (ea, a = 1 · · · , p) be a basis of Tθ0 Θ as in (ii) and define ea(t), t ∈ [0, 1] as the parallel transport of ea along γ: Dea(t) dt = 0, t ∈ [0, 1], ea(0) = ea where D is the covariant derivative along γ. Introduce ωN (θ) = N X n=1 ω(xn, θ) and Fa(t) = ωN (γ(t))(ea(t)) By Taylor formula, there exists ca ∈ [0, 1] such that Fa(1) = Fa(0) + F0 a(ca) (2) Note Fa(1) = 0, Fa(0) = ωN (θ0)(ea) and F0 a(t) = (∇ωN )(γ0 (t), ea(t)) = P b ∆b(∇ωN )(eb(t), ea(t)). In particular, F0 a(0) = P b ∆b(∇ωN )(eb, ea). Dividing (2) by √ N, gives − 1 √ N ωN (θ0)(ea) = 1 √ N X b ∆b(∇ωN )(eb(ca), ea(ca)) (3) Define Y N =  − 1 √ N ωN (θ0)(e1), · · · , − 1 √ N ωN (θ0)(ep) † and let AN be the ma- trix with entries AN (a, b) = 1 N (∇ωN )(ea(ca), eb(ca)). Then (3) writes as Y N = (AN )† ( √ N∆1, · · · , √ N∆p)† . Since Eθ0 [ω(x, θ0)] = 0, by the central limit the- orem, Y N converges in distribution to a multivariate normal distribution with mean 0 and covariance Γ. Note AN a,b = 1 N (∇ωN )(ea, eb) + RN a,b where RN a,b = 1 N (∇ωN )(ea(ca), eb(ca)) − 1 N (∇ωN )(ea, eb). By the SLLN and as- sumption (ii), the matrix BN with entries BN (a, b) = 1 N (∇ωN )(ea, eb) converges almost surely to the matrix A. Note |RN a,b| is bounded by 1 N N X n=1 sup t∈[0,1] sup θ∈B(θ0,δ) |∇ω(xn, γ(t))(ea(t), eb(t)) − ∇ω(xn, θ0)(ea, eb)| By the SLLN, for δ small enough, the right-hand side converges to R(δ) defined in (iii). The convergence in probability of θ̂N to θ0 and assumption (iii) show that RN a,b → 0 in probability and so AN converges in probability to A. By Slutsky lemma ((A† N )−1 , YN ) converges in distribution to ((A† )−1 , N(0, Γ)) and so (A† N )−1 YN converges in distribution to (A† )−1 N(0, Γ) = N(0, (A† )−1 ΓA−1 ). Remark 1 on ω = dl. For ω an estimating form, one has Eθ[ω(x, θ)] = 0. Taking the covariant derivative, one gets Eθ[dl(U)ω(V )] = −Eθ[∇ω(U, V )] for all vector fields U, V . When ω = dl, this writes Eθ[ω(U)ω(V )] = −Eθ[∇ω(U, V )]. In particular Γ = Eθ0 [dl ⊗ dl(ea, eb)] = −A and A† = A = Eθ0 [∇(dl)(ea, eb)] = Eθ0 [∇2 l(ea, eb)] where ∇2 is the Hessian of l. The limit matrix is therefore equal to Fisher information matrix Γ−1 = −A−1 . This yields the following corollary. Corollary 1. Assume Θ = (M, g) is a Riemannian manifold and let d be the Riemannian distance on Θ. Assume ω = dl satisfies the assumptions of Theorem 2 where ∇ is the Levi-Civita connection on Θ. The following convergence holds in distribution as N → ∞. Nd2 (θ̂N , θ0) ⇒ p X i=1 X2 i where X = (X1, · · · , Xp)T is a random variable with law N(0, I−1 ) with I(a, b) = Eθ0 [∇2 l(ea, eb)]. The next proposition is concerned with asymptotic efficiency of MLE. It states that the lower asymptotic variance for estimating forms satisfying Theorem 2 is attained for ω0 = dl. Take ω an estimating from and consider the matrices E, F, G, H with entries Ea,b = Eθ0 [dl(θ0, x)eadl(θ0, x)eb], Fa,b = Eθ0 [dl(θ0, x)eaω(θ0, x)eb] = −Aa,b, Ga,b = Fb,a, Ha,b = Eθ0 [ω(θ0, x)eaω(θ0, x)eb] = Γa,b. Recall E−1 is the limit distribution when ω0 = dl. Note M =  E F G H  is symmetric. When ω = dl, it is furthermore positive but not definite. Proposition 1. If M is positive definite, then E−1 < (A† )−1 ΓA−1 . Proof. Since M is symmetric positive definite, the same also holds for its inverse. By Schur inversion lemma, E − FH−1 G is symmetric positive definite. That is E > FH−1 G or equivalently E−1 < (A† )−1 ΓA−1 . Remark 2. As an example, it can be checked that Theorem 2 is satisfied by ω = dl of the Gaussian and Laplace distributions discussed in paragraph 2.1. For the Gaussian distribution on Pm, this result is proved in [1]. More examples will be given in a future paper. Remark 3 on Cramér-Rao lower bound. Assume Θ is a Riemannian manifold and θ̂n defined in Theorem 2 (i) is unbiased: E[Logθ0 (θ̂n)] = 0. Consider (e1, · · · , ep) an orthonormal basis of Tθ0 Θ and denote by a = (a1, · · · , ap) the coordinates in this basis of Logθ0 (θ̂n). Smith [17] gave an intrinsic Cramér-Rao lower bound for the covariance C(θ0) = E[aaT ] as follows C ≥ F−1 + curvature terms (4) where F = (Fi,j = E[dL(θ0)eidL(θ0)ej], i, j ∈ [1, p]) is Fisher information ma- trix and L(θ) = PN i=1 log f(xi, θ). Define L the matrix with entries Li,j = E[dl(θ0)eidl(θ0)ej] where l(θ) = log f(x1, θ). By multiplying (4) by √ n, one gets, with y = √ na, E[yyT ] ≥ L−1 + n × curvature terms It can be checked that as n → ∞, n × curvature terms → 0. Recall y converges in distribution to N(0, (A† )−1 ΓA−1 ). Assume it is possible to interchange limit and integral, from Theorem 2 one deduces (A† )−1 ΓA−1 ≥ L−1 which is similar to Proposition 1. 4 Statistical tests. Asymptotic properties of MLE have led to another fundamental subject in statis- tics which is testing. In the following, some popular tests on Euclidean spaces are generalized to manifolds. Let Θ, M and f be as in the beginning of the previous section. Wald and score tests. Given x1, · · · , xn independent samples of f(., θ) where θ is unknown, consider the test H0 : θ = θ0. Define the Wald test statistic for H0 by QW = n(∆1, · · · , ∆p)I(θ0)(∆1, · · · , ∆p)T where I(θ0) is Fisher matrix with entries I(θ0)(a, b) = −Eθ0 [∇2 l(ea, eb)] and ∆1, · · · , ∆p, (ea)a=1:p are defined as in Theorem 2. Continuing with the same notations, the score test is based on the statistic QS = U(θ0)T I(θ0)U(θ0) where U(θ0) = (U1(θ0), · · · , Up(θ0)), (Ua(θ0))a=1:p are the coordinates of ∇θ0 l(θ0, X) in the basis (ea)a=1:p and l(θ, X) = Pn i=1 log(f(xi, θ)). Theorem 3. Assume ω = dl satisfies conditions of Theorem 2. Then, under H0 : θ = θ0, QW (respectively QS) converges in distribution to a χ2 distribution with p = dim(Θ) degrees of freedom. In particular, Wald test (resp. the score test) rejects H0 when QW (resp. QS) is larger than a chi-square percentile. Because of the lack of space, the proof of this theorem will be published in a future paper. One can also consider a generalization of Wilks test to manifolds. An extension of this test to the manifold Pm appeared in [1]. References 1. Riemannian Gaussian distributions on the space of symmetric positive definite ma- trices. Said, S. et al. IEEE. Inf. Th. 2017. 2. Riemannian Laplace distribution on the space of symmetric positive definite matri- ces. Hajri, H et al. Entropy 18, 2016. 3. A geometric learning approach on the space of complex covariance matrices. Hajri, H. et al. Icassp 2017. 4. Gaussian distributions on Riemannian symmetric spaces: statistical learning with structured covariance matrices. Said, S. et al. IEEE. Inf. Th. 2017. 5. Parameters estimate of Riem. Gaussian distribution in the manifold of covariance matrices. Zanini, P. et al. IEEE Sensor Array. Rio de Janeiro, 2016. 6. Statistical analysis on Stiefel and Grassmann manifolds with applications in com- puter vision. Turaga, Pavan K. et al. IEEE Computer Society, 2008. 7. A system identification approach for video-based face recognition. Gaurav Aggarwal et al. ICPR (4), pages 175–178. IEE Computer Society 2004. 8. Statistical comp. on Grass. and Stiefel manifolds for image and video-based recog- nition. Turaga, Pavan K. et al. IEEE Trans. Pattern Anal. Mach. Intell (11), 2011. 9. Shape manifolds, Procrustean metrics, and complex projective spaces. David G. Kendall. Bulletin of the London Mathematical Society, 1984. 10. Statistics on special manifolds. Chikuse, Y. Lecture Notes in Statistics. Vol. 174, Springer-Verlag, New York 2003. 11. A geometric particle filter for template-based visual tracking. Junghyun Kwon et al. IEEE Trans. Pattern Anal. Mach. Intell 36 (4), pages 625–643, 2014. 12. Analysis of non-linear attitude observers for time-varying reference measurements. Jochen Trumpf et al. IEEE Trans. Automat. Contr. 57 (11), 2012. 13. Gauss. distributions on Lie Groups and their app. to statistical shape analysis. P. Thomas Fl. et al. Inform. Proce. in Medical Imaging, 18th Intern. conf, UK, 2003. 14. Riemannian Lp center of mass: existence, uniqueness and convexity. Afsari, B. Proc. Amer. Math. Soc., 139 (2), pages 655-673, 2011. 15. Large sample theory of intrinsic and extrinsic sample means on manifolds. I Bhat- tacharya, R. and Patrangenaru, V. Ann. Stat. 3, number 1, 2003. 16. Quasi-likelihood and its application: a general approach to optimal parameter es- timation. Heyde, C. C. Springer-Verlag Inc, Berlin; New York, 1997. 17. Covariance, subspace, and intrinsic Cramér-Rao bounds. Smith, S. T. IEEE Trans. Signal Process 53 (5), pages 1610-1630, 2005.