Stochastic Development Regression using Method of Moments

07/11/2017
Publication GSI2017
OAI : oai:www.see.asso.fr:17410:22568
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit
 

Résumé

This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects.
The model is intrinsically de ned using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry.
We propose to infer parameters using the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on nite dimensional landmark manifolds.

Stochastic Development Regression using Method of Moments

Collection

application/pdf Stochastic Development Regression using Method of Moments Line Kühnel, Stefan Sommer
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

Stochastic Development Regression using Method of Moments
application/pdf Stochastic Development Regression using Method of Moments (slides)

Média

Voir la vidéo

Métriques

0
0
437.98 Ko
 application/pdf
bitcache://2ab8862d1c8a567ab38bf69fb6735f7104818f31

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors Platine

alanturinginstitutelogo.png
logothales.jpg

Sponsors Bronze

logo_enac-bleuok.jpg
imag150x185_couleur_rvb.jpg

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors

smai.png
logo_gdr-mia.png
gdr_geosto_logo.png
gdr-isis.png
logo-minesparistech.jpg
logo_x.jpeg
springer-logo.png
logo-psl.png

Organisateurs

logo_see.gif
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/17410/22568</identifier><creators><creator><creatorName>Stefan Sommer</creatorName></creator><creator><creatorName>Line Kühnel</creatorName></creator></creators><titles>
            <title>Stochastic Development Regression using Method of Moments</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2018</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><subjects><subject>Frame Bundle</subject><subject>Non-linear Statistics</subject><subject>Regression</subject><subject>Statistics on Manifolds</subject><subject>Stochastic Development</subject></subjects><dates>
	    <date dateType="Created">Thu 8 Mar 2018</date>
	    <date dateType="Updated">Thu 8 Mar 2018</date>
            <date dateType="Submitted">Fri 20 Jul 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">2ab8862d1c8a567ab38bf69fb6735f7104818f31</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>37289</version>
        <descriptions>
            <description descriptionType="Abstract">This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects.<br />
The model is intrinsically de ned using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry.<br />
We propose to infer parameters using the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on nite dimensional landmark manifolds.
</description>
        </descriptions>
    </resource>
.

Stochastic Development Regression using Method of Moments Line Kühnel and Stefan Sommer Department of Computer Science, University of Copenhagen kuhnel@di.ku.dk, sommer@di.ku.dk Abstract. This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression cap- tures the relation between manifold-valued response and Euclidean co- variate variables using the stochastic development construction. It is thereby able to incorporate several covariate variables and random ef- fects. The model is intrinsically defined using the connection of the man- ifold, and the use of stochastic development avoids linearizing the ge- ometry. We propose to infer parameters using the Method of Moments procedure that matches known constraints on moments of the observa- tions conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds. Keywords: Frame Bundle, Non-linear Statistics, Regression, Statistics on Manifolds, Stochastic Development. 1 Introduction There is a growing interest for statistical analysis of non-linear data such as shape data arising in medical imaging and computational anatomy. Non-linear data spaces lack vector space structure, and traditional Euclidean statistical theory is therefore not sufficient to analyze non-linear data. This paper considers parame- ter inference for the stochastic development regression (SDR) model introduced in [10] that generalizes Euclidean regression models to non-linear spaces. The focus of this paper is to introduce an alternative estimation procedure which is simple and computationally tractable. Stochastic development regression is used to model the relation between a manifold-valued response and Euclidean covariate variables. Similar to Brown- ian motions on a manifold, M, defined as the transport of a Euclidean Brow- nian motion from Rn to M, the SDR model is defined as the transport of a Euclidean regression model. A Euclidean regression model can be regarded as a time dependent model in which, potentially, several observations have been observed over time. Given a response variable yt ∈ Rd and covariate vector xt = (x1 t , . . . , xm t ) ∈ Rm , the Euclidean regression model can be written as yt = αt + βtxt + εt, t ∈ [0, 1], (1) where αt ∈ Rd and βt ∈ Rd×m . A regression model can hence be defined as a stochastic process with drift αt, covariate dependency through βtxt, and a brownian noise εt. The SDR model is then defined as the transport of a regres- sion model of the form (1), from Rd to the manifold M. The trasnportation is performed by stochastic development described in Section 2. Fig. 1 visualizes the idea behind the model. Fig. 1. The idea behind the model. Normal linear regression process zi t defined in (1) is transported to the manifold through stochastic development, ϕ. Here FM is the frame bundle, π a projection map, and Dyi1 the transition distribution of yit = π(ϕ(zi t)). The tangent bundle of FM can be split in a horizontal and vertical subspace. Changes on FM in the vertical direction corresponds to fixing a point y ∈ M while changing the frame, ν, of the tangent space, TyM. Changes in the horizontal direction is fixing the frame for the tangent space and changing the point on the manifold. The frame is in this case parallel transported to the new tangent space. In [10], Laplace approximation was applied for estimation of the parameter vector. However, this method was computational expensive and it was difficult to obtain results for detailed shapes. Alternatively, a Monte Carlo Expectation Maximization (MCEM) method has been considered, but, with this method, high probability samples were hard to obtain, which led to an unstable objective function. As a consequence, this paper examines the Method of Moments (MM) procedure for parameter estimation. The MM procedure is easy to apply and not as computationally expensive as the Laplace approximation. It is a well-known method for estimation in Euclidean statistics (see for example [14, 6, 3]), where it has been proven in general to provide consistent parameter estimates. Several versions of the generalized regression model have been proposed in the case of manifold-valued response and Euclidean covariate variables. Local re- gression is considered in [19, 11]. The former defines an intrinsic local regression model, while [11] constructs an extrinsic model. For global regression models, [5, 12, 16] consider geodesic regression, which is a generalization of the Euclidean linear regression model. There have been several approaches for defining non- geodesic regression models on manifolds. An example is kernel based regression models, in which the model function is estimated by a kernel representation [1, 4, 13]. In [8, 7, 17], the non-geodesic relation is modelled by a polynomial or piece- wise cubic spline function. Moreover, [15, 2] propose estimation of a parametric link function by minimization of the total residual sum of squares and the gen- eralized method of moments procedure respectively. The paper will be structured as follows. Section 2 gives a brief description of stochastic development and the frame bundle FM. Section 3 introduces the SDR model and Section 4 describes the estimation procedure, Method of Moments. At the end, a simulation example is performed in Section 5. 2 Stochastic Development This section gives a brief introduction to frame bundle and stochastic develop- ment. For a more detailed description and a reference for the following see [9]. Consider a d-dimensional Riemannian manifold (M, g) and a probability space (Ω, F, P). Stochastic development is a method for transportation of stochas- tic processes in Rd to stochastic processes on M. Let zt : Ω → Rd denote a stochastic process for t ∈ [0, 1]. In order to define the stochastic development of zt it is necessary to consider a connection on M. A connection, ∇, defines transportation of vectors along curves on the manifold, such that tangent vec- tors in different tangent spaces can be compared. A frequently used connec- tion, which will also be used in this paper, is the Levi-Civita connection of a Riemannian metric. Consider a point q ∈ M and let ∂i for i = 1, . . . , d de- note a coordinate frame at q, i.e. an ordered basis for TqM, with dual frame dxi . A connection ∇ is locally determined by the Christoffel symbols defined by ∇∂i ∂j = Γk ij∂k. The Christoffel symbols for the Levi-Civita connection are given by Γk ij = 1 2 gkl (∂igjl + ∂jgil − ∂lgij), where gij denotes the coefficients of the metric g in the dual frame dxi , i.e. g = gijdxi dxj , and gij are the inverse coefficients. Stochastic development uses the frame bundle, FM, defined as the fiber bundle of tuples (y, ν), y ∈ M with ν : Rd → TyM being the frame for the tangent space TyM. Given a connection on FM, the tangent bundle of the frame bundle, TFM, can be split into a horizontal, HFM, and vertical, V FM, subspace, i.e. TFM = HFM⊕V FM. Fig. 1 shows a visualization of the frame bundle and the horizontal and vertical tangent spaces. The horizontal subspace determines changes in y ∈ M while fixing the frame ν, while V FM fixes y ∈ M and describes the change in the frame for TyM. Given the split of the tangent bundle TFM, an isomorphism π?,(y,ν) : H(y,ν)FM → TyM can be defined. The inverse map π? (y,ν) is called the horizontal lift and pulls a tangent vector in TyM to H(y,ν)FM. The horizontal lift of v ∈ TyM is here denoted v? ∈ H(y,ν)FM. Let e1, . . . , ed be the canonical basis of Rd and consider a point (y, ν) ∈ FM. Define the horizontal vector fields, H1, . . . , Hd, by Hi(ν) = (νei)? . The vector fields H1, . . . , Hd then form a basis for the subspace HFM. Given this basis for HFM, the stochastic development of a Euclidean stochastic process, zt, to the frame bundle FM can be found by the solution to the Stratonovich differential equation dUt = Hi(Ut) ◦ dzi t, where Einsteins summation notation is used and ◦ specifies that it is a Stratonovich differential equation. The stochastic development of a process zt ∈ Rd with reference point (y, ν) will be denoted ϕ(y,ν)(zt). A stochastic process on M can then be obtained by the projection of Ut to M by the projection map π: FM → M. 3 Model Consider a d-dimensional manifold M equipped with a connection ∇ and let y1, . . . , yn be n realizations of the response y ∈ M. Notice that the realiza- tions are assumed to be measured with additive noise, which might pull the observations to an ambient space of M. An example of such additive noise for landmark data is given in Section 5. Denote for each observation i = 1, . . . , n, xi = (xi1, . . . , xim) ∈ Rm the covariate vector of m ≤ d covariate variables. The SDR model is defined as a stochastic process on M based on the definition of Euclidean regression models regarded as stochastic processes (see (1)). Assume therefore that the response y ∈ M is the endpoint of a stochastic process yt in M and the covariates, xi, the endpoint of a stochastic process Xt = (X1t, . . . , Xmt) in Rm . The process Xjt is for random covariate variables assumed to be a Brow- nian motion in R, while for fixed covariate effects it is modelled as a fixed drift. The process yit for each observation i = 1, . . . , n is defined as the stochastic development of a Euclidean model on Rm . Consider the stochastic process, zit, in Rm defined by the stochastic differential equation equivalent to the Euclidean regression model defined in (1), dzit = αdt + WdXit + dεit, t ∈ [0, 1]. (2) Here αdt is a fixed drift, W the m × m coefficient matrix and εit the random error modelled as a Brownian motion in Rm . The response process yit is then given as the stochastic development of zit, i.e. yit = ϕ(y0,ν0)(zit) for a reference point y0 and frame ν0 ∈ Ty0 M (see Fig. 1). The realizations are modelled as noisy observations of the endpoints of yit, yi = yi1 + ε̃i in which ε̃i ∼ N(0, τ2 I) denotes iid. additive noise. There is a natural relation between W and the frame ν0. If ν0 is assumed to be an orthonormal basis and U the d × m-matrix with columns of basis vectors of ν0, then the matrix W̃ = UW explains the gathered effect of W and ν0 through U. However, this decomposition is not unique and hence the W̃ matrix is estimated instead of U and W individually. 4 Method of Moments In this section the MM procedure is introduced for the estimation of the parame- ters in the regression model. The MM procedure uses known moment conditions to define a set of equations which can be optimized to find the true parameter vector θ = (τ, α, W̃, y0), see [14, 6, 3]. Here τ2 is the additive noise variance, α the drift, W̃ combined effect of covariates and ν0, and y0 the initial point on M. In the SDR model the known moment conditions are based on the moments of the additive noise ε̃i and the fact that ε̃i is independent of the covariate variables xik for each k = 1, . . . , m. Hence, the moment conditions are, E [ε̃ij] = 0, E [ε̃ijxik] = 0, E  ε̃2 ij  = τ2 ∀j = 1, . . . , d, and k = 1, . . . , m. Known consistent estimators for these moments are the sample means. Consider the residuals, ε̂ij = yij − ŷij, in which the dependency of the parameter vec- tor, θ, lies in the predictions, ŷij for i = 1, . . . , n, j = 1, . . . , d. For a proper choice of parameter vector θ, the sample means will approach the true moments. Therefore, the set of equations used to optimize the parameter vector θ are, 1 n n X i=1 ε̂ij = 0, 1 n n X i=1 xikε̂ij = 0, and 1 n − 2 n X i=1 ε̂2 ij = τ̂2 , for all j = 1, . . . , d and k = 1, . . . , m and where τ̂2 is the estimated variance. In Euclidean statistics, the method of moments is known to provide consistent estimators, but these estimators might be biased. The cost function considered for optimization with respect to θ is, f(θ) = 1 d X j 1 n n X i=1 ε̂ij !2 + 1 dm X j,k 1 n n X i=1 xikε̂ij !2 + 1 d X j 1 n − 2 n X i=1 ε̂2 ij − τ̂2 !2 . (3) This cost function depends on predictions from the model based on the given parameter vector in each iteration. In order for the objective function to be stable it has to be evaluated for several predictions. Therefore, the function has been averaged for several predictions to obtain a more stable gradient descent optimization procedure. The initial value of θ can in practice be chosen as parameters estimated from a Euclidean multivariate linear regression model. Here, the estimated covariance matrix would resemble the W̃ effect and the intercept the initial point y0. 5 Simulation Example The performance of the estimation procedure will be evaluated using simulated data. We will generate landmark data on Riemannian landmark manifolds as defined in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework [18], and use the Levi-Civita connection. Shapes in the landmark manifold M are defined by a finite landmark representation, i.e. q ∈ M, q = (x1 1, x2 1, . . . , x1 nl , x2 nl ), where nl denotes the number of landmarks. The dimension of M is hence d = 2nl. Using a kernel K, the Riemannian metric on M is defined as g(v, w) = Pnl i,j vK−1 (xi, xj)w with K−1 denoting the inverse of the kernel matrix. In the following, we use a Gaussian kernel for K with standard deviation σ = 0.1. We will consider a single covariate variable x ∈ R drawn from N(0, 36) and model the relation to two response variables either with 1 or 3 landmarks. The response variables are simulated from a model with parameters given in Table 1 and Fig. 2 for nl = 3. Examples of simulated data for nl = 1 and 3 are shown in Fig. 2. The additive noise is in this case normally distributed iid. random noise added to each coordinate of landmarks. In this example we consider a 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 40 20 0 20 40 60 80 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 40 20 0 20 40 60 80 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 40 20 0 20 40 60 80 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 30 20 10 0 10 20 30 40 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 30 20 10 0 10 20 30 40 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 30 20 10 0 10 20 30 40 Fig. 2. (upper left) Sample drawn from model without additive noise and drift. (upper center) Sample drawn with additive noise, but no drift. (upper right) Sample drawn from the full model. The vertical lines are the stochastic development of zit and the horizontal corresponds to the additive noise, the blue point is the reference point. (lower left) Model without drift and variance for nl = 3, n = 70. (lower center) Model without drift and n = 70. (lower right) Model without drift and n = 150. These plots show the estimated results. (red) initial, (green) true, and (black) estimated reference point and frame. The gray samples are predicted from the estimated model while the green are a subset of the simulated data. Lower right plot does also show the difference in the estimated parameters for n = 70, n = 150 for the model with no drift. The magenta parameters in that plot is the estimated parameters for model without drift and n = 70, the corresponding black parameters in lower center plot. simplification of the model, as the random error in zit, given in (2), will be disregarded. Estimation of parameters is examined for three different models: one without additive noise and drift, one without drift, and at last the full model. For nl = 3 only estimation of the two first models is studied, and estimation in the model with no drift has been considered for n = 70 and n = 150. By the results shown in Table 1 and Fig. 2, the procedure makes a good estimate of the frame matrix W̃ in every situation. For the model with no additive noise and no drift, the procedure finds a reasonable estimate of y0. When noise is added, it is seen that a larger sample size is needed in order to get a good estimate of y0. On the contrary, the variance estimate seems biased in each case. For nl = 3 the variance parameters estimated were τ̂ = 0.306 for n = 70 and τ̂ = 0.231 for n = 150. However, when drift is added to the model, the estimation procedure has a hard time recapture the true estimates of y0 and α. This difficulty can be explained by the relation between the variables. In normal Table 1. Parameter estimates found with the MM procedure for 1 landmark. First column shows the true values and each column, estimated parameters in each model. True excl. τ, α n = 70 excl. α n = 70 excl. α n = 150 full model n = 150 τ 0.1 - (τ = 0) 0.256 0.226 0.207 α 40 - (α = 0) - (α = 0) - (α = 0) 37.19 W̃ (0, 2) (0, 2.013) (0.004, 1.996) (0, 2.003) (0, 2.004) y0 (1, 0) (1.064, 0.0438) (1.158, 0.162) (1.026, 0.0227) (1.076, 2.708) linear regression, only one intercept variable is present in the model, but in the SDR this intercept variable is split between α and y0. 6 Conclusion Method of Moments procedure has been examined for parameter estimation in the stochastic development regression (SDR) model. The SDR model is a gener- alization of regression models on Euclidean space to manifold-valued data. This model analyzes the relation between manifold-valued response and Euclidean covariate variables. The performance of the estimation procedure was studied based on a simulation example. The Method of Moments procedure was easier to apply and less computationally expensive than the Laplace approximation considered in [10]. The estimates found for the frame parameters were reason- able, but the procedure had a hard time retrieving the reference point and drift parameter. This is due to a mis-specification of the model as the reference point and drift parameter jointly correspond to the intercept in normal Euclidean regression models and hence there is no unique split of these parameters. For further investigation, it could be interesting to test the relation between the reference point and drift parameter to be able to retrieve good estimates of these parameters. In the Euclidean case, the Method of Moments procedure has been shown to provide consistent, but sometimes biased estimates. An interest- ing question for future work could also be, whether the parameter estimates in this model is consistent and biased. Acknowledgements. This work was supported by the CSGB Centre for Stochas- tic Geometry and Advanced Bioimaging funded by a grant from the Villum foundation. References 1. M. Banerjee, R. Chakraborty, E. Ofori, M. S. Okun, D. E. Vaillancourt, and B. C. Vemuri. A Nonlinear Regression Technique for Manifold Valued Data with Ap- plications to Medical Image Analysis. In 2016 IEEE Conference on CVPR, pages 4424–4432, June 2016. 2. E. Cornea, H. Zhu, P. Kim, J. G. Ibrahim, and the Alzheimer’s Disease Neuroimag- ing Initiative. Regression models on Riemannian symmetric spaces. Journal of the Royal Statistical Society: Series B, 79:463–482, March 2017. 3. J. G. Cragg. Using Higher Moments to Estimate the Simple Errors-in-Variables Model. The RAND Journal of Economics, 28:S71–S91, 1997. 4. B. C. Davis, P. T. Fletcher, E. Bullitt, and S. Joshi. Population Shape Regres- sion From Random Design Data. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–7, October 2007. 5. P. T. Fletcher. Geodesic Regression and the Theory of Least Squares on Rieman- nian Manifolds. Int. Journal of Computer Vision, 105:171–185, November 2012. 6. M. L. Hazelton. Methods of Moments Estimation. In International Encyclopedia of Statistical Science, pages 816–817. Springer Berlin Heidelberg, 2011. 7. J. Hinkle, P. Muralidharan, P. T. Fletcher, and S. Joshi. Polynomial Regression on Riemannian Manifolds. arXiv: 1201.2395, January 2012. 8. Y. Hong, R. Kwitt, N. Singh, N. Vasconcelos, and M. Niethammer. Parametric Regression on the Grassmannian. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(11):2284–2297, November 2016. 9. E. P. Hsu. Stochastic Analysis on Manifolds. American Mathematical Soc., 2002. 10. L. Kühnel and S. Sommer. Stochastic Development Regression on Non-Linear Manifolds. IPMI Conference, 2017. arXiv: 1703.00291. 11. L. Lin, B. St Thomas, H. Zhu, and D. B. Dunson. Extrinsic local regression on manifold-valued data. arXiv: 1508.02201, August 2015. 12. M. Niethammer, Y. Huang, and F.-X. Vialard. Geodesic Regression for Image Time-Series. MICCAI, 14(0 2):655–662, 2011. 13. J. Nilsson, F. Sha, and M. I. Jordan. Regression on Manifolds Using Kernel Di- mension Reduction. In Proceedings of the 24th ICML, pages 697–704. ACM, 2007. 14. M. Pal. Consistent moment estimators of regression coefficients in the presence of errors in variables. Journal of Econometrics, 14(3):349–364, December 1980. 15. X. Shi, M. Styner, J. Lieberman, J. G. Ibrahim, W. Lin, and H. Zhu. Intrinsic regression models for manifold-valued data. MICCAI, 12(Pt 2):192–199, 2009. 16. N. Singh, J. Hinkle, S. Joshi, and P. T. Fletcher. Hierarchical Geodesic Models in Diffeomorphisms. Int. Journal of Computer Vision, 117:70–92, March 2016. 17. N. Singh, F.-X. Vialard, and M. Niethammer. Splines for diffeomorphisms. Medical Image Analysis, 25(1):56–71, October 2015. 18. L. Younes. Shapes and Diffeomorphisms. Springer, 2010. 19. Y. Yuan, H. Zhu, W. Lin, and J. S. Marron. Local Polynomial Regression for Symmetric Positive Definite Matrices. Journal of the Royal Statistical Society. Series B, Statistical methodology, 74(4):697–719, September 2012.