Résumé

An extrinsic look at the Riemannian Hessian

Métriques

91
9
179.8 Ko
 application/pdf
bitcache://4612cfe530a63bd3bd0abce277fba161bb10415e

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors financier

logo_gdr-mia.png
logo_inria.png
image010.png
logothales.jpg

Sponsors logistique

logo-minesparistech.jpg
logo-universite-paris-sud.jpg
logo_supelec.png
Séminaire Léon Brillouin Logo
logo_cnrs_2.jpg
logo_ircam.png
logo_imb.png
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/2552/5061</identifier><creators><creator><creatorName>Pierre-Antoine Absil</creatorName></creator><creator><creatorName>Robert Mahony</creatorName></creator><creator><creatorName>Jochen Trumpf</creatorName></creator></creators><titles>
            <title>An extrinsic look at the Riemannian Hessian</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2013</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sat 5 Oct 2013</date>
	    <date dateType="Updated">Mon 25 Jul 2016</date>
            <date dateType="Submitted">Fri 25 May 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">4612cfe530a63bd3bd0abce277fba161bb10415e</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>25093</version>
        <descriptions>
            <description descriptionType="Abstract"></description>
        </descriptions>
    </resource>
.

An extrinsic look at the Riemannian Hessian Pierre-Antoine Absil (UCLouvain) Robert Mahony (Australian National University) Jochen Trumpf (Australian National University) GSI 2013, Paris 29 August 2013 Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 1 Broader topic: easy-to-implement Newton-type methods on manifolds M f R x Given: ◮ A manifold M, i.e., a set endowed (often implicitly) with a manifold structure (i.e., a collection of compatible charts). ◮ A function f : M → R, smooth in the sense of the manifold structure. Task: Compute a local minimizer of f . Approach: Newton-type methods on manifolds. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 2 Some specific manifolds and related applications I ◮ Stiefel manifold St(p, n) and orthogonal group Op = St(n, n) St(p, n) = {X ∈ Rn×p : XT X = Ip} Applications: computer vision; principal component analysis; independent component analysis... ◮ Grassmann manifold Gr(p, n) Set of all p-dimensional subspaces of Rn Applications: various dimensionality reduction problems... Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 3 Some specific manifolds and related applications II ◮ Low-rank symmetric PSD manifold: Rn×p ∗ /Op ≃ {YY T : Y ∈ Rn×p ∗ } where Rn×p ∗ is the set of all full-rank n × p matrices. Applications: Low-rank approximation of positive-definite matrices, e.g., for metric learning. ◮ Low-rank manifold: M(p, m × n) = {X ∈ Rm×n : rank(X) = p}. Applications: low-rank approximation of matrices, e.g., for recommander systems. ◮ Shape manifolds. Applications: shape analysis, e.g., for medical applications. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 4 Some specific manifolds and related applications III ◮ Oblique manifold Rn×p ∗ /Sdiag+ Rn×p ∗ /Sdiag+ ≃ {Y ∈ Rn×p ∗ : diag(Y T Y ) = Ip} Applications: independent component analysis; factor analysis (oblique Procrustes problem)... ◮ Flag manifold Rn×p ∗ /Supp∗ Elements of the flag manifold can be viewed as a p-tuple of linear subspaces (V1, . . . , Vp) such that dim(Vi ) = i and Vi ⊂ Vi+1. Applications: analysis of QR algorithm... Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 5 Topic of talk: easy-to-implement Newton-type methods on manifolds M f R x Given: ◮ A manifold M, i.e., a set endowed (often implicitly) with a manifold structure (i.e., a collection of compatible charts). ◮ A function f : M → R, smooth in the sense of the manifold structure. Task: Compute a local minimizer of f . Approach: Newton-type methods on manifolds. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 6 Reminder: Newton in Rn Required: (smooth) real-valued function f on Rn. Iteration xk ∈ Rn → xk+1 ∈ Rn defined by 1. Solve the Newton equation Hess f · ηk = −∂f (xk) for the unknown ηk ∈ Txk Rn ≃ Rn, where ∂f (x) := ∂1f (x) . . . ∂nf (x) T and, for all zx ∈ Tx M, Hess f · zx := Dzx (∂f ). 2. Set xk+1 := xk + ηk. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 7 Newton on Riemannian submanifolds (with Levi-Civita connection) Required: Riemannian submanifold M of Euclidean space E; retraction R on M; real-valued function f on M; extension ¯f of f on E. Iteration xk ∈ M → xk+1 ∈ M defined by 1. Solve the Newton equation Hess f · ηk = −grad f (xk) for the unknown ηk ∈ Txk M, where grad f (x) = Px (∂¯f (x)), with Px orthog proj onto Tx M and, for all zx ∈ Tx M, Hess f · zx := Px Dzx (grad f ). 2. Set xk+1 := R(xk, ηk). Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 8 An extrinsic look at the Riemannian Hessian For all zx ∈ Tx M (and dropping the subscript x to lighten the notation), we have Hess f · z = Px Dz (grad f ) = Px Dz P∂¯f = Px (Px ∂2¯f (x)z + DzP ∂¯f (x)) = Px ∂2¯f (x)z + Px Dz P∂¯f (x) Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 9 An extrinsic look at the Riemannian Hessian Recall: Hess f · z = Px ∂2¯f (x)z + Px DzP∂¯f (x). ◮ Theorem: Px Dz Pu = Px DzPP⊥ x u = −Px Dz(P⊥ U) =: Ax (z, P⊥ x u), for all x ∈ M, z ∈ Tx M, u ∈ Tx E ≃ E, and all extension U of u. The symbol Ax stands for the Weingarten map of the submanifold M of the Euclidean space E. ◮ Proof: For the first equality, observe that 0 = PP⊥ = Dz (PP⊥ ) = Dz PP⊥ x + Px Dz P⊥ = DzPP⊥ x − Px Dz P. Multiplying by Px on the left and using the identity Px Px = Px yields Px DzP = Px Dz PP⊥ x . For the second equality, observe that, for all extensions U of u, − Px Dz(P⊥ U) = −Px Dz P⊥ U − Px P⊥ x Dz U = −Px DzP⊥ U = Px DzPU. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 10 An extrinsic look at the Riemannian Hessian: the Stiefel manifold Recall: Hess f · z = Px ∂2¯f (x)z + Px DzP∂¯f (x). ◮ The Stiefel manifold St(p, n) is the set of orthonormal p-frames in Rn: St(p, n) = {X ∈ Rn×p : XT X = Ip}. ◮ The orthogonal projector PX onto TX St(p, n) is given by PX U = (I − XXT )U + X 1 2 (XT U − UT X) = U − X 1 2 (XT U + UT X). ◮ Let Z ∈ TX M and W ∈ T⊥ X M. We have PX DZ PW = −ZXT W − X 1 2 (ZT W + W T Z). Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 11 An extrinsic look at the Riemannian Hessian: the Grassmann manifold Recall: Hess f · z = Px ∂2¯f (x)z + Px DzP∂¯f (x). ◮ The Grassmann manifold Grm,n is the set of m-dimensional subspaces of Rn. Equivalently, it can be viewed as the set of rank-m orthogonal projectors in Rn, i.e., Grm,n = {X ∈ Rn×n : XT = X, X2 = X, trX = n}. ◮ It is known that PX = ad2 X with adX A := [X, A] := XA − AX and ad2 X := adX ◦ adX . ◮ We obtain that, for all Z ∈ TX Grm,n and all W ∈ T⊥ X Grm,n, PX DZ P W = −adX adW Z. One recovers herewith the Hessian formula given by Helmke et al (arXiv:0709.2205). Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 12 An extrinsic look at the Riemannian Hessian: the fixed-rank manifold Recall: Hess f · z = Px ∂2¯f (x)z + Px DzP∂¯f (x). ◮ The fixed-rank manifold Mp(m × n) is the set of all m × n matrices of rank p. ◮ The projector PX onto TX Mp(m × n) is given by PX W = PUW PV +P⊥ U W PV +PUW P⊥ V = W PV +PUW −PUW PV , where PU := UUT and P⊥ U := I − PU. ◮ Let Z ∈ TX Mp(m × n). Let W ∈ T⊥ X Mp(m × n). We obtain PX DZ P W = WZT (X+ )T + (X+ )T ZT W . Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 13 Newton on Riemannian submanifolds (with Levi-Civita connection) Required: Riemannian submanifold M of Euclidean space E; retraction R on M; real-valued function f on M; extension ¯f of f on E. Iteration xk ∈ M → xk+1 ∈ M defined by 1. Solve the Newton equation Hess f · ηk = −grad f (xk) for the unknown ηk ∈ Txk M, where grad f (x) = Px (∂¯f (x)), with Px orthog proj onto Tx M and, for all zx ∈ Tx M, Hess f · zx := Px Dzx (grad f ). 2. Set xk+1 := R(xk, ηk). Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 14 Take-home message ◮ Several reasons to optimize a real-valued function on a manifold (Stiefel manifold, Grassmann manifold, fixed-rank manifold...). ◮ Newton’s method is the archetypal second-order method. ◮ Newton’s method on a submanifold: Hess f · ηk = −grad f (xk) xk+1 := R(xk, ηk). ◮ Recent results for the Hessian: Hess f · z = Px ∂2¯f (x)z + Px DzP∂¯f (x), with formulas for Px and Px Dz P on several specific manifolds. Ref: PAA, Mahony & Trumpf, http://sites.uclouvain.be/absil/2013.01 ◮ Recent results for retractions: Projection-like techniques to construct R. Ref: PAA, Malick, http://sites.uclouvain.be/absil/2010.038 Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 15 A freely available toolbox for optimization on manifolds: www.manopt.org Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 16 Projection-like retractions on submanifolds Ref: PAA, Malick, http://sites.uclouvain.be/absil/2010.038 ◮ A retraction on M is a smooth mapping R : TM → M such that R(x, 0x ) = x and d dt R(x, tu) t=0 = u, for all (x, u) ∈ TM. ◮ A retractor on a d-dim submanifold M of an n-dim Euclidean space E is a smooth mapping D : TM → Gr(n − d, E) such that, for all x ∈ M, D(x, 0) is transverse to Tx M. ◮ Define the affine space D(x, u) = x + u + D(x, u). Let R(x, u) be the point of M ∩ D(x, u) nearest to x + u. ux D(x, u) M R(x, u) ◮ Theorem: R is a retraction on M. Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 17 Projection-like retractions: orthographic retraction ux M D(x, u) R(x, u) Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 18 Projection-like retractions: orthographic retraction on fixed-rank manifold ux M D(x, u) R(x, u) ◮ Now M := Mp(m × n), the set of all m × n matrices of rank p. ◮ Let X = U Σ0 0 0 0 V T with Σ0 ∈ Rp×p the diagonal matrix of non-zero singular values, and let Z = U A C B 0 V T be in TX Mp(m × n). ◮ The orthographic retraction R on M is given by R(X, Z) = U Σ0 + A C B B(Σ0 + A)−1C V T = U Σ0 + A B I (Σ0 + A)−1C V T . Easy-to-implement Newton-type optimization methods on manifolds: An extrinsic look at the Riemannian Hessian 19