Weakly Correlated Sparse Components with Nearly Orthonormal Loadings

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14322

Résumé

There is already a great number of highly efficient methods producing components with sparse loadings which significantly facilitates the interpretation of principal component analysis (PCA). However, they produce either only orthonormal loadings, or only uncorrelated components, or, most frequently, neither of them. To overcome this weakness, we introduce a new approach to define sparse PCA similar to the Dantzig selector idea already employed for regression problems. In contrast to the existing methods, the new approach makes it possible to achieve simultaneously nearly uncorrelated sparse components with nearly orthonormal loadings. The performance of the new method is illustrated on real data sets. It is demonstrated that the new method outperforms one of the most popular available methods for sparse PCA in terms of preservation of principal components properties.

Weakly Correlated Sparse Components with Nearly Orthonormal Loadings

Collection

application/pdf Weakly Correlated Sparse Components with Nearly Orthonormal Loadings Matthieu Genicot, Wen Huang, Nickolay Trendafilov

Média

Voir la vidéo

Métriques

126
10
582.89 Ko
 application/pdf
bitcache://cf942035c65862b4fb6b19cae9fc819e145efe2c

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14322</identifier><creators><creator><creatorName>Matthieu Genicot</creatorName></creator><creator><creatorName>Wen Huang</creatorName></creator><creator><creatorName>Nickolay Trendafilov</creatorName></creator></creators><titles>
            <title>Weakly Correlated Sparse Components with Nearly Orthonormal Loadings</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><subjects><subject>Dantzig selector</subject><subject>LASSO</subject><subject>Orthonormal and oblique component loadings matrices</subject><subject>Optimization on matrix manifolds</subject></subjects><dates>
	    <date dateType="Created">Sun 8 Nov 2015</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Fri 17 Aug 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">cf942035c65862b4fb6b19cae9fc819e145efe2c</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24716</version>
        <descriptions>
            <description descriptionType="Abstract">
There is already a great number of highly efficient methods producing components with sparse loadings which significantly facilitates the interpretation of principal component analysis (PCA). However, they produce either only orthonormal loadings, or only uncorrelated components, or, most frequently, neither of them. To overcome this weakness, we introduce a new approach to define sparse PCA similar to the Dantzig selector idea already employed for regression problems. In contrast to the existing methods, the new approach makes it possible to achieve simultaneously nearly uncorrelated sparse components with nearly orthonormal loadings. The performance of the new method is illustrated on real data sets. It is demonstrated that the new method outperforms one of the most popular available methods for sparse PCA in terms of preservation of principal components properties.

</description>
        </descriptions>
    </resource>
.

Weakly Correlated Sparse Components with Nearly Orthonormal Loadings GSI 2015 Matthieu Genicot Wen Huang Nickolay T. Trendafilov Universit´e Catholique de Louvain Universit´e Catholique de Louvain Open University October 30, 2015 1 / 12 Principal Components Analysis (PCA) Goal of PCA: Reducing high dimensional data to a lower dimension for visualization purpose or to reveal hidden patterns October 30, 2015 2 / 12 Principal Components Analysis (PCA) Goal of PCA: Reducing high dimensional data to a lower dimension for visualization purpose or to reveal hidden patterns Express the data X 2 Rn⇥p in a new space: Y = XA – linear combinations of all the p variables of X – orthogonal loadings (A) – uncorrelated components (Y) October 30, 2015 2 / 12 Principal Components Analysis (PCA) Goal of PCA: Reducing high dimensional data to a lower dimension for visualization purpose or to reveal hidden patterns Express the data X 2 Rn⇥p in a new space: Y = XA – linear combinations of all the p variables of X ) Di culty to interpret the results October 30, 2015 2 / 12 Motivations for sparse PCA Gene expression analysis 20000 genes, ⇠ 200 samples The components can have a biological interpretation Financial applications To manage the stocks e ciently Every non-zero loading has a cost (e.g., a transaction cost) October 30, 2015 3 / 12 Motivations for sparse PCA Gene expression analysis 20000 genes, ⇠ 200 samples The components can have a biological interpretation Financial applications To manage the stocks e ciently Every non-zero loading has a cost (e.g., a transaction cost) ) trade-o↵ between statistical fidelity (i.e., variance explained) and interpretability/utility (i.e., number of variables used) How to reduce the number of variables used for each component? ) How to achieve sparseness? October 30, 2015 3 / 12 Motivations for sparse PCA Gene expression analysis 20000 genes, ⇠ 200 samples The components can have a biological interpretation Financial applications To manage the stocks e ciently Every non-zero loading has a cost (e.g., a transaction cost) ) trade-o↵ between statistical fidelity (i.e., variance explained) and interpretability/utility (i.e., number of variables used) How to reduce the number of variables used for each component? ) How to achieve sparseness? Other applications Image processing, multiscale data processing etc. October 30, 2015 3 / 12 Problem formulation Sparse optimizers a of f (a) with `1 norm: Weighted form: min f (a) + ⌧kak1, for some ⌧ > 0 `1-constrained form: min f (a) subject to kak1  ⌧ Function-constrained form: min kak1 subject to f (a)  ¯f October 30, 2015 4 / 12 Problem formulation Classic PCA problem: maximize ai f (ai ) = a> i Rai subject to a> i ai = 1 a> i aj = 0, i 6= j Sparse optimizers a of f (a) with `1 norm: Weighted form: min f (a) + ⌧kak1, for some ⌧ > 0 `1-constrained form: min f (a) subject to kak1  ⌧ Function-constrained form: min kak1 subject to f (a)  ¯f October 30, 2015 4 / 12 Problem formulation Classic PCA problem: maximize ai f (ai ) = a> i Rai subject to a> i ai = 1 a> i aj = 0, i 6= j Sparse optimizers a of f (a) with `1 norm for sparse PCA: Weighted form: max a>Ra + ⌧kak1, for some ⌧ > 0 `1-constrained form: min f (a) subject to kak1  ⌧ Function-constrained form: min kak1 subject to f (a)  ¯f October 30, 2015 4 / 12 Problem formulation Classic PCA problem: maximize ai f (ai ) = a> i Rai subject to a> i ai = 1 a> i aj = 0, i 6= j Sparse optimizers a of f (a) with `1 norm for sparse PCA: Weighted form: max a>Ra + ⌧kak1, for some ⌧ > 0 `1-constrained form: max a>Ra subject to kak1  ⌧ Function-constrained form: min kak1 subject to f (a)  ¯f October 30, 2015 4 / 12 Problem formulation Classic PCA problem: maximize ai f (ai ) = a> i Rai subject to a> i ai = 1 a> i aj = 0, i 6= j Sparse optimizers a of f (a) with `1 norm for sparse PCA: Weighted form: max a>Ra + ⌧kak1, for some ⌧ > 0 `1-constrained form: max a>Ra subject to kak1  ⌧ Function-constrained form: min kak1 subject to a>Ra max ✏ Trendafilov (2014) October 30, 2015 4 / 12 Problem formulation Optimization on the oblique manifold OB(p, r): min OB(p,r) kAk1 + µkA> RA D2 k2 F Make the problem smooth: kAk1 ⇡ X ij ⇣q A2 ij + ✏2 ✏ ⌘ At the minimum: A>RA ⇡ D, then A>A ⇡ I Final cost function: min OB(p,r) X ij ⇣q A2 ij + ✏2 ✏ ⌘ + µkA> RA D2 kF , October 30, 2015 5 / 12 Tests Dataset Real DNA methylation dataset available online on the NCBI website 2000 genes randomly selected and ⇠ 150 samples Tests with 10 components Measures of interest Variance explained Correlation of the Components Orthogonality of the loadings Comparison to the method of Journ´ee et al. (2010) with both `0 and `1 norms October 30, 2015 6 / 12 Sparseness Drawback: Not exactly zero values October 30, 2015 7 / 12 Sparseness Drawback: Not exactly zero values October 30, 2015 7 / 12 Naive variance explained tr(A>X>XA) October 30, 2015 8 / 12 Correlation k A>XX>A diag(A>XX>A) kF October 30, 2015 9 / 12 Adjusted variance explained Zou et al. 2006 Orthogonal loadings: tr(A>X>XA) Non-orthogonal loadings For component i: Remove variance already explained by components 1, . . . (i 1) Project component i on the space spanning by components 1, . . . (i 1) ! QR decomposition: Y = QR, with Y = XA Residual variance after adjustment: tr(R2) October 30, 2015 10 / 12 Adjusted variance explained Zou et al. 2006 October 30, 2015 10 / 12 Orthogonality k A>A diag(A>A) kF October 30, 2015 11 / 12 Take-Home message & further work Motivation With larger and larger datasets collected, sparseness in PCA is more and more needed Results Our method can explain a large part of the variance in the data outperforms Journ´ee’s method for the uncorrelation between the components outperforms Journ´ee’s method for the orthogonality of the loadings Further work Tests at a larger-scale are needed and comparison with more methods are needed October 30, 2015 12 / 12