Generalized Mutual-Information based independence tests

28/10/2015
Publication GSI2015
OAI : oai:www.see.asso.fr:11784:14348
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit
 

Résumé

We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of mutual information associated to ϕ-divergences are derived through the dual representations of ϕ-divergences. The asymptotic properties of the estimates are established, including consistency, asymptotic distribution and large deviations principle. The related tests of independence are compared through their relative asymptotic Bahadur efficiency and numerical simulations.

Generalized Mutual-Information based independence tests

Collection

application/pdf Generalized Mutual-Information based independence tests Philippe Regnault, Amor Keziou
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

Generalized Mutual-Information based independence tests

Média

Voir la vidéo

Métriques

12
6
226.81 Ko
 application/pdf
bitcache://742af7a71567b539cabc02f3bfdac2b41995970e

Licence

Creative Commons Attribution-ShareAlike 4.0 International

Sponsors

Organisateurs

logo_see.gif
logocampusparissaclay.png

Sponsors

entropy1-01.png
springer-logo.png
lncs_logo.png
Séminaire Léon Brillouin Logo
logothales.jpg
smai.png
logo_cnrs_2.jpg
gdr-isis.png
logo_gdr-mia.png
logo_x.jpeg
logo-lix.png
logorioniledefrance.jpg
isc-pif_logo.png
logo_telecom_paristech.png
csdcunitwinlogo.jpg
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/11784/14348</identifier><creators><creator><creatorName>Amor Keziou</creatorName></creator><creator><creatorName>Philippe Regnault</creatorName></creator></creators><titles>
            <title>Generalized Mutual-Information based independence tests</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2015</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><dates>
	    <date dateType="Created">Sun 8 Nov 2015</date>
	    <date dateType="Updated">Wed 31 Aug 2016</date>
            <date dateType="Submitted">Fri 20 Jul 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">742af7a71567b539cabc02f3bfdac2b41995970e</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>24742</version>
        <descriptions>
            <description descriptionType="Abstract">
We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of mutual information associated to ϕ-divergences are derived through the dual representations of ϕ-divergences. The asymptotic properties of the estimates are established, including consistency, asymptotic distribution and large deviations principle. The related tests of independence are compared through their relative asymptotic Bahadur efficiency and numerical simulations.

</description>
        </descriptions>
    </resource>
.

Semi-parametric estimation of mutual information ; optimal test of independence Philippe Regnault1 joint work with Amor Keziou1 1 Laboratoire de Mathématiques de Reims, EA 4535, Université de Reims Champagne-Ardenne, France 2nd conference on Geometric Science of Information - 30/10/2015 Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 1 / 22 Testing independence by means of mutual informations (MIs) thresholding Classical independence measures and related tests ϕ-mutual informations Mutual information test for real-valued r.v. Semi-parametric modeling of the joint distribution The semi-parametric model Some examples Dual estimates of mutual informations Dual representation of ϕ-MI Estimating ϕ-MI via the duality technique Asymptotic properties of the dual estimates Bahadur asymptotic efficiency of ϕ-MI based tests KL-MI test maximizes Bahadur slope Simulations For finite random variables For Gaussian couples Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 2 / 22 Testing independence by means of mutual informations (MIs) thresholdingClassical independence measures and related tests Classical χ2 and MI independence tests Notations : ◮ (X, Y ), couple of r.v. taking values in a finite space X × Y ; ◮ P joint distribution of (X, Y ) ; ◮ P⊥ := P1 ⊗ P2, product distribution of margins P1 and P2 of (X, Y ). Dependence measures : ◮ χ2 -dependence measure (X and Y finite sets, P = (px,y )(x,y)) : χ2 (P, P⊥ ) := 1 2 X ×Y dP dP⊥ (x, y) − 1 2 dP⊥ (x, y) = 1 2 (x,y) (px,y − px py )2 px py , where dP/dP⊥ denotes the density of P with respect to P⊥ . ◮ Mutual information (MI) : IKL(P) := K(P, P⊥ ) = dP dP⊥ log dP dP⊥ dP⊥ = (x,y) px,y log px,y px py , where K(Q, P) := dQ dP log dQ dP dP, for P, Q distr., with Q a.c.w.r.t. P. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 3 / 22 Testing independence by means of mutual informations (MIs) thresholdingClassical independence measures and related tests χ2 and mutual information independence tests Let (X1, Y1), . . . , (Xn, Yn) be an n-sample of (X, Y ) ∼ P = (px,y )(x,y). For testing H0 : X and Y are independent, against H1 := H0, the test statistics for χ2 and MI independence tests are : 2nχ2 P, P⊥ = n (x,y)∈X ×Y (px,y − px py ) 2 px py , 2n IKL(P) = 2n (x,y)∈X ×Y px,y log px,y px py , where P := (px,y )(x,y) and P⊥ := (px py )(x,y) are, respectively, the empirical versions of P = (px,y )(x,y) and P⊥ = (px py )(x,y). Both 2nχ2 P, P⊥ and 2nIKL(P) are asymptotically chi-squared distributed, with (|X| − 1)(|Y| − 1) degrees of freedom under H0. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 4 / 22 Testing independence by means of mutual informations (MIs) thresholdingϕ-mutual informations ϕ-mutual informations (X, Y ) taking its values in a general measurable space (X × Y, AX ⊗ AY), with joint distribution P ∈ M1 (X × Y). Let ϕ : R → [0, +∞] be some non-negative closed proper convex function such that ϕ(1) = 0, ϕ′ (1) = 0 and ϕ′′ (1) = 1. ϕ-mutual information (ϕ-MI) of (X, Y ) : Iϕ(X, Y ) := Iϕ(P) := Dϕ(P, P⊥ ) := X ×Y ϕ dP dP⊥ (x, y) dP⊥ (x, y). ◮ for ϕ1(x) = x log x − x + 1, Iϕ1 (P) = IKL(P) (classic MI) ; ◮ for ϕ2(x) = 1 2 (x − 1) 2 , Iϕ2 (P) = χ2 (P, P⊥ ) (χ2 measure of dependence). Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 5 / 22 Testing independence by means of mutual informations (MIs) thresholdingϕ-mutual informations ϕ-mutual informations (X, Y ) taking its values in a general measurable space (X × Y, AX ⊗ AY), with joint distribution P ∈ M1 (X × Y). Let ϕ : R → [0, +∞] be some non-negative closed proper convex function such that ϕ(1) = 0, ϕ′ (1) = 0 and ϕ′′ (1) = 1. ϕ-mutual information (ϕ-MI) of (X, Y ) : Iϕ(X, Y ) := Iϕ(P) := Dϕ(P, P⊥ ) := X ×Y ϕ dP dP⊥ (x, y) dP⊥ (x, y). ◮ for ϕ1(x) = x log x − x + 1, Iϕ1 (P) = IKL(P) (classic MI) ; ◮ for ϕ2(x) = 1 2 (x − 1) 2 , Iϕ2 (P) = χ2 (P, P⊥ ) (χ2 measure of dependence). Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 5 / 22 Testing independence by means of mutual informations (MIs) thresholdingϕ-mutual informations ϕ-mutual informations (X, Y ) taking its values in a general measurable space (X × Y, AX ⊗ AY), with joint distribution P ∈ M1 (X × Y). Let ϕ : R → [0, +∞] be some non-negative closed proper convex function such that ϕ(1) = 0, ϕ′ (1) = 0 and ϕ′′ (1) = 1. ϕ-mutual information (ϕ-MI) of (X, Y ) : Iϕ(X, Y ) := Iϕ(P) := Dϕ(P, P⊥ ) := X ×Y ϕ dP dP⊥ (x, y) dP⊥ (x, y). ◮ for ϕ1(x) = x log x − x + 1, Iϕ1 (P) = IKL(P) (classic MI) ; ◮ for ϕ2(x) = 1 2 (x − 1) 2 , Iϕ2 (P) = χ2 (P, P⊥ ) (χ2 measure of dependence). Fundamental property : Iϕ(P) ≥ 0 and Iϕ(P) = 0 iif P = P⊥ iif X and Y are independent. Hence, the independence test problematic can be reformulated in H0 : Iϕ(P) = 0 against H1 : Iϕ(P) > 0. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 5 / 22 Testing independence by means of mutual informations (MIs) thresholdingMutual information test for real-valued r.v. Dealing with continuous r.v. – classical approaches Main estimation methods in the literature : ◮ Plug-in : replace P by its empirical counterpart in the definition of Iϕ. Widely used and studied when X, Y finite sets ; see e.g., Pardo (2006). Yields poor efficiency (power) for the related tests when dealing with continuous r.v., since it requires gathering data into classes (choice of the (number of) classes, etc). ◮ Kernel estimation : plug kernel estimates of the joint density and marginals into Iϕ. Leads to the difficulty of choosing the kernel and the bandwidth ; see e.g. Khan (2007). Unkown (asymptotic) distributions of estimates. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 6 / 22 Semi-parametric modeling of the joint distribution The semi-parametric model Modeling the ratio dP/dP⊥ Assume that the joint distribution P of the random vector (X, Y ) belongs to the semi-parametric model MΘ := P ∈ M1(X × Y); dP dP⊥ (·, ·) =: hθ(·, ·); θ = (α, β⊤ )⊤ ∈ Θ , where Θ ⊆ R1+d is the parameter space, and hθ(·, ·) : (x, y) ∈ X × Y → hθ(x, y) ∈ R is some specified real-valued function, indexed by the parameter θ. Main assumptions on the model : A.1 (identifiability)(hθ(x, y) = hθ′ (x, y), ∀(x, y) ∈ X × Y) ⇒ (θ = θ′ ) ; A.2 there exists (a unique) θ0 ∈ int(Θ) such that hθ0 (x, y) = 1, for all (x, y) ∈ X × Y. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 7 / 22 Semi-parametric modeling of the joint distribution Some examples The Gaussian model If (X, Y ) is Gaussian with unknown mean µ := (µ1, µ2)⊤ and unknown variance matrix Γ, then dP/dP⊥ =: hθ(x, y) = exp α + β1x + β2y + β3x2 + β4y2 + β5xy , and θ := (α, β1, β2, β3, β4, β5)⊤ . Note that the number of free parameters in θT is d, and that αT is considered as a normalizing parameter due to the constraint X ×Y hθT (x, y) dP⊥ (x, y) = X ×Y dP(x, y) = 1. Moreover, the independence parameter is θ0 := (0, . . . , 0)T ∈ R6 . Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 8 / 22 Semi-parametric modeling of the joint distribution Some examples The finite support model Assume that the support of P, supp(P) = X × Y, is a finite-discrete set of size K1 × K2 ; denote by (P(x, y))(x,y)∈X ×Y := (px,y )(x,y)∈X ×Y the density of P with respect to the counting measure on X × Y. We have dP dP⊥ (x, y) = exp   (a,b)∈X ×Y θa,b δ(a,b)(x, y)   , where θa,b = log pa,b papb , (a, b) ∈ X × Y. Here, θ = (θa,b)(a,b)∈X ×Y ∈ Θ ⊂ RK1K2 . The number of free-parameters in θT is then equal to (K1 − 1)(K2 − 1). The independence parameter is θ0 := (0, . . . , 0)T ∈ RK1K2 . Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 9 / 22 Semi-parametric modeling of the joint distribution Some examples Parametric copula models Assume that P has continuous c.d.f. F on R2 , with marginal c.d.f. F1 and F2. The copula C(·, ·) of the vector (X, Y ), see e.g. Nelsen (2006), is defined, ∀(u, v) ∈]0, 1[2 , by C(u, v) := F(F−1 1 (u), F−1 2 (v)). If F(·, ·) is absolutely continuous with respect to the Lebesgue measure on R2 , then we have the relation dP dP⊥ (x, y) = f (x, y) f1(x)f2(y) = c (F1(x), F2(y)), where f (·, ·) is the joint density of (X, Y ), f1 and f2 are the marginal densities, of X and Y , and c(·, ·) the copula density. Numerous examples of semi-parametric models can be obtained by considering hθ(x, y) = cθ(F1(x), F2(y)), θ ∈ Rd where cθ, θ ∈ Rd is any parametric copula model. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 10 / 22 Dual estimates of mutual informations Dual representation of ϕ-MI Dual representation of ϕ-divergences Notations : F : class of measurable real function defined on (Ω, A) ; F ∪ B : linear subspace spanned by F and bounded functions ; MF : set if signed measures such that |f | is integrable, f ∈ F ; τF : weakest topology on MF for which applications Q ∈ MF → f dQ, f ∈ F ∪ B are continuous ; τM : weakest topology on F ∪ B for which applications f ∈ MF → f dQ, Q ∈ MF are continuous. (Browniatowski & Keziou, 2003, 2006) For any signed measure P on (Ω, A), the application Q → Dϕ(Q, P) is convex and l.s.c. on MF . In addition, if Q is a.c.w.r.t. P, and ϕ′ dQ dP ∈ F, the following dual representation holds : Dϕ(Q, P) = sup f ∈F f dQ − ϕ∗ (f )dP , where ϕ∗ is the convex conjugate of ϕ. Moreover, the supremum is achieved uniquely for f = ϕ′ dQ dP . Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 11 / 22 Dual estimates of mutual informations Dual representation of ϕ-MI Dual representation of ϕ-MI in the semi-parametric context Assumptions on the model : A.3 Iϕ(P) < ∞ ; A.4 for all θ ∈ Θ, |ϕ′ (hθ)|dP < ∞. Specifying the dual representation of Dϕ for Q = P, P = P⊥ and F = {ϕ′ (hθ), θ ∈ Θ}, we obtain Dual representation of ϕ-MI Iϕ(P) = sup θ∈Θ ϕ′ (hθ)dP − ϕ∗ (ϕ′ (hθ))dP⊥ . The supremum is achieved uniquely for θ = θT (the true parameter). Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 12 / 22 Dual estimates of mutual informations Estimating ϕ-MI via the duality technique Dual estimates Given (X1, Y1), . . . , (Xn, Yn) an n-sample of (X, Y ) ∼ P, with P such that dP dP⊥ = hθT , we define the dual estimators of Iϕ(P) and θT respectively as Iϕ := sup θ∈Θ ϕ′ (hθ) dP − ϕ∗ (ϕ′ (hθ)) dP1 ⊗ P2 = sup θ∈Θ    1 n n i=1 ϕ′ (hθ(Xi , Yi )) − 1 n2 n i=1 n j=1 ϕ∗ (ϕ′ (hθ(Xi , Yj )))    , θϕ := arg sup θ∈Θ ϕ′ (hθ) dP − ϕ∗ (ϕ′ (hθ)) dP1 ⊗ P2 = arg sup θ∈Θ    1 n n i=1 ϕ′ (hθ(Xi , Yi )) − 1 n2 n i=1 n j=1 ϕ∗ (ϕ′ (hθ(Xi , Yj )))    , where P denotes the empirical distribution of the sample and P⊥ = P1 ⊗ P2 the product of its marginals. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 13 / 22 Dual estimates of mutual informations Estimating ϕ-MI via the duality technique Example : finite distributions In the context of finite distributions, dual and empirical estimates are equal : Iϕ = x,y ϕ px,y px py px py . Particularly, for ϕ = ϕ2, the duality technique recovers the classic χ2 estimate. In addition, θx,y = log px,y px py , (x, y) ∈ X × Y, do not depend on the choice of ϕ. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 14 / 22 Dual estimates of mutual informations Estimating ϕ-MI via the duality technique Dual estimates for semi-parametric copula models When dealing with semi-parametric copula models hθ(x, y) = cθ(F1(x), F2(y)), with unknown non-parametric cumulative distribution functions F1 and F2, it is necessary to estimate them, for example, empirically. Hence, the dual estimates of Iϕ(P) and θT become Iϕ := sup θ∈Θ    1 n i ϕ′ ◦ cθ Ri n + 1 , Si n + 1 − 1 n2 i,j ϕ∗ ◦ ϕ′ ◦ cθ Ri n + 1 , Sj n + 1    , θϕ := arg sup θ∈Θ    1 n i ϕ′ ◦ cθ Ri n + 1 , Si n + 1 − 1 n2 i,j ϕ∗ ◦ ϕ′ ◦ cθ Ri n + 1 , Sj n + 1 where Ri is the rank of Xi in the sample X1, . . . , Xn and Sj is the rank of Yj in the sample Y1, . . . , Yn. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 15 / 22 Dual estimates of mutual informations Asymptotic properties of the dual estimates Consistency of dual estimates Assumptions on the model : A.5 the parameter space is compact ; A.6 θ → ϕ′ (hθ(x, y)) is continuous for all (x, y) and X ×Y supθ∈Θ |fθ(x, y)| dP(x, y) < ∞; A.7 θ → ϕ∗ (ϕ′ (hθ(x, y))) is continuous for all (x, y) and X ×Y supθ∈Θ gθ(x, y)2 dP⊥ (x, y) < ∞. Proposition Assume A.1-2, 5-7. The dual estimates Iϕ of Iϕ(P) and θϕ of θT are consistent. Precisely, as n → ∞, the following convergences in probability hold Iϕ → Iϕ(P) and θϕ → θT . Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 16 / 22 Dual estimates of mutual informations Asymptotic properties of the dual estimates Asymptotic distribution of IKL We assume the following specific form for the density ratio hθ = dP dP⊥ : hθ(x, y) = exp (α + mβ(x, y)) with mβ(x, y) := d k=1 βkξk (x)ζk (y), for some specified measurable real valued functions ξk and ζk, k = 1, . . . , d, defined, respectively, on X and Y. Assumptions on the model : A.8 there exists a neighborhood N(θT ) of θT such that (x, y) → (∂3 /∂3 θ)ϕ′ (hθ(x, y)); θ ∈ N(θT ) (resp. (x, y) → (∂3 /∂3 θ)ϕ∗ (ϕ′ (hθ(x, y))); θ ∈ N(θT ) ) are dominated by some P-integrable functions (resp. some P⊥ -square-integrable functions) ; A.9 the integrals P f ′ θT 2 , P⊥ g′ θT 2 , P f ′′ θT , P⊥ g′′ θT 2 exist, and the matrix Σ1 := − Pf ′′ θT − P⊥ g′′ θT is nonsingular. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 17 / 22 Dual estimates of mutual informations Asymptotic properties of the dual estimates Asymptotic distribution of IKL Proposition Assume A.1-2, 5-9. Under the null hypothesis (P = P⊥ ), we have : ◮ √ n θϕ1 converges in distribution to a centered multivariate normal random variable with covariance matrix Σ = Σ−1 1 Σ2Σ−1 1 , where Σ2 is explicit ; ◮ 2n Iϕ1 converges in distribution to the random variable Z⊤ Z, where Z is a centered multivariate normal random variable with covariance matrix C = Σ −1/2 1 Σ2Σ −1/2 1 . Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 18 / 22 Bahadur asymptotic efficiency of ϕ-MI based tests KL-MI test maximizes Bahadur slope The most efficient test (among ϕ-MI tests) is... Additional assumption A.10 for all f ∈ ϕ′ (MΘ), for all a > 0, exp(a|f |)dP⊥ < ∞ and exp(a|ϕ ∗ ◦f |)dP⊥ < ∞. Theorem (KL-MI test is the most efficient) Let (X, Y ) a couple of random variables with joint distribution P ∈ MΘ. Suppose that conditions (A.1-2, 5, 10) are fulfilled. For the test problem H0 : P = P⊥ against H1 : P = P⊥ , the test based on the estimate IKL of the Kullback-Leibler mutual information is uniformly (i.e., whatever be the alternative P ∈ MΘ satisfying conditions (A.1-2, 5, 10)) the most efficient test (in Bahadur’s sense) among all Iϕ-based tests, including the classical χ2 -independence one. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 19 / 22 Simulations For finite random variables Testing independence of finite random variables X = Y = {0, 1}. Alternatives : Pθ = (px,y;θ) with px,y;θ := 1−θ 4 + θ 2 1{y=x}. Signif. level : 0.01. Power of KL-MI an d χ2 tests estimated by Monte- Carlo from 10,000 samples of size 30. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 20 / 22 Simulations For Gaussian couples Testing indep. of the components of a Gaussian couple (X, Y ) centred Gaussian couple with covariance matrix Σ = 1 ρ ρ 1 , with ρ ∈ [0, 1]. Signif. level : 0.05. Power of KL-MI, χ2 and non- correlation tests estimated by Monte-Carlo from 10,000 samples of size 50. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.20.40.60.81.0 Correlation between X and Y Poweroftests KL−MI test χ2 −MI test Pearson test Spearman test Kendall test Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 21 / 22 Bibliography Bibliography ◮ Keziou, A., Regnault P. (2015). Semiparametric estimation of mutual information and related criteria : optimal test of independence. arXiv :1508.04671 ◮ Pardo, L. (2006). Statistical inference based on divergence measures, volume 185 of Statistics : Textbooks and Monographs. Chapman & Hall/CRC, Boca Raton, FL. ◮ Khan, S., Bandyopadhyay, S., Ganguly, A.R., Saigal, S., Erickson, D.J., Protopopescu, V., Ostrouchov, G. (2007). Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. Physical Review E, 76, 026209. ◮ Nelsen, R.B. (2006). An introduction to copulas. Springer Series in Statistics. Springer, New York, second edition. ◮ Keziou, A. (2003). Dual representation of ϕ-divergences and applications. C. R. Math. Acad. Sci. Paris, 336(10), 857-862. ◮ Browniatowski, M, Keziou, A. (2006). Minimization of ϕ-divergences on sets of signed measures. Studia Sci. Math. Hungar., 43(4), 403-442. ◮ Eichelsbacher, P., Schmock, U. (2002). Large deviations of U-empirical measures on strong topologies and applications. Ann. Inst. H. Poincarré Porbab. Statist., 38(5), 779-797. Ph Regnault (LMR-URCA) MI-based independence test GSI 2015 22 / 22