Optimal Transport to Rényi Entropies

07/11/2017
Auteurs : Olivier Rioul
Publication GSI2017
OAI : oai:www.see.asso.fr:17410:22609
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit
 

Résumé

Recently, an optimal transportation argument was proposed by the author to provide a simple proof of Shannon's entropy-power inequality. Interestingly, such a proof could have been given by Shannon himself in his 1948 seminal paper. In fact, by 1948 Shannon established all the ingredients necessary for the proof and the transport argument takes the form of a simple change of variables.
In this paper, the optimal transportation argument is extended to Rényi entropies in relation to Shannon's entropy-power inequality and to a reverse version involving a certain conditional entropy. The transportation argument turns out to coincide with Barthe's proof of sharp direct and reverse Young's convolutional inequalities and can be applied to derive

recent Rényi entropy-power inequalities.


Optimal Transport to Rényi Entropies

Collection

application/pdf Optimal Transport to Rényi Entropies Olivier Rioul
Détails de l'article
contenu protégé  Document accessible sous conditions - vous devez vous connecter ou vous enregistrer pour accéder à ou acquérir ce document.
- Accès libre pour les ayants-droit

Recently, an optimal transportation argument was proposed by the author to provide a simple proof of Shannon's entropy-power inequality. Interestingly, such a proof could have been given by Shannon himself in his 1948 seminal paper. In fact, by 1948 Shannon established all the ingredients necessary for the proof and the transport argument takes the form of a simple change of variables.
In this paper, the optimal transportation argument is extended to Rényi entropies in relation to Shannon's entropy-power inequality and to a reverse version involving a certain conditional entropy. The transportation argument turns out to coincide with Barthe's proof of sharp direct and reverse Young's convolutional inequalities and can be applied to derive

recent Rényi entropy-power inequalities.

Optimal Transport to Rényi Entropies

Média

Voir la vidéo

Métriques

0
0
240.37 Ko
 application/pdf
bitcache://5d2d81305ae2ac4f92a25d6738692ce9b8379f2b

Licence

Creative Commons Aucune (Tous droits réservés)

Sponsors

Sponsors Platine

alanturinginstitutelogo.png
logothales.jpg

Sponsors Bronze

logo_enac-bleuok.jpg
imag150x185_couleur_rvb.jpg

Sponsors scientifique

logo_smf_cmjn.gif

Sponsors

smai.png
logo_gdr-mia.png
gdr_geosto_logo.png
gdr-isis.png
logo-minesparistech.jpg
logo_x.jpeg
springer-logo.png
logo-psl.png

Organisateurs

logo_see.gif
<resource  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns="http://datacite.org/schema/kernel-4"
                xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
        <identifier identifierType="DOI">10.23723/17410/22609</identifier><creators><creator><creatorName>Olivier Rioul</creatorName></creator></creators><titles>
            <title>Optimal Transport to Rényi Entropies</title></titles>
        <publisher>SEE</publisher>
        <publicationYear>2018</publicationYear>
        <resourceType resourceTypeGeneral="Text">Text</resourceType><subjects><subject>Optimal transport</subject><subject>Rényi entropy</subject><subject>entropy-power inequality</subject></subjects><dates>
	    <date dateType="Created">Thu 8 Mar 2018</date>
	    <date dateType="Updated">Thu 8 Mar 2018</date>
            <date dateType="Submitted">Mon 22 Oct 2018</date>
	</dates>
        <alternateIdentifiers>
	    <alternateIdentifier alternateIdentifierType="bitstream">5d2d81305ae2ac4f92a25d6738692ce9b8379f2b</alternateIdentifier>
	</alternateIdentifiers>
        <formats>
	    <format>application/pdf</format>
	</formats>
	<version>37356</version>
        <descriptions>
            <description descriptionType="Abstract">
Recently, an optimal transportation argument was proposed by the author to provide a simple proof of Shannon's entropy-power inequality. Interestingly, such a proof could have been given by Shannon himself in his 1948 seminal paper. In fact, by 1948 Shannon established all the ingredients necessary for the proof and the transport argument takes the form of a simple change of variables.

In this paper, the optimal transportation argument is extended to Rényi entropies in relation to Shannon's entropy-power inequality and to a reverse version involving a certain conditional entropy. The transportation argument turns out to coincide with Barthe's proof of sharp direct and reverse Young's convolutional inequalities and can be applied to derive
recent Rényi entropy-power inequalities.
</description>
        </descriptions>
    </resource>
.

Optimal Transport to Rényi Entropies Olivier Rioul LTCI, Télécom ParisTech, Université Paris-Saclay, 75013, Paris, France olivier.rioul@telecom-paristech.fr http://perso.telecom-paristech.fr/rioul/ Abstract. Recently, an optimal transportation argument was proposed by the author to provide a simple proof of Shannon’s entropy-power inequality. Interestingly, such a proof could have been given by Shannon himself in his 1948 seminal paper. In fact, by 1948 Shannon established all the ingredients necessary for the proof and the transport argument takes the form of a simple change of variables. In this paper, the optimal transportation argument is extended to Rényi entropies in relation to Shannon’s entropy-power inequality and to a reverse version involving a certain conditional entropy. The transportation argument turns out to coincide with Barthe’s proof of sharp direct and reverse Young’s convolutional inequalities and can be applied to derive recent Rényi entropy-power inequalities. Keywords: Rényi entropy, entropy-power inequality, optimal transport 1 Introduction: A Proof that Shannon Missed 2016 was the Shannon Centenary which marked the life and influence of Claude E. Shannon on the 100th anniversary of his birth. On this occasion many scientific events were organized throughout the world in honor of his achievements—on top of which his 1948 seminal paper [1] which developed the mathematical foundations of communication. The French edition of the book re-edition of Shannon’s paper [2] has recently been published. Remarkably, Shannon’s revolutionary work, in a single publication [1], estab- lished the fully formed field of information theory, with all insights and mathe- matical proofs, albeit in sketched form. There seems to be only one exception in which Shannon’s proof turned out to be flawed: the celebrated entropy-power inequality (EPI). The EPI can be described as follows. Letting P(X) = 1 n E{kXk2 } be the average power of a random vector X taking values in Rn , Shannon defined the entropy-power N(X) as the power of a zero-mean white Gaussian random vector X∗ having the same entropy as X. He argued [1, § 21] that for continuous random vectors it is more convenient to work with the entropy-power N(X) than with the differential entropy h(X). By Shannon’s formula [1, § 20.6] h(X∗ ) = n 2 log 2πeP(X∗ )  for the entropy of the white Gaussian X∗ , the closed-form expression of N(X) = P(X∗ ) when h(X∗ ) = h(X) is N(X) = exp 2 n h(X)  2πe (1) which is essentially e to the power a multiple of the entropy of X, also recognized as the “entropy power” of X in this sense. Since the Gaussian maximizes entropy for a given power [1, § 20.5]: h(X) ≤ n 2 log 2πeP(X)  , the entropy-power does not exceed the actual power: N(X) ≤ P(X) with equality if and only if X is white Gaussian. The power of a scaled random vector is given by P(aX) = a2 P(X), and the same property holds for the entropy-power: N(aX) = a2 N(X) (2) thanks to the well-known scaling property of the entropy [1, § 20.9]: h(aX) = h(X) + n log |a| (3) Now for any two independent continuous random vectors X and Y , the power of the sum equals the sum of the individual powers: P(X + Y ) = P(X) + P(Y ) and clearly the same relation holds for the entropy-power in the case of white Gaussian vectors (or Gaussian vectors with proportional covariances). In general, however, the entropy-power of the sum exceeds the sum of the individual entropy-powers: N(X + Y ) ≥ N(X) + N(Y ) (4) where equality holds only if X and Y are Gaussian with proportional covariances. This is the celebrated entropy-power inequality (EPI) as stated by Shannon. It is remarkable that Shannon had the intuition of this inequality since it turns out to be quite difficult to prove. Shannon’s proof [1, Appendix 6] is an incomplete variational argument which shows that Gaussian densities yield a stationary point for N(X + Y ) with fixed N(X) and N(Y ) but this does not exclude the possibility that the stationary point is not a global minimum. The first actual proof of the EPI occured more than ten years later and was quite involved; subsequent proofs used either integration over a path of a continuous Gaussian perturbation or the sharp version of Young’s inequality where the EPI is obtained as a limit (which precludes to settle the equality condition in this case). We refer to [3] for a comprehensive list of references and a detailed history. Recently, an optimal transportation argument was proposed by the author [4, 5] to provide a simple proof of the entropy-power inequality, including the equality condition. Interestingly, as we shall now demonstrate, such a proof, appropriately rephrased, could have been given by Shannon himself in his 1948 seminal paper. In fact, by 1948 Shannon established all the ingredients necessary for the proof. As in Shannon’s paper [1], to simplify the presentation we assume, without loss of generality, that all considered random vectors have zero mean and we here restrict ourselves to real-valued random variables in one dimension n = 1. The optimal transport argument takes the form of a simple change of variables: if e.g., X∗ is Gaussian, then there exists a (possibly nonlinear) nondecreasing transformation T such that T(X∗ ) is identically distributed as X—so that one would take X = T(X∗ ) in what follows. Similarly if Y ∗ is Gaussian on can take Y = U(Y ∗ ). A detailed proof of this change of variable is given in [4, 5] but this is easily seen as a generalization of the inverse c.d.f. method used e.g., for sampling random variables. Theorem 1 (Shannon’s Entropy-Power Inequality). Let X, Y be indepen- dent zero-mean random variables with continuous densities. Then N(X + Y ) ≥ N(X) + N(Y ). Proof. The proof is in several steps, each being a direct consequence of Shannon’s basic results established in [1]. 1. We first proceed to prove the apparently more general inequality N(aX + bY ) ≥ a2 N(X) + b2 N(Y ) (5) for any real-valued coefficients a, b. By the scaling property of the entropy- power (2), this is in fact equivalent to the original EPI (4). 2. We can always assume that X and Y have the same entropy-power N(X) = N(Y ), or equivalently, have the same entropy h(X) = h(Y ). Otherwise, one could find constants c, d such that cX and dY have equal entropy-power (e.g., c = exp(−h(X)) and d = exp(−h(Y ))) and applying (5) to cX and dY yields the general case, again thanks to the scaling property of the entropy-power. 3. Let X∗ , Y ∗ be independent zero-mean Gaussian variables with the same entropy as X, Y . Since the entropies of X∗ and Y ∗ are equal they have the same variance and are, therefore, identically distributed. Since equality holds in (5) for X∗ , Y ∗ , we have a2 N(X) + b2 N(Y ) = a2 N(X∗ ) + b2 N(Y ∗ ) = N(aX∗ + bY ∗ ) so that (5) is equivalent to N(aX + bY ) ≥ N(aX∗ + bY ∗ ) or (taking the logarithm) h(aX + bY ) ≥ h(aX∗ + bY ∗ ) (6) 4. To prove (6) we may always assume the change of variables X = T(X∗ ), Y = U(Y ∗ ) as explained above. One is led to prove that h(aT(X∗ ) + bU(Y ∗ )) ≥ h(aX∗ + bY ∗ ) (7) which is written only in terms of the Gaussian variables. 5. Since X∗ and Y ∗ are i.i.d. Gaussian, the Gaussian variables X̃ = aX∗ +bY ∗ and Ỹ = −bX∗ + aY ∗ are uncorrelated and, therefore, independent. Letting ∆ = a2 + b2 we can write X∗ = (aX̃ − bỸ )/∆ and Y ∗ = (bX̃ + aỸ )/∆. Since conditioning reduces entropy [1, § 20.4], h(aT(X∗ ) + bU(Y ∗ )) = h(aT(aX̃−bỸ ∆ ) + bU(bX̃+aỸ ∆ )) ≥ h(aT(aX̃−bỸ ∆ ) + bU(bX̃+aỸ ∆ )|Ỹ ) (8) 6. By the change of variable in the entropy [1, § 20.8], for any transformation T, h(T(X)) = h(X) + E log T0 (X) where T0 (X) > 0 is the jacobian of the transformation. Applying the transformation in X̃ for fixed Ỹ in the right-hand side of (8) we obtain h(aT(aX̃−bỸ ∆ )+bU(bX̃+aỸ ∆ )|Ỹ ) = h(X̃|Ỹ )+E log a2 ∆ T0 (aX̃−bỸ ∆ )+ b2 ∆ U0 (bX̃+aỸ ∆ )  (9) 7. By the concavity of the logarithm, log a2 ∆ T0 (aX̃−bỸ ∆ ) + b2 ∆ U0 (bX̃+aỸ ∆ )  = log a2 ∆ T0 (X∗ ) + b2 ∆ U0 (Y ∗ )  ≥ a2 ∆ log T0 (X∗ ) + b2 ∆ log U0 (Y ∗ ) (10) but again from change of variable in the entropy [1, § 20.8], E log T0 (X∗ ) = h(T(X∗ )) − h(X∗ ) = h(X) − h(X∗ ) = 0 and similarly E log U0 (Y ∗ ) = 0. Thus the second term in the right-hand side of (9) is ≥ 0. 8. Since X̃, Ỹ are independent, one has [1, § 20.2] h(X̃|Ỹ ) = h(X̃) = h(aX∗ + bY ∗ ), which is the right-hand side of (7). Combining the established inequalities this proves the EPI. u t Remark 1. The case of equality can easily be settled by noting that equality holds in (10) only if T0 (X) = U0 (Y ) a.e., which since X and Y are independent implies that T0 = U0 is constant, hence transformations T, U are linear and X, Y are Gaussian (see [4] for details). Going back to the proof it is interesting to note that the only place where the gaussianity of X∗ , Y ∗ is used is for the simplification h(X̃|Ỹ ) = h(X̃). If we drop this assumption we obtain the more general statement: Corollary 1. Let X, Y be independent zero-mean random variables with con- tinuous densities, and similarly let X∗ , Y ∗ be independent zero-mean random variables with continuous densities, all of equal entropies. Then for any real a, b, h(aX + bY ) ≥ h(aX∗ + bY ∗ | − bX∗ + aY ∗ ) (11) If in addition we drop the assumption of equal entropies than letting λ = a2 /∆, 1 − λ = b2 /∆ we obtain Corollary 2. Let X, Y be independent zero-mean random variables with con- tinuous densities, and similarly let X∗ , Y ∗ be independent zero-mean random variables with continuous densities. Then for any 0 < λ < 1, h( √ λX + √ 1 − λY ) − λh(X) − (1 − λ)h(Y ) ≥ h( √ λX∗ + √ 1 − λY ∗ | − √ 1 − λX∗ + √ λY ∗ ) − λh(X∗ ) − (1 − λ)h(Y ∗ ) (12) In fact since the choice of (X, Y ) and (X∗ , Y ∗ ) is arbitrary the latter inequality can be split into two inequalities [5], the EPI and a reverse EPI: h( √ λX + √ 1 − λY ) ≥ λh(X) + (1 − λ)h(Y ) h( √ λX∗ + √ 1 − λY ∗ | − √ 1 − λX∗ + √ λY ∗ ) ≤ λh(X∗ )+(1 − λ)h(Y ∗ ). (13) 2 Generalization to Rényi Entropies We now extend the same argument to Rényi entropies. Definition 1 (Hölder Conjugate). Let p > 0, its Hölder conjugate is p0 such that 1 p + 1 p0 = 1. We write p0 = ∞ if p = 1; note that p0 can be negative if p < 1. Definition 2 (Rényi Entropy). The Rényi entropy of order p of a random vector X with density f ∈ Lp (Rn ) is defined by hp(X) = −p0 log kfkp = 1 1 − p log Z Rn fp . (14) As is well known, hp(X) is non-increasing in p and we recover Shannon’s entropy by letting p → 1 from above or below: h(X) = limp→1 hp(X). We also make the following definitions. Definition 3 (Power Transformation). Given a random vector X with den- sity f ∈ Lα , we define Xα as the random vector with density fα = fα R fα . (15) Definition 4 (Young’s Triple). A Young triple (p, q, r) consists of three posi- tive real numbers such that p0 , q0 , r0 are of the same sign and 1 p0 + 1 q0 = 1 r0 . (16) The triple rate λ associated to (p, q, r) is the ratio of 1/p0 in 1/r0 : λ = 1/p0 1/r0 = r0 p0 1 − λ = 1/q0 1/r0 = r0 q0 . (17) In other words 1/p + 1/q = 1 + 1/r as in the classical Young’s inequality. If all p0 , q0 , r0 are > 0 then p, q, r > 1; otherwise p0 , q0 , r0 < 0 and p, q, r < 1. Thus we always have 0 < λ < 1. Definition 5 (Dual Young’s Triple). A Young triple (p∗ , q∗ , r∗ ) (with rate λ∗ ) is dual to (p, q, r) if it satisfies r∗ = 1 r and λ∗ = 1 − λ. From the definition we have p, q, r > 1 ⇐⇒ p∗ , q∗ , r∗ < 1 and vice versa. Since 1 p∗0 = λ∗ 1 r∗0 = 1/r0 −1/p0 1/r0 (1 − r) = 1/r−1/p 1/r and similarly for q∗0 , the definition fully determines (p∗ , q∗ , r∗ ) as (p∗ = p r , q∗ = q r , r∗ = 1 r ) (18) We observe from the definition that the dual of (p∗ , q∗ , r∗ ) is the original triple (p, q, r). We can now state the following Theorem 2. Let X, Y be independent zero-mean random variables with con- tinuous densities, and similarly let X∗ , Y ∗ be independent zero-mean random variables with continuous densities. Then for any Young’s triple (p, q, r) with dual (p∗ , q∗ , r∗ ), hr( √ λX1/p + √ 1 − λY1/q) − λhp(X1/p) − (1 − λ)hq(Y1/q) ≥ λ∗ hp∗ (X∗ 1/p∗ ) + (1 − λ∗ )hq∗ (Y ∗ 1/q∗ ) − hr∗ (− √ λ∗X∗ 1/p∗ + √ 1 − λ∗Y ∗ 1/q∗ ) (19) Proof. The proof uses the same transportation argument X = T(X∗ ), Y = U(Y ∗ ) as above, combined with an application of Hölder’s inequality. It is omitted due to lack of space (but see § 3.2 below). u t Remark 2. In (19) terms like hp(X1/p) may be simplified since hp(X1/p) = 1 1 − p log R f ( R f1/p)p = 1 1 − 1/p log Z f1/p = h1/p(X). (20) The above form was chosen to stress the similarity with (12). Remark 3. The inequality (19) is invariant by duality, in the sense that if we permute the roles of all variables (p, q, r, λ, X, Y ) and starred variables (p∗ , q∗ , r∗ , λ∗ , X∗ , Y ∗ ) we obtain the exact same inequality. Remark 4. The case of equality can be determined as in the proof of Theorem 1: this is the case where T0 = U0 is constant, hence transformations T, U are linear. Hence equality holds in (19) if and only if there exists a constant c > 0 such that X has the same distribution as cX∗ and Y has the same distribution as cY ∗ . 3 Some Applications 3.1 Back to Shannon’s Entropy-Power Inequality There is a striking similarity between Theorem 2 and Corollary 2. In fact for fixed λ = 1 − λ∗ , we can let p, q, r → 1 from above (or below) so that p∗ , q∗ , r∗ → 1 from below (or above) to obtain h( √ λX + √ 1 − λY ) − λh(X) − (1 − λ)h(Y ) ≥ (1 − λ)h(X∗ ) + λh(Y ∗ ) − h(− √ 1 − λX∗ + √ λY ∗ ). (21) This is exactly (12) in Corollary 2 because the right-hand side can be rewritten as (1 − λ)h(X∗ ) + λh(Y ∗ ) − h(− √ 1 − λX∗ + √ λY ∗ ) = h(X∗ ) + h(Y ∗ ) − h(− √ 1 − λX∗ + √ λY ∗ ) − λh(X∗ ) − (1 − λ)h(Y ∗ ) = h( √ λX∗ + √ 1 − λY ∗ , − √ 1 − λX∗ + √ λY ∗ ) − h(− √ 1 − λX∗ + √ λY ∗ ) − λh(X∗ ) − (1 − λ)h(Y ∗ ) (22) = h( √ λX∗ + √ 1 − λY ∗ | − √ 1 − λX∗ + √ λY ∗ ) − λh(X∗ ) − (1 − λ)h(Y ∗ ) (23) where (22) holds because the entropy is invariant by rotation. Thus, Theorem 2 implies the classical Shannon’s entropy-power inequality. It is the natural generalization to Rényi entropies using optimal transport arguments. Remark 5. The above calculation (22)-(23) also shows that the EPI and the “reverse EPI” (13) are in fact equivalent, as already noted in [5]. This is due to the fact that Theorem 2 is invariant by duality (Remark 3). 3.2 Relation to Sharp Young Direct and Reverse Inequalities To simplify the presentation we stay with one-dimensional random variables. As in Corollary 2, since the choice of (X, Y ) and (X∗ , Y ∗ ) is arbitrary, (19) can be simplified. If we let X1/p, Y1/q be i.i.d. centered Gaussian, √ λX1/p + √ 1 − λY1/q also has the same Gaussian distribution, and since the Rényi entropy of a Gaussian variable X ∼ N(m, σ2 ) is easily found to be hp(X) = − p0 log p 2p + log √ 2πσ2, (24) the l.h.s. of (19) is equal to −r0 2 (log r r − log p p − log q q ). By the equality case (remark 4) this expression is also the value taken by the r.h.s. of (19) when X∗ 1/p∗ , Y ∗ 1/q∗ are i.i.d. Gaussian (this can also be checked directly from the above definition of the dual Young’s triple). Therefore, the expression −r0 2 (log r r − log p p − log q q ) can be inserted between the two sides of (19) in Theorem 2. In other words, (19) is split into two equivalent inequalities which can be rewritten as hr( √ λX + √ 1 − λY )−λhp(X)−(1−λ)hq(Y ) ≥ − r0 2 ( log r r − log p p − log q q ) (25) with equality if and only if X and Y are i.i.d. Gaussian. Plugging the definition (14) of Rényi entropies and dividing by r0 (which can be positive of negative), it is easily found [5] that (25) yields the optimal Young’s direct and reverse inequalities: s r1/r |r0|1/r0 kf ∗ gkr ≤ s p1/p |p0|1/p0 kfkp · s q1/q |q0|1/q0 kgkq. (26) for p, q, r > 1 (r0 > 0) and the reverse inequality for 0 < p, q, r < 1 (r0 < 0), where f and g denote the densities of √ λX and √ 1 − λY . Equality holds if and only if X/ √ p0 and Y/ √ q0 are i.i.d. Gaussian. In fact, a closer look at (19) shows that it coincide with Barthe’s transportation proof of sharp Young’s inequalities [6, Lemma 1] which uses the same change of variables X = T(X∗ ), Y = U(Y ∗ ) as above. 3.3 Rényi Entropy-Power Inequalities Again to simplify the presentation we stay with two one-dimensional independent random variables X, Y . By analogy with the entropy-power (1), the Rényi entropy- power of order p is defined by Np(X) = exp 2 n hp(X)  2πe (27) We have the following characterization which is an immediate generalization of the classical case r = c = 1: Lemma 1. Let r > 0, c > 0. The Renyi entropy-power inequality Nr(X + Y ) ≥ c Nr(X) + Nr(Y )  (28) is equivalent to hr( √ λX + √ 1 − λY ) − λhr(X) − (1 − λ)hr(Y ) ≥ 1 2 log c ∀λ ∈ (0, 1)  . (29) Now suppose p∗ , q∗ , r∗ < 1 so that r > 1 is greater than p and q. Since hp(X) is non-increasing in p, one has hp(X) ≥ hr(X) and hq(Y ) ≥ hr(Y ), hence Theorem 2 in the form (25) implies (29) for any λ ∈ (0, 1) provided that 1 2 log c is taken as the minimum of the r.h.s. of (25) taken over all p, q such that 1/p+1/q = 1+1/r. The method can easily be generalized to more than two independent random variables. In this way we obtain the recent Renyi entropy-power inequalities obtained by Bobkov and Chistyakov [7] and by Ram and Sason [8]. References 1. C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, pp. 379–423, 623–656, 1948. 2. C. E. Shannon and W. Weaver, La théorie mathématique de la communication. Paris, France: Cassini, 2017. 3. O. Rioul, “Information theoretic proofs of entropy power inequalities,” IEEE Trans. Inf. Theory, vol. 57, no. 1, pp. 33–55, Jan. 2011. 4. ——, “Yet another proof of the entropy power inequality,” IEEE Trans. Inf. Theory, to appear., draft available at https://arxiv.org/abs/1606.05969. 5. ——, “Optimal transportation to the entropy-power inequality,” in IEEE Information Theory and Applications Workshop (ITA 2017), San Diego, USA, Feb. 2017. 6. F. Barthe, “Optimal Young’s inequality and its converse: A simple proof,” GAFA, Geom. funct. anal., vol. 8, no. 2, pp. 234–242, 1998. 7. S. G. Bobkov and G. P. Chistyakov, “Entropy power inequality for the Rényi entropy,” IEEE Trans. Inf. Theory, vol. 61, no. 2, pp. 708–714, Feb. 2015. 8. E. Ram and I. Sason, “On Rényi entropy power inequalities,” IEEE Trans. Inf. Theory, vol. 62, no. 12, pp. 6800–6815, Dec. 2016.