BlogWorlD: The linearity of variance – stats.stackexchange.com #JHedzWorlD

Friday, December 4, 2015

The linearity of variance – stats.stackexchange.com #JHedzWorlD

I think the following two formulas are true:

$$ mathrmVar(aX)=a^2 mathrmVar(X) $$ while a is a constant number $$ mathrmVar(X + Y)=mathrmVar(X)+mathrmVar(Y) $$ if $X$, $Y$ are independent

However, I am not sure what is wrong with the below:

$$mathrmVar(2X) = mathrmVar(X+X) = mathrmVar(X) + mathrmVar(X) $$ which does not equal to $2^2 mathrmVar(X)$, i.e. $4mathrmVar(X)$.

If it is assumed that $X$ is the sample taken from a population, I think we can always assume $X$ to be independent from the other $X$s.

So what is wrong with my confusion?

$DeclareMathOperatorCovCov$ $DeclareMathOperatorCorrCorr$ $DeclareMathOperatorVarVar$

The problem with your line of reasoning is

“I think we can always assume $X$ to be independent from the other $X$s.”

$X$ is not independent of $X$. The symbol $X$ is being used to refer to the same random variable here. Once you know the value of the first $X$ to appear in your formula, this also fixes the value of the second $X$ to appear.

If two variables $X$ and $Y$ are independent then $Pr(X=a|Y=b)$ is the same as $Pr(X=a)$: knowing the value of $Y$ does not give us any additional information about the value of $X$. But $Pr(X=a|X=b)$ is $1$ if $a=b$ and $0$ otherwise: knowing the value of $X$ gives you complete information about the value of $X$.

Another way of seeing things is that if two variables are independent then they have zero correlation (though zero correlation does not imply independence!) but $X$ is perfectly correlated with itself, $Corr(X,X)=1$ so $X$ can’t be independent of itself. Note that since the covariance is given by $Cov(X,Y)=Corr(X,Y)sqrtVar(X)Var(Y)$, then
$$Cov(X,X)=1sqrtVar(X)^2=Var(X)$$

The more general formula for the variance of a sum of two random variables is

$$Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y)$$

In particular, $Cov(X,X) = Var(X)$, so

$$Var(X+X) = Var(X) + Var(X) + 2Var(X) = 4Var(X)$$

which is the same as you would have deduced from applying the rule

$$Var(aX) = a^2 Var(X) implies Var(2X) = 4Var(X)$$

If you are interested in linearity, then you might be interested in the bilinearity of covariance. For random variables $W$, $X$, $Y$ and $Z$ (whether dependent or independent) and constants $a$, $b$, $c$ and $d$ we have

$$Cov(aW + bX, Y) = a Cov(W,Y) + b Cov(X,Y)$$

$$Cov(X, cY + dZ) = c Cov(X,Y) + d Cov(X,Z)$$

and overall,

$$Cov(aW + bX, cY + dZ) = ac Cov(W,Y) + ad Cov(W,Z) + bc Cov(X,Y) + bd Cov(X,Z)$$

You can then use this to prove the (non-linear) results for variance that you wrote in your post:

$$Var(aX) = Cov(aX, aX) = a^2 Cov(X,X) = a^2 Var(X)$$

$$ beginalign Var(aX + bY) &= Cov(aX + bY, aX + bY) \ &= a^2 Cov(X,X) + ab Cov(X,Y) + ba Cov (X,Y) + b^2 Cov(Y,Y) \ Var(aX + bY) &= a^2 Var(X) + b^2 Var(Y) + 2ab Cov(X,Y) endalign $$

The latter gives, as a special case when $a=b=1$,

$$Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y)$$

When $X$ and $Y$ are uncorrelated (which includes the case where they are independent), then this reduces to $Var(X+Y) = Var(X) + Var(Y)$. So if you want to manipulate variances in a “linear” way (which is often a nice way to work algebraically), then work with the covariances instead, and exploit their bilinearity.

Variance isn't linear — your first statement shows this (if it were, you'd have $Var(aX) = a Var(X)$. Covariance on the other hand is bilinear. – Batman 1 hour ago

Thank you so much! I got it! – lanselibai 1 hour ago

JHedzWorlD

BlogWorlD

Friday, December 4, 2015

The linearity of variance – stats.stackexchange.com #JHedzWorlD

No comments:

Post a Comment