Covariance

Covariance and Correlation

When two random variables, X and Y, are not independent.

The covariance between the two is defined by:

Cov(X,Y) = E[(X-E(X))(Y-E(Y))]

Descrite: sum(x)sum(y) (x - mue sub x)(y - mue sub y)P(X = x, Y = y)
Continuous: ∫(-inf, inf)∫(-inf, inf)(x- mue sub x)(y- mue sub y)f(x,y)dxdy

mue sub x = sum(x)P(X = x) // the mean

The last bit on those two are the joint density function.

If both variables tend to deviate in the same direction, the covariance will be positive. If the opisite is true, then the covariance will be negative. If they are not strongly linearly related, the covariance will be near 0.

Computational formula for covariance

Cov(X,Y) = E(XY) - E(X)E(Y)
Other formulas for real numbers a and b:

E(aX + bY) = aE(X) + bE(Y)
V(zX + bY) = a^2V(X) + b^2V(Y)+2abCov(X,Y)

Correlation Coefficient

Scaled covariance - the correlation is always between -1 and 1.

Notation: Cor(X,Y) or 𝓅X,Y

Cor(X,Y) = Cov(X,Y)/standard_div(x)standard_div(y)

If Y = aX + b:

Cor(X,Y) = {
    1 if a > 0
    -1 if a < 0
}

Independence

Recall, X and Y are independent if P(X=x, Y=y)=P(X=x)P(Y=y)

If X and Y are independent:

Cov(X,Y) = 0
Cor(X,Y) = 0

However if the covariance of two vars is 0, you cannot conclude they are independent.