Bias and Variance of an Estimator

Consider the following estimators of the mean of a distribution. X is an i.i.d. sample from the distribution.

  1. m1=txtNm_1 = \frac{\sum_t x^t}{N} (this is the MLE)

  2. m2=x1+xN2m_2 = \frac{x^1+x^N}{2}

  3. m3=5m_3 = 5

Now, draw a sample of size N (say N=3): X={6,1,5}

m1=4,m2=11/5,m3=5m_1 = 4, \, m_2= 11/5, \, m_3=5

If we consider the means to be random variables, each of them will have a variance.

Say we want to estimate Θ\Theta (here, Θ=μ\Theta=\mu of the distribution from which we are drawing X)

The desirable property of the estimator d of Θ\Theta is that the expected value of d must be equal to the quantity we want to estimate i.e. E[d]=ΘE[d] = \Theta. d is then called the unbiased estimator.

The bias of an estimator 'd' is given by:

bΘ(d)=E[d]Θb_\Theta(d) = E[d]-\Theta

If bΘ(d)=0b_\Theta(d) = 0, d is an unbiased estimator.

Is m2m_2 an unbiased estimator of μ\mu?

m2=x1+xN2m_2 = \frac{x^1+x^N}{2} E[m2]=E[x1+xN2]=12E[x1+xN]=12E[x1]+12E[xN]E[m_2] = E[\frac{x^1+x^N}{2}] = \frac{1}{2}E[x^1+x^N] = \frac{1}{2} E[x^1] + \frac{1}{2} E[x^N]

Since E[xt]=μE[x^t] = \mu (by definition), E[m2]=12μ+12μ=μE[m_2] = \frac{1}{2}\mu + \frac{1}{2}\mu = \mu

Therefore, m2m_2 is an unbiased estimator of the mean μ\mu.

Is m3m_3 an unbiased estimator of μ\mu?

E[m3]=E[5]=5E[m_3] = E[5] = 5

Clearly, E[m3]μ=0E[m_3]-\mu = 0 iff μ=5\mu=5. Therefore, m3m_3 is not an unbiased estimator of the mean μ.\mu.

The variance of an estimator 'd' is given by

E[(dE[d])2]E[(d-E[d])^2]

More data leads to lower variance.

m3m_3 has the least variance (it is always 5!). m1m_1 has a lower variance than m2m_2.

The square error of an estimator is given by: E[(dΘ)2]=(E[d]Θ)2+E[(dE[d])2]=E[(d-\Theta)^2] = (E[d]-\Theta)^2 + E[(d-E[d])^2] = Bias2+VarianceBias^2 + Variance

Last updated