This section discusses how to estimate the parameters of a distribution i.e. μ,σ2 of the line f(x)=w0+w1x.
We denote the parameters by Θ=(μ,σ2)
The likelihood of Θ given a sample X is given by:
l(Θ∣X)=∑tp(xt∣Θ)
Therefore, the Log Likelihood of Θ given a sample X is denoted by:
L(Θ∣X):=logl(Θ∣X)=∑tlogp(xt∣Θ)
This assumes that the observations in X are independent.
The Maximum Likelihood Estimator (MLE) is given by:
Θ∗:=argmaxΘL(Θ∣X)
Estimating the Parameter
Ph of a Bernoulli Distribution
X is a Bernoulli Random Variable.
Ph=P[X=1]
For example, consider the following:
Let 1 denote Heads, and 0 denote Tails.
Say X = {1,1,0}
We need to determine Θ i.e. Ph.
We have l(Ph∣X)=P(X∣Ph)=Ph∗Ph∗(1−Ph)
More generally, for X={xt}t=1N, we have:
p(X∣ph)=Πt=1Nphxt(1−ph)(1−xt)
It can be proved that the MLE is given by ph=N∑xt.
Estimating the Parameters of a Multinomial Distribution
Consider a die with 6 faces numbered from 1 to 6.
If X is a Multinomial Random Variable, there are k>2 possible values of X (here, 6).
Say X={5, 4, 6}. We can imagine indicator vectors for each observation as [0 0 0 0 1 0], [0 0 0 1 0 0] and [0 0 0 0 0 1].
Say X={4,6,4,2,3,3}. The MLE of xi i.e. side i shows up, can be given by:
pi=N∑t=1Nxit.
Estimating the Parameters of a Gaussian Distribution
The MLE for the mean m is N∑txt and the MLE for the variance σ2 is N∑t(xt−m)2.
However, if we divide by N-1 instead of N (for variance), it is called the unbiased estimate.