Machine Learning - Stanford - Coursera
1.0.0
1.0.0
  • Acknowledgements
  • Introduction
  • Linear Algebra Review
  • Types of Machine Learning
  • Supervised Learning
    • Linear Regression
      • Linear Regression in One Variable
        • Cost Function
        • Gradient Descent
      • Multivariate Linear Regression
        • Cost Function
        • Gradient Descent
        • Feature Scaling
        • Mean Normalization
        • Choosing the Learning Rate α
    • Polynomial Regression
      • Normal Equation
      • Gradient Descent vs. Normal Equation
Powered by GitBook
On this page

Was this helpful?

  1. Supervised Learning
  2. Linear Regression
  3. Multivariate Linear Regression

Mean Normalization

In addition to scaling the features, some may also consider Mean Normalization.

In this, we replace xix_ixi​ by xi−μix_i-μ_ixi​−μi​ so as to make the features have approximately 0 mean.

(Note: This is not applied for x0x_0x0​ which has a fixed value 1).

In general, we can use the following formula to scale the features using mean normalization:

xi=(xi−μi)/Six_i = (x_i-μ_i)/S_ixi​=(xi​−μi​)/Si​

where xix_ixi​ is the ithi^{th}ith feature, μiμ_iμi​ is its mean and SiS_iSi​ is its range (i.e. max-min).

If this leads to xix_ixi​ being in the range [-0.5, 0.5] approximately, the gradient descent will work quickly.

PreviousFeature ScalingNextChoosing the Learning Rate α

Last updated 5 years ago

Was this helpful?