CS-GY 6923: Machine Learning
1.0.0
1.0.0
  • Introduction
  • What is Machine Learning?
  • Types of Machine Learning
    • Supervised Learning
      • Notations
      • Probabilistic Modeling
        • Naive Bayes Classifier
      • Linear Regression
      • Nearest Neighbor
      • Evaluating a Classifier
      • Parametric Estimation
        • Bayesian Approach to Parameter Estimation
        • Parametric Estimation for Simple Linear Regression
        • Parametric Estimation for Multivariate Linear Regression
        • Parametric Estimation for Simple Polynomial Regression
        • Parametric Estimation for Multivariate Polynomial Regression
      • Bias and Variance of an Estimator
      • Bias and Variance of a Regression Algorithm
        • Model Selection
      • Logistic Regression
      • Decision Trees
        • Using Decision Trees for Regression
        • Bias and Variance
      • Dimensionality Reduction
      • Neural Networks
        • Training a Neuron
        • MLP
          • Regression with Multiple Outputs
          • Advice/Tricks and Issues to Train a Neural Network
        • Deep Learning
      • Support Vector Machines
      • Ensemble Learning
    • Unsupervised Learning
      • K-Means Clustering
      • Probabilistic Clustering
    • Reinforcement Learning
Powered by GitBook
On this page

Was this helpful?

  1. Types of Machine Learning
  2. Supervised Learning
  3. Parametric Estimation

Parametric Estimation for Simple Polynomial Regression

PreviousParametric Estimation for Multivariate Linear RegressionNextParametric Estimation for Multivariate Polynomial Regression

Last updated 5 years ago

Was this helpful?

In cases where the data cannot be fit using a linear decision boundary, we may want to use polynomial regression.

Say we want to use a degree 2 polynomial. The equation can be given by:

g(x∣w2,w1,w0)=w2x2+w1x+w0g(x|w_2,w_1,w_0) = w_2x^2 + w_1x + w_0g(x∣w2​,w1​,w0​)=w2​x2+w1​x+w0​

Our aim is to find values for w0,w1,w2w_0,w_1,w_2w0​,w1​,w2​ that minimize the squared error ∑t(rt−g(xt))2\sum_t (r^t-g(x^t))^2∑t​(rt−g(xt))2

Note: Given a dataset {xt,rt}t=1N\{x^t,r^t\}_{t=1}^N{xt,rt}t=1N​ where xt∈Rx^t\in Rxt∈R i.e. where xt=[x1t]x^t=[x_1^t]xt=[x1t​] (1 dimension), to find the polynomial of degree 2 g(x∣w2,w1,w0)=w2x2+w1x+w0g(x|w_2,w_1,w_0) = w_2x^2 + w_1x + w_0g(x∣w2​,w1​,w0​)=w2​x2+w1​x+w0​ that minimizes the squared error, we can construct a related dataset with inputs in R2R^2R2 (2 dimensions) with the second dimension x2t=(x1t)2x_2^t=(x_1^t)^2x2t​=(x1t​)2, and then use simple linear regression on this new dataset to obtain w2,w1,w0w_2,w_1,w_0w2​,w1​,w0​ that minimize the squared error, and finally output g(x∣w2,w1,w0)=w2x2+w1x+w0g(x|w_2,w_1,w_0) = w_2x^2 + w_1x + w_0g(x∣w2​,w1​,w0​)=w2​x2+w1​x+w0​ with these w2,w1,w0w_2,w_1,w_0w2​,w1​,w0​ values as the best 2 degree polynomial that fits the original dataset.

This can be extended to higher degree polynomials as well.