CS-GY 6923: Machine Learning
1.0.0
1.0.0
  • Introduction
  • What is Machine Learning?
  • Types of Machine Learning
    • Supervised Learning
      • Notations
      • Probabilistic Modeling
        • Naive Bayes Classifier
      • Linear Regression
      • Nearest Neighbor
      • Evaluating a Classifier
      • Parametric Estimation
        • Bayesian Approach to Parameter Estimation
        • Parametric Estimation for Simple Linear Regression
        • Parametric Estimation for Multivariate Linear Regression
        • Parametric Estimation for Simple Polynomial Regression
        • Parametric Estimation for Multivariate Polynomial Regression
      • Bias and Variance of an Estimator
      • Bias and Variance of a Regression Algorithm
        • Model Selection
      • Logistic Regression
      • Decision Trees
        • Using Decision Trees for Regression
        • Bias and Variance
      • Dimensionality Reduction
      • Neural Networks
        • Training a Neuron
        • MLP
          • Regression with Multiple Outputs
          • Advice/Tricks and Issues to Train a Neural Network
        • Deep Learning
      • Support Vector Machines
      • Ensemble Learning
    • Unsupervised Learning
      • K-Means Clustering
      • Probabilistic Clustering
    • Reinforcement Learning
Powered by GitBook
On this page
  • Boosting
  • Bagging

Was this helpful?

  1. Types of Machine Learning
  2. Supervised Learning

Ensemble Learning

This is the concept of combining multiple learners.

Different learners could use:

  • different algorithms

  • different parameters

  • different training sets

There is, however, little theoretical justification for just generating hypotheses using different learning algorithms. Instead, run the same algorithm on different training sets.

Two common ensemble learning techniques:

  • bagging

  • boosting

Boosting

Use multiple weak learners (i.e. the learner's test accuracy is slightly better than random, say 51%).

For example, use several Decesion Stumps (i.e. a Decision Tree with just a root and leaves).

  • First, assign equal weights to all training examples.

  • Run a weak learner on the training set, and get a hypothesis h.

  • Now, increment the weights (using a certain formula) of the misclassified examples (if an example has weight 3, it is repeated 3 times, so now there will be 3 such examples in the updated dataset).

  • Now, train a new weak learner on the updated dataset.

  • Keep repeating this, until a good accuracy is achieved.

  • In effect, each new weak learner learns from the mistakes of the previous weak learner.

  • In the end (we decide when to stop), we will have i hypotheses. Compute a weighted vote (using a certain formula) to determine the prediction for an example.

Theoretically, overfitting may be a concern, but it isn't usually a problem if the learners are weak.

Bagging

A weighted vote of weak learners is used to compute the prediction.

PreviousSupport Vector MachinesNextUnsupervised Learning

Last updated 5 years ago

Was this helpful?