CS-GY 6923: Machine Learning
1.0.0
1.0.0
  • Introduction
  • What is Machine Learning?
  • Types of Machine Learning
    • Supervised Learning
      • Notations
      • Probabilistic Modeling
        • Naive Bayes Classifier
      • Linear Regression
      • Nearest Neighbor
      • Evaluating a Classifier
      • Parametric Estimation
        • Bayesian Approach to Parameter Estimation
        • Parametric Estimation for Simple Linear Regression
        • Parametric Estimation for Multivariate Linear Regression
        • Parametric Estimation for Simple Polynomial Regression
        • Parametric Estimation for Multivariate Polynomial Regression
      • Bias and Variance of an Estimator
      • Bias and Variance of a Regression Algorithm
        • Model Selection
      • Logistic Regression
      • Decision Trees
        • Using Decision Trees for Regression
        • Bias and Variance
      • Dimensionality Reduction
      • Neural Networks
        • Training a Neuron
        • MLP
          • Regression with Multiple Outputs
          • Advice/Tricks and Issues to Train a Neural Network
        • Deep Learning
      • Support Vector Machines
      • Ensemble Learning
    • Unsupervised Learning
      • K-Means Clustering
      • Probabilistic Clustering
    • Reinforcement Learning
Powered by GitBook
On this page
  • Advice/Tricks and Issues to Train a Neural Network
  • Adaptive Learning Rate
  • Momentum
  • Early Stopping

Was this helpful?

  1. Types of Machine Learning
  2. Supervised Learning
  3. Neural Networks
  4. MLP

Advice/Tricks and Issues to Train a Neural Network

Advice/Tricks and Issues to Train a Neural Network

Adaptive Learning Rate

The idea is to start with a higher learning rate and decrease it as time progresses.

Momentum

Δwit=−η∂Et∂wi+αΔwit−1\Delta w_i^t = -\eta \frac{\partial E^t}{\partial w_i} + \alpha \Delta w_i^{t-1}Δwit​=−η∂wi​∂Et​+αΔwit−1​

(t is time). α\alphaα can be default or computed.

Early Stopping

Too much training could cause overfitting. Stop training when the validation error starts to increase.

PreviousRegression with Multiple OutputsNextDeep Learning

Last updated 5 years ago

Was this helpful?