CS-GY 6923: Machine Learning
1.0.0
1.0.0
  • Introduction
  • What is Machine Learning?
  • Types of Machine Learning
    • Supervised Learning
      • Notations
      • Probabilistic Modeling
        • Naive Bayes Classifier
      • Linear Regression
      • Nearest Neighbor
      • Evaluating a Classifier
      • Parametric Estimation
        • Bayesian Approach to Parameter Estimation
        • Parametric Estimation for Simple Linear Regression
        • Parametric Estimation for Multivariate Linear Regression
        • Parametric Estimation for Simple Polynomial Regression
        • Parametric Estimation for Multivariate Polynomial Regression
      • Bias and Variance of an Estimator
      • Bias and Variance of a Regression Algorithm
        • Model Selection
      • Logistic Regression
      • Decision Trees
        • Using Decision Trees for Regression
        • Bias and Variance
      • Dimensionality Reduction
      • Neural Networks
        • Training a Neuron
        • MLP
          • Regression with Multiple Outputs
          • Advice/Tricks and Issues to Train a Neural Network
        • Deep Learning
      • Support Vector Machines
      • Ensemble Learning
    • Unsupervised Learning
      • K-Means Clustering
      • Probabilistic Clustering
    • Reinforcement Learning
Powered by GitBook
On this page

Was this helpful?

  1. Types of Machine Learning
  2. Supervised Learning

Neural Networks

PreviousDimensionality ReductionNextTraining a Neuron

Last updated 5 years ago

Was this helpful?

These models aim to mimic the human brain, or are at least inspired by it.

The basic unit in a neural network is a neuron. A neuron computes a weighted sum of its inputs and then an activation is computed.

X=[x0x1....xd]X=\begin{bmatrix}x_0\\x_1\\.\\.\\.\\.\\x_d\end{bmatrix}X=​x0​x1​....xd​​​, W=[w0w1....wd]W=\begin{bmatrix}w_0\\w_1\\.\\.\\.\\.\\w_d\end{bmatrix}W=​w0​w1​....wd​​​and x0=1x_0 = 1x0​=1.

Some common activation functions:

A neural network (also known as a Multi Layer Perceptron (MLP)) has multiple layers of neurons. The most common problem faced in neural networks is the credit assignment problem: it is difficult to determine which neurons are to be given credit/blame for an increase/decrease in accuracy.

The sigmoid activation function 11+e−WTX\frac{1}{1+e^{-W^TX}}1+e−WTX1​ gives a probability as an output. The threshold/step activation function outputs 1 if WTX>0W^TX > 0WTX>0 and 0 otherwise. The linear activation function simply outputs WTXW^TXWTX.