CS-GY 6923: Machine Learning
1.0.0
1.0.0
  • Introduction
  • What is Machine Learning?
  • Types of Machine Learning
    • Supervised Learning
      • Notations
      • Probabilistic Modeling
        • Naive Bayes Classifier
      • Linear Regression
      • Nearest Neighbor
      • Evaluating a Classifier
      • Parametric Estimation
        • Bayesian Approach to Parameter Estimation
        • Parametric Estimation for Simple Linear Regression
        • Parametric Estimation for Multivariate Linear Regression
        • Parametric Estimation for Simple Polynomial Regression
        • Parametric Estimation for Multivariate Polynomial Regression
      • Bias and Variance of an Estimator
      • Bias and Variance of a Regression Algorithm
        • Model Selection
      • Logistic Regression
      • Decision Trees
        • Using Decision Trees for Regression
        • Bias and Variance
      • Dimensionality Reduction
      • Neural Networks
        • Training a Neuron
        • MLP
          • Regression with Multiple Outputs
          • Advice/Tricks and Issues to Train a Neural Network
        • Deep Learning
      • Support Vector Machines
      • Ensemble Learning
    • Unsupervised Learning
      • K-Means Clustering
      • Probabilistic Clustering
    • Reinforcement Learning
Powered by GitBook
On this page

Was this helpful?

  1. Types of Machine Learning
  2. Supervised Learning

Notations

PreviousSupervised LearningNextProbabilistic Modeling

Last updated 5 years ago

Was this helpful?

The training set is denoted by X.

It has N training examples.

Each example is denoted by xt,rtx^t, r^txt,rt where xtx^txt is the feature set of the ttht^{th}tth training example and rtr^trt is the corresponding label.

xt=[x1tx2t...xdt]x^t=\begin{bmatrix}x_1^t\\x_2^t\\.\\.\\.\\x_d^t\end{bmatrix}xt=​x1t​x2t​...xdt​​​

The training set is denoted as X={xt,rt}t=1NX = \{x^t, r^t\}_{t=1}^{N}X={xt,rt}t=1N​

h(x) is the hypothesis that assigns a label r to x. For example, if we have a task of classifying cars as family cars/not family cars, based on two features X1,X2X_1, X_2X1​,X2​, based on the below feature space, we could hypothesize that:

h(x)={1;P1≤X1≤P2  &  e1≤X2≤e20;otherwiseh(x) = \begin{cases}1; P_1\leq X_1\leq P_2 \,\,\&\,\, e_1\leq X_2\leq e_2\\0; otherwise\end{cases}h(x)={1;P1​≤X1​≤P2​&e1​≤X2​≤e2​0;otherwise​

This hypothesis, however, may or may not be correct.

The error of the hypothesis h on X is given by:

E(h∣X)E(h|X)E(h∣X) or Err(h∣X)=∣{xt∈X∣h(xt)≠rt}∣Err(h|X) = |\{x^t \in X | h(x^t) \neq r^t \}|Err(h∣X)=∣{xt∈X∣h(xt)=rt}∣, basically the number of misclassified examples.

Say we know the correct hypothesis and we compare our current hypothesis with the correct hypothesis:

The current hypothesis labels everything inside the orange box as + and everything outside as -.

False positives are examples that are mistakenly labeled by our current hypothesis as positive. False negatives are examples that are mistakenly labeled by our current hypothesis as negative.

Based on the task at hand, we must focus on reducing either false positives or false negatives.