Evaluating a Classifier
Train-Test Split
Given a dataset, we first split it into a training set (say 70%) and a test set (say 30%). We must make sure to randomly shuffle the examples before splitting, to avoid cases such as all the examples of a certain class being together in the set.
We train using the training set and compute the error (and accuracy) on the test set.
Note: The main aim is to generalize to unseen examples. If we train using the entire dataset, the classifier will most probably overfit i.e. it will classify training examples fairly well but wouldn't be able to generalize to unseen test examples.
A good practice is to compute both the training error and the test error.
If training error << test error, the model is probably overfitting!
k-fold Cross-Validation
This is mainly useful when we have a small dataset. Splitting a small dataset would reduce the amount of data available for both training and testing.
Instead of splitting the dataset into a train and a test set, do the following:
(k=5 is common, though k=10, 12 are also used)
This way, every example in the dataset is eventually used for both training and testing.
Note: A version of cross validation can also be used to choose the values of the parameters of a learning algorithm. For example, choosing k for kNN:
Leave-one-out cross validation uses k=N. It is also called N-fold CV. It is effective, but time consuming.
ROC Curve
A confusion matrix gives the TP, FP, TN, FN. We can compute the TPR and FPR.
,
An ROC curve plots the TPR vs. the FPR at different thresholds.
(for different thresholds applied on , the FP, FN, TP, TN change, causing TPR, FPR to change).
The dotted line corresponds to 50% accuracy i.e. as good as random guessing for a binary classification.
The AUC (Area Under Curve) corresponds to the accuracy, so the higher the curve is above the dotted line, the higher is the accuracy of the classifier.
At TPR=FPR=0, all examples were labeled negative. At FPR=TPR=1, all examples were labeled positive.
At FPR=0, TPR=1, all examples were classified correctly! (ROC Heaven!)
At FPR=1, TPR=0, all examples were misclassified! (ROC Hell!)
At ROC Heaven, AUC=1 (max. possible AUC). At ROC Hell, AUC=0 (min. possible AUC).
Last updated