by Said Bleik, Shaheen Gauher, Data Scientists at Microsoft
Evaluation metrics are the key to understanding how your classification model performs when applied to a test dataset. In what follows, we present a tutorial on how to compute common metrics that are often used in evaluation, in addition to metrics generated from random classifiers, which help in justifying the value added by your predictive model, especially in cases where the common metrics suggest otherwise.
- Creating the Confusion Matrix
- Accuracy
- Per-class Precision, Recall, and F-1
- Macro-averaged Metrics
- One-vs-all Matrices
- Average Accuracy
- Micro-averaged Metrics
- Evaluation on Highly Imbalanced Datasets
- Majority-class Metrics
- Random-guess Metrics
- Kappa Statistic
- Custom R Evaluation Module in Azure Machine Learning
Creating the Confusion Matrix
We will start by creating a confusion matrix from simulated classification results. The confusion matrix provides a tabular summary of the actual class labels vs. the predicted ones. The test set we are evaluating on contains 100 instances which are assigned to one of 3 classes a, b, or c.
set.seed(0)
actual = c('a','b','c')[runif(100, 1,4)] # actual labels
predicted = actual # predicted labels
predicted[runif(30,1,100)] = actual[runif(30,1,100)] # introduce incorrect predictions
cm = as.matrix(table(Actual = actual, Predicted = predicted)) # create the confusion matrix
cm
## Predicted
## Actual a b c
## a 24 2 1
## b 3 30 4
## c 0 5 31
Next we will define some basic variables that will be needed to compute the evaluation metrics.
n = sum(cm) # number of instances
nc = nrow(cm) # number of classes
diag = diag(cm) # number of correctly classified instances per class
rowsums = apply(cm, 1, sum) # number of instances per class
colsums ...read more
Source:: http://revolutionanalytics.com