Mushroom Classification Project part 4 — Data Modelling and Model Evaluation
This article picks off from my previous article on data pre-processing.
Data Modelling
Everything we have done till now has just been a preparation for this step — Data modelling!
In this section, we will experiment on 4 models and choose the best one as our final model. The models we are going to try are:
- RandomForestClassifier
- LogisticRegression
- KNeighborsClassifier
- LinearSVC
In this article, we will not get into the theory of these models, but if you want to understand more about how these models works check out LogisticRegression, RandomForestClassifier, KNeighborsClassifier and LinearSVC.
The really good thing about Scikit-Learn is that all of these models are fitted in the same way. As a programmer, we should always try to reduce the amount of code written and maximise efficiency. So, we are going to create a function which fits all of the models for us and uses the default evaluation metric to show us the accuracy of the model.
The above function take a dictionary of name/model-object pairs, trains the model, scores it, prints the scores and returns a diction of name/score pairs.
As we can see, RandomForestClassifier
and KNeighborsClassifier
perform the best (Though we should be sceptic of any model which gets a 100% accuracy).
This ends the Modelling section of this article. In the next section, we will evaluate the best two models using scikit-learn’s evaluation metrics.
Model Evaluation
In this section we will score our model to see how good our model is. Now, you might be thinking that we have already done in the above piece of code, but accuracy is just one of the many evaluation metrics.
Creating predictions
In order to evaluate our models, we compare the test labels (ground truth/truth labels) with the predictions (preds)
we will be using the following metrics to evaluate our models:
- Accuracy
- Precision
- Recall
- F1 score
- Confusion Matrix
- Plot ROC curve
- Classification report
- AUC score
What are all of these things?
This is the official documentation of all of the evaluation metrics in scikit-learn. I have not covered all of them, but feel free to try out new metrics.
We could use these metrics on our trained models, but to get a more general overview of our model’s performance, we will use cross- validation (cross_val_score
)
How does cross-validation work?
While training and evaluating models, we split the data into train and test dataset. In some rare cases, we get a lucky split and our model gets very good results. This is prevented in cross validation as the 5 models are trained on 5 train datasets and test datasets.
Though not all of the above metrics can be used with cross-validation, we can use Accuracy, precision, recall and f1 score with it.
1. Cross-validation with Evaluation Metrics
We will functionalize the process of cross-validating our model. as it is a repetitive work.
This is the evaluation scores for LogisticRegression()
and this is the evaluation scores for RandomForestClassifier()
Remember when I said that we should be sceptic of any model with 100% accuracy? Well, now we can see that our RandomForestClassifier() doesn’t have a 100% accuracy. We can see how cross_val_score
can show us the drawbacks of our model far better than usual metric function.
As our RandomForestClassifier() is generally better than our LogisticRegression() model, we will drop LogisticRegression and perform the other metrics only on RandomForestClassifier().
2. Confusion Matrix
note: Confusion matrix explained
3. Plot ROC Curve
As you can see most evaluation metrics are one-liners. This is the thing about Sklearn — no matter how complicated the process is, Sklearn makes it look like cakewalk.
4. Classification Report
What are all of these things?
5. AUC Score
Conclusion
All of the above metrics signify a perfect model, but we know that is not the case. We will create a not-so-perfect model in the next article and see how to improve it and save the improved model.
Alright, this is all for this article.
See you in part 5!