ResNet10 Run Report

Summarized Findings

Overfitting. The model demonstrates a strong capacity to learn the training data, achieving an 84% accuracy and a macro F1-score of 0.84. However, this performance degrades significantly on the validation set, dropping to 69% accuracy with a macro F1-score of 0.70. A 15% drop indicates the model is memorizing specific features of the training set rather than learning generalized representations.

Good performance on class 12 (Cabinets) (Validation F1: 0.94). It retains a 1.00 precision and 0.88 recall. Class 7 (Monarch Butterfly) is highly stable (Validation F1: 0.91). Class 18 (Tractor) is strong (Validation F1: 0.89). Class 4 (Godwit) has strong retention (Validation F1: 0.82).

These classes (Cabinet, Tractor, Butterfly) are visually distinct, which is likely why they perform well.

Class 15 is the sink class. In validation, precision is 0.31, when we predict 15 we are wrong 70% of the time. In the validation heatmap, we see significant misclassifications from Water snake (0), Night Heron (2), Limpkin (3), Acoustic Guitar (10), Sink (11) and more all bleeding into the Nail (15) prediction.

Impala (Class 9) has a low validation recall (0.52). The heatmap shows 12% of Impalas are predicted as Hyenas (Class 6) and 12% as Rams (Class 8). Confusion between animals.

Water snake (Class 0) has a low validation F1 (0.56) and is confused with the Godwit (Class 4) 10% of the time.

Sink (Class 11) struggles in validation with a recall of 0.56. It is frequently misclassified as a Piggy Bank (Class 16) 12% of the time.

Possible Improvements

Introduce MixUp, CutMix, FMix, ResizeMix HMix

Notebook

Please see the companion notebook. It contains:

Training Confusion Matrix
Validation Confusion Matrix
Training Classification Report
Validation Classification Report
Training GradCam
Validation GradCam