Confusion matrices comparison across all evaluated models. - Appendix

Baseline DistilBERT confusion matrices on the TweetEval emotion prediction task. Notice how the model appears to be heavily biased for classifying for optimism by default. [!h]

Baseline DistilRoBERTa confusion matrices on the TweetEval emotion prediction task. Notice how the model is slightly more distributed in classifying then DistilBERT base, but still tends to classify for sadness by default. [!h]

Confusion Matrices for bert-tweeteval-distilbert. Notice how the classifications now tend towards the original training distributions, in comparison to the original base classifications. [!h]

Confusion Matrices for bert-tweeteval-distilroberta. Notice how the classifications now tend towards the original training distributions, in comparison to the original base classifications. [!h]

Confusion Matrices for all LLM models and prompting strategies. Structured prompts clearly yielded better results.