Performance Evaluation of Classical Machine Learning Models for Emotion Classification
محتوى المقالة الرئيسي
الملخص
Emotion detection in textual data represents a critical challenge in natural language processing with applications in mental health monitoring, customer sentiment analysis, and human-computer interaction. This study investigates three classical machine learning algorithms for multi-class emotion classification across eleven emotional categories using a balanced dataset of approximately 106,000 annotated sentences. The research employs Term Frequency-Inverse Document Frequency vectorization with trigram support and 3,000-dimensional feature space. Logistic Regression, Random Forest, and Naive Bayes classifiers were evaluated using comprehensive metrics including accuracy, precision, recall, F1-score, and five-fold cross-validation. Results demonstrate that Logistic Regression achieved superior performance with 79.90% accuracy, 81.18% precision, and 80.27% F1-score, substantially exceeding Random Forest at 75.32% and Naive Bayes at 69.01%. Cross-validation analysis revealed remarkable stability with standard deviations below 0.5%, confirming robust generalization. Per-class analysis identified enthusiasm, love, and neutral as most reliably detected emotions exceeding 83% accuracy, while empty and sadness presented greater challenges. The findings validate that classical machine learning approaches with proper feature engineering achieve competitive performance for fine-grained emotion detection while offering advantages in computational efficiency, interpretability, and deployment simplicity.
تفاصيل المقالة
إصدار
القسم

هذا العمل مرخص بموجب Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.