TY - GEN
T1 - Ensemble Learning for Fake News Detection
T2 - 11th Intelligent Systems Conference, IntelliSys 2025
AU - González-Celi, Sebastián
AU - Roa, Henry N.
AU - Cruz-Silva, Jorge
AU - Loza-Aguirre, Edison
AU - Salgado-Reyes, Nelson
AU - Guaña-Moya, Javier
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - The proliferation of fake news on digital platforms presents significant societal challenges, necessitating the development of robust and interpretable detection systems. This study proposes a stacking-based ensemble learning model that integrates XGBoost and Logistic Regression to improve fake news classification accuracy while enhancing model transparency. Unlike traditional Natural Language Processing (NLP) approaches, which rely solely on textual analysis, this model incorporates structural and statistical metadata features, such as publication date and article length, to improve generalizability across misinformation domains. Experimental results on the Spanish Political Fake News dataset demonstrate that the stacking ensemble model outperforms individual classifiers, achieving an F1-score of 95.2% and a ROC-AUC of 0.974. SHAP (Shapley Additive Explanations) analysis enhances interpretability by identifying the most influential features contributing to classification decisions, confirming that metadata plays a critical role in misinformation detection. These findings highlight the effectiveness of hybrid machine-learning approaches that combine textual and structural information for scalable misinformation detection. The study’s contributions include a highly accurate and explainable classification model, positioning ensemble learning as a viable solution for real-world applications in automated fact-checking, journalism, and social media moderation.
AB - The proliferation of fake news on digital platforms presents significant societal challenges, necessitating the development of robust and interpretable detection systems. This study proposes a stacking-based ensemble learning model that integrates XGBoost and Logistic Regression to improve fake news classification accuracy while enhancing model transparency. Unlike traditional Natural Language Processing (NLP) approaches, which rely solely on textual analysis, this model incorporates structural and statistical metadata features, such as publication date and article length, to improve generalizability across misinformation domains. Experimental results on the Spanish Political Fake News dataset demonstrate that the stacking ensemble model outperforms individual classifiers, achieving an F1-score of 95.2% and a ROC-AUC of 0.974. SHAP (Shapley Additive Explanations) analysis enhances interpretability by identifying the most influential features contributing to classification decisions, confirming that metadata plays a critical role in misinformation detection. These findings highlight the effectiveness of hybrid machine-learning approaches that combine textual and structural information for scalable misinformation detection. The study’s contributions include a highly accurate and explainable classification model, positioning ensemble learning as a viable solution for real-world applications in automated fact-checking, journalism, and social media moderation.
KW - Fake news detection
KW - Machine learning
KW - Metadata-driven classification
KW - SHAP analysis
KW - Stacking ensemble learning
KW - XGBoost
UR - https://www.scopus.com/pages/publications/105017241962
U2 - 10.1007/978-3-031-99965-9_30
DO - 10.1007/978-3-031-99965-9_30
M3 - Contribución a la conferencia
AN - SCOPUS:105017241962
SN - 9783031999642
T3 - Lecture Notes in Networks and Systems
SP - 485
EP - 502
BT - Intelligent Systems and Applications - Proceedings of the 2025 Intelligent Systems Conference IntelliSys
A2 - Arai, Kohei
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 28 August 2025 through 29 August 2025
ER -