Integrating clustering with evolutionary feature selection using ENORA and SToWVector

Alexander José Mackenzie-Rivero, Rodrigo Martínez-Béjar, Hilarión José Vegas-Meléndez

    Research output: Contribution to journalArticlepeer-review

    Abstract

    The rapid growth of textual data from sources such as social media, blogs, and digital libraries has intensified the demand for scalable and semantically informed classification methods. This study introduces a hybrid framework that integrates unsupervised clustering, evolutionary feature selection, and semantic interpretation to enhance automatic text classification. The approach combines the SToWVector representation with a Multi-Objective Evolutionary Search (MOES) strategy optimized through the ENORA algorithm, while employing the NaiveBayesMultinomial classifier for evaluation. Semantic interpretation is incorporated via ontological reasoning, enabling the model to capture latent conceptual relationships among terms and thereby complement both the clustering and feature selection processes. Experimental evaluations on benchmark and large-scale datasets (SMS Spam and Euronews) demonstrate the robustness of the framework, including a scenario in which 100% accuracy was achieved. The proposed method outperforms traditional models and achieves competitive results against deep learning-based classifiers. These findings underscore the framework's adaptability and effectiveness in managing high-dimensional unstructured text, while preserving interpretability through symbolic reasoning.

    Original languageEnglish
    Article number100508
    JournalArray
    Volume28
    DOIs
    StatePublished - Dec 2025

    Keywords

    • Clustering
    • Feature selection
    • Machine learning
    • Text classification

    Fingerprint

    Dive into the research topics of 'Integrating clustering with evolutionary feature selection using ENORA and SToWVector'. Together they form a unique fingerprint.

    Cite this