Skip to main navigation Skip to search Skip to main content

Predicting MOOC Student Success Using Aggregated Behavioral Features and Machine Learning Under the CRISP-ML(Q) Framework

  • María Belén Toledo
  • , Henry N. Roa
  • , Edison Loza-Aguirre
  • , Eduardo Espinosa-Avila

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    This study examines the application of machine learning to predict academic performance in Massive Open Online Courses (MOOCs), aiming to address the persistent challenge of low completion rates despite high enrollment rates. Identifying at-risk students early in the course is crucial for enabling targeted interventions to improve engagement and retention. The research is essential because it offers a practical and scalable approach for early detection using only behavioral data readily available during the initial stages of learning. The study utilized data from a Coursera MOOC, comprising over 8,000 anonymized student records and more than 80 event log files. Aggregated behavioral features—such as total sessions, number of quizzes completed, lecture views, supplementary activity access, and average session duration—were extracted from the first 50% of course participation. Following the CRISP-ML(Q) framework, three predictive models were developed and compared: logistic regression, decision trees, and support vector machines. The results showed that the decision tree classifier achieved the highest performance, with 90.9% accuracy and balanced precision and recall. These findings suggest that early-stage, aggregated behavioral metrics can predict course outcomes without relying on complex sequence modeling. The implications are significant for MOOC providers and educational institutions seeking real-time, interpretable, and scalable solutions to support student success. The proposed model can guide early interventions and improve overall course completion rates in online learning environments.

    Original languageEnglish
    Title of host publicationProceedings of the Future Technologies Conference, FTC 2025, Volume 3
    EditorsKohei Arai
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages226-242
    Number of pages17
    ISBN (Print)9783032079947
    DOIs
    StatePublished - 2026
    EventFuture Technologies Conference, FTC 2025 - Munich, Germany
    Duration: Nov 6 2025Nov 7 2025

    Publication series

    NameLecture Notes in Networks and Systems
    Volume1677 LNNS
    ISSN (Print)2367-3370
    ISSN (Electronic)2367-3389

    Conference

    ConferenceFuture Technologies Conference, FTC 2025
    Country/TerritoryGermany
    CityMunich
    Period11/6/2511/7/25

    Keywords

    • Academic performance prediction
    • CRISP-ML(Q)
    • Learning analytics
    • Machine learning
    • MOOCs

    Fingerprint

    Dive into the research topics of 'Predicting MOOC Student Success Using Aggregated Behavioral Features and Machine Learning Under the CRISP-ML(Q) Framework'. Together they form a unique fingerprint.

    Cite this