Short-term educational technology log data (2–5 hours of use) can effectively predict long-term student outcomes, showing similar performance to models using full-period data. Key features like success rates and average attempts per problem are strong predictors, especially at performance extremes, and combining these log features with pre-assessment scores further enhances prediction accuracy.
Summary of https://arxiv.org/pdf/2412.15473
Investigates whether student log data from educational technology, specifically from the first few hours of use, can predict long-term student outcomes like end-of-year external assessments.
Using data from a literacy game in Uganda and two math tutoring systems in the US, the researchers explore if machine learning models trained on this short-term data can effectively predict performance.
They examine the accuracy of different machine learning algorithms and identify some common predictive features across the diverse datasets. Additionally, the study analyzes the prediction quality for different student performance levels and the impact of including pre-assessment scores in the models.