Personalized Practice Scheduling for Better Retention

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Online Learning · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

Most study apps guess when to give you the next problem. That guess can make the difference between remembering an idea next week and losing it in a day. You can test whether a smarter schedule keeps students learning longer. This project turns practice timing into a real experiment.

What Is It?

Bayesian Knowledge Tracing, or BKT, is a way to estimate what a student knows based on right and wrong answers. Think of it like a scoreboard with hidden stats. Each new problem updates the model's guess about mastery.

This project adds a forgetting curve to that model. A forgetting curve describes how memory fades over time if you do not review. Instead of only asking, “Did the student learn it?”, your system also asks, “How fast do they forget it?” That lets the app schedule review at the right moment, before the skill drops too far.

The data usually comes from large learning logs, such as edX or EdNet. Those logs show when students answered, what they got right, and how performance changed over time. Your job is to turn that history into a scheduler that predicts the next best practice item.

Why This Is a Good Topic

This topic works well because it has a clear input, a clear output, and a measurable result. You can compare a normal BKT scheduler with your memory-aware version and ask which one improves later recall. That makes the project testable, data-driven, and easy to judge with statistics. It also connects to real problems in online learning, where good timing can help students retain more with less wasted practice.

Research Questions

How does adding a per-student forgetting curve change retention compared with standard BKT?
What is the effect of personalized review timing on delayed quiz scores?
Does a memory-aware scheduler improve long-term retention for low-performing students more than for high-performing students?
To what extent do different decay-rate assumptions change the model's prediction accuracy?
Which feature set, response history, time since last practice, or item difficulty, best predicts forgetting in the log data?
How does the adaptive scheduler affect the number of practice items needed to reach the same retention level?

Basic Materials

Laptop or desktop computer with internet access.
Spreadsheet software or Google Sheets.
Python installed with Jupyter Notebook.
Free data set from edX, EdNet, or another open learning log source.
Web browser for testing a simple practice app.
Digital notebook for research notes and hypothesis tracking.

Advanced Materials

Laptop or desktop computer with internet access.
Python environment with Jupyter Notebook, pandas, NumPy, scikit-learn, and PyTorch or TensorFlow.
Access to open education logs from edX, EdNet, or similar research data.
Database or cloud storage for storing user interaction logs.
Web framework such as Flask or Django for building a test app.
Statistical analysis tools for survival analysis, mixed models, or Bayesian inference.
Version control with Git and GitHub for experiment tracking.

Software & Tools

Python: Lets you clean learning logs, fit models, and run prediction tests.
Jupyter Notebook: Helps you explore patterns in the data and compare model versions.
pandas: Organizes student response logs into usable tables.
scikit-learn: Supports train-test splits, metrics, and baseline models.
GitHub: Tracks code changes and records experiment versions.

Experiment Steps

Define the learning outcome you will measure, such as delayed recall, mastery gain, or review efficiency.
Choose the baseline scheduler you will compare against, then write down exactly how your new model changes it.
Build a clean data pipeline so every student session, answer, and time gap gets recorded the same way.
Decide how you will estimate forgetting, then fit that estimate on a training set before you test it.
Plan your control condition, your adaptive condition, and the metric that decides which one works better.
Set up your evaluation so you can compare short-term accuracy, long-term retention, and practice efficiency without mixing them together.

Common Pitfalls

Training and testing on the same student logs, which makes the model look better than it really is.
Using raw answer correctness as the only signal, which misses time gaps and forgetting.
Comparing schedulers on different student groups, which confuses the effect of the model with the effect of the learners.
Defining retention with only one quick quiz, which does not show whether memory holds over time.
Ignoring item difficulty, which can make the scheduler seem smart when the questions were just easier.

What Makes This Competitive

A stronger project goes beyond a simple accuracy score. You can test whether the scheduler helps different learner groups, different skill types, or different time gaps. You can also compare several forgetting models instead of just one. If your analysis shows when the model works, when it fails, and why, the project looks much closer to real research.

Project Variations

Compare the scheduler on math facts versus vocabulary to see whether forgetting behaves differently across content types.
Test whether item difficulty or time since last practice matters more in predicting the next review time.
Replace the forgetting curve with a simpler spaced-repetition rule and measure whether the adaptive model still wins.

Learn More

MIT OpenCourseWare: Search for machine learning, data analysis, and algorithms courses that explain model evaluation and prediction basics.
PubMed: Search review articles on spaced repetition, memory retention, and educational interventions.
NIH: Use the National Library of Medicine resources to find background on learning science and experimental design.
arXiv: Search for recent preprints on Bayesian Knowledge Tracing, knowledge tracing, and educational recommendation systems.
Google Scholar: Search for papers on forgetting curves, spaced repetition, and online learning analytics.

Systems Software Category Guide

How to Do Real Systems Software Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →