RNN Sleep Stage Modeling for Shift Work
ISEF Category: Computational Biology and Bioinformatics
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Computational Neuroscience · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
Your brain does not sleep in a straight line. It cycles, skips, and repeats like a playlist that keeps remixing itself. That makes sleep stage prediction a great test for machine learning. It also connects to shift work, where timing can throw the whole pattern off.
What Is It?
This project uses a recurrent neural network, or RNN, to predict how sleep stages change over time. Sleep stages include awake, light sleep, deep sleep, and REM sleep. An RNN is a model that remembers what came before, so it can learn patterns in sequences, not just one snapshot at a time.
Think of it like predicting the next song in a playlist after hearing the last few tracks. Sleep works the same way in a sense, because one stage makes the next stage more or less likely. Your model can learn those transition patterns from public Sleep-EDF data, then you can test how adding circadian timing signals, often called Process C, changes the model’s predictions.
Process C means the body clock. It affects when you feel sleepy or alert across the day. By comparing a model with and without that input, you can ask whether the clock helps explain why REM and NREM cycling looks different under shift-work schedules.
Why This Is a Good Topic
This is a strong science fair topic because you can test a clear question with public data and measurable outputs. You are not guessing whether the model works, you can score it with accuracy, transition matrices, and sequence metrics. The project also connects to sleep health, shift work, and circadian biology, which gives it real-world meaning. You can learn data cleaning, model building, and model comparison without needing a wet lab.
Research Questions
- How does adding circadian Process C inputs change RNN accuracy for predicting the next sleep stage?
- What is the effect of shift-work-like time shifts on predicted REM to NREM transition rates?
- Does a model trained on one subset of Sleep-EDF patients generalize to new patients better than a model trained on mixed nights?
- To what extent does removing circadian timing information reduce sequence prediction performance?
- Which sleep-stage transitions, such as NREM to REM or REM to wake, change most when Process C inputs are ablated?
- How does the length of the input sequence affect the model’s ability to predict unstable sleep transitions?
Basic Materials
- Computer with enough memory to handle time-series data.
- Python installed with a Jupyter notebook environment.
- Public Sleep-EDF dataset from PhysioNet.
- CSV or EDF file reader package for sleep signals.
- Pandas for data cleaning and table work.
- NumPy for numerical operations.
- Scikit-learn for baseline models and metrics.
- TensorFlow or PyTorch for building the RNN.
- Matplotlib or Seaborn for plots.
- GitHub or local version control for tracking code changes.
Advanced Materials
- High-memory workstation or access to a university computing cluster.
- Python with TensorFlow or PyTorch and GPU support.
- Sleep staging annotations from Sleep-EDF on PhysioNet.
- Signal processing packages for feature extraction from EEG, EOG, or EMG channels.
- Custom scripts for circadian phase encoding.
- Statistical testing tools in SciPy or statsmodels.
- ImageJ for any figure inspection or publication-quality image checks.
Software & Tools
- Python: Runs data cleaning, feature extraction, model training, and evaluation.
- Jupyter Notebook: Lets you test ideas, plot results, and document each step.
- TensorFlow: Builds sequence models for sleep-stage prediction.
- PyTorch: Offers flexible RNN experimentation and easier custom model design.
- PhysioNet: Provides the public Sleep-EDF dataset and metadata.
- scikit-learn: Computes baseline metrics, splits data, and supports comparison models.
Experiment Steps
- Define the prediction task, then decide whether you will predict the next sleep stage, the next transition, or the full sequence.
- Choose one circadian input format, then decide how you will encode Process C across the night.
- Build a clean train, validation, and test split that keeps patient nights separate.
- Plan a baseline model first, then compare it against an RNN with circadian inputs and an ablated version without them.
- Select evaluation metrics that reward both stage accuracy and transition quality, not just raw percent correct.
- Design a comparison table that tests shift-work-like time shifts, patient groups, or sequence length effects.
Common Pitfalls
- Mixing data from the same patient across train and test sets, which inflates model performance.
- Using raw sleep labels without checking class imbalance, which can make the model guess the most common stage.
- Treating circadian time as a simple clock hour, which can miss the wraparound nature of the body clock.
- Comparing models with different preprocessing steps, which makes the ablation result unfair.
- Reporting only accuracy, which can hide weak REM prediction and poor transition timing.
What Makes This Competitive
A stronger project goes beyond a basic accuracy score. You can test whether circadian inputs improve rare transition prediction, patient generalization, or shift-work simulation. You can also compare more than one model family, then use careful statistics to show where the clock signal helps and where it does not. That kind of analysis looks much more like real computational neuroscience.
Project Variations
- Use only EEG-derived features instead of full Sleep-EDF channel sets, then test whether simpler inputs still predict stage transitions well.
- Replace the RNN with an LSTM or GRU, then compare which sequence model handles circadian timing better.
- Simulate shift-work by shifting circadian phase inputs across nights, then measure how REM timing and transition stability change.
Learn More
- PhysioNet: Search for Sleep-EDF and related sleep staging datasets, plus documentation on how the public records are organized.
- NIH PubMed: Search for review articles on sleep stage scoring, circadian rhythms, and shift work sleep disorder.
- NASA Open Science Repository: Look for freely available materials on circadian timing and human performance under schedule shifts.
- MIT OpenCourseWare: Search for machine learning and neural networks courses that cover RNNs and sequence modeling.
- SciPy Lecture Notes: Use the free online notes for practical Python data analysis, plotting, and statistics.
- National Library of Medicine Bookshelf: Search for open textbooks on sleep physiology, circadian biology, and data science methods.
Computational Biology and Bioinformatics Category Guide
How to Do Real Computational Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Hub →
