Sleep Apnea Severity from Pulse Ox

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Pathophysiology · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

A tiny drop in blood oxygen can leave a big clue in an overnight trace. Think of it like a heartbeat monitor for breathing, except the pattern stretches across the whole night. If a model can read that pattern well, it may estimate sleep-apnea severity without a full sleep lab.

What Is It?

Sleep apnea is a condition where breathing stops or gets shallow during sleep. A pulse oximeter tracks how much oxygen is in your blood over time, so repeated dips can act like footprints left by those breathing events. AHI, or apnea-hypopnea index, is the clinical score doctors use to count how often those events happen per hour.

A CNN-LSTM model mixes two ideas. The CNN looks for short local patterns, like quick drops and recoveries in oxygen. The LSTM follows the order of those patterns across the night, so the model can learn the bigger story, not just one spike at a time.

Why This Is a Good Topic

This is a strong science fair topic because you can test a real medical question with public data and clear metrics. Sleep apnea matters because full sleep studies are expensive and hard to access, so a simpler home signal could help screening. You can learn signal preprocessing, model comparison, and validation without needing a hospital lab.

Research Questions

How does window length affect sleep-apnea severity classification from overnight pulse-ox traces?
What is the effect of using raw oxygen saturation versus engineered desaturation features on model performance?
Does a CNN-LSTM outperform a simple 1D CNN on public sleep-apnea data?
To what extent does patient-wise splitting change reported accuracy compared with random splitting?
Which label setup, binary risk or severity bands, gives the most stable recall for severe cases?
How does class imbalance correction affect the model's ability to detect low-oxygen episodes?

Basic Materials

Laptop or desktop computer with at least 8 GB RAM and Python support.
Google Colab free account for notebook runs and optional GPU access.
Python 3.11 with pandas, NumPy, Matplotlib, TensorFlow, and scikit-learn.
Jupyter Notebook or VS Code for cleaning data and running experiments.
PhysioNet account and access to the Apnea-ECG and MESA public datasets.
Stable internet connection for downloading data and documentation.

Advanced Materials

Research workstation with GPU access for faster training and tuning.
Access to raw overnight pulse-oximeter files and scored AHI labels.
Secure storage for clinical data, if your school or lab requires it.
WFDB-compatible signal software for reading PhysioNet records.
Signal review tools for checking bad segments, missing samples, and label quality.
Optional polysomnography reference data for stronger validation against clinical scoring.

Software & Tools

Python: Handles data cleaning, feature extraction, model training, and evaluation.
Google Colab: Runs notebooks and gives free GPU access for early model tests.
TensorFlow/Keras: Builds the CNN-LSTM and baseline neural networks.
scikit-learn: Calculates cross-validation metrics, confusion matrices, and class scores.
WFDB Python package: Reads PhysioNet signal files and annotations.

Experiment Steps

Define your target label, such as binary apnea risk, severity bands, or continuous AHI.
Lock a patient-wise split so traces from the same person never appear in both train and test sets.
Decide how you will segment and normalize each overnight signal before it reaches the model.
Build a simple baseline first, then compare it with the CNN-LSTM to see whether sequence memory helps.
Choose metrics that expose clinical mistakes, especially recall for severe cases and calibration of predicted risk.

Common Pitfalls

Randomly splitting windows from the same patient into train and test, which leaks personal signal patterns and inflates results.
Using accuracy alone, which can hide weak detection of severe sleep apnea.
Feeding the model traces with mixed sampling rates or missing segments, which teaches noise instead of physiology.
Ignoring class imbalance, which pushes the model toward the most common severity group.
Comparing the CNN-LSTM only against a weak baseline, which makes the gain look bigger than it really is.

What Makes This Competitive

A class-level project reports one accuracy score. A stronger project uses patient-wise splits, compares several severity thresholds, and reports recall for the hard cases. You can push it further by testing the model on both Apnea-ECG and MESA, or by checking calibration, not just accuracy. That shows you understand the clinical cost of false alarms and misses.

Project Variations

Use only the Apnea-ECG dataset and compare raw pulse-ox input with hand-built desaturation features.
Replace the CNN-LSTM with a 1D CNN or GRU and test whether sequence memory really helps.
Predict AHI as severity bands first, then try continuous regression and compare which setup is more stable.

Learn More

PhysioNet: Search the Apnea-ECG and MESA datasets, plus signal file documentation and example records.
PubMed: Search review articles on sleep apnea screening, pulse oximetry, and AHI prediction.
NIH NHLBI: Read the sleep apnea overview and diagnosis pages for plain-language clinical context.
MedlinePlus: Find a simple explainer on obstructive sleep apnea and sleep studies.
MIT OpenCourseWare: Search machine learning and deep learning materials for CNN and LSTM basics.

Biomedical and Health Sciences Category Guide

How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →