PPG Sepsis Detection With Wearables

ISEF Category: Translational Medical Science

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Disease Detection and Diagnosis · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

A pulse sensor can hold more than your heart rate. Hidden in that squiggly line may be early clues that your body is slipping into sepsis. If you can find those clues before doctors see them in routine vitals, you could point toward a cheaper, faster warning system. That makes this project feel less like a class exercise and more like a real medical signal hunt.

What Is It?

Photoplethysmography, or PPG, is the light-based signal used by many wearables to track your pulse. A sensor shines light into skin and measures tiny changes in blood volume. Think of it like watching ripples in a stream instead of measuring the water level with a ruler. The shape of that ripple can change when the body is under stress.

Sepsis is a dangerous whole-body response to infection. Doctors often rely on vital signs, lab tests, and clinical judgment to spot it. Your project asks a sharper question: can raw PPG, by itself, carry early warning patterns before clinical onset? Instead of hand-picked pulse features, a transformer model learns patterns directly from the waveform, which is the full signal shape over time.

Why This Is a Good Topic

This is a strong science fair topic because it asks a clear yes-or-no question with real data and measurable outcomes. You can compare PPG-only models against multimodal models, which gives you a built-in way to test whether wearable data adds value. The project connects to a real hospital problem, earlier sepsis detection, and to a consumer problem, whether a smartwatch could someday help flag danger. You can learn signal processing, machine learning evaluation, class imbalance handling, and clinical data reasoning.

Research Questions

How does using raw PPG alone affect sepsis prediction performance 6 hours before onset?
What is the effect of adding vital signs to a PPG-only model on AUROC and AUPRC?
Does a transformer model outperform a simpler baseline model on raw PPG segments?
To what extent do different PPG segment lengths change early sepsis detection accuracy?
Which preprocessing choice, such as normalization or artifact filtering, changes model performance the most?
How does model performance change when you train on one patient group and test on a different patient group?
What is the effect of class balancing on the false positive rate for early sepsis detection?

Basic Materials

Laptop or desktop computer with at least 16 GB RAM.
Access to MIMIC-IV waveform data through the required credentialed platform.
Python installed through Anaconda or Miniconda.
Jupyter Notebook for data exploration and modeling.
External hard drive or cloud storage for large waveform files.
Basic spreadsheet software for tracking samples and labels.
Git for version control.

Advanced Materials

University or institutional server with a GPU.
Secure access to MIMIC-IV waveform files and linked clinical data.
Python with PyTorch or TensorFlow.
JupyterLab or VS Code for model development.
Docker for reproducible analysis environments.
GitHub or GitLab for private code management.
Statistical software such as R for performance comparisons and confidence intervals.

Software & Tools

Python: Processes waveform data, builds features, and trains the model.
PyTorch: Trains a transformer on raw PPG segments.
WFDB: Reads physiologic waveform files from the MIMIC database.
pandas: Organizes labels, timestamps, and patient-level metadata.
scikit-learn: Computes baseline models, splits data, and measures AUROC and AUPRC.
Matplotlib: Plots PPG signals, loss curves, and performance comparisons.

Experiment Steps

Define the exact prediction window, endpoint, and patient inclusion rules before you touch the model.
Map each PPG segment to a label that avoids leakage from later clinical information.
Choose a baseline, such as logistic regression or a simple neural net, so you can compare against the transformer.
Decide how you will split patients, so signals from the same person never appear in both training and testing.
Plan how you will compare PPG-only, multimodal, and ablation models with the same evaluation metrics.
Build a statistic plan for confidence intervals, class imbalance, and subgroup checks before interpreting results.

Common Pitfalls

Mixing waveform segments from the same patient across train and test sets, which inflates performance.
Using labels tied to charted onset times that accidentally include data from after the prediction window.
Comparing PPG-only and multimodal models with different preprocessing, which makes the result unfair.
Ignoring motion artifacts and poor-signal segments, which can make the model learn noise instead of physiology.
Reporting only AUROC, which can hide weak performance when sepsis cases are rare.

What Makes This Competitive

A stronger version of this project does more than train one model. It tests whether raw wearable-style data still holds predictive value after careful patient-level splits, hard baselines, and ablation checks. You can raise the quality by using calibration, confidence intervals, and subgroup analysis instead of a single score. You can also ask whether the model fails in a specific way, such as on noisy signals or on certain patient groups, which makes the work feel much closer to real translational research.

Project Variations

Compare PPG-only models against ECG-plus-PPG models to test how much wearable data adds.
Swap the transformer for a temporal convolutional network to see whether a simpler architecture matches performance.
Test whether the model still works when you predict 3 hours, 6 hours, or 12 hours before sepsis onset.

Learn More

MIMIC-IV documentation: Learn how the database is organized and find the waveform and clinical data specs through PhysioNet.
PhysioNet: Read dataset papers and access curated physiologic signals used in critical care research.
NIH PubMed: Search for review articles on sepsis prediction, PPG analysis, and wearable monitoring.
MIMIC Code Repository: Study open-source examples for handling MIMIC data and reproducing analyses from PhysioNet.
MIT OpenCourseWare: Use free courses in machine learning and signal processing to strengthen the modeling side.
NOAA and NASA data literacy resources: Practice careful data cleaning, time alignment, and uncertainty thinking on open datasets before working with clinical data.

Translational Medical Science Category Guide

How to Do Real Translational Medical Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →