Cough Sound Classifier for Respiratory Screening App

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Pathophysiology · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

A cough can carry more than sound. It can also carry patterns that point to airway irritation, infection, or narrowing. A mel-spectrogram turns that sound into an image, so a CNN can hunt for patterns the way your eyes spot shapes in a heat map. That makes this project a mix of biology, signal processing, and real-world screening.

What Is It?

Your idea is to ask whether a computer can tell apart coughs from pertussis, COVID, asthma, and healthy people by listening to sound clips. The model does not hear a cough the way you do. It turns each clip into a mel-spectrogram, which is a picture of sound energy across pitch and time. A CNN, or convolutional neural network, scans that picture for repeating patterns.

That matters because different conditions can change how coughs start, break, and repeat. Think of each cough like a fingerprint made of timing, roughness, and pitch changes. A strong project does not claim diagnosis. It tests how well the model separates groups under careful controls and how stable the result stays when the data changes.

Why This Is a Good Topic

This is a strong science fair topic because you can test it with public data, clear metrics, and real comparisons. It connects to a real health problem, since cough-based screening can help flag people who may need more attention. You can also learn a lot from it, including audio feature extraction, model design, class imbalance, and how to judge a classifier without fooling yourself.

Research Questions

How does mel-spectrogram window size affect class-balanced accuracy?
What is the effect of speaker-level splits versus random clip splits on apparent accuracy?
Does adding background-noise augmentation reduce false positives on real-world cough recordings?
To what extent does a CNN outperform an MFCC-based baseline on the same cough dataset?
Which class, pertussis, COVID, asthma, or healthy, is most often confused with the others?
How does training on one dataset and testing on another change recall for each class?

Basic Materials

Laptop or desktop computer with at least 8 GB of RAM.
Free Google Colab account or another notebook environment.
Public cough audio dataset access, such as CoughVid or COUGHVID-19.
Headphones for checking individual clips by ear.
Python 3 and an internet connection.
Spreadsheet software for logging splits, metrics, and errors.
External microphone or smartphone, if you plan a small pilot test with local recordings.

Advanced Materials

GPU workstation or university server for training CNNs.
High-quality microphone and quiet recording space for pilot data collection.
Secure data storage with controlled access for any human recordings.
Annotation software for labeling cough events and background noise.
Python environment with CUDA support for faster training.
Consent forms and review materials if you collect new human audio.

Software & Tools

Python: Cleans audio files, runs experiments, and stores metrics.
Librosa: Converts cough recordings into mel-spectrograms and other audio features.
TensorFlow/Keras: Trains the CNN and tracks validation performance.
TensorFlow.js: Runs the trained model in the browser for the PWA.
Google Colab: Gives free notebook-based GPU access for model prototyping.

Experiment Steps

Define the exact labels, sample unit, and screening claim you want the model to make.
Choose one split rule, such as speaker-level separation, and keep it fixed for every comparison.
Build a single feature pipeline, then decide which baseline representation you will test against.
Pick metrics that reflect screening quality, including recall for each class, confusion matrix patterns, and balanced accuracy.
Plan one ablation at a time, then check how each design choice changes performance before you add the browser app.
Move the trained model into the PWA only after notebook results are stable, then compare browser predictions with saved test clips.

Common Pitfalls

Letting clips from the same speaker land in both train and test sets, which inflates the score with speaker memory instead of cough learning.
Reporting only overall accuracy, which hides weak recall for pertussis or asthma.
Training on clips with different loudness and noise levels without control, which makes the model chase recording conditions.
Changing the spectrogram settings between runs, which turns model comparison into an apples-to-oranges test.
Calling the PWA a diagnostic tool, which overstates what a student-built classifier can safely claim.

What Makes This Competitive

This becomes stronger when you test more than one dataset split, not just a random train-test split. Add a baseline model, an ablation study, and a cross-dataset test so you can show what parts of the design actually matter. If you also examine which cough types get confused and why, your project moves from a demo to a real measurement study.

Project Variations

Train the same model on pediatric coughs only and compare how well it separates asthma from healthy breathing sounds.
Swap the CNN for an MFCC plus classical machine learning baseline, then compare which features carry the most signal.
Test whether browser-side inference keeps the same ranking of errors as notebook inference when audio quality drops.

Learn More

CDC Pertussis pages: Read about symptoms, diagnosis, and surveillance background at the CDC website.
CDC COVID-19 pages: Find respiratory symptom and testing context at the CDC website.
NHLBI Asthma information: Learn the airway changes behind asthma at the NIH/NHLBI website.
PubMed: Search review articles on cough acoustics, respiratory sound analysis, and machine learning.
NCBI Bookshelf: Look for free background chapters on the respiratory system and airway disease.

Biomedical and Health Sciences Category Guide

How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →