Cough Sound Classifier for Respiratory Screening App
ISEF Category: Biomedical and Health Sciences
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Pathophysiology · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months
The Hook
A cough can carry more than sound. It can also carry patterns that point to airway irritation, infection, or narrowing. A mel-spectrogram turns that sound into an image, so a CNN can hunt for patterns the way your eyes spot shapes in a heat map. That makes this project a mix of biology, signal processing, and real-world screening.
What Is It?
Your idea is to ask whether a computer can tell apart coughs from pertussis, COVID, asthma, and healthy people by listening to sound clips. The model does not hear a cough the way you do. It turns each clip into a mel-spectrogram, which is a picture of sound energy across pitch and time. A CNN, or convolutional neural network, scans that picture for repeating patterns.
That matters because different conditions can change how coughs start, break, and repeat. Think of each cough like a fingerprint made of timing, roughness, and pitch changes. A strong project does not claim diagnosis. It tests how well the model separates groups under careful controls and how stable the result stays when the data changes.
Why This Is a Good Topic
This is a strong science fair topic because you can test it with public data, clear metrics, and real comparisons. It connects to a real health problem, since cough-based screening can help flag people who may need more attention. You can also learn a lot from it, including audio feature extraction, model design, class imbalance, and how to judge a classifier without fooling yourself.
Research Questions
- How does mel-spectrogram window size affect class-balanced accuracy?
- What is the effect of speaker-level splits versus random clip splits on apparent accuracy?
- Does adding background-noise augmentation reduce false positives on real-world cough recordings?
- To what extent does a CNN outperform an MFCC-based baseline on the same cough dataset?
- Which class, pertussis, COVID, asthma, or healthy, is most often confused with the others?
- How does training on one dataset and testing on another change recall for each class?
Basic Materials
- Laptop or desktop computer with at least 8 GB of RAM.
- Free Google Colab account or another notebook environment.
- Public cough audio dataset access, such as CoughVid or COUGHVID-19.
- Headphones for checking individual clips by ear.
- Python 3 and an internet connection.
- Spreadsheet software for logging splits, metrics, and errors.
- External microphone or smartphone, if you plan a small pilot test with local recordings.
Advanced Materials
- GPU workstation or university server for training CNNs.
- High-quality microphone and quiet recording space for pilot data collection.
- Secure data storage with controlled access for any human recordings.
- Annotation software for labeling cough events and background noise.
- Python environment with CUDA support for faster training.
- Consent forms and review materials if you collect new human audio.
Software & Tools
- Python: Cleans audio files, runs experiments, and stores metrics.
- Librosa: Converts cough recordings into mel-spectrograms and other audio features.
- TensorFlow/Keras: Trains the CNN and tracks validation performance.
- TensorFlow.js: Runs the trained model in the browser for the PWA.
- Google Colab: Gives free notebook-based GPU access for model prototyping.
Experiment Steps
- Define the exact labels, sample unit, and screening claim you want the model to make.
- Choose one split rule, such as speaker-level separation, and keep it fixed for every comparison.
- Build a single feature pipeline, then decide which baseline representation you will test against.
- Pick metrics that reflect screening quality, including recall for each class, confusion matrix patterns, and balanced accuracy.
- Plan one ablation at a time, then check how each design choice changes performance before you add the browser app.
- Move the trained model into the PWA only after notebook results are stable, then compare browser predictions with saved test clips.
Common Pitfalls
- Letting clips from the same speaker land in both train and test sets, which inflates the score with speaker memory instead of cough learning.
- Reporting only overall accuracy, which hides weak recall for pertussis or asthma.
- Training on clips with different loudness and noise levels without control, which makes the model chase recording conditions.
- Changing the spectrogram settings between runs, which turns model comparison into an apples-to-oranges test.
- Calling the PWA a diagnostic tool, which overstates what a student-built classifier can safely claim.
What Makes This Competitive
This becomes stronger when you test more than one dataset split, not just a random train-test split. Add a baseline model, an ablation study, and a cross-dataset test so you can show what parts of the design actually matter. If you also examine which cough types get confused and why, your project moves from a demo to a real measurement study.
Project Variations
- Train the same model on pediatric coughs only and compare how well it separates asthma from healthy breathing sounds.
- Swap the CNN for an MFCC plus classical machine learning baseline, then compare which features carry the most signal.
- Test whether browser-side inference keeps the same ranking of errors as notebook inference when audio quality drops.
Learn More
- CDC Pertussis pages: Read about symptoms, diagnosis, and surveillance background at the CDC website.
- CDC COVID-19 pages: Find respiratory symptom and testing context at the CDC website.
- NHLBI Asthma information: Learn the airway changes behind asthma at the NIH/NHLBI website.
- PubMed: Search review articles on cough acoustics, respiratory sound analysis, and machine learning.
- NCBI Bookshelf: Look for free background chapters on the respiratory system and airway disease.
Biomedical and Health Sciences Category Guide
How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
