ESP32 Cough Classifier for Sound-Based Health Signals

ESP32 Cough Classifier for Sound-Based Health Signals

ISEF Category: Embedded Systems

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Signal Processing  ·  Difficulty: Advanced  ·  Setup: School Lab  ·  Time: 1 to 2 Months

The Hook

A cough can carry clues about health, but a phone does not need to send that sound to the cloud to learn from it. You can train a tiny model to sort cough types on an ESP32-S3, then keep the audio on the device. That makes this project both a signal processing challenge and a privacy challenge.

What Is It?

This project asks you to teach a small computer to listen to cough sounds and sort them into groups like dry, productive, whooping, and croup. The trick is not just hearing the sound. You first convert the audio into log-Mel features, which turn sound into a picture-like pattern the model can read. Then a 1D-CNN, a convolutional neural network that learns local patterns, makes the prediction.

Think of it like sorting birds by their calls. You do not need to know the whole song at once. You listen for short sound patterns, then combine them into a guess. Your goal is to make that process small enough and fast enough to run on an ESP32-S3, a low-power microcontroller, without sending the recording to a server.

Why This Is a Good Topic

This is a strong science fair topic because you can test real model choices, measure accuracy, and compare privacy-friendly on-device inference against cloud-based approaches. It connects to telehealth, public health, and edge AI, so the problem feels real. You can learn signal processing, feature extraction, model evaluation, and embedded deployment, all without needing a medical lab.

Research Questions

  • How does the choice of cough feature, such as log-Mel spectrograms versus MFCCs, affect classification accuracy on cough types?
  • What is the effect of using a smaller 1D-CNN versus a deeper 1D-CNN on ESP32-S3 inference speed and accuracy?
  • Does training on one public dataset and testing on another reduce cough-classifier performance, and by how much?
  • To what extent does adding background-noise augmentation improve classification on real-world cough recordings?
  • Which cough classes are most often confused, and does class imbalance explain those errors?
  • How does on-device inference compare with cloud inference in latency and privacy tradeoffs for the same model?

Basic Materials

  • ESP32-S3 development board with microphone support or an external I2S microphone.
  • USB cable and laptop for coding and model deployment.
  • Python installed on a computer.
  • Free audio analysis software or notebook environment, such as Jupyter.
  • Headphones for checking recordings by ear.
  • Public cough datasets from CoughVid and Coswara.
  • Digital notebook or spreadsheet for tracking model runs.
  • Basic graphing tool for accuracy, confusion matrices, and timing.

Advanced Materials

  • ESP32-S3 development board with PSRAM, if available.
  • I2S microphone module with known sample rate.
  • MicroSD storage or serial logging setup for testing audio capture.
  • Reference computer with Python, NumPy, SciPy, Librosa, TensorFlow or PyTorch, and an ESP32 deployment toolchain.
  • Audio inspection tools such as Audacity and ImageJ-style plotting, if needed for visual checks.
  • Access to a stable test microphone and quiet recording space for controlled validation.
  • Optional oscilloscope or logic analyzer for debugging embedded audio timing.

Software & Tools

  • Python: Prepares audio features, trains models, and runs evaluation scripts.
  • Jupyter Notebook: Helps you compare feature sets, plots, and confusion matrices in one place.
  • Librosa: Extracts log-Mel spectrograms and other audio features from cough recordings.
  • TensorFlow or PyTorch: Trains the 1D-CNN before you export or simplify it for embedded use.
  • ESP-IDF: Builds and flashes firmware for the ESP32-S3 and tests on-device inference.

Experiment Steps

  1. Define the exact cough labels you will predict and decide whether you will treat the task as four-class classification or a simpler binary baseline.
  2. Compare public datasets and check their label quality, recording conditions, and class balance before you train anything.
  3. Choose one feature pipeline first, then build a second pipeline only if you need a direct comparison for your research question.
  4. Train a small baseline model, measure accuracy, and record which classes it confuses most often.
  5. Plan a deployment version that fits the ESP32-S3 memory and speed limits, then test whether accuracy changes after compression or simplification.
  6. Design an evaluation plan that includes cross-dataset testing, timing, and privacy claims, so your results reflect real use, not just a held-out split.

Common Pitfalls

  • Mixing recordings from the same speaker across training and test sets, which makes the model look better than it really is.
  • Ignoring class imbalance, which can make the model favor the most common cough type.
  • Treating noisy public datasets as if they came from one recording setup, which can hide dataset shift problems.
  • Using a model that fits on a laptop but fails on the ESP32-S3 because memory use is too high.
  • Reporting only accuracy and skipping confusion matrices, which hides which cough types the classifier actually misses.

What Makes This Competitive

A stronger project would compare more than one feature set, more than one model size, and more than one test condition. You could measure cross-dataset performance, then explain why some cough types break the model. You could also report latency, memory use, and energy cost on the ESP32-S3, not just accuracy. That turns your project from a simple classifier demo into a careful embedded-systems study.

Project Variations

  • Use binary labels, such as cough versus no cough, to build a simpler baseline before moving to four-class classification.
  • Swap in MFCCs, chroma, or raw waveform input to compare feature engineering against end-to-end learning.
  • Test whether a quantized model keeps enough accuracy on the ESP32-S3 while reducing memory and inference time.

Learn More

  • CoughVid dataset paper: Search PubMed or the journal site for the original CoughVid publication and dataset description.
  • Coswara dataset paper: Search PubMed for the Coswara respiratory sound dataset and read the methods section.
  • MIT OpenCourseWare: Search for audio signal processing lectures to review spectrograms, filtering, and classification basics.
  • NASA open data and tutorials: Search NASA resources for machine learning and signal processing examples if you want an edge-AI mindset.
  • NIH PubMed: Search review articles on cough sound analysis, respiratory sound classification, and digital biomarkers.
  • ESP-IDF documentation: Read the official ESP32-S3 development docs and examples for deployment details.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →

Shopping Cart