Privacy-Preserving Campus Sound Detection

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Internet of Things · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

A microphone can hear a lot more than you think. On a school campus, that creates a privacy problem fast. What if you could detect danger sounds without sending raw audio anywhere? That is the core challenge in this project.

What Is It?

This project studies a network of small microphones and edge devices that listen for specific events, like a gunshot, breaking glass, or a scream. The key idea is simple. Each device processes audio locally, learns from its own data, and only shares model updates instead of raw recordings. That approach is called federated learning, which means the training happens across many devices without collecting the original files in one place.

Think of it like a study group where each student keeps their notes private, but everyone still helps build the answer key. Your system tries to recognize sound patterns from short audio clips, then improves by combining what each node learned. You also get to study the privacy tradeoff. Better privacy usually means less shared data, but that can make the model harder to train. That tension is the heart of the project.

Why This Is a Good Topic

This is a strong science fair topic because you can test a clear engineering question, how well can a private, distributed sound detector work compared with a centralized one? You can measure accuracy, false alarms, communication cost, and privacy risk. The topic connects to school safety, smart buildings, and ethical AI. A student can learn signal processing, machine learning, embedded systems, and experimental design in one project.

Research Questions

How does federated learning affect sound classification accuracy compared with training on pooled audio data? ?
What is the effect of the number of edge devices on model accuracy and false alarm rate? ?
Does adding microphone noise reduction improve detection of gunshot, breaking-glass, and scream events? ?
To what extent does model size change on-device inference speed and battery use? ?
Which feature representation, raw waveform, spectrogram, or mel spectrogram, gives the best event detection? ?
How does non-IID training data, where each device hears different environments, change convergence and final accuracy? ?

Basic Materials

ESP32 development board with Wi-Fi.
INMP441 I2S microphone module.
Laptop for code development and model training.
MicroSD card module or local storage option for audio logging.
USB cables and stable power supplies.
Headphones for checking recorded clips.
Free audio samples of gunshot, glass break, scream, and background noise.
Breadboard and jumper wires.
Digital multimeter for basic circuit checks.

Advanced Materials

Multiple ESP32 boards for distributed testing.
INMP441 microphones matched for sensitivity.
Small speaker or audio playback rig for controlled test sounds.
Acoustically similar test rooms or a quiet lab space.
Edge AI benchmarking tools for timing and memory use.
Server or workstation for federated aggregation experiments.
Optional external ADC or higher-quality audio front end for comparison studies.
Calibrated sound level meter for reference measurements.

Software & Tools

Python: Trains models, processes audio features, and analyzes results.
TensorFlow Lite for Microcontrollers: Runs compact classifiers on ESP32-class hardware.
PyTorch: Helps prototype audio models before porting them to edge devices.
Audacity: Checks audio clips and trims labeled event samples.
ImageJ: Measures spectrogram or heatmap visuals if you export them as images.

Experiment Steps

Define the exact events your system will detect, then decide how you will label background sound and false alarms.
Choose the comparison groups, such as centralized training, federated training, and local-only training.
Plan your audio feature pipeline, then decide whether you will classify raw waveform data or spectrogram-based features.
Design a privacy rule for what each device may transmit, then make sure no raw audio leaves the node.
Set up your evaluation metrics, including accuracy, precision, recall, false positives, communication load, and inference time.
Build a test plan that checks performance across different rooms, distances, and background noise levels.

Common Pitfalls

Training on a tiny set of sound clips, which makes the model memorize examples instead of learning real patterns.
Mixing up background noise and target events, which raises false alarms and hides weak classes.
Testing only in one room, which makes the system fail when the acoustics change.
Letting devices send raw audio during debugging, which breaks the privacy goal and weakens the project story.
Comparing models with different input features or different training splits, which makes the accuracy numbers unfair.

What Makes This Competitive

A stronger version of this project does more than build a working detector. It compares privacy, accuracy, and edge-device limits in a careful way. You could test how non-IID data affects learning across multiple microphones, or show how much performance drops when raw audio never leaves the device. A polished project will also include clean controls, strong validation, and a clear argument for why the design matters in real schools.

Project Variations

Use playground, hallway, and cafeteria noise as separate environments to test how location changes model accuracy.
Swap federated learning for split learning, then compare privacy and communication cost against the same audio task.
Focus on a single event class, like glass break detection, to study how one-sound systems behave under different background noises.

Learn More

MIT OpenCourseWare: Search for classes on embedded systems, machine learning, and signal processing to build the technical base.
PubMed: Search review articles on acoustic event detection, federated learning, and privacy-preserving sensing.
IEEE Xplore: Find peer-reviewed papers on edge audio classification and distributed learning methods.
NASA Open-Source Software and Data: Explore signal processing and edge computing examples, then adapt the analysis style to your project.
TensorFlow Lite for Microcontrollers documentation: Learn how to run small models on ESP32-class devices.
NIH PubMed Central: Read full-text papers on privacy-preserving machine learning and sensor-based monitoring.

Embedded Systems Category Guide

How to Do Real Embedded Systems Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Datasets →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →