Change-Point Detection for Sensor Data

ISEF Category: Mathematics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Probability and Statistics · Difficulty: Advanced · Setup: Home Setup · Time: Full Year

The Hook

Your house already produces data that can tell a story. A CO2 spike can mean a crowded room, and a PM2.5 jump can mean a cooking event or poor ventilation. The trick is finding the moment the story changes, before the trend gets buried in noise.

What Is It?

Change-point detection asks a simple question, when did the data stop acting normal and start acting different? Think of it like hearing a song skip. Most of the track sounds smooth, then one tiny moment breaks the pattern. In sensor data, that break might mark a window opening, a cooking event, a filter failure, or a sensor glitch.

High-dimensional means you are watching several signals at once, not just one. For home air-quality logs, that might include CO2, PM2.5, humidity, temperature, and maybe occupancy data from a smart plug or motion sensor. Your algorithm looks for coordinated shifts across these streams. FDR stands for false discovery rate. In plain language, it controls how many of your detected change points are likely to be false alarms.

Why This Is a Good Topic

This makes a strong science fair topic because you can test a real statistical method on real data you collect yourself. You can compare different detection rules, measure false alarms, and ask whether adding more sensor channels improves accuracy. The topic connects to indoor air quality, smart homes, and public health, but the core project stays in math and statistics. You can learn data cleaning, modeling, hypothesis testing, and how to judge whether an algorithm is trustworthy.

Research Questions

How does adding more sensor channels affect change-point detection accuracy in home-air-quality data?
What is the effect of FDR control on the number of false alarms in streaming sensor logs?
Does a high-dimensional detector find ventilation or cooking events earlier than a single-sensor baseline?
To what extent do missing values reduce change-point detection performance in real-world sensor streams?
Which sensor combination gives the clearest change-point signal for indoor air-quality events?
How does the detector perform on weekdays compared with weekends in a month-long dataset?
To what extent does smoothing the data before detection change the tradeoff between sensitivity and false positives?

Basic Materials

Laptop with spreadsheet software or Python installed.
Home air-quality sensor that records CO2 and PM2.5.
Optional humidity and temperature sensor.
Notebook for logging events like cooking, window opening, and occupancy.
Stable internet or local storage for exporting sensor logs.
Digital clock or time-synced phone for matching events to timestamps.
Data storage folder with enough space for a month of logs.

Advanced Materials

Multiple synchronized air-quality sensors with timestamp export.
Raspberry Pi or similar device for automated logging.
Reference calibration source or co-located comparison sensor.
Access to a statistics package that supports custom hypothesis testing.
Version control tool for code and analysis notes.
External hard drive or cloud backup for raw and processed data.
Optional occupancy or ventilation sensors for multi-channel analysis.

Software & Tools

Python: Cleans the time series, runs the detector, and plots detected change points.
R: Fits statistical models and compares detector performance across methods.
Jupyter Notebook: Keeps code, notes, and figures in one place.
ImageJ: Not needed for this topic, so skip it unless you add visual sensor mapping.
Google Sheets: Tracks event logs and quick summaries before deeper analysis.

Experiment Steps

Define the exact event you want to detect, such as cooking, window opening, or a ventilation shift.
Choose the sensor channels that will enter your model and decide how you will align their timestamps.
Plan a baseline method first, then choose one multi-sensor detector that should outperform it.
Design controls that separate real environmental changes from sensor drift, missing data, and routine noise.
Build a scoring plan for true positives, false alarms, detection delay, and FDR.
Set up a comparison framework so you can test different sensor combinations, smoothing choices, and thresholds on the same dataset.

Common Pitfalls

Logging sensor data without time sync, which makes event labels miss the actual change point.
Mixing too many unrelated events, which hides whether the detector is finding one clear phenomenon or several different ones.
Treating a sensor spike as a real change point when it came from a power glitch or Bluetooth drop.
Comparing methods on different data windows, which makes one algorithm look better for the wrong reason.
Ignoring class imbalance, which can make a detector look accurate even when it misses rare change events.

What Makes This Competitive

A stronger project goes beyond, 'My algorithm found change points.' You can compare several detectors, prove why your FDR rule should control false alarms, and test it on both real and simulated streams. You can also study whether extra channels help only in some situations, like cooking events but not slow ventilation changes. A competitive version usually has clean evaluation, honest error analysis, and a clear reason your method beats the baseline.

Project Variations

Swap home-air-quality logs for wearable heart-rate or motion-sensor data and test the same detector on personal activity shifts.
Compare indoor CO2 with PM2.5 only, then ask whether one channel alone can match a multi-sensor model.
Add a simulation study that varies noise, missing data, and event length to test where the detector breaks first.

Learn More

NIH PubMed: Search for review articles on change-point detection, false discovery rate control, and time-series methods.
arXiv: Search for recent preprints on high-dimensional change-point detection and streaming statistics.
MIT OpenCourseWare: Look for probability, statistics, and data analysis courses that cover hypothesis testing and time series.
NOAA Air Resources Laboratory: Use background material on air-quality measurement and atmospheric data interpretation.
US EPA Air Sensor Toolbox: Read about low-cost air sensors, calibration limits, and common data-quality issues.
NIST/SEMATECH e-Handbook of Statistical Methods: Find sections on time-series analysis, control charts, and model evaluation.

Mathematics Category Guide

How to Do Real Mathematics Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →