Forecasting Urban NO2 Hotspots With Machine Learning
ISEF Category: Environmental Engineering
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Pollution Control · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
Traffic jams do more than waste time. They can trap pollution in one street while the next block stays cleaner. If you can predict those NO2 spikes before they happen, you are working on a real air quality problem cities care about. That makes this a strong science fair topic.
What Is It?
NO2, or nitrogen dioxide, is a gas that comes mostly from burning fuel in cars, trucks, and other engines. In busy corridors, the amount can jump fast when traffic slows, wind drops, or the air near the road gets trapped. Your goal is to predict those hot spots before they peak.
Think of the city like a giant bowl of soup. Traffic stirs pollution into the air, while wind and temperature can either mix it out or keep it concentrated in one place. Machine learning is just a pattern-finding tool. You give it past examples, such as traffic counts, weather, and measured NO2, and it learns which combinations usually lead to higher pollution.
This project mixes environmental science and data science. You are not just asking whether traffic affects air quality. You are asking when, where, and under what weather conditions the effect gets strongest.
Why This Is a Good Topic
This topic works well because you can test a clear question with real public data and a real-world outcome. You can build a model, compare it to actual NO2 readings, and see which inputs matter most. The project connects to city planning, health, and pollution control. You can also learn data cleaning, feature selection, model testing, and error analysis, which are core research skills.
Research Questions
- How does traffic volume predict hourly NO2 levels along urban corridors?
- What is the effect of wind speed on the accuracy of NO2 hotspot forecasts?
- Does adding temperature and humidity improve model performance compared with traffic data alone?
- To what extent do weekday and weekend patterns change NO2 predictions?
- Which machine-learning model predicts NO2 hotspots better, random forest, linear regression, or gradient boosting?
- How does distance from the nearest major road affect measured NO2 and forecast error?
Basic Materials
- Laptop with internet access and enough storage for data files.
- Spreadsheet software or a free statistics program.
- Python installed with common data packages.
- Public OpenAQ air-quality data for NO2 readings.
- Public weather data from NOAA or a local meteorological station.
- Google Traffic or another public traffic source with time-based congestion data.
- Portable NO2 sensor such as an Aeroqual loaner, if available.
- Notebook for tracking data sources, cleaning choices, and model tests.
Advanced Materials
- Portable Aeroqual monitor or another calibrated NO2 sensor for field validation.
- Tripod or fixed mount for repeatable sensor placement.
- GPS-enabled device for recording sampling locations.
- Calibrated reference data from a nearby regulatory monitoring station.
- Low-cost particulate or meteorological sensors for extra covariates.
- Laptop with Python, R, or similar analysis software.
- External hard drive or cloud backup for large datasets.
- GIS software for mapping corridor-level hotspots.
Software & Tools
- Python: Cleans data, builds models, and tests prediction accuracy.
- Jupyter Notebook: Keeps code, charts, and notes in one place.
- Google Earth Engine: Helps if you want to add mapping or land-use context.
- QGIS: Maps sensor locations, roads, and predicted hotspot zones.
- ImageJ: Not usually needed here, but useful if you later analyze camera-based traffic counts.
Experiment Steps
- Define the corridor, the time window, and the exact NO2 outcome you want to predict.
- Choose one target variable first, such as hourly NO2 or hotspot versus non-hotspot classification.
- Assemble traffic, weather, and air-quality datasets that share the same time stamps and location scale.
- Decide how you will clean missing values, outliers, and mismatched timestamps before modeling.
- Build a baseline model, then compare it with models that add more predictors one layer at a time.
- Plan a validation check with portable sensor data or a held-out monitoring site so you can test real-world performance.
Common Pitfalls
- Mixing data from different time zones or timestamp formats, which shifts traffic, weather, and NO2 out of alignment.
- Using a traffic measure that does not match the road segment where the air-quality reading was taken, which weakens the signal.
- Training and testing on the same stretch of time, which makes the model look better than it really is.
- Ignoring missing weather values or sensor gaps, which can bias the hotspots you think you found.
- Treating a portable sensor as perfect without checking calibration against a reference station, which can distort validation.
What Makes This Competitive
A strong version of this project goes beyond a simple prediction plot. You can compare multiple models, test whether traffic adds value after weather is already known, and check performance on separate routes or seasons. Strong entries also explain why the model fails in some places, not just where it works. If you pair prediction with careful validation against a sensor and a reference station, your project looks much more like real environmental research.
Project Variations
- Use bicycle-corridor or school-dropoff traffic data instead of major roads to test whether smaller congestion sources still create NO2 spikes.
- Replace Google Traffic with webcam-based vehicle counts or city open-data traffic sensors to compare how the input source changes model accuracy.
- Add land-use or street-canyon features, such as building density and road width, to test whether urban form improves hotspot forecasting.
Learn More
- OpenAQ: Search the site for NO2 datasets and API documentation to find public air-quality measurements.
- NOAA National Centers for Environmental Information: Use weather station records and climate data for wind, temperature, and humidity inputs.
- NASA Earthdata: Explore land-surface and atmospheric context data if you want extra environmental predictors.
- PubMed: Search for review articles on traffic-related air pollution, NO2 exposure, and urban exposure modeling.
- US EPA Air Quality System: Look up monitored NO2 data and compare your portable sensor against regulatory stations.
- MIT OpenCourseWare: Search for introductory machine learning and data analysis courses if you need a free refresher on model testing.
Environmental Engineering Category Guide
How to Do Real Environmental Engineering Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
