Predicting Rooftop Solar Soiling Loss

Predicting Rooftop Solar Soiling Loss

ISEF Category: Energy: Sustainable Materials and Design

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Solar Process, Materials, and Design  ·  Difficulty: Intermediate  ·  Setup: Home Setup  ·  Time: 1 to 2 Months

The Hook

Dust does not just make your car dirty. It can also steal energy from solar panels. A thin film on glass changes how much light reaches the cells underneath. If you can predict when that loss will happen, you can help solar owners clean panels at the right time.

What Is It?

Solar soiling means dirt, dust, pollen, smoke particles, or other debris build up on a panel and block sunlight. Think of it like putting a dim filter over a flashlight. The panel still works, but it gets less light, so its output drops.

This project asks whether you can predict that loss using outside data such as local air quality, wind, rainfall, and dust conditions. NASA POWER gives weather and sunlight data. PurpleAir gives nearby particle readings from low-cost air sensors. A regression model is a math tool that looks for patterns between inputs and a measured result, like panel output or estimated soiling loss. If the model works, you can estimate when cleaning matters most.

You do not need a solar farm to study this. You can use rooftop panel data, published utility data, or your own small panel setup. The big idea is simple: connect environmental conditions to a real drop in energy output, then see how well a model predicts it.

Why This Is a Good Topic

This is a strong science fair topic because you can measure real-world change, compare several environmental inputs, and test a model with real data. It connects to renewable energy, pollution, and maintenance costs. You can start with public datasets and still make original choices about features, model type, and validation. You will also learn data cleaning, regression, and how to judge whether a prediction model actually works.

Research Questions

  • How does regional PM2.5 or PM10 change predict rooftop PV soiling loss?
  • What is the effect of recent rainfall on panel output recovery after a dust event?
  • Does adding wind speed improve a model that predicts soiling loss from air quality alone?
  • To what extent do NASA POWER weather variables improve prediction accuracy for rooftop PV output?
  • Which regression model best predicts soiling loss from PurpleAir and weather data?
  • How does season change the link between dust levels and PV performance?
  • To what extent can a model trained in one location predict soiling loss in a different nearby location?

Basic Materials

  • Laptop or desktop computer with internet access.
  • Spreadsheet software such as Google Sheets or Excel.
  • Free Python environment such as Google Colab.
  • Access to NASA POWER data.
  • Access to PurpleAir public data.
  • Rooftop PV output data from a public source or your own small solar panel setup.
  • Digital notebook for tracking data sources, dates, and cleaning decisions.

Advanced Materials

  • Rooftop PV monitoring system with logged power output.
  • Small reference solar panel for field comparison.
  • Pyranometer or light sensor for irradiance checks.
  • Portable weather station for local wind, humidity, and rainfall.
  • Dust collection materials for surface inspection.
  • Image capture setup for panel surface photography.
  • Database or cloud storage for time-series data.

Software & Tools

  • Google Colab: Runs Python notebooks in the browser and handles regression, plots, and data cleaning.
  • Python: Lets you merge time-series data, fit models, and test prediction accuracy.
  • pandas: Organizes weather, air quality, and power data into analysis-ready tables.
  • scikit-learn: Fits regression models and compares their prediction scores.
  • ImageJ: Measures surface coverage if you add panel photo analysis to your project.

Experiment Steps

  1. Define your target, either measured PV output or estimated soiling loss from panel data.
  2. Choose the environmental variables you will test, then set one as your main predictor and the others as controls.
  3. Build a clean dataset by matching panel performance records with nearby air quality and weather data on the same dates.
  4. Split the data into training and test sets so you can judge prediction quality on unseen days.
  5. Compare at least two regression models and decide which metrics matter most for your question.
  6. Check whether the model still works across different seasons, rainfall periods, or sites.

Common Pitfalls

  • Mixing time stamps from air quality data and PV output data, which creates fake relationships between dust and power loss.
  • Using a single nearby sensor as if it perfectly represents rooftop conditions, which can hide local variation.
  • Ignoring rainfall and wind, which can make a dust model look weaker or stronger than it really is.
  • Training and testing on the same stretch of dates, which makes the model seem more accurate than it is.
  • Treating cloud cover like soiling, which confuses low sunlight with dirt on the panel.

What Makes This Competitive

A stronger project goes beyond one simple correlation. You can compare several models, test them across seasons, and check whether the predictions still hold at a different site. You can also add a second outcome, such as cleaning threshold days or percent power recovery after rain. Clear validation matters more than a fancy model name.

Project Variations

  • Use a small rooftop panel and compare predicted soiling loss with your own logged output data from several weather conditions.
  • Swap PurpleAir for another public particulate source and test whether the model stays accurate with a different sensor network.
  • Add satellite or local smoke event data and test whether fire smoke changes prediction accuracy during haze periods.

Learn More

  • NASA POWER: Search the NASA POWER Data Access Viewer and documentation for surface meteorology and solar data by location and date.
  • PurpleAir: Search the PurpleAir map and API documentation for public particle sensor readings near your area.
  • NOAA National Centers for Environmental Information: Find historical weather and climate data for rainfall, wind, and cloud conditions.
  • PubMed: Search review articles on photovoltaic soiling, dust deposition, and cleaning effects.
  • MIT OpenCourseWare: Search introductory materials on regression, statistics, and machine learning for data analysis practice.
  • Solar Energy journal: Search for peer-reviewed studies on PV soiling, cleaning intervals, and performance losses.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart