Heat-Visit Prediction With Satellite Data
ISEF Category: Computational Biology and Bioinformatics
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Computational Epidemiology · Difficulty: Advanced · Setup: Home Setup · Time: Full Year
The Hook
Heat does not hit every neighborhood the same way. Some blocks stay cooler because they have more trees, lighter roofs, or less pavement. Your model can help find places where heat may send more people to the ER. That turns satellite data into a public health map.
What Is It?
This project uses machine learning to connect environmental data with health outcomes. You start with satellite land-surface temperature, which estimates how hot the ground looks from space. You also add tree-canopy data, which shows how much of a neighborhood is shaded by trees. Then you compare those features with heat-related emergency department visits.
Think of it like matching a weather map with a neighborhood shade map. The model looks for patterns, such as whether hotter, less shaded areas tend to have more heat illness visits. If you add census or demographic data, you can also look for environmental justice patterns, which means checking whether some communities face more heat risk than others.
Why This Is a Good Topic
This is a strong science fair topic because you can test a clear question with public data and real health stakes. You can build a model, check how well it predicts visits, and compare neighborhoods with different heat and tree-cover profiles. The project teaches data cleaning, feature selection, model evaluation, and map-based analysis. It also connects directly to urban planning and public health.
Research Questions
- How does neighborhood tree canopy affect the predicted rate of heat-related emergency department visits? ?
- What is the effect of land-surface temperature on heat-related emergency department visit counts? ?
- Does adding tree-canopy data improve model accuracy compared with temperature alone? ?
- To what extent do census tract demographics change the model’s ability to identify heat-risk hotspots? ?
- Which machine learning model best predicts heat-related emergency department visits from satellite and neighborhood data? ?
- How does the relationship between heat and emergency visits vary across neighborhoods with different land cover types? ?
Basic Materials
- Laptop with at least 8 GB RAM.
- Internet access for downloading public datasets.
- Spreadsheet software such as Google Sheets or Excel.
- Python installed through Anaconda or a similar free distribution.
- Jupyter Notebook for cleaning data and running models.
- Public health and census data from city, county, state, or CDC sources.
- NASA land-surface-temperature data from Earthdata or a related portal.
- NLCD tree-canopy or land-cover data from the USGS or NOAA data portal.
- GIS software such as QGIS for mapping results.
- Digital notebook for tracking variable definitions and file versions.
Advanced Materials
- High-memory laptop or university workstation.
- Python with scikit-learn, pandas, geopandas, rasterio, and shapely.
- GIS software such as QGIS or ArcGIS if available through a university.
- Access to shapefiles or census tract boundaries.
- Publicly available emergency department or syndromic surveillance data.
- NASA Earthdata access for higher resolution thermal products.
- NLCD or similar land-cover rasters.
- Statistics software such as R for sensitivity checks and spatial regression.
- ImageJ or similar software if you need to inspect raster-derived visual outputs.
- Version control with Git and GitHub for reproducible code.
Software & Tools
- Python: Cleans datasets, joins neighborhood features, and trains prediction models.
- Jupyter Notebook: Lets you document code, plots, and analysis in one place.
- QGIS: Maps heat, tree canopy, and visit rates by neighborhood.
- Google Colab: Gives you a free cloud notebook if your computer is slow.
- R: Helps you run regression checks and compare model performance.
Experiment Steps
- Define the neighborhood unit you will analyze, such as census tracts or ZIP codes.
- Decide which outcome you will predict, such as heat-related ED visits, visit rates, or hotspot categories.
- Choose your input features, then separate thermal, land-cover, and demographic variables.
- Build a baseline model first, then compare it with models that add tree canopy and other predictors.
- Plan a validation strategy that tests whether the model works on unseen neighborhoods or time periods.
- Map the predictions and check whether high-risk areas cluster in communities with fewer trees or higher heat exposure.
Common Pitfalls
- Mismatching health records and satellite data by date, which weakens the link between heat exposure and visits.
- Using different neighborhood boundaries for different datasets, which creates broken joins and missing rows.
- Treating land-surface temperature like air temperature, which can distort the public health interpretation.
- Ignoring low-population neighborhoods, which can make visit rates look unstable or noisy.
- Overfitting a model to the same neighborhoods used for training, which makes the hotspot map look better than it really is.
What Makes This Competitive
A competitive version of this project goes beyond a simple correlation plot. You can compare several models, test whether tree canopy adds predictive value, and use spatial validation so your results hold up in new areas. Strong entries also check fairness, for example, whether the model performs differently across income or race-linked neighborhood groups. If you can explain where the model fails, and why, your project becomes much stronger.
Project Variations
- Use heat-related 911 or ambulance calls instead of emergency department visits to see whether the same predictors work across health data sources.
- Replace tree-canopy data with impervious surface or albedo to test whether shade or surface reflectivity matters more.
- Add air pollution or humidity data to see whether a multi-factor model predicts heat illness better than land-surface temperature alone.
Learn More
- NASA Earthdata: Search for land-surface temperature datasets and reading guides for satellite temperature products.
- USGS National Land Cover Database: Find tree canopy and land-cover layers for the United States.
- CDC Environmental Public Health Tracking: Explore public health data and heat-related health topic pages.
- PubMed: Search for review articles on urban heat islands, tree canopy, and heat illness.
- QGIS Documentation: Learn how to map tract-level data and make heat-risk layers.
- MIT OpenCourseWare, Introduction to Machine Learning: Use the free course materials to review model training, validation, and bias.
Computational Biology and Bioinformatics Category Guide
How to Do Real Computational Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
