Wearable Steps and Hypertension Causality

ISEF Category: Biomedical Engineering

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Other · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

A simple step count can hide a big health story. Two people can walk the same number of steps, yet have very different blood pressure risk. That makes this topic a great test of causal inference, the math of asking what actually causes what. You can turn public health data into a real research question.

What Is It?

This project asks a direct question, does walking more actually cause a lower risk of hypertension? Hypertension means high blood pressure, and it raises the risk of heart disease and stroke. Step counts from wearables are a clean way to measure daily movement, but they do not prove cause by themselves. People who walk more may also sleep better, eat differently, or have different ages, incomes, or health histories.

That is where causal inference comes in. Think of it like trying to figure out whether an umbrella caused the rain, or whether both showed up because of a storm. Tools like DoWhy and EconML help you model that storm by adjusting for other factors, testing assumptions, and checking how much hidden bias would be needed to change the result. You can use public datasets or published summary statistics to explore whether higher step counts have a dose-response effect, which means the risk changes in a graded way as steps rise.

Why This Is a Good Topic

This is a strong science fair topic because it is testable with public data, yet it asks a real health question. You do not need a wet lab, but you still get to use serious research tools like causal graphs, regression, and sensitivity analysis. The project connects to a problem millions of people care about, blood pressure control. You can also learn how to separate correlation from cause, which is a skill that matters in medicine, policy, and tech.

Research Questions

How does daily step count relate to incident hypertension after adjusting for age, sex, BMI, and smoking status?
What is the effect of moving from low step count to moderate step count on estimated hypertension risk?
Does the estimated step count effect stay similar when you change the confounders included in the causal model?
To what extent does the dose-response curve suggest a threshold, rather than a straight-line pattern, for hypertension risk?
Which sensitivity analysis assumptions would have to fail to erase the estimated protective effect of higher step counts?
How does the estimated effect differ between NHANES-based models and UK Biobank summary-based models?

Basic Materials

Laptop with enough memory to run Python and Jupyter notebooks.
Python 3 with pandas, numpy, scipy, matplotlib, seaborn, DoWhy, and EconML.
Jupyter Notebook or Google Colab for analysis and notes.
Public NHANES data files or documented extract files.
UK Biobank published summary statistics or paper tables.
Spreadsheet software for tracking variables, exclusions, and model runs.
Notebook or lab journal for analysis decisions and version notes.

Advanced Materials

Laptop or workstation with more RAM for larger files and repeated simulations.
Python 3 with pandas, numpy, statsmodels, DoWhy, EconML, scikit-learn, and matplotlib.
Access to the NHANES raw survey files and codebooks.
Access to UK Biobank summary statistics tables or published effect estimates.
A causal diagram tool such as DAGitty for drawing confounder models.
Optional access to a university computing cluster for bootstrap or permutation runs.
Reference manager for tracking papers and methods.

Software & Tools

Python: Runs data cleaning, causal models, and sensitivity tests.
Jupyter Notebook: Keeps code, plots, and written reasoning in one place.
DoWhy: Helps you define causal assumptions and test how stable your result is.
EconML: Fits heterogeneous treatment effect and dose-response style models.
DAGitty: Lets you draw a causal graph and think through confounders before modeling.

Experiment Steps

Define the exact exposure, outcome, and time window you will study.
Draw a causal graph that lists likely confounders, mediators, and colliders.
Choose one primary dataset and one external comparison dataset or summary source.
Build a clean analysis table with one row per person or one row per published estimate.
Fit a baseline causal model, then test how the result changes when you alter confounders, functional form, and exclusions.
Run sensitivity checks and compare whether the dose-response pattern survives hidden-bias stress tests.

Common Pitfalls

Treating step count as pure exercise, which ignores diet, sleep, age, and baseline health differences that can confound the result.
Mixing incident hypertension with existing hypertension, which blurs new cases and old cases.
Using a causal model without a clear causal graph, which makes it easy to adjust for the wrong variables.
Comparing NHANES and UK Biobank estimates without checking whether the populations and measurement methods match.
Interpreting a weak sensitivity analysis as proof, when hidden confounding may still explain the pattern.

What Makes This Competitive

A stronger version of this project goes beyond a single regression. You would build a clear causal graph, justify every confounder, and test several model forms for dose-response shape. You would also show how sensitive the result is to unmeasured confounding, then compare whether independent datasets point the same way. That kind of careful reasoning makes the project feel like real epidemiology, not just data plotting.

Project Variations

Use self-reported physical activity instead of wearable step counts to compare how measurement quality changes the causal estimate.
Focus on a subgroup, such as teens, adults under forty, or older adults, to see whether the step-hypertension link changes with age.
Compare a linear dose-response model with a threshold model to test whether extra steps help most at low activity levels.

Learn More

NHANES: Search the CDC NHANES site for questionnaires, examination files, and codebooks that support public health analysis.
UK Biobank: Search published methods papers and summary statistic releases to find accessible data descriptions and outcomes.
Causal Inference: What If: A free online textbook by Miguel Hernán and James Robins, useful for building causal thinking. Find it through Harvard and online searches.
DoWhy documentation: Open-source guides and examples for causal effect estimation, available through the DoWhy project pages and GitHub.
EconML documentation: Examples for treatment effect and heterogeneity modeling, available through the EconML project pages and GitHub.
DAGitty: A free web tool for drawing causal diagrams and checking adjustment sets.

Biomedical Engineering Category Guide

How to Do Real Biomedical Engineering Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →