Predicting Voter Registration Spikes From Online Signals
ISEF Category: Behavioral and Social Sciences
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Sociology and Anthropology · Difficulty: Intermediate · Setup: Home Setup · Time: 1 to 2 Months
The Hook
A county can get louder online before it gets louder at the ballot box. Search and pageview data can act like a weather alert for civic attention. You can test whether those digital traces show up before voter-registration surges in official state records.
What Is It?
Google Trends shows how often people search a term compared with other searches. Wikipedia pageviews show how many times people read a page. If interest in voting rises in a county, those online traces may rise before official registration numbers do.
Your project tests whether those signals work like a leading indicator, which means a clue that appears before the final outcome. You then compare them with state Secretary of State records, which give you the official registration counts to check against. The goal is not to guess who will vote, but to see whether public online behavior lines up with later civic action.
Why This Is a Good Topic
This is a strong science fair topic because you can measure it with public data, test it with clear statistics, and connect it to a real problem, finding early signs of civic participation. You can learn data cleaning, time-series thinking, and basic model checking without a wet lab. The question is narrow enough for a fair, but rich enough to support a serious analysis.
Research Questions
- How does county-level Google Trends interest in voting terms change before a voter-registration surge?
- What is the effect of Wikipedia pageview spikes on the size of the next county-level registration jump?
- Does combining Trends and pageviews predict official registration records better than either signal alone?
- To what extent does the best lag between online attention and registration differ across election years?
- Which civic-interest pages or search terms track registration changes most closely at the county level?
- Does normalization by county population improve the match between online attention and registration surges?
Basic Materials
- Laptop with internet access.
- Spreadsheet software such as Google Sheets or Excel.
- Python with pandas, scipy, statsmodels, and matplotlib.
- Access to Google Trends.
- Access to Wikipedia pageview data or the Wikimedia Pageviews API.
- County-level voter-registration records from state Secretary of State websites.
- U.S. Census county population data.
- County FIPS code crosswalk.
Advanced Materials
- Geocoded voter-file extract with registration dates.
- Historical county shapefiles and boundary crosswalks.
- Python or R environment with statsmodels, scikit-learn, or lme4.
- Access to a compute server or university workstation for repeated model runs.
- Archived civic-interest search logs or an API pipeline for multiple keywords.
Software & Tools
- Google Trends: Exports relative search interest so you can compare attention across time and place.
- Wikimedia Pageviews API: Pulls article traffic counts so you can compare reading spikes with registration shifts.
- Python: Cleans county-level data, aligns dates, and runs lag and regression models.
- Google Sheets: Helps with quick checks, charts, and early comparisons before you code.
Experiment Steps
- Define the exact registration spike you will measure and the county-level time window around it.
- Pick a small set of search terms and Wikipedia pages that match civic attention, then keep the list fixed.
- Decide how you will line up online signals and official records, including the lag range you will test.
- Build a baseline model with simple predictors so you can compare the online signals fairly.
- Plan holdout tests, sensitivity checks, and a rule for handling counties with thin or missing data.
Common Pitfalls
- Comparing county-level online signals with state-level registration totals, which hides the signal you want to test.
- Using raw Trends values without a normalization plan, which makes county comparisons unstable.
- Ignoring county name and FIPS mismatches, which can join the wrong record sets.
- Fitting the same lag to every year, which can miss election-cycle differences.
- Treating one strong correlation as proof, which ignores confounders like turnout drives and news events.
What Makes This Competitive
A stronger version asks whether online attention predicts a later registration surge better than simple baselines like past registration, population, or election timing. You can raise the level by testing multiple lags, holding some counties or election years out, and reporting forecast error, not just correlation. A sharp comparison between search terms and Wikipedia pages can also show which signal carries more information and which one only mirrors news coverage.
Project Variations
- Use county-level turnout spikes instead of registration surges, then see whether the same online signals still lead.
- Compare voting-related searches with issue-based searches, such as school funding or abortion, to see which topic better predicts civic action.
- Test one state at a time so you can compare election law and local media effects across different settings.
Learn More
- Google Trends Help Center: Explains how Trends data are scaled and exported, and you can find it on the Google Trends help pages.
- Wikimedia REST API Documentation: Shows how to pull Wikipedia pageview counts, and you can find it in the Wikimedia Foundation developer docs.
- U.S. Census Bureau, American Community Survey: Gives county population and demographic controls, and you can find it on the Census website.
- U.S. Election Assistance Commission: Provides election and registration background reports, and you can find it on the EAC website.
- OpenIntro Statistics: A free textbook for correlation, regression, and sampling, and you can find it on the OpenIntro site.
- MIT OpenCourseWare, Introduction to Probability and Statistics: Offers free lessons on statistical thinking, and you can find it in MIT OpenCourseWare.
Behavioral and Social Sciences Category Guide
How to Do Real Behavioral and Social Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
