State of the Union Moral Language Trends

ISEF Category: Behavioral and Social Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Sociology and Anthropology · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

A presidential speech can sound calm, but the word choice can still swing with the political mood. If you track moral language across 50 years of State of the Union addresses, you can turn a long speech archive into a trend line. That lets you ask a sharper question, does the way leaders talk about right and wrong move with national polarization?

What Is It?

BERTopic is a topic-modeling method. Think of it like sorting a messy stack of notes into piles that share similar words and ideas. Instead of reading every speech one by one, you let the model group passages that sound alike, then you inspect whether any group reflects moral foundations such as harm, fairness, loyalty, authority, and purity.

You then count how much each speech leans on those moral themes and compare that score across decades. Gallup polarization indices give you a second line of data. Put the two lines on the same timeline, and you can test whether moral rhetoric rises when the country feels more divided, or whether the two move in different directions.

Why This Is a Good Topic

This makes a strong science fair topic because the question is clear, the data are public, and the analysis sits at the border of language and social behavior. You can build a real measurement pipeline, test whether your result holds up under different model settings, and connect the finding to a current issue people care about. A good version of this project teaches you text cleaning, topic modeling, and correlation analysis without needing a wet lab.

Research Questions

How does the share of moral-foundation language in State of the Union addresses change across presidential administrations?
What is the effect of party control on the amount of harm, fairness, loyalty, authority, and purity language in each speech?
Does the yearly moral-foundation score correlate with Gallup polarization indices?
To what extent do election years show larger shifts in moral rhetoric than non-election years?
Which moral foundation has the strongest lagged link with changes in polarization?
How does using BERTopic instead of a keyword count change the trend you measure?

Basic Materials

Laptop with 16 GB RAM or more.
Python installed with Jupyter Notebook.
Internet access to download transcripts and survey data.
State of the Union transcript archive from the American Presidency Project.
Spreadsheet software for checking year-by-year outputs.
Git for version control and version tracking.

Advanced Materials

High-RAM workstation or shared computing server.
Python environment with BERTopic, sentence-transformers, pandas, scikit-learn, and scipy.
R with tidyverse and lmtest for alternate statistical checks.
Hand-coding sheet for validating a sample of passages by eye.
Zotero for managing papers on moral foundations, polarization, and text analysis.

Software & Tools

Python: Cleans transcripts, runs topic models, and computes correlations.
Jupyter Notebook: Keeps code, notes, and figures in one place.
BERTopic: Groups speech passages into topics and helps you track how they change over time.
pandas: Organizes speech text, topic scores, and polarization data into tables.
matplotlib: Plots year-by-year trends and side-by-side comparisons.

Experiment Steps

Define the speech unit you will score, such as full speeches, sections, or paragraphs.
Decide how you will label moral-foundation language, either with a dictionary, manual coding, or both.
Build a first topic model, then check whether the top words match a human reading of the passages.
Choose how to turn topic output into one yearly measure that you can compare across decades.
Match the speech measures to Gallup polarization data and pick the statistical test you will run.
Plan a validation check that compares your main result with an alternate model setting or a hand-coded sample.

Common Pitfalls

Letting one president's speeches dominate the model, which can hide era-level change.
Skipping transcript cleanup, which leaves boilerplate phrases and applause lines in the topics.
Treating topic labels as fixed facts, which can make a loose cluster sound more precise than it is.
Comparing speech scores and Gallup data on mismatched years, which can flatten or shift the pattern.
Using too many model settings without a plan, which makes it hard to explain why the trend changed.

What Makes This Competitive

A stronger version of this project does more than plot a trend line. You compare at least two topic-model settings, check your labels by hand on a sample, and test whether any link to polarization still appears after you control for party, year, and speech length. If you add a lag test or a comparison corpus, you move from a simple text summary to a real analysis of how political language shifts with social tension.

Project Variations

Compare moral-foundation language in State of the Union speeches from Democratic and Republican presidents to see whether the pattern is mostly partisan or historical.
Swap in a keyword-based score instead of BERTopic, then test whether the simpler method gives the same trend line.
Add inaugural addresses or party convention speeches as a comparison corpus to see whether State of the Union rhetoric is unusual.

Learn More

American Presidency Project: Search the site for State of the Union transcripts and use it as your speech source.
Gallup: Look up public articles and methodology notes on polarization trends and public opinion change.
BERTopic documentation: Read the open-source guide for building topic models in Python.
Moral Foundations Dictionary paper: Search Google Scholar for the original dictionary paper and its appendix.
Pew Research Center: Find background reports on political polarization and party sorting.

Behavioral and Social Sciences Category Guide

How to Do Real Behavioral and Social Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →