Chatbot Vaping Cessation Message Study
ISEF Category: Translational Medical Science
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Disease Prevention · Difficulty: Advanced · Setup: Home Setup · Time: Full Year
The Hook
A lot of people ask the internet for help quitting vaping before they ever ask a doctor. That means real quit strategies may already be hiding in posts, comments, and stories. You can turn those words into data, then test which message style makes people more ready to try quitting.
What Is It?
This project starts with a simple idea, people talk differently when they are trying to quit than when they are just hearing official health advice. You collect public posts from platforms like TikTok or Reddit, then use natural language processing, or NLP, which means computer methods for finding patterns in text. Your goal is to spot the phrases, themes, and coping tips that show up in successful quit stories.
Think of it like sorting a giant pile of recipe cards. Some cards are just general advice, while others repeat the same useful steps. Topic modeling helps you group similar messages, so you can see what kinds of strategies people mention most. Then you can build a chatbot prompt or message set that uses those real user strategies and compare it with a generic public health message.
The second half of the project tests whether message style changes intent. A survey cohort can read one of the message versions and answer questions about whether they would try to quit, seek help, or keep reading. You are not measuring actual quitting here, but you are measuring a real behavior-related outcome, quit-attempt intent.
Why This Is a Good Topic
This is a strong science fair topic because it blends a real public health problem with data you can analyze and test. You can measure text patterns, compare message groups, and run a simple experiment on attitude or intent. It also connects to nicotine addiction, health communication, and social media, so the topic feels current and useful. A student can realistically learn NLP basics, survey design, and statistics without needing a hospital or wet lab.
Research Questions
- How does message framing in vaping-cessation posts affect quit-attempt intent?
- What is the effect of chatbot prompts built from successful quit narratives versus generic CDC-style messaging on willingness to quit?
- Does topic prevalence differ between posts that describe quitting success and posts that describe relapse?
- To what extent do coping-strategy keywords predict higher self-reported readiness to quit?
- Which narrative themes appear most often in public vaping-cessation discussions?
- How does audience age group affect response to narrative-based versus factual quit messages?
Basic Materials
- Laptop with internet access.
- Spreadsheet software such as Google Sheets or Excel.
- Python installed with basic text-analysis libraries.
- Free Reddit API access or approved public dataset access.
- Public post archive or manually collected public posts that follow platform rules.
- Survey platform such as Google Forms or Microsoft Forms.
- Digital notebook for coding message themes.
- Basic statistics calculator or spreadsheet functions.
Advanced Materials
- Laptop with internet access.
- Python with pandas, scikit-learn, spaCy, NLTK, and gensim.
- Jupyter Notebook or Google Colab for reproducible analysis.
- API access to public social media data through approved endpoints.
- Qualtrics or REDCap for survey randomization, if available through school access.
- R or Python stats packages for regression, chi-square tests, and effect size estimates.
- Version control with GitHub for analysis tracking.
- IRB-style consent materials, if your school requires them for survey work.
Software & Tools
- Python: Cleans text, counts words, and runs topic models or simple classifiers.
- Google Colab: Lets you code in a browser without installing heavy software.
- spaCy: Helps you clean posts, split text, and find named patterns.
- scikit-learn: Supports clustering, classification, and basic model comparison.
- Voyant Tools: Gives quick text summaries and word frequency views for early exploration.
Experiment Steps
- Define the outcome you will measure, such as quit-attempt intent, message credibility, or willingness to seek help.
- Collect a public text sample and decide clear inclusion rules for posts that count as quit narratives.
- Code the main themes in the narratives, then use NLP to group similar strategies into topics.
- Build two or more message versions, one based on real user strategies and one based on generic public health wording.
- Plan a survey design that randomizes readers to one message version and keeps the comparison fair.
- Choose the statistics you will use to compare groups and decide how you will handle missing or messy responses.
Common Pitfalls
- Collecting posts without a clear definition of what counts as a quit narrative, which makes the dataset noisy and hard to defend.
- Mixing posts from very different platforms, which can blur the message style and confound your topic model.
- Treating topic clusters as if they prove causation, when they only show patterns in language.
- Writing survey questions that hint at the answer, which can inflate quit-attempt intent scores.
- Skipping a control message group, which makes it hard to tell whether the chatbot prompt helped at all.
What Makes This Competitive
A stronger version of this project does more than count keywords. It compares multiple message styles, uses a clear coding scheme, and checks whether the results hold across different subgroups. You can raise the level again by testing whether the effect stays the same after controlling for age, prior vaping experience, or message trust. Clean methods and careful statistics matter more than flashy graphics here.
Project Variations
- Focus only on Reddit quit stories and compare them with Reddit relapse stories to see which coping themes separate the two groups.
- Test whether short chatbot replies, long chatbot replies, or bullet-style quit tips produce the highest quit-attempt intent.
- Compare vaping-cessation narratives with smoking-cessation narratives to see whether the same message themes work for both.
Learn More
- PubMed: Search for review articles on nicotine cessation, health communication, and message framing.
- NIH National Cancer Institute: Find plain-language resources on tobacco control and behavior change research.
- CDC Tobacco Information and Prevention Source: Review evidence-based nicotine cessation messaging and youth tobacco facts.
- NIH NIMH Data Archive: Explore methods for survey design and behavioral research data handling.
- MIT OpenCourseWare, Introduction to Computational Thinking and Data Science: Use the course materials to build basic text-analysis skills in Python.
- NLTK Book: Read the free online textbook for practical natural language processing methods.
Translational Medical Science Category Guide
How to Do Real Translational Medical Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
