AI Chronic Pain Coaching Chatbots

ISEF Category: Translational Medical Science

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Disease Treatment and Therapies · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

A chatbot can sound caring and still give bad advice. In chronic pain, that gap can matter. You can build a model that talks like a coach, then test whether it really uses the skills a therapist would expect. The project is part AI, part psychology, and part careful judging.

What Is It?

This project studies a chatbot that gives coaching support to people with chronic pain. The idea is simple: can a small language model respond in a way that feels supportive, while also following the rules of cognitive behavioral therapy, or CBT? CBT is a therapy style that helps people notice thoughts, feelings, and behaviors, then choose more helpful responses. Think of it like a GPS for coping skills, not a magic cure.

You are not trying to diagnose anyone or replace a clinician. You are testing the quality of the chatbot’s language. That means looking at whether the bot listens, reflects emotion, asks useful follow-up questions, and avoids harmful or off-track advice. You can compare its replies against a rubric, which is a scoring guide, and see how well it matches gold-standard examples.

Why This Is a Good Topic

This is a strong science fair topic because you can measure it with clear scoring rules. You can change the model, the prompt, the training data, or the style of the input message, then compare how the chatbot responds. The project connects to a real problem, since many people already turn to chat tools for health support. You can also learn how to evaluate AI systems, design rubrics, and analyze rater agreement, which are real research skills.

Research Questions

How does fine-tuning on patient-physician dialog corpora affect empathic-response scores in a chronic-pain coaching chatbot?
What is the effect of adding CBT-style prompt instructions on rubric scores for coping skill guidance?
Does the chatbot maintain CBT fidelity better on short pain statements or on longer, more complex patient messages?
To what extent do blinded raters agree when scoring empathy, warmth, and CBT fidelity in chatbot replies?
Which training set produces the best balance of empathy and CBT fidelity, patient-physician dialogs or generic conversational data?
How does response style change when the chatbot is asked to coach, validate feelings, or give problem-solving advice?

Basic Materials

Laptop or desktop computer with enough memory to run or access a small LLM.
Curated text corpus of patient-physician dialogs, with privacy-safe or synthetic data only.
Gold-standard CBT response rubric.
Spreadsheet software for scoring and analysis.
Google Docs or similar writing tool for prompt design and annotation.
Basic statistics tool or script environment for summary analysis.
Blinded rater scoring sheets.
Version control folder structure for model, prompts, and outputs.

Advanced Materials

GPU-access workstation or cloud compute for fine-tuning a small LLM.
Python environment with libraries for model training and evaluation.
Annotated psychotherapy or CBT dialogue dataset, if permitted and de-identified.
Inter-rater reliability toolkit for scoring agreement.
Secure data storage for any sensitive or human-rated text.
Prompt-testing harness for batch generation.
Jupyter notebook environment for analysis and figures.
Human subjects or IRB-approved review process, if you collect new rater or user data.

Software & Tools

Python: Runs the model pipeline, analysis scripts, and scoring workflow.
Jupyter Notebook: Helps you inspect outputs, compare models, and graph rubric scores.
Hugging Face Transformers: Supports small language model fine-tuning and inference.
Pandas: Organizes rater scores, prompt versions, and evaluation tables.
ImageJ: Not needed for this topic, so skip it and focus on text-based analysis.

Experiment Steps

Define the exact chatbot behavior you want to test, such as empathy, CBT skill use, or both.
Choose one training or prompting change at a time so you can tell what caused the score shift.
Build a blinded scoring rubric that separates empathy, safety, and CBT fidelity.
Prepare a fixed set of pain-related test prompts that cover different tones, severities, and response needs.
Plan how you will compare model versions with the same raters, the same prompts, and the same scoring scale.
Set up your analysis plan before you generate results, including agreement checks and summary statistics.

Common Pitfalls

Scoring empathy without separating it from CBT fidelity, which makes the results hard to interpret.
Using test prompts that are too similar, which hides weak spots in the chatbot.
Letting raters see which model version produced each reply, which biases the scores.
Training on mixed-quality dialog data, which can teach the chatbot to copy bad coaching habits.
Treating high empathy scores as proof of safety, even when the bot gives misleading or overconfident advice.

What Makes This Competitive

A strong version of this project goes past simple chatbot demos. You would compare multiple training strategies, use blinded raters, and report agreement between raters, not just average scores. You could also test whether the bot stays strong on edge cases, like frustration, hopelessness, or vague pain descriptions. The best projects separate style, safety, and therapy fidelity, then show which design choices improve each one.

Project Variations

Test whether the chatbot performs better on fibromyalgia, migraine, or general chronic pain prompts.
Compare prompt engineering, light fine-tuning, and retrieval-augmented responses for empathy and CBT fidelity.
Measure whether the chatbot responds differently to supportive, angry, or withdrawn patient language.

Learn More

PubMed: Search review articles on CBT for chronic pain, empathy in clinical communication, and digital mental health.
NIH National Center for Complementary and Integrative Health: Read free summaries on chronic pain management and mind-body approaches.
NIMH: Find background material on mental health support, therapy concepts, and digital interventions.
Hugging Face Documentation: Learn how small language models are fine-tuned and evaluated.
MIT OpenCourseWare: Search for free machine learning, natural language processing, and data science course materials.
Cochrane Library abstracts: Look for review summaries on psychotherapy, pain coping, and behavioral interventions.

Translational Medical Science Category Guide

How to Do Real Translational Medical Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →