Speech Biomarkers for Early Alzheimer’s Screening

Speech Biomarkers for Early Alzheimer’s Screening

ISEF Category: Biomedical Engineering

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Other  ·  Difficulty: Advanced  ·  Setup: Home Setup  ·  Time: 1 to 2 Months

The Hook

Your voice can carry clues about brain health. Tiny changes in pauses, word choice, and rhythm can show up before a diagnosis does. That makes speech a powerful data source for early screening. You can test that idea with public audio and free tools.

What Is It?

This project asks a simple question with a hard answer, can a computer tell the difference between speech from people with early Alzheimer's and speech from age-matched controls? You are not diagnosing anyone. You are building a classifier, which is a model that sorts recordings into groups based on patterns it learns from data.

Think of speech like a fingerprint made of sound. Some parts are obvious, like pitch and pause length. Other parts are hidden inside a model's embeddings, which are numeric summaries that capture patterns in audio. wav2vec2 is a pretrained speech model that turns sound into those summaries. SHAP is a method that helps you see which features pushed the model toward one class or the other.

The goal is not just accuracy. You also want explanation. A strong project asks whether the model relies on meaningful cues, such as longer pauses, slower rate, or simpler acoustic patterns, instead of random noise.

Why This Is a Good Topic

This is a strong science fair topic because you can test a real clinical problem with public data and free computing tools. Speech changes are measurable, and you can compare different feature sets, models, and explanation methods without needing a hospital lab. You will learn machine learning, data cleaning, feature engineering, and model interpretation. Those skills transfer to many biomedical engineering projects.

Research Questions

  • How does using wav2vec2 embeddings instead of hand-crafted acoustic features change classification performance?
  • What is the effect of adding pause and speaking-rate features to a baseline speech model?
  • Does a model trained on one type of speech task generalize better than a model trained on another?
  • To what extent do SHAP explanations agree with known speech markers of cognitive decline?
  • Which feature group, prosody, timing, or embedding-based features, contributes most to separating the two groups?
  • How does class balancing change model sensitivity, specificity, and overall accuracy?

Basic Materials

  • Laptop or desktop computer with reliable internet access.
  • Free Google account for Colab access.
  • Public ADReSS or DementiaBank challenge audio files.
  • Python environment in Google Colab.
  • Pandas for data tables.
  • NumPy for numerical work.
  • Librosa for audio feature extraction.
  • Scikit-learn for classification and evaluation.
  • Matplotlib or Seaborn for plots.
  • SHAP for model explanations.

Advanced Materials

  • University or research-grade GPU access.
  • Larger annotated speech dataset for external validation.
  • Hugging Face Transformers for wav2vec2 feature extraction.
  • PyTorch for model training.
  • Praat for detailed acoustic measurements.
  • OpenSMILE for standardized speech feature sets.
  • Statistical software or Python packages for mixed-effects analysis.
  • Version control system such as Git for reproducible workflows.

Software & Tools

  • Google Colab: Runs Python notebooks in the browser and gives you free access to basic compute for audio analysis.
  • Python: Lets you clean audio data, extract features, train models, and plot results.
  • Librosa: Extracts acoustic features such as pitch proxies, energy, and spectral summaries from speech files.
  • Hugging Face Transformers: Loads wav2vec2 models that convert speech into embeddings.
  • SHAP: Shows which features most influenced each prediction, which helps you explain model behavior.

Experiment Steps

  1. Define one clear prediction task, then decide whether you will compare embeddings, hand-crafted acoustic features, or both.
  2. Audit the dataset, then plan how you will handle speaker balance, label quality, and train-test separation.
  3. Choose a feature strategy, then map each feature group to a question about timing, prosody, or representation learning.
  4. Design your model comparison, then set a baseline that a stronger method must beat.
  5. Plan your evaluation metrics, then include measures that matter for screening, such as recall, precision, and confusion matrices.
  6. Prepare an explanation step, then decide how you will use SHAP or similar tools to connect predictions back to speech features.

Common Pitfalls

  • Mixing the same speaker across training and test sets, which makes the model look better than it really is.
  • Treating short or noisy recordings the same as clean ones, which can turn audio quality into the real signal.
  • Using accuracy alone, which hides poor sensitivity for the Alzheimer's class.
  • Extracting features from clips with different trimming rules, which adds artificial differences that the model can learn.
  • Reading SHAP values as proof of causation, when they only show which features influenced the model's output.

What Makes This Competitive

A competitive version of this project does more than train one model and report accuracy. You can compare embeddings against interpretable acoustic features, then test whether the explanation matches known clinical markers like pauses and speaking rate. You can also use a stronger evaluation design, such as speaker-independent splits and external validation. That shows real care about bias, generalization, and clinical usefulness.

Project Variations

  • Try the same pipeline on spontaneous speech, then compare it with picture-description speech to see which task gives cleaner signals.
  • Swap in a different feature set, such as OpenSMILE descriptors, and test whether standardized acoustic features beat embeddings.
  • Focus on explanation quality, then compare SHAP patterns across models to see which one gives the most medically sensible signals.

Learn More

  • ADReSS Challenge: Public challenge data and task description for dementia-related speech analysis, found by searching for the ADReSS challenge page.
  • DementiaBank: A widely used speech dataset for dementia research, found through the TalkBank site and its DementiaBank materials.
  • PubMed: Search for review articles on speech biomarkers, Alzheimer’s disease, and automatic dementia detection.
  • NIH National Institute on Aging: Background on Alzheimer's disease, symptoms, and research priorities.
  • Hugging Face model hub: Search for wav2vec2 models and speech processing examples.
  • SHAP documentation: Explains how to interpret model predictions with feature attribution plots.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →

Shopping Cart