Smartphone Voice Models for Early Neuro Markers

ISEF Category: Computational Biology and Bioinformatics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Computational Neuroscience · Difficulty: Advanced · Setup: Home Setup · Time: Full Year

The Hook

Your voice can carry clues about the brain before a diagnosis does. Tiny changes in rhythm, pitch, and pauses can show up in speech long before they become obvious in daily life. That makes voice data a strong target for machine learning. You can test whether a model can spot those patterns from smartphone recordings.

What Is It?

This project studies speech as a signal. Prosody means the way you speak, not the words you choose. It includes pitch, stress, timing, and pauses. In Parkinson's disease and Alzheimer's disease, those patterns can change because the brain systems that control movement, planning, and language also affect speech.

Contrastive learning is a machine learning method that learns by comparing examples. Think of it like teaching a student by showing pairs. The model learns that two recordings from the same kind of speech context should sit close together in feature space, while different ones should sit farther apart. Self-supervised learning means the model learns from the data itself, not from a huge set of hand-labeled examples. That matters here because medical voice datasets are often small and messy.

Why This Is a Good Topic

This is a strong science fair topic because you can test a real pattern with data science tools instead of expensive lab equipment. You can compare recording conditions, speech tasks, feature types, and model setups. The project connects to early screening for neurological disease, which is a real clinical need. You can also learn data cleaning, signal processing, model evaluation, and how to think about bias in medical AI.

Research Questions

How does self-supervised contrastive pretraining change voice-model accuracy for early neuro marker detection?
What is the effect of speech task type, such as reading, picture description, or free speech, on model performance?
Does combining pitch, pause, and speaking-rate features improve classification more than using one feature group alone?
To what extent does recording with different smartphone models affect the stability of learned voice embeddings?
Which augmentation strategy, such as noise, pitch shift, or time masking, gives the best downstream classification results?
How does subject-level splitting compare with recording-level splitting in estimated model performance?

Basic Materials

Smartphone with a working microphone and voice recorder app.
Quiet recording space with consistent lighting and low background noise.
Laptop or desktop computer with at least 8 GB RAM.
Free storage for audio files and metadata.
Spreadsheet software for tracking participant age group, task type, and recording conditions.
Headphones for checking audio quality.
Consent form and participant information sheet approved by your school or mentor.
Notebook or digital log for recording protocol changes.

Advanced Materials

Access to a shared or public speech dataset with health labels or clinical annotations.
External USB microphone for higher consistency across sessions.
Pop filter or small microphone stand to reduce handling noise.
Python environment with GPU access if available.
Audio feature extraction package for prosody analysis.
Version-controlled dataset storage with clear train, validation, and test splits.
Institutional review or mentor-approved workflow for handling human speech data.
Statistical analysis software for confidence intervals, effect sizes, and significance testing.

Software & Tools

Python: Lets you clean audio data, extract features, and train models.
librosa: Extracts speech features such as pitch, spectral shape, and timing cues.
PyTorch: Builds contrastive learning models and fine-tunes classifiers.
scikit-learn: Tests baseline models and evaluates classification metrics.
ImageJ: Not used here, so skip it and focus on audio analysis tools.

Experiment Steps

Define the speech task, the participant pool, and the exact outcome you want to predict.
Choose one recording protocol and keep microphone distance, room type, and prompt style consistent.
Decide which audio features or learned embeddings you will compare against a simple baseline.
Build train, validation, and test splits by person, not by recording, so the model cannot memorize speakers.
Plan a contrastive pretraining setup and a downstream classifier that tests whether pretraining helps.
Set up evaluation metrics that matter for imbalanced medical data, such as sensitivity, specificity, and AUC.

Common Pitfalls

Mixing recordings from the same person across train and test sets, which inflates accuracy.
Letting room noise or phone model differences become the main signal, which hides speech patterns.
Using word content instead of prosody, which turns the project into a language task instead of a voice task.
Comparing models with only one random split, which makes the results too unstable to trust.
Treating a small dataset like a clinical tool, which can lead to overclaiming diagnostic value.

What Makes This Competitive

A stronger project goes beyond basic accuracy. You can test whether the model learns real speech patterns by using subject-level splits, matched controls, and multiple speech tasks. You can also compare simple features against learned embeddings, then report sensitivity, specificity, calibration, and confidence intervals. If you add an ablation study that shows which speech cues matter most, your project will feel much more serious and original.

Project Variations

Compare reading-aloud recordings with spontaneous speech to see which task better exposes prosodic change.
Test whether a model trained on one smartphone brand transfers to another brand without major loss in performance.
Compare handcrafted prosody features with self-supervised embeddings to see which gives cleaner separation between groups.

Learn More

NIH PubMed: Search for review articles on speech biomarkers, Parkinson's disease, Alzheimer's disease, and machine learning.
NCBI Bookshelf: Read free background chapters on neuroscience, speech, and machine learning concepts.
IEEE Xplore: Search for peer-reviewed papers on speech-based disease detection and contrastive learning, then filter for accessible abstracts.
MIT OpenCourseWare: Look for free courses on machine learning and signal processing to strengthen your model design.
NIH Data Sharing resources: Review guidance on human data ethics, privacy, and reproducible analysis.

Computational Biology and Bioinformatics Category Guide

How to Do Real Computational Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →