ALS Splicing Signature Discovery
ISEF Category: Cellular and Molecular Biology
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Molecular Biology · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
ALS can hide in RNA before cells look sick. That makes it a good mystery for data analysis, not just wet lab work. You can use public motor-neuron RNA-seq data to look for splicing changes that point to hidden disease signals.
What Is It?
Alternative splicing is how one gene can make more than one RNA product. Think of it like editing a video by choosing different clips from the same raw footage. In healthy cells, that editing follows patterns. In disease, those patterns can shift.
This project looks at RNA-seq data from motor neurons made from induced pluripotent stem cells, or iPSCs. RNA-seq reads the RNA present in a sample, so you can compare ALS samples with control samples and ask which RNA parts get kept, skipped, or rearranged. One pattern to look for is intron retention, which means a piece of RNA that should usually be removed stays in the final transcript. You would test whether ALS samples show a distinct intron-retention signal that does not depend on TDP-43 changes and whether that signal appears in more than one dataset.
Why This Is a Good Topic
This is a strong science fair topic because the data already exist, the question is testable, and the analysis can be narrowed to a clear signal. You can learn transcriptomics, differential splicing, validation across cohorts, and basic statistics without needing a wet lab. The topic also connects to a real disease problem, since ALS research needs better biomarkers and clearer molecular subtypes.
Research Questions
- How does intron retention differ between ALS motor-neuron iPSC samples and matched controls? ?
- What is the effect of cohort choice on the intron-retention signature you detect? ?
- Does a TDP-43-independent splicing panel separate ALS samples from controls better than random intron sets? ?
- To what extent do your top splicing events replicate across public ALS RNA-seq datasets? ?
- Which genes show the strongest intron-retention changes in ALS motor-neuron iPSCs? ?
- What is the effect of normalization method on the number of significant alternative-splicing calls? ?
Basic Materials
- Laptop or desktop computer with enough memory for RNA-seq analysis.
- Stable internet access for downloading public datasets and annotations.
- Spreadsheet software for tracking sample metadata and results.
- Text editor for code and notes.
- Free account access to public data portals such as GEO or SRA.
- External storage or cloud backup for large files.
- Basic statistics reference notes for false discovery rate and effect size.
Advanced Materials
- High-memory workstation or cloud compute instance for alignment and splice analysis.
- Command-line tools for RNA-seq quality control and alignment.
- Reference genome and gene annotation files for the correct human build.
- Splicing analysis software such as rMATS or DEXSeq.
- Visualization software for sashimi plots and coverage tracks.
- Container or environment manager such as Conda for reproducible runs.
- Version control repository for scripts and analysis notes.
Software & Tools
- NCBI GEO: Find public ALS RNA-seq datasets, sample metadata, and study links.
- NCBI SRA: Download raw sequencing reads and check run details.
- rMATS: Detect differential alternative splicing events between groups.
- DEXSeq: Test exon and intron usage changes across samples.
- Python: Clean metadata, merge results, and make plots.
Experiment Steps
- Define the exact ALS comparison you will test, including tissue type, genotype groups, and control matching.
- Select public cohorts that share a similar cell type and sequencing platform so batch effects stay manageable.
- Decide how you will quantify splicing, then choose one primary metric for intron retention.
- Build a validation plan that splits discovery and replication cohorts before you look at results.
- Set the statistical threshold and correction method you will use for multiple testing.
- Plan a final figure set that shows effect size, replication, and biological meaning.
Common Pitfalls
- Mixing cohorts with very different cell-differentiation protocols, which can make protocol noise look like disease signal.
- Ignoring transcript annotation version mismatches, which can flip exon and intron coordinates.
- Calling every significant splice change a biomarker, which overstates findings from one dataset.
- Skipping replication analysis, which leaves you with a pattern that may only fit one cohort.
- Treating low-read genes as real splicing hits, which inflates false positives in intron-retention tests.
What Makes This Competitive
A competitive version of this project would do more than list differential splicing hits. You would predefine a discovery set, test a replication set, and show that your signal survives across datasets. Strong projects also compare at least two analysis methods and explain why one gives cleaner biology. If you can tie the pattern to a known pathway or RNA-binding mechanism, your story gets much stronger.
Project Variations
- Compare ALS motor-neuron iPSCs with spinal motor neurons from other neurodegenerative datasets to test whether the splicing pattern is disease-specific.
- Focus on one splicing metric, such as intron retention, exon skipping, or alternative 3' splice sites, and test which gives the cleanest ALS separation.
- Build a small classifier from the top splicing events and test whether it predicts ALS status in an unseen cohort.
Learn More
- NCBI Gene Expression Omnibus: Search for ALS, iPSC, and motor neuron RNA-seq datasets with sample metadata.
- NCBI Sequence Read Archive: Find raw RNA-seq files and read study methods.
- PubMed: Search review articles on ALS, TDP-43, alternative splicing, and intron retention.
- Molecular Cell and Nature Communications: Search for peer-reviewed RNA splicing studies with public datasets and analysis methods.
- MIT OpenCourseWare: Look for free genomics and bioinformatics course materials that cover RNA-seq analysis.
- NIH and NCBI Bookshelf: Read free background chapters on gene expression, RNA processing, and transcriptomics.
Cellular and Molecular Biology pillar guide
How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →