Designing Peptides to Block Amyloid-β Clumping in Alzheimer’s
ISEF Category: Biochemistry
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Medicinal Biochemistry · Difficulty: Advanced · Setup: Home Setup · Time: Full Year
The Hook
Amyloid-β clumps are tied to one of the most studied diseases in biology. You can search for short peptides that latch onto the sticky hotspots before the protein builds bigger fibers. A laptop can help you test thousands of designs without touching a pipette. That makes this a strong first research project in computational biochemistry.
What Is It?
Amyloid-β is a short protein fragment that can stack on itself and form clumps. The spots where stacking starts are called hotspots. Think of them like the rough tabs on puzzle pieces that help the pieces lock together. If you design a peptide that covers those tabs, you may slow the clumping process.
ESM2 embeddings are numeric fingerprints made by a protein language model. A genetic algorithm is a search method that keeps better sequences, mixes them, and tries again, like breeding a population of candidate peptides. In this project, you use those tools to generate peptide ideas and then score them with molecular dynamics, or MD, which tracks how the molecules move in a simulated water box.
Why This Is a Good Topic
This topic gives you a real disease question, a clear computer-based workflow, and lots of room for controlled comparison. You can test whether one design rule beats another, measure how often the search finds strong candidates, and compare your top peptides with scrambled controls. A student can learn sequence analysis, basic machine learning, and simulation thinking without needing a wet lab.
Research Questions
- How does peptide length affect predicted binding to amyloid-β hotspot residues?
- What is the effect of adding charged residues on self-aggregation scores for candidate peptides?
- Does using ESM2 embeddings improve the ranking of anti-amyloid peptides over sequence-only features?
- To what extent does the genetic algorithm find lower-scoring candidates than random sequence search?
- Which hotspot-targeting motifs produce the best balance between amyloid-β binding and peptide solubility?
- How does the top designed peptide compare with scrambled controls in MD-based aggregation metrics?
Basic Materials
- Laptop or desktop computer with 16 GB RAM.
- Python 3.11 and a code editor such as VS Code.
- Free access to a Jupyter Notebook environment.
- Internet access for PubMed, RCSB PDB, and UniProt searches.
- Spreadsheet software for tracking sequence scores.
- External storage or cloud backup for simulation outputs.
Advanced Materials
- Access to an NVIDIA GPU workstation or a compute cluster.
- Molecular dynamics software such as GROMACS, AMBER, or NAMD.
- Python packages for sequence analysis, optimization, and plotting.
- ESM2 model weights and a local inference environment.
- PyMOL or ChimeraX for structure inspection.
- Reference set of amyloid-β sequences, mutant variants, and control peptides.
Software & Tools
- Python: Runs the genetic algorithm, scoring scripts, and plots.
- Google Colab: Gives you a free notebook for smaller test runs.
- ESM2: Generates protein embeddings that turn sequences into numeric features.
- GROMACS: Simulates peptide behavior and helps compare candidate stability.
- PyMOL: Lets you inspect peptide and amyloid-β contact regions in 3D.
Experiment Steps
- Define the amyloid-β hotspot you want to target and the success metric you will optimize.
- Choose a peptide design space, including length, allowed amino acids, and any synthesis limits.
- Build a scoring pipeline that combines embedding features, hotspot complementarity, and self-aggregation penalties.
- Set up the genetic algorithm so you can compare multiple random seeds and see whether results repeat.
- Plan a validation set with scrambled sequences, random peptides, and any known binder controls you can find.
- Decide how you will turn simulation output into one final ranking, using the same rule for every candidate.
Common Pitfalls
- Optimizing only the amyloid-β binding score, which can select peptides that also clump by themselves.
- Running one genetic algorithm seed, which can make a lucky search look better than it is.
- Leaving out scrambled-sequence controls, which makes it hard to prove that your scoring rule beats chance.
- Comparing peptides with different lengths without normalizing the metrics, which can bias the ranking.
- Trusting ESM2 output without checking simulation behavior, which can hide false positives from the model.
What Makes This Competitive
A class-level project stops at one best sequence. A stronger project tests many seeds, compares against random search, and uses scrambled controls. You can also split your data into train and holdout sets so the model does not just memorize one hotspot. If you add a clear analysis of why the top peptides win, the project looks much closer to real research.
Project Variations
- Test your pipeline on amyloid-β mutants to see whether your design rules still pick strong candidates.
- Swap in a sequence-only scoring method and compare it with ESM2 to measure the model's added value.
- Add a penalty for predicted self-aggregation and check whether it changes the final peptide rankings.
Learn More
- PubMed: Search review articles on amyloid-β aggregation, peptide inhibitors, and molecular design.
- RCSB Protein Data Bank: Find amyloid-related structures and study known interaction sites.
- NIH Bookshelf: Read free textbook chapters on protein structure, peptides, and molecular recognition.
- UniProt: Check amyloid precursor protein and amyloid-β sequence context.
- GROMACS tutorials: Use the official guides and examples to learn molecular dynamics setup and analysis.
Biochemistry Category Guide
How to Do Real Biochemistry Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
