Protein Binder Design for Disease Targets

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Genetics and Molecular Biology of Disease · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

A tiny protein can act like a custom key for a disease target. You do not need a wet lab to test that idea first. With the right workflow, you can design many candidates on a laptop, then rank them by how well they seem to fold and bind. That makes this topic a strong mix of biology, code, and real drug-design thinking.

What Is It?

De novo design means you build a new protein sequence from scratch instead of starting with a natural one. Think of it like designing a custom key for a lock. The lock is the disease target, such as IL-6R or PCSK9, and the key is a miniprotein, a very small protein built to sit on one surface and block a signal or interaction.

ProteinMPNN suggests amino acid sequences that fit a chosen backbone, which is the protein's 3D shape. ESMFold and AlphaFold-Multimer then predict whether that sequence folds and whether it still looks like a good partner for the target. Rosetta-relax checks whether the structure looks strained. In this project, you are not proving the binder works in a patient. You are building a screening pipeline that can sort promising designs from weak ones.

Why This Is a Good Topic

This makes a strong science fair topic because you can test clear design choices, like target choice, protein length, or scoring rules, and each choice gives you measurable outputs. It connects to real problems in disease therapy, since protein binders can block signaling proteins or help make new drugs. You can learn structure prediction, controls, and data analysis without needing a wet lab. That keeps the project realistic while still feeling like real research.

Research Questions

How does target choice, such as IL-6R versus PCSK9, change the predicted interface confidence of de novo miniprotein binders?
What is the effect of miniprotein length on AlphaFold-Multimer confidence and Rosetta-relax energy?
Does adding more interface-focused residues improve predicted binding confidence without hurting stability?
To what extent do designs with higher interface contact counts also show better relaxation energy?
Which backbone topology gives the best combined score for a chosen target, a helix-turn-helix or a helix-bundle?
How does a shuffled-sequence negative control compare with designed sequences on the same scoring rubric?

Basic Materials

Laptop with internet access and enough memory to run Colab notebooks.
Google account for Colab access.
Target protein structure file from RCSB PDB or AlphaFold DB.
Spreadsheet software such as Google Sheets or Excel for tracking scores.
Free structure viewer such as ChimeraX or PyMOL for checking interfaces.

Advanced Materials

University HPC cluster or GPU server for larger batches of designs.
Local Rosetta install for repeatable relax scoring.
Python environment with pandas, numpy, and matplotlib.
Benchmark set of known binder-target complexes for comparison.
ChimeraX or PyMOL for interface review and figure making.

Software & Tools

Google Colab: Runs ProteinMPNN and ESMFold notebooks without local setup.
ProteinMPNN: Suggests sequences that fit a chosen backbone shape.
ESMFold: Predicts whether your designed sequence folds as expected.
AlphaFold-Multimer: Estimates whether the binder and target stay in a complex.
Rosetta Relax: Scores local strain and helps you compare candidate designs.

Experiment Steps

Choose one disease target and define the exact binding site you will design against.
Set design rules for miniprotein length, secondary structure, and interface chemistry before you generate sequences.
Generate a small candidate set and rank it with one shared scoring sheet.
Compare prediction confidence, interface contact quality, and Rosetta-relax energy so one metric does not dominate.
Add negative controls and a simple baseline, then decide which candidates survive your final filter.

Common Pitfalls

Designing against a target structure with missing loops, which can make the interface look better than it really is.
Mixing scores from different target setups, which breaks direct comparison between candidates.
Treating high confidence as proof of binding, even when the contact map is weak.
Letting Rosetta energy override all other signals, which can reward stable but useless folds.
Skipping negative controls, which hides whether your pipeline beats a random-sequence baseline.

What Makes This Competitive

A strong version of this project does more than generate a few pretty models. You compare several target sites, build a clear scoring rule, and test whether the top designs still look good after a stricter second pass. The best entries usually include negative controls, a baseline against known binders, and a simple statistical summary of why one design strategy wins. That turns a notebook demo into a real screening workflow.

Project Variations

Compare miniprotein binders against two disease targets, IL-6R and PCSK9, to see whether target shape changes design success.
Test whether shorter or longer miniproteins earn better combined scores for the same target.
Compare ProteinMPNN-designed sequences with shuffled-sequence controls or a simpler baseline design method.

Learn More

RCSB Protein Data Bank: Search target structures and real protein-protein interfaces.
PubMed: Find review articles on miniprotein binders, protein design, and target selection.
NCBI Bookshelf: Read free background chapters on protein structure and molecular interactions.
RosettaCommons documentation: Learn how Rosetta relax and scoring work.
Google Colab documentation: Set up notebooks and manage files for your design workflow.

Biomedical and Health Sciences Category Guide

How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →