De Novo Viral Protein Binder Design

De Novo Viral Protein Binder Design

ISEF Category: Biochemistry

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Structural Biochemistry  ·  Difficulty: Advanced  ·  Setup: Home Setup  ·  Time: 1 to 2 Months

The Hook

A small custom protein can act like a lockpick for a virus. In this project, you design a miniprotein from scratch that fits one spot on a viral surface protein. The computer gives you many candidates, but only a few should survive structural scoring. That makes this a sharp test of how far protein design can go.

What Is It?

A miniprotein binder is a short protein built to stick to another protein at a specific spot. Here, the target is a viral surface protein, which is the part a virus uses to meet and enter host cells. If you can design a binder that fits that surface well, you may block the virus or create a useful detection tool.

Think of the target as a puzzle piece and the binder as a custom clip. RFdiffusion proposes the binder backbone, which is the main 3D shape. ProteinMPNN then picks the amino acids, the letters of the protein sequence. AlphaFold-Multimer acts like a reality check and predicts whether the binder and target still fit together in a stable complex.

Why This Is a Good Topic

This topic works well because you can test many design choices with the same scoring rules and get clear numbers back. It connects to viral entry, antiviral design, and protein engineering, so the real-world stakes are easy to explain. You can learn target selection, sequence design, and structure-based filtering without needing a full wet lab at the start.

Research Questions

  • How does the choice of viral surface protein change AlphaFold-Multimer confidence for top binder designs?
  • What is the effect of binder length on interface quality, predicted clash count, and model confidence?
  • Does starting from an exposed epitope produce more high-scoring candidates than starting from a partially buried epitope?
  • To what extent do RFdiffusion backbone scores agree with AlphaFold-Multimer rankings?
  • Which sequence filters remove the most false positives before structure validation?
  • How does target flexibility affect the number of designs that keep the same interface across repeated predictions?

Basic Materials

  • Free Google Colab account with GPU access when available.
  • Laptop with a reliable internet connection and at least 8 GB of RAM.
  • FASTA and PDB files for one viral surface protein target.
  • Spreadsheet or digital lab notebook for tracking design IDs, scores, and filters.
  • Free UCSF ChimeraX or a similar structure viewer for checking interfaces.

Advanced Materials

  • Synthetic gene order for top binder sequences.
  • Expression vector with a suitable fusion tag or purification tag.
  • Competent E. coli or a yeast display system for expression screening.
  • Purified viral target protein or target domain.
  • Access to size-exclusion chromatography, BLI, or SPR for binding tests.
  • SDS-PAGE, Western blot, and purification reagents for expression checks.

Software & Tools

  • Google Colab: Runs the design notebooks without local GPU hardware.
  • RFdiffusion: Generates new protein backbones around the chosen viral target.
  • ProteinMPNN: Chooses amino acid sequences that fit each backbone.
  • ColabFold: Repeats structure prediction with AlphaFold-Multimer scoring.
  • UCSF ChimeraX: Lets you inspect interfaces, clashes, and shape fit.

Experiment Steps

  1. Define one viral target and one binding hotspot you will try to hit.
  2. Set a scoring rubric that balances interface confidence, shape fit, and sequence quality.
  3. Generate a broad design set with RFdiffusion, then narrow it with ProteinMPNN.
  4. Compare top designs against decoys with AlphaFold-Multimer and record the best scores.
  5. Plan a short validation path for the strongest candidates, such as expression, solubility, and binding readouts.

Common Pitfalls

  • Picking a target region with poor structure data or missing loops, which makes every design score look better than the real protein allows.
  • Ranking candidates by one model score alone, which keeps false positives that only fold well in isolation.
  • Mixing target conformations from different sources, which makes interface comparisons noisy and hard to trust.
  • Ignoring sequence diversity, which leaves you with many nearly identical designs instead of a real search space.
  • Skipping negative controls, which makes it hard to tell whether a high score means binding or just sticky behavior.

What Makes This Competitive

A strong project does more than generate pretty protein models. It tests a clear design rule, compares many candidates against decoys, and shows how much the ranking changes when you change the target site or scoring filter. The best version pairs structural prediction with a second analysis layer, like interface size, clash counts, or repeated runs with different seeds. That makes your result easier to trust and easier to explain.

Project Variations

  • Design binders against a different viral surface protein family, then compare which target gives the best prediction scores.
  • Test how short, medium, and longer miniproteins change interface confidence for the same epitope.
  • Compare designs aimed at a conserved site versus a strain-specific site to see which route gives cleaner predictions.

Learn More

  • PubMed: Search review articles on de novo protein design, miniproteins, and viral entry proteins.
  • RCSB PDB: Find solved protein structures and inspect viral surface proteins and interfaces in the 3D viewer.
  • AlphaFold Protein Structure Database: Search predicted structures for your target family and compare related proteins.
  • MIT OpenCourseWare: Search structural biology and protein biochemistry lectures for free background material.
  • ColabFold GitHub repository: Read the notebook notes and input guidance for AlphaFold-Multimer workflows.
  • NCBI Protein: Look up sequence records, domains, and variants for your target protein.
Shopping Cart