In Silico Multi-Epitope Vaccine Design for Pathogens
ISEF Category: Cellular and Molecular Biology
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Cellular Immunology · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
A single protein can hold dozens of tiny immune targets. If you pick the right ones, you can design a vaccine on a laptop before anyone makes a real molecule in the lab. That makes this project a powerful mix of biology, coding, and public health. It also connects directly to diseases that still hit millions of people in low-resource regions.
What Is It?
This project asks you to design a vaccine candidate on a computer instead of in a wet lab. You start with a pathogen protein, such as a surface protein from Leishmania donovani. Then you search for short fragments called epitopes, which are pieces the immune system can recognize. Think of the protein like a long sentence, and the epitopes like the words your immune system actually reads.
Next, you test which epitopes are likely to bind HLA molecules. HLA proteins sit on human cells and present pieces of pathogens to T cells. NetMHCpan helps predict that binding. After that, you can build a multi-epitope vaccine sequence and check whether its 3D shape looks stable with modeling tools such as AlphaFold-Multimer or related structure prediction workflows. In plain terms, you are asking, “Which peptide combo looks most likely to get noticed by the immune system and still fold in a sensible way?”
Why This Is a Good Topic
This is a strong science fair topic because you can compare many designs with clear scoring rules. You are not guessing. You are ranking candidates by binding strength, coverage of HLA alleles, conservation across strains, and structural fit. The project connects to a real problem, neglected tropical diseases, where vaccines are still needed, and you can learn how immunology, bioinformatics, and basic statistics work together.
Research Questions
- How does HLA allele selection change the number of predicted high-affinity epitopes from the target protein?
- What is the effect of using conserved epitopes only on predicted population coverage in endemic regions?
- Does adding linker sequences change the predicted stability of a multi-epitope vaccine model?
- To what extent do different adjuvant fusion choices alter predicted antigenicity and allergenicity scores?
- Which epitope ranking method produces the most consistent vaccine candidates across multiple prediction tools?
- How does sequence variation across pathogen strains affect the overlap of top-ranked epitopes?
Basic Materials
- Computer with internet access and enough memory for structure files.
- FASTA sequence of the target pathogen protein.
- Access to NetMHCpan through a public web server or institutional access.
- Protein sequence analysis tool such as ExPASy ProtParam.
- Peptide property prediction tools for antigenicity, allergenicity, and solubility.
- Spreadsheet software such as Google Sheets or Excel.
- Reference set of HLA allele frequencies for endemic populations from published studies or public databases.
- Literature management tool such as Zotero.
Advanced Materials
- Workstation or server with a modern GPU if local structure prediction is used.
- Command-line environment with Python and Conda.
- Local installation or institutional access to peptide and HLA prediction packages.
- Access to AlphaFold-Multimer or a comparable structure prediction pipeline.
- Molecular visualization software such as PyMOL or UCSF ChimeraX.
- Docking or refinement software for protein-peptide complexes.
- Access to a peptide synthesis or immunology lab for future validation planning.
- Sequence alignment software for conservation analysis across pathogen isolates.
Software & Tools
- NetMHCpan: Predicts peptide binding to HLA molecules and helps you rank candidate epitopes.
- IEDB Analysis Resource: Provides public tools for epitope prediction, population coverage, and immune property checks.
- AlphaFold-Multimer: Predicts protein complex structures so you can inspect whether a designed construct folds reasonably.
- UCSF ChimeraX: Lets you visualize models, compare interfaces, and inspect binding regions.
- R: Helps you summarize prediction scores, compare candidates, and make clean figures.
Experiment Steps
- Define the target pathogen protein, the HLA allele set, and the population you want your design to serve.
- Screen the protein for candidate T-cell and B-cell epitopes, then rank them by binding, conservation, and immune property filters.
- Assemble a short list of non-overlapping epitopes and decide how you will connect them in a single construct.
- Model the full vaccine sequence and compare predicted structure quality across several design versions.
- Evaluate each design with a scoring rubric that includes antigenicity, allergenicity, solubility, and HLA population coverage.
- Compare your top designs against published vaccine constructs or random epitope sets to show why your choice is better.
Common Pitfalls
- Using only one HLA allele, which makes the design look good for one person but weak for the target population.
- Picking epitopes with strong binding scores but ignoring whether the sequences are conserved across pathogen strains.
- Treating antigenicity predictions as proof of immune response, which overstates what the software can actually tell you.
- Building a long construct without checking linker effects, which can distort the predicted structure.
- Comparing scores from different tools without normalizing them, which makes the final ranking hard to defend.
What Makes This Competitive
A class-level version usually stops at listing predicted epitopes. A stronger project explains why one design wins over another. You can make it more competitive by using multiple HLA populations, conservation across strains, and a clear scoring system that weights each feature. A careful comparison of several construct designs, not just one, also makes your analysis feel much more scientific.
Project Variations
- Use a different neglected tropical pathogen, such as Plasmodium, Trypanosoma, or dengue virus, to compare how epitope patterns change across diseases.
- Focus on HLA allele coverage in one endemic region, then compare the design to global allele frequencies.
- Add a sequence conservation analysis across multiple pathogen isolates before epitope ranking to see whether your best targets stay stable.
Learn More
- IEDB Analysis Resource: Search the site for epitope prediction, population coverage, and immunogenicity tools.
- PubMed: Search review articles on reverse vaccinology, epitope-based vaccines, and neglected tropical diseases.
- NIH NCBI Bookshelf: Read free background chapters on immunology, HLA biology, and vaccine design concepts.
- PubChem: Look up adjuvant molecules, peptide properties, and related chemical information when you need background context.
- UCSC or NCBI Genome resources: Use public sequence databases to find pathogen protein sequences and strain variants.
- Nature Reviews Immunology: Search for review articles on antigen presentation, HLA diversity, and vaccine design.
Cellular and Molecular Biology Category Guide
How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
