Vaccine Epitope Design for Tropical Disease Targets

Vaccine Epitope Design for Tropical Disease Targets

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Immunology  ·  Difficulty: Advanced  ·  Setup: Home Setup  ·  Time: 1 to 2 Months

The Hook

A peptide that fits one immune system can miss another by a wide margin. HLA genes act like different display racks, and each rack shows different pathogen pieces to T cells. You can test which pathogen fragments fit the widest set of racks using public databases and prediction tools. That turns vaccine design into a data project you can run from home.

What Is It?

In-silico vaccine-epitope design means you use computer tools to find tiny pieces of a pathogen, called epitopes, that may trigger an immune response. A T-cell epitope is a short peptide that can bind to HLA, the human leukocyte antigen system that presents fragments to immune cells. NetMHCpan predicts which peptides are likely to bind well.

BepiPred looks for B-cell epitopes, which are parts of a protein that antibodies may recognize. Population-coverage analysis asks a practical question, if you choose these epitopes, how many people in a target population have HLA types that could present them? Think of it like choosing a set of keys that opens the most locks in a city.

Why This Is a Good Topic

This is a strong science fair topic because you can test a real biological question with public data and clear scoring rules. You are not guessing, you are comparing candidate epitopes, HLA coverage, and sequence conservation. That gives you a project with measurable outputs, useful graphs, and a direct link to vaccine design for diseases that still lack good options.

Research Questions

  • How does predicted population coverage change when you choose epitopes from different pathogen proteins?
  • What is the effect of adding a BepiPred filter on the number of final epitope candidates?
  • Does sequence conservation across pathogen strains change which epitopes stay in the top tier?
  • To what extent do the top-ranked epitopes cover HLA types in one region compared with another?
  • Which protein region gives the best balance of binding score, conservation, and coverage?
  • How does the epitope shortlist change when you compare Chagas with leishmaniasis data?

Basic Materials

  • Laptop or desktop computer with internet access.
  • Spreadsheet software for tracking protein IDs, HLA alleles, and scores.
  • A notebook for recording search terms, filters, and decisions.
  • Public protein sequence records from NCBI Protein or UniProt.
  • HLA frequency tables from the Allele Frequency Net Database.
  • Access to the NetMHCpan and BepiPred web servers.

Advanced Materials

  • High-memory laptop or university workstation for batch runs.
  • Python environment with Biopython, pandas, NumPy, matplotlib, and seaborn.
  • JupyterLab or RStudio for reproducible analysis notebooks.
  • MAFFT or another multiple-sequence-alignment tool.
  • Local proteome files for multiple strains or related species.
  • Git repository for version control and file tracking.

Software & Tools

  • NetMHCpan: Predicts which peptides are likely to bind specific HLA molecules.
  • BepiPred 2.0: Scores protein regions that may act as linear B-cell epitopes.
  • IEDB Population Coverage Tool: Estimates how much of a target population your epitope set may reach.
  • Allele Frequency Net Database: Provides HLA frequency data for coverage comparisons across populations.
  • Python: Cleans sequences, compares scores, and makes plots for your ranking rules.

Experiment Steps

  1. Choose one pathogen protein family and define the target population you want to study first.
  2. Gather strain sequences and clean them so your analysis starts from comparable records.
  3. Run epitope prediction and rank candidates by binding, exposure, and conservation.
  4. Match the shortlist to HLA frequency data and estimate population coverage for each candidate set.
  5. Compare alternative scoring rules, then lock your final selection rule before you write up results.

Common Pitfalls

  • Treating a high binding score as proof of immunity, which confuses prediction with biology.
  • Using only one pathogen strain, which hides whether your epitope survives sequence variation.
  • Mixing allele lists from different sources without checking population names, which breaks coverage estimates.
  • Comparing B-cell and T-cell outputs as if they measure the same thing, which leads to a messy shortlist.
  • Ranking epitopes by one score only, which ignores conservation, antigenicity, and HLA breadth.

What Makes This Competitive

This project gets stronger when you move beyond one top peptide. Compare several proteins, test how your ranking changes across HLA regions, and show that your shortlist survives changes in scoring rules. A competitive version explains why the same epitope panel works for multiple populations, not just one database snapshot. Clear figures and reproducible code help a lot.

Project Variations

  • Compare Chagas and leishmaniasis proteins with the same scoring pipeline to see which pathogen gives broader HLA coverage.
  • Focus on conserved epitopes across multiple strains instead of single reference proteins.
  • Build a multi-epitope vaccine panel and test whether the combined coverage beats the best single epitope.

Learn More

  • PubMed: Search review articles on immunoinformatics, HLA binding, and epitope-based vaccines.
  • IEDB: Use the Immune Epitope Database for background data and analysis tools.
  • Allele Frequency Net Database: Find HLA frequency tables for population coverage work.
  • NCBI Bookshelf: Read free textbook chapters on immunology and pathogen biology.
  • CDC Neglected Tropical Diseases pages: Review disease background, transmission, and global burden.
  • MIT OpenCourseWare: Find free genetics and immunology lectures for a deeper baseline.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart