Ancestral Antifreeze Protein Evolution

Ancestral Antifreeze Protein Evolution

ISEF Category: Computational Biology and Bioinformatics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Computational Evolutionary Biology  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

Some fish survive in water colder than the freezing point of blood. They do it with antifreeze proteins, tiny molecules that help stop ice crystals from growing. Your project can ask a bigger question, did these proteins evolve the same trick more than once? That turns a cool adaptation into a real evolutionary test.

What Is It?

Ancestral-sequence-reconstruction means you use DNA or protein sequences from living species to estimate what their shared ancestors likely looked like. Think of it like building a family tree, then filling in missing pages from old copies. In this project, you would compare antifreeze proteins from Arctic and Antarctic fish lineages to estimate older protein sequences and see how those ancestors may have changed over time.

ESMFold is a structure-prediction tool. It tries to estimate how a protein folds into a 3D shape from its sequence. That matters here because two proteins can do similar jobs while looking very different. If fish from separate regions evolved ice-binding proteins with different folds, that supports convergent evolution, which means different lineages arrived at a similar solution on their own.

Why This Is a Good Topic

This topic works well because you can turn a big evolution question into measurable comparisons. You can test sequence similarity, ancestral reconstruction confidence, predicted structure, and structural similarity across lineages. The real-world link is cold adaptation, which matters for physiology, climate biology, and protein evolution. You can learn how bioinformatics, phylogenetics, and structure prediction fit together in one project.

Research Questions

  • How does the predicted ancestral sequence of antifreeze proteins differ between Arctic and Antarctic fish lineages?
  • What is the effect of lineage on predicted protein fold among ice-binding proteins from different fish groups?
  • Does ancestral-sequence-reconstruction suggest that ice-binding activity appeared more than once in fish evolution?
  • To what extent do sequence similarity and structural similarity agree for antifreeze proteins across cold-adapted fish species?
  • Which amino acid changes most strongly separate predicted ice-binding proteins from non-ice-binding relatives?
  • How does confidence in ancestral reconstruction change when you compare different alignment methods or tree topologies?

Basic Materials

  • Laptop or desktop computer with internet access.
  • Free account access to sequence databases such as NCBI Protein or UniProt.
  • Spreadsheet software for tracking sequences, species, and outputs.
  • Sequence alignment tool such as MAFFT or Clustal Omega.
  • Phylogenetic tree viewer such as FigTree or iTOL.
  • ESMFold access through a web interface or approved research access.
  • Notebook for recording species names, accession numbers, and analysis decisions.

Advanced Materials

  • Workstation or cloud compute access for batch structure prediction.
  • Curated protein sequence set from NCBI Protein, UniProt, or Ensembl.
  • Multiple sequence alignment software for manual refinement.
  • Phylogenetic inference software such as IQ-TREE or RAxML.
  • Ancestral reconstruction software such as PAML or FastML.
  • Structural comparison tools such as PyMOL or UCSF ChimeraX.
  • Statistical software such as R or Python with Biopython and SciPy.
  • Version control system such as Git for tracking scripts and results.

Software & Tools

  • NCBI Protein: Finds protein sequences and accession records for antifreeze proteins and related homologs.
  • UniProt: Helps you confirm protein annotations, species names, and functional notes.
  • MAFFT: Aligns protein sequences so you can compare conserved and changing regions.
  • IQ-TREE: Builds phylogenetic trees that support ancestral-sequence-reconstruction.
  • ESMFold: Predicts protein 3D structure from sequence for fold comparison.
  • R: Analyzes similarity, confidence values, and statistical comparisons across lineages.

Experiment Steps

  1. Define the exact protein family you will study, then decide which fish lineages and outgroups belong in your dataset.
  2. Collect sequences from reliable databases and clean the list so each entry has a species name, accession number, and functional label.
  3. Build and refine a multiple sequence alignment, because your ancestral inference depends on how well homologous positions line up.
  4. Infer a phylogenetic tree and choose the ancestor nodes you want to reconstruct for comparison.
  5. Reconstruct ancestral sequences, then run structure prediction on both ancestral and modern proteins so you can compare fold changes over time.
  6. Plan a comparison framework that measures sequence distance, structural similarity, and evidence for convergent ice-binding across lineages.

Common Pitfalls

  • Mixing true antifreeze proteins with unrelated cold-response proteins, which blurs the evolutionary signal.
  • Using poor sequence alignments, which can move key residues into the wrong positions and distort ancestral reconstruction.
  • Comparing proteins without a clear outgroup, which makes it harder to tell what changed in each lineage.
  • Treating one predicted structure as final truth, which ignores uncertainty in ESMFold output and ancestral sequence calls.
  • Forgetting to separate sequence similarity from structural similarity, which can hide cases where different folds do the same job.

What Makes This Competitive

A stronger project goes beyond one tree and one structure prediction. You can compare multiple reconstruction methods, test alternate alignments, and measure whether the same function appears in different folds. Strong entries also quantify uncertainty, not just the best guess sequence. If you pair that with a clear evolutionary hypothesis and careful controls, your project looks much more like real research.

Project Variations

  • Compare antifreeze proteins from fish only, then add non-fish cold-adapted proteins as an outgroup to test how unique the fish solutions are.
  • Focus on one lineage pair, such as Arctic versus Antarctic notothenioid fish, and ask whether similar function came from similar or different ancestral folds.
  • Swap structure prediction for residue-level analysis, then test whether key ice-binding amino acids evolved independently in separate clades.

Learn More

  • NCBI Protein: Search fish antifreeze proteins, review accession pages, and collect sequence records.
  • UniProt: Check protein function, taxonomy, and annotation quality for candidate sequences.
  • NCBI Bookshelf: Read free background chapters on molecular evolution, phylogenetics, and protein structure.
  • MIT OpenCourseWare: Search for free course materials on evolutionary biology, bioinformatics, and genetics.
  • PAML documentation and papers: Find method notes and examples for ancestral-sequence-reconstruction.
  • PubMed: Search review articles on antifreeze proteins, convergent evolution, and protein structure prediction.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart