Plant-Pollinator Coevolution Gene Analysis

Plant-Pollinator Coevolution Gene Analysis

ISEF Category: Computational Biology and Bioinformatics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Computational Evolutionary Biology  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

Flowers do not just look pretty, they send chemical messages. Some plants make scents that match the sense organs and feeding habits of their pollinators. You can test whether those signals and responses evolved together by analyzing public gene datasets.

What Is It?

This project asks whether plant genes and pollinator genes changed in step over evolution. The idea comes from coevolution, which means two species push each other to adapt over time. Think of it like a lock and key, except both the lock and the key keep changing. If a flower evolves a new scent pathway, a pollinator may evolve better smell or feeding behavior to match it.

You would use public transcriptomes from the 1KP project, which is a large collection of RNA sequence data from many plants. RNA tells you which genes are active in a tissue. By comparing gene families across species, you can look for signals of mutual information, which measures shared patterns, and direct coupling, which tries to separate real links from indirect ones. In plain language, you are asking which genes seem to move together across evolution, not just by accident.

Why This Is a Good Topic

This is a strong science fair topic because it starts with public data, so you do not need to collect plants or insects in the field. You can build a clear computational pipeline, test a real biological question, and compare multiple gene families tied to scent production, smell reception, and pollination traits. You also get to learn sequence analysis, gene family comparison, and statistics, which are useful in many bioinformatics projects.

Research Questions

  • How does gene family coevolution differ between floral-scent biosynthesis genes and background housekeeping genes?
  • What is the effect of pollination syndrome on mutual information scores between plant scent gene families and pollinator receptor families?
  • Does direct coupling identify fewer, stronger plant-pollinator gene links than simple correlation across transcriptomes?
  • To what extent do convergent floral-scent modules show shared evolutionary signals across unrelated plant lineages?
  • Which plant scent pathway genes show the strongest association with documented pollinator categories in public datasets?
  • How does transcriptome completeness affect the stability of coevolution signals in 1KP data?
  • To what extent do different clustering choices change the set of gene families labeled as coevolving?

Basic Materials

  • A computer with enough storage for large sequence files.
  • Stable internet access for downloading public transcriptomes and annotation files.
  • External hard drive or cloud storage for backups.
  • Spreadsheet software for tracking species, gene families, and metadata.
  • Command line access on Windows, macOS, or Linux.
  • Python installed with bioinformatics and data analysis packages.
  • R installed for statistical testing and plotting.
  • Gene family annotation tables from public databases.
  • Public transcriptome data from the 1KP project.

Advanced Materials

  • Access to a university or research server with high memory and multi-core processing.
  • Local copy of the full 1KP transcriptome dataset or a curated subset.
  • Transcript assembly and orthology inference software.
  • Multiple sequence alignment tools for protein or coding sequences.
  • Phylogenetic analysis software for tree building and trait mapping.
  • Mutual information and direct coupling analysis code or libraries.
  • Gene ontology and pathway annotation databases.
  • High-performance storage for intermediate alignment and matrix files.
  • Version control repository for reproducible analysis.

Software & Tools

  • NCBI BLAST: Helps you identify candidate gene families and confirm sequence similarity against known genes.
  • MAFFT: Aligns gene or protein sequences before you compare evolutionary patterns.
  • Python: Runs data cleaning, matrix building, and custom coevolution scripts.
  • R: Makes statistical tests, plots, and phylogenetic summaries easier to manage.
  • ImageJ: Not needed for the core analysis, but useful if you include figure scoring from exported plots or annotated images.

Experiment Steps

  1. Define the exact gene families you want to compare, then choose a narrow biological question that your data can answer.
  2. Build a species list with matching metadata, so each transcriptome links to a pollination or scent category.
  3. Decide how you will turn raw sequence data into comparable gene family presence, absence, or expression features.
  4. Plan one baseline method and one stronger method, such as correlation versus direct coupling, so you can compare them.
  5. Set up controls that test whether your signal survives missing data, uneven sampling, and phylogenetic relatedness.
  6. Predefine the summary metrics, figures, and statistical tests you will use to judge whether any pattern is real.

Common Pitfalls

  • Mixing transcriptomes from species with very different assembly quality, which can create fake coevolution signals.
  • Comparing gene families without checking orthology, which can lump unrelated genes into the same group.
  • Treating correlation as proof of interaction, which can hide indirect links caused by shared ancestry.
  • Using too many species with weak metadata, which makes pollinator categories and scent pathways hard to interpret.
  • Ignoring phylogenetic non-independence, which can make closely related species look like repeated evidence when they are not.

What Makes This Competitive

A strong project would not just run one network method and stop. You would test whether the pattern holds across multiple gene family sets, multiple sampling filters, and at least one phylogeny-aware control. You would also explain why direct coupling gives a better answer than simpler similarity measures. If you can connect the signal to a clear biological story about scent evolution and pollinator matching, the project gets much stronger.

Project Variations

  • Use only angiosperm lineages with known specialized pollinators, then compare the strength of coevolution signals across pollination types.
  • Focus on floral scent biosynthesis genes alone, and test whether convergent modules appear more often in unrelated plant families.
  • Swap pollinator gene families for smell receptor or detoxification families, then ask whether plant scent evolution tracks sensory or metabolic adaptation.

Learn More

  • NCBI Gene and PubMed: Search gene family reviews, orthology papers, and coevolution methods papers for background and examples.
  • NIH 1KP resources: Look for project summaries and linked datasets from the One Thousand Plants initiative.
  • NCBI Taxonomy: Check species names and higher-level classification for your transcriptome metadata.
  • UniProt: Read protein function annotations for candidate scent pathway genes and receptor families.
  • MIT OpenCourseWare: Use free bioinformatics and computational biology lectures to build your analysis workflow.
  • Molecular Biology of the Cell: Use a library copy or preview chapters for clear background on gene expression and protein function.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart