Brassica Polyploidy Origins in Genomes

Brassica Polyploidy Origins in Genomes

ISEF Category: Plant Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Systematics and Evolution  ·  Difficulty: Intermediate  ·  Setup: Home Setup  ·  Time: 1 to 2 Months

The Hook

Some plants carry extra genome copies like backup books on the same shelf. That can change how they evolve, adapt, and split into new species. Brassica crops, like cabbage, broccoli, and mustard, are a great place to study this because their genomes record old duplication events. You can test those signals with public data and Python, no lab coat required.

What Is It?

Polyploidy means a plant has more than two full sets of chromosomes. Think of it like a recipe book that got copied one or more extra times. Those extra copies can stay intact, change slowly, or get deleted over time. In Brassica, whole-genome duplication events helped shape the lineages that later gave rise to familiar crops.

One way to study this is with Ks values, which measure silent DNA changes between gene pairs. If many duplicated genes have similar Ks values, they can form a peak that points to an old genome duplication event. You can use public genome files, compare gene pairs, and look for those peaks in Python. That turns evolutionary history into a measurable pattern.

Why This Is a Good Topic

This makes a strong science fair topic because you can test a clear hypothesis with public data and repeatable analysis. You are not guessing from one plant, you are comparing genomes and asking whether duplication events line up with known Brassica evolution. The topic connects to crop breeding, plant diversity, and genome evolution. You can learn data cleanup, sequence comparison, plotting, and basic evolutionary reasoning without needing a wet lab.

Research Questions

  • How does the Ks distribution differ among Brassica species with different duplication histories?
  • What is the effect of using different gene-pair filtering rules on the number and shape of Ks peaks?
  • Does the inferred timing of duplication peaks match the accepted Brassica lineage split pattern?
  • To what extent do duplicated gene pairs cluster differently across Brassica A, B, and C genomes?
  • Which Brassica genome shows the clearest signal of an ancient polyploidy event?
  • How does the inclusion of synteny-based gene pairs change the detected Ks peaks?

Basic Materials

  • Computer with internet access.
  • Python installed or access to a browser-based Python notebook.
  • Spreadsheet software for tracking sample IDs and results.
  • Public Brassica genome FASTA and gene annotation files from NCBI or Ensembl Plants.
  • A text editor for cleaning file names and checking formats.
  • Basic understanding of plots, axes, and averages.

Advanced Materials

  • Access to larger genome assemblies from NCBI Genome or Ensembl Plants.
  • Synteny or orthology output from tools such as MCScanX or a similar open-source package.
  • Command-line Python environment with pandas, NumPy, matplotlib, and scipy.
  • FASTA and GFF/GTF parsing tools.
  • A local machine or university server with enough memory to process multiple genome comparisons.
  • Version control with Git for tracking analysis changes.

Software & Tools

  • Python: Cleans genome tables, calculates Ks summaries, and makes plots.
  • Jupyter Notebook: Lets you document each analysis step and keep code with notes.
  • pandas: Organizes gene pair data and helps you filter, group, and sort results.
  • matplotlib: Creates Ks distribution plots and comparison figures.
  • NCBI Genome: Provides public Brassica assemblies and annotation files for your analysis.

Experiment Steps

  1. Define the Brassica species or genome set you will compare and the evolutionary question you want to answer.
  2. Choose a public data source and plan how you will match genome assemblies with annotation files.
  3. Decide how you will identify duplicated gene pairs, including whether you will use synteny, orthology, or both.
  4. Set a filtering rule for low-quality gene pairs so your Ks signal is not cluttered by bad matches.
  5. Plan how you will summarize Ks values into distributions and compare peaks across species.
  6. Decide which external evidence, such as known Brassica lineage relationships, you will use to check whether your pattern makes sense.

Common Pitfalls

  • Mixing genomes from different annotation versions, which creates fake differences in Ks peaks.
  • Using gene pairs without checking whether they are true duplicates, which adds noise to the distribution.
  • Plotting raw Ks values with no filtering, which can hide real peaks behind outliers.
  • Comparing species with very different assembly quality, which can make one genome look more duplicated than it really is.
  • Interpreting every bump in the histogram as a duplication event, which can turn random variation into a false story.

What Makes This Competitive

A strong project goes beyond making one Ks histogram. You can compare several Brassica species, test more than one filtering strategy, and explain why the peaks change. You can also tie the pattern to a known evolutionary tree and use statistics instead of eye-balling the plot. That kind of careful analysis shows real command of comparative genomics, not just graph making.

Project Variations

  • Compare cultivated Brassica crops with wild relatives to see whether domestication changed the Ks pattern.
  • Use synteny blocks instead of all duplicate genes to test whether the duplication signal gets cleaner.
  • Compare Brassica with another plant genus that did not undergo the same polyploid history to create a stronger control.

Learn More

  • NCBI Genome: Search for Brassica genome assemblies, annotations, and related sequence records.
  • Ensembl Plants: Find plant genomes, gene models, and comparative genomics resources.
  • MIT OpenCourseWare Biology Courses: Look for genetics and evolution lectures that explain duplication and inheritance.
  • PubMed: Search for review articles on Brassica polyploidy, Ks analysis, and genome evolution.
  • USDA National Agricultural Library: Search plant genetics and crop evolution resources for background reading.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →

Shopping Cart