Livestock *E. coli* Resistance Gene Spread Analysis
ISEF Category: Animal Sciences
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Genetics · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
Some bacteria collect resistance genes like stickers on a laptop. In livestock-associated E. coli, those stickers can spread through farms, meat chains, and waste streams. Public genome databases let you track that spread without starting in a wet lab. That makes this a strong project if you want real-world genetics and serious data analysis.
What Is It?
A pan-genome is the full set of genes found across many strains of the same species. Think of it like a master library, where each strain carries a different stack of books. Some genes show up in almost every strain, while others appear only in a few. In this project, you look for antimicrobial-resistance genes inside livestock-associated E. coli genomes and compare how those genes cluster across strains.
The key idea is gene spread. A resistance gene can move between bacteria through plasmids, which are small DNA circles that act like exchange cards. If related strains from livestock share the same resistance genes, that can point to common ancestry, gene transfer, or both. Public NCBI Pathogen Detection data gives you genomes, metadata, and a place to look for those patterns.
Why This Is a Good Topic
This topic is a strong science fair choice because it is testable with public data, but still feels like real research. You can ask clear questions about which host sources, lineages, or time periods carry more resistance genes, then back your claims with statistics and genome comparisons. It connects to food safety, antibiotic stewardship, and animal health. A student can learn bioinformatics, data cleaning, visualization, and basic genomic reasoning without needing to grow bacteria.
Research Questions
- How does the number of antimicrobial-resistance genes differ between livestock-associated E. coli strains from cattle, pigs, and poultry?
- What is the effect of collection year on the diversity of resistance genes found in livestock-associated E. coli strains?
- Does phylogenetic lineage predict which resistance genes appear together in the same E. coli genomes?
- To what extent do plasmid-linked resistance genes cluster more often in livestock-associated strains than in non-livestock strains?
- Which resistance gene families are most strongly associated with specific livestock hosts in public NCBI Pathogen Detection data?
- How does genome quality filtering change the observed pan-genome size and resistance gene counts?
- To what extent do multidrug-resistance profiles differ across geographic regions in livestock-associated E. coli?
Basic Materials
- Computer with internet access.
- Spreadsheet software or Google Sheets.
- NCBI Pathogen Detection access.
- NCBI assembly and metadata downloads.
- Free text editor for notes and code.
- External storage for downloaded files.
- Basic stats calculator or spreadsheet functions.
Advanced Materials
- Computer with enough RAM for genome tables.
- Python environment with pandas, matplotlib, seaborn, and scipy.
- R with tidyverse and ggplot2.
- AMRFinderPlus or a similar resistance gene caller.
- Pan-genome analysis software such as Roary, Panaroo, or PIRATE.
- Phylogeny viewer such as iTOL or FigTree.
- Genome metadata table from NCBI Pathogen Detection.
- Version control with Git.
Software & Tools
- NCBI Pathogen Detection: Lets you browse isolate groups, download metadata, and compare public E. coli genomes.
- AMRFinderPlus: Identifies antimicrobial-resistance genes from genome sequences.
- Python: Cleans metadata, builds gene presence matrices, and runs summary statistics.
- R: Makes comparison plots, diversity charts, and grouped tests for your dataset.
- iTOL: Annotates trees with host source, year, and resistance profiles.
Experiment Steps
- Define the comparison groups you will test, such as host species, region, or collection year.
- Choose a genome set with consistent metadata and quality filters so your samples are comparable.
- Build a gene presence table that separates core genes from accessory genes and resistance genes.
- Plan a phylogenetic or clustering framework that lets you compare related strains, not just raw counts.
- Select statistics that match your question, such as group comparisons, enrichment tests, or permutation tests.
- Decide how you will present the result, such as a tree map, heat map, or pan-genome summary plot.
Common Pitfalls
- Mixing genomes with very different assembly quality, which can make missing genes look like real biological differences.
- Comparing strains with incomplete host metadata, which can blur livestock, food, and environmental categories.
- Counting duplicate isolates as separate discoveries, which inflates gene frequency and makes one outbreak look bigger than it is.
- Treating every resistance hit as equal, which ignores whether a gene is on a plasmid, a chromosome, or a fragmented contig.
- Skipping lineage control, which can make shared ancestry look like host-specific resistance spread.
What Makes This Competitive
A class-level version of this project stops at simple counts. A stronger entry asks whether resistance patterns still hold after you control for lineage, genome quality, and sampling bias. You can raise the level by comparing multiple livestock hosts, testing enrichment with permutation methods, and tying gene patterns to a phylogeny. Clear visuals and careful filtering matter as much as the final answer.
Project Variations
- Compare resistance gene patterns in livestock-associated E. coli versus retail meat isolates.
- Test whether one livestock host carries a more diverse accessory genome than the others.
- Add a phylogenetic layer to see whether resistance genes track strain relatedness or host source more closely.
Learn More
- NCBI Pathogen Detection: Search for E. coli isolate clusters, genome metadata, and public download options on the NCBI site.
- AMRFinderPlus documentation: Learn how NCBI names resistance genes and how to interpret genome hits on the NCBI GitHub and NCBI help pages.
- PubMed: Search for review articles on antimicrobial resistance in livestock-associated E. coli and pangenome analysis.
- NCBI Bookshelf: Find free background chapters on bacterial genetics, plasmids, and horizontal gene transfer.
- USDA ARS: Look for reports and research summaries on antimicrobial resistance in food animals and agricultural microbiology.
Animal Sciences Category Guide
How to Do Real Animal Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
