Wastewater Virome Seasonality in Cities

Wastewater Virome Seasonality in Cities

ISEF Category: Microbiology

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Virology  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

A city sewer can act like a giant health diary. Every flush carries clues about the viruses moving through a community. You can mine public data and ask whether one harmless virus family rises before norovirus does. That turns wastewater into an early warning system.

What Is It?

This project studies the viral community, or virome, in wastewater. A virome is the full set of viruses found in a sample. In this case, you are not growing viruses in a lab. You are using public sequencing data from wastewater samples and asking what viruses are present, how common they are, and how those patterns change over time.

Think of it like listening to a crowd through a wall. You cannot hear every voice clearly, but you can still tell when the crowd gets louder, quieter, or changes tone. CheckV helps judge how complete a viral sequence is. Cenote-Taker helps find and label viral sequences in metagenomes, which are samples containing DNA from many organisms at once. You can then compare viral abundance across cities and dates, and test whether crAssphage, a common non-pathogenic virus in human gut waste, changes before norovirus season begins.

The big idea is not just finding viruses. It is linking environmental sequence data to public health timing. That makes the project part microbiology, part data science, and part epidemiology.

Why This Is a Good Topic

This is a strong science fair topic because it asks a clear question with public data you can actually access. You do not need to culture viruses or collect wastewater yourself. You can build a real analysis pipeline, compare cities, and test whether viral trends line up with reported illness patterns. That gives you room to learn bioinformatics, statistics, and scientific storytelling in one project.

Research Questions

  • How does crAssphage abundance change across cities and seasons in public wastewater metagenomes?
  • What is the effect of city size on the timing of wastewater virome shifts?
  • Does higher crAssphage abundance precede norovirus seasonality in reported clinical data?
  • To what extent do viral community profiles differ between regions with different climate patterns?
  • Which quality control filters change the number of viral contigs recovered from wastewater metagenomes?
  • How does CheckV completeness scoring affect which viral sequences stay in the final analysis?
  • What is the effect of sampling frequency on the ability to detect a pre-season viral signal?

Basic Materials

  • Computer with enough storage for large sequencing files.
  • Stable internet connection for downloading public datasets.
  • Free NCBI SRA Toolkit for retrieving metagenome reads.
  • Free command line access, such as a terminal on macOS or Linux, or Windows Subsystem for Linux.
  • Spreadsheet software for tracking sample metadata.
  • Public wastewater sample metadata from SRA, BioProject, or published supplement files.
  • Public norovirus surveillance data from CDC, state health departments, or peer-reviewed papers.
  • External hard drive or cloud storage for backups.
  • Notebook for recording sample IDs, dates, and filtering decisions.

Advanced Materials

  • Access to a university or school compute server for large read sets.
  • Conda or another package manager for reproducing the analysis environment.
  • CheckV for viral sequence quality assessment.
  • Cenote-Taker for viral discovery and annotation.
  • FastQC or MultiQC for read quality summaries.
  • Bowtie2 or a similar mapper for abundance calculations.
  • R or Python for statistics and plotting.
  • BLAST or HMM-based databases for confirming taxonomic calls.
  • Snakemake or Nextflow for building a reproducible pipeline.

Software & Tools

  • NCBI SRA Toolkit: Downloads public wastewater sequencing reads from the Sequence Read Archive.
  • CheckV: Estimates viral contig completeness and contamination.
  • Cenote-Taker: Detects and annotates viral sequences in metagenomic data.
  • R: Cleans metadata, runs statistics, and makes time series plots.
  • Python: Automates file handling and helps compare viral abundance across samples.

Experiment Steps

  1. Define one public-health question that your data can answer, then narrow it to a specific virus, a date range, and a city set.
  2. Choose a sample inclusion rule so your results come from comparable wastewater metagenomes instead of a mixed bag of sequencing projects.
  3. Build a viral discovery pipeline that separates raw reads, assembled contigs, and quality-controlled viral calls.
  4. Decide how you will turn sequence counts into abundance values that you can compare across samples and cities.
  5. Plan a time-lag analysis that tests whether wastewater viral changes happen before clinical norovirus reports.
  6. Design negative and sensitivity checks so you can tell whether your signal survives different filtering choices.

Common Pitfalls

  • Mixing wastewater samples from different sequencing protocols, which can make viral abundance comparisons look biological when they are technical.
  • Treating every viral contig as real without checking completeness, which inflates false positives.
  • Comparing raw read counts instead of normalized abundance, which rewards deeper sequencing runs rather than stronger signals.
  • Using clinical norovirus reports with mismatched dates or regions, which breaks the lag analysis.
  • Ignoring batch effects from assembly and annotation settings, which can change the apparent virome composition more than the city itself.

What Makes This Competitive

A stronger project goes beyond plotting a few city trends. You would build a clean, reproducible pipeline, justify every filter, and test whether your result holds under different normalization methods. You would also compare multiple cities, multiple seasons, and more than one statistical approach for lead-lag timing. If you can show that a signal survives those checks, your project looks much more like real research.

Project Variations

  • Swap norovirus for another clinical target, such as influenza or enterovirus, and test whether wastewater virome shifts still lead case reports.
  • Focus on one region or climate zone, then compare how viral seasonality changes between coastal and inland cities.
  • Analyze the effect of sequencing depth and assembly method on how many viral contigs CheckV and Cenote-Taker recover.

Learn More

  • NCBI Sequence Read Archive: Find public metagenomic datasets and sample metadata by searching wastewater and virome studies.
  • CheckV paper in Nature Biotechnology: Read the method paper to understand viral completeness scoring and where to find it through PubMed.
  • Cenote-Taker paper in Genome Biology: Learn how the tool finds viral sequences and where to find the article through PubMed or the journal site.
  • CDC Wastewater Surveillance: Review public health wastewater reporting and where to find state and national dashboards.
  • NIH PubMed: Search for review articles on wastewater virology, norovirus seasonality, and metagenomic analysis.
  • NCBI BioProject: Find linked project pages for wastewater sequencing studies and their associated sample records.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart