Yeast Fitness Screens and Buffering Networks

ISEF Category: Cellular and Molecular Biology

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Genetics · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

Some genes only matter when another gene fails. That backup system is called buffering, and it helps cells survive stress. You can map that hidden support network with public yeast data, then ask whether the same logic predicts weak spots in human cells. This turns a huge genetics problem into a data science project you can actually attack.

What Is It?

Cells do not run on single genes acting alone. They work more like a team, where one gene can cover for another. When one player leaves, the team may still function. When two backup paths fail together, the cell can crash. That double failure is called a synthetic lethal pair.

This project asks you to find those backup patterns in yeast deletion data. A deletion collection is a big set of yeast strains, where one gene is removed in each strain. Researchers test how each strain grows under stress, like heat, salt, or drugs. If two genes show similar stress responses, they may sit in the same buffering network. You can turn those patterns into a graph, where genes are nodes and shared stress sensitivity is the link between them.

The next step is cross-species testing. Human DepMap CRISPR screens measure how essential genes are in many cancer cell lines. If your yeast-based network predicts which human gene pairs might also act as backups, that gives your model real biological value. In plain terms, you are asking whether yeast can help point to weak spots in human cells.

Why This Is a Good Topic

This is a strong science fair topic because the question is clear, measurable, and built from public data. You can test whether gene buffering patterns across many stress conditions predict synthetic-lethal interactions better than chance. The topic connects to cancer biology, drug target discovery, and basic genetics. You can learn data cleaning, network analysis, and validation, which are all skills judges like to see.

Research Questions

How does combining yeast fitness screens across multiple stressors change the accuracy of synthetic-lethal predictions?
What is the effect of different similarity measures, such as correlation versus mutual information, on buffering network quality?
Does a buffering network built from one stressor predict gene interactions under a different stressor?
To what extent do yeast buffering links overlap with essential gene patterns seen in DepMap CRISPR screens?
Which yeast gene groups form the most conserved backup modules when compared with human cell data?
How does removing low-quality screens change the stability of the predicted network?
What is the effect of network threshold choice on the number of predicted synthetic-lethal pairs?

Basic Materials

Computer with internet access.
Spreadsheet software or Google Sheets.
Python installed through Anaconda or another free distribution.
Public yeast fitness screen datasets from a journal supplement or GEO.
DepMap public CRISPR screen data.
Basic statistics reference notes or a free online statistics text.
External drive or cloud storage for versioned data files.

Advanced Materials

Computer with internet access and enough memory for large matrices.
Python with pandas, numpy, scipy, scikit-learn, and networkx.
R with tidyverse and igraph.
Jupyter Notebook or RStudio.
Access to GO term enrichment tools through the Gene Ontology website.
Human cell line dependency data from DepMap.
Yeast interaction data from BioGRID or SGD for external comparison.
Optional access to a university Linux server for larger graph runs.

Software & Tools

Python: Cleans fitness matrices, calculates similarity scores, and builds the buffering network.
R: Runs statistics, plots network summaries, and compares models.
Cytoscape: Visualizes gene interaction graphs and highlights modules.
ImageJ: Not needed for this project, so skip it unless you add microscopy data.
Jupyter Notebook: Keeps code, notes, and figures in one place for reproducible analysis.

Experiment Steps

Define the gene set you will analyze and decide which public screens count as usable evidence.
Assemble a clean matrix of gene fitness values across stress conditions and remove screens with obvious quality problems.
Choose a rule for turning shared stress response into a buffering link, then test more than one scoring method.
Build the network and decide how you will identify modules, hubs, and candidate synthetic-lethal pairs.
Compare your yeast predictions with independent datasets, including known interaction databases and DepMap CRISPR results.
Stress-test your result by changing thresholds, dropping one dataset at a time, and checking whether the network still holds.

Common Pitfalls

Mixing datasets that use different gene IDs, which causes silent mismatches between yeast and human comparisons.
Treating every weak correlation as a real buffering link, which fills the graph with false positives.
Ignoring missing screens or poor-quality measurements, which can make one stress condition dominate the whole model.
Comparing yeast and human genes without mapping orthologs carefully, which creates fake cross-species matches.
Changing the network threshold after looking at the answer, which makes the final result look stronger than it really is.

What Makes This Competitive

A strong version of this project does more than make a pretty network. It compares multiple ways to define buffering, then tests which one predicts known interactions best. It also uses outside validation, like DepMap and curated interaction databases, instead of relying on one dataset. The best projects show careful statistics, clear biological reasoning, and a result that could guide real target discovery.

Project Variations

Use stress-response data from a different model organism, such as fission yeast, to see whether the same buffering logic still works.
Focus on one stress class, such as DNA damage or osmotic stress, and ask whether narrow networks beat broad ones.
Compare graph-based prediction with a simpler machine learning model to see which method better finds synthetic-lethal pairs.

Learn More

The Saccharomyces Genome Database: Search SGD for gene deletion phenotypes, interaction data, and functional annotations.
Gene Ontology Consortium: Use GO terms to group genes by process and test whether buffering modules share biology.
PubMed: Search review articles on synthetic lethality, genetic buffering, and yeast fitness screens.
DepMap Portal: Find public CRISPR dependency data and gene effect scores for human cell lines.
BioGRID: Look up curated genetic and physical interaction data for comparison with your predicted network.
MIT OpenCourseWare: Search for free genetics, systems biology, and computational biology lecture materials.

Cellular and Molecular Biology Category Guide

How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →