CRISPR Off-Target Binding With Gillespie Models

ISEF Category: Computational Biology and Bioinformatics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Computational Biomodeling · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

CRISPR can act like a GPS with a typo. One wrong letter in a guide RNA can send the system to the wrong spot, and that matters if you want clean edits. Your model can help predict which guide designs stay on target and which ones wander.

What Is It?

CRISPR-Cas9 is a gene editing system. Cas9 is the cutting protein, and the guide RNA is the address label that tells it where to go. On-target binding means Cas9 finds the DNA sequence you want. Off-target binding means it grabs a similar sequence by mistake.

A Gillespie model is a way to simulate random events step by step. Think of it like rolling loaded dice for each possible next move, such as binding, unbinding, or cutting. That matters here because molecular search is not smooth or perfectly predictable. In a crowded nucleus, molecules bump into each other, slow down, and compete for attention. Your model can treat those random choices as biology, not noise.

This project asks how guide sequence, mismatch position, and crowding change the odds of off-target hits. You are not building a wet-lab CRISPR system. You are building a computational testbed that turns published binding ideas into predictions you can compare across designs.

Why This Is a Good Topic

This is a strong science fair topic because you can test real biology with code, math, and published parameters. You can change one guide feature at a time, then measure how the predicted binding outcomes shift. The project connects to gene editing safety, which is a real problem in medicine and biotech. You can also learn stochastic simulation, parameter sweeps, and model validation without needing a lab bench.

Research Questions

How does guide-RNA mismatch position change the predicted ratio of on-target to off-target binding events?
What is the effect of nuclear crowding on the mean time for Cas9 to find its target site?
Does increasing guide-RNA length reduce or increase off-target binding in a stochastic model?
To what extent do mismatch counts versus mismatch positions explain off-target binding probability?
Which guide designs produce the largest separation between on-target and off-target dwell times?
How does the assumed rate of unbinding change the ranking of candidate guide RNAs?

Basic Materials

Laptop or desktop computer with enough memory to run simulations.
Python installed with NumPy, SciPy, pandas, and Matplotlib.
Jupyter Notebook or another Python notebook editor.
Spreadsheet software for tracking simulation outputs.
Public CRISPR guide design and off-target reference tables from journal articles or NIH-linked resources.
Notebook or lab journal for assumptions, parameters, and model choices.

Advanced Materials

Access to a faster workstation or cloud compute for large parameter sweeps.
Python with pandas, NumPy, SciPy, Matplotlib, Seaborn, and statsmodels.
Git for version control and reproducibility tracking.
Public datasets of validated CRISPR off-target sites from peer-reviewed papers or databases.
Optional molecular context data from ENCODE, if you compare chromatin state or accessibility.
Benchmark code for stochastic simulation testing and sensitivity analysis.

Software & Tools

Python: Runs the Gillespie simulation, parameter sweeps, and result plots.
Jupyter Notebook: Keeps code, notes, and figures in one place for easy revision.
pandas: Organizes simulation outputs into tables for comparison across guide designs.
Matplotlib: Plots binding probabilities, dwell times, and off-target risk curves.
SciPy: Supports statistical testing, fitting, and distribution analysis.

Experiment Steps

Define the exact binding events your model will track, such as target search, mismatch recognition, unbinding, and cleavage.
Choose the guide-RNA features you will vary first, such as mismatch count, mismatch position, or guide length.
Assign reaction rates from published CRISPR kinetics studies and record where each number came from.
Build a baseline Gillespie simulation for one guide and one target, then check that the output behaves sensibly.
Add crowding or competition terms and compare whether the model still predicts realistic search behavior.
Run a parameter sweep, rank guide designs by specificity, and test which assumptions change the ranking most.

Common Pitfalls

Using one fixed rate for every binding step, which hides the effect of mismatch position on specificity.
Mixing up off-target binding probability with off-target cutting probability, which are not the same thing.
Treating all mismatches as equal, which can flatten the sequence effects that matter most.
Choosing parameters from unrelated systems, which makes the simulation look precise while the biology is wrong.
Running too few stochastic repeats, which makes random noise look like a real design trend.

What Makes This Competitive

A class-level project usually runs one simulation and stops there. A stronger project compares several guide designs, tests sensitivity to key rate constants, and checks whether the ranking stays stable under repeated runs. You can also make the work sharper by validating your model against published off-target datasets instead of only reporting raw simulation output. That extra layer of comparison is what makes the project feel like real biomodeling.

Project Variations

Compare guide RNAs with different GC content to see whether predicted binding stability changes off-target risk.
Add a chromatin accessibility factor and test whether crowded or open DNA regions change the search time ranking.
Replace one Gillespie assumption with a deterministic baseline model and compare when each approach fails.

Learn More

PubMed: Search for review articles on CRISPR-Cas9 off-target binding kinetics and stochastic modeling.
NCBI Bookshelf: Look for free background chapters on CRISPR biology and gene editing methods.
NIH CRISPR resources: Find plain-language explanations and links to current gene editing research.
MIT OpenCourseWare: Use free systems biology or computational biology lectures to review stochastic simulation ideas.
Nature Reviews Genetics: Search for review articles on CRISPR specificity and off-target effects through a school library or public abstract access.

Computational Biology and Bioinformatics Category Guide

How to Do Real Computational Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →