Predicting G-Quadruplex Stability in Oncogene Promoters

Predicting G-Quadruplex Stability in Oncogene Promoters

ISEF Category: Biochemistry

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Structural Biochemistry  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

Some cancer-linked genes hide folded DNA structures right in their promoters. Those folds, called G-quadruplexes, can act like tiny roadblocks or switches for gene activity. You can scan the c-MYC, KRAS, and BCL-2 promoters for those patterns, then test which ones look most stable in molecular dynamics simulations.

What Is It?

A promoter is the stretch of DNA near a gene that helps control when the gene turns on. A G-quadruplex is a four-stranded DNA fold that forms in guanine-rich, or G-rich, sequences. Think of normal DNA as a zipper and a G-quadruplex as a knot the zipper can make when one side has the right pattern.

In oncogene promoters, these knots can matter because they may change how easily proteins read the DNA. QGRS Mapper gives you a first pass at where the folds could form. Molecular dynamics, or MD, then simulates how the structure moves over time, so you can see whether a ligand, a small molecule that binds DNA, seems to steady the fold or let it wobble.

Why This Is a Good Topic

This topic is testable because you can compare real DNA sequences, score them with one rule, and then ask whether the same rank order holds in simulation. It connects to cancer biology, because c-MYC, KRAS, and BCL-2 are all well-known genes tied to growth control. You can learn sequence analysis, structure reading, and basic simulation design without needing a wet lab.

Research Questions

  • Which promoter, c-MYC, KRAS, or BCL-2, has the highest G-quadruplex score in the same upstream window?
  • How does window length around each transcription start site change the number of predicted G-quadruplex motifs?
  • What is the effect of loop length on the QGRS Mapper score for each promoter?
  • Does ligand binding in MD simulations reduce structural fluctuation more in one promoter than in the others?
  • To what extent do predicted G-quadruplex sites overlap with known regulatory regions in each promoter?
  • What is the effect of using different ligand classes on the persistence of the simulated quadruplex fold?

Basic Materials

  • Laptop or desktop computer with internet access.
  • Free web browser.
  • QGRS Mapper access.
  • NCBI Gene or GenBank access.
  • PubMed access for background papers.
  • Google Sheets or Excel for scoring and charts.
  • Text editor for sequence notes.
  • Optional Python install for plots.

Advanced Materials

  • GPU-capable workstation or university cluster access.
  • GROMACS, AMBER, or NAMD for MD runs.
  • UCSF ChimeraX or PyMOL for structure viewing.
  • Open Babel for file conversion and ligand preparation.
  • MDAnalysis or VMD for trajectory analysis.
  • PubChem structure files for candidate ligands.
  • Reference PDB structures for G-quadruplex templates.

Software & Tools

  • QGRS Mapper: Finds candidate G-quadruplex motifs in promoter sequences.
  • NCBI Gene: Provides gene context and annotated promoter records for c-MYC, KRAS, and BCL-2.
  • PubChem: Supplies ligand structures and basic property data.
  • GROMACS: Runs molecular dynamics simulations of quadruplex-ligand systems.
  • Python: Cleans output files and plots stability metrics.
  • UCSF ChimeraX: Lets you inspect structures and compare folded states.

Experiment Steps

  1. Define the promoter window and scoring rule you will use for every gene.
  2. Gather the same sequence span for c-MYC, KRAS, and BCL-2 so the comparison stays fair.
  3. Rank candidate G-quadruplex motifs with QGRS Mapper and choose the structures you will carry into simulation.
  4. Choose one ligand set and one unbound control, then decide how you will standardize the starting structures.
  5. Pick the stability outputs you will compare, such as root-mean-square deviation, hydrogen-bond persistence, and stacking contacts.
  6. Plan a statistic that compares runs across genes and ligands, not just one trajectory at a time.

Common Pitfalls

  • Mixing promoter window lengths, which makes one gene look stronger just because you scanned more DNA.
  • Treating a high QGRS score as proof of folding, which skips the check that simulation or literature validation should provide.
  • Starting MD from mismatched template structures, which changes stability because the inputs do not match.
  • Comparing runs with different force fields or solvent choices, which confuses setup differences with real biology.
  • Reading a single trajectory as a final answer, which hides run-to-run noise and random folding changes.

What Makes This Competitive

A strong version of this project does more than rank sequences. It compares multiple promoters under one scoring rule, then checks whether MD keeps the same order or changes it. You can raise the level again by testing more than one ligand class and by using statistics instead of eyeballing a single trajectory. That gives your project a clear analytical story, not just a screenshot of a folded DNA model.

Project Variations

  • Compare the wild-type promoter sequences with known point mutants that weaken G-rich runs.
  • Test a natural product ligand and a drug-like ligand on the same quadruplex to see whether they stabilize the fold in different ways.
  • Add a non-oncogene G-rich promoter as a control to see whether the same scoring rule still picks out strong candidates.

Learn More

  • PubMed: Search for review articles on promoter G-quadruplexes, oncogene regulation, and ligand binding.
  • NCBI Gene and GenBank: Find annotated promoter regions and download sequence records for c-MYC, KRAS, and BCL-2.
  • Protein Data Bank: Look up solved G-quadruplex structures and ligand-bound complexes.
  • PubChem: Find ligand structures, identifiers, and basic chemistry data for simulation inputs.
  • QGRS Mapper: Use the free web tool to scan G-rich DNA for candidate quadruplex motifs.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →

Shopping Cart