Long COVID Gene Network Drug Repurposing

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Genetics and Molecular Biology of Disease · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months

The Hook

Long COVID can linger for months, and doctors still do not have one clear drug that fixes it. Your project can search public gene data the way a detective matches clues across cases. Instead of guessing a treatment, you can test which existing drugs point toward the same gene network. That turns a hard health problem into a data question you can actually attack.

What Is It?

This project asks a simple question with a big data twist: which existing drugs seem most likely to reverse the gene activity seen in long COVID? You start with gene-expression data, which is a snapshot of which genes are turned up or down in a cell or tissue sample. Think of it like a music mixer, where some tracks are louder and some are quieter than normal.

OpenTargets helps you connect diseases, genes, and drug targets. STRING shows how proteins work together in a network. Your idea is to overlap the long-COVID gene pattern with chronic fatigue syndrome transcriptomes, which are full gene-expression profiles from people with that condition, and then see which drug targets sit near the strongest shared signals. If a drug points at the same disturbed network, it may be a better repurposing candidate than a drug picked by guesswork alone.

Why This Is a Good Topic

This is a strong science fair topic because you can test it with public data, clear rules, and repeatable analysis. You are not just listing genes. You are comparing datasets, scoring overlap, and checking whether the same drug targets keep showing up across sources. That gives you a real biomedical question tied to a real patient problem, while teaching you transcriptomics, network analysis, and basic statistical thinking.

Research Questions

How does the choice of long-COVID transcriptome dataset change the ranked overlap with chronic fatigue syndrome signatures?
What is the effect of using different differential-expression cutoffs on the drug candidates that appear at the top of the list?
Does adding STRING network neighbors improve agreement between independent long-COVID datasets?
To what extent do repurposing candidates overlap between blood-based and tissue-based long-COVID datasets?
Which chronic fatigue syndrome transcriptome subtype produces the strongest inverse signature against long-COVID?
How does pathway-level scoring compare with single-gene overlap for ranking repurposing candidates?

Basic Materials

Laptop or desktop computer with internet access.
Spreadsheet software such as Google Sheets or Excel.
Python with pandas, scipy, matplotlib, and networkx.
R with Bioconductor packages for gene-expression checks.
Web access to OpenTargets, STRING, PubMed, and NCBI GEO.
A note-taking system for tracking dataset IDs, gene lists, and scoring rules.

Advanced Materials

Access to a university workstation or cloud compute account.
RStudio or Jupyter notebooks for reproducible analysis.
DESeq2, edgeR, or limma for transcriptome analysis.
Cytoscape for network visualization and hub analysis.
g:Profiler or Enrichr for pathway enrichment.
Git for version control and analysis tracking.

Software & Tools

Python: Cleans gene lists, runs overlap tests, and plots ranked results.
R: Handles differential-expression checks and enrichment analysis.
OpenTargets Platform: Links disease evidence to drug targets and known gene associations.
STRING: Maps protein interactions and highlights network neighbors.
Cytoscape: Visualizes the gene network and helps you compare target clusters.

Experiment Steps

Define the exact long-COVID and chronic fatigue syndrome datasets you will compare, and decide whether you are testing blood, tissue, or mixed samples.
Build a clean gene list from each dataset, then choose one ranking rule for genes that move up or down.
Map the shared genes onto STRING and OpenTargets to see whether network neighbors and known target evidence change the hit list.
Create a scoring scheme for repurposing candidates, then compare it with a simple overlap baseline.
Validate the top candidates against an independent dataset or a pathway-level readout, and record where the signal weakens.

Common Pitfalls

Mixing gene symbols with protein IDs, which creates false mismatches between OpenTargets, STRING, and transcriptome files.
Comparing datasets with different tissues or platforms without normalization, which makes the overlap look stronger or weaker than it really is.
Treating every shared gene as a drug hint, which ignores whether the gene points in the right direction for repurposing.
Using one dataset as proof, which can make a noisy signature look real even when it does not repeat.
Ignoring multiple-testing control, which can flood the final list with false positives from pathway or enrichment scans.

What Makes This Competitive

A class-level version of this project stops at one dataset and one overlap list. A stronger version checks multiple public datasets, uses a clear null model, and shows that the same drug targets rise to the top again and again. You can make it more competitive by testing whether network distance, pathway score, or target direction gives the best prediction. That kind of careful comparison shows real judgment, not just data collection.

Project Variations

Compare blood-based long-COVID signatures with ME/CFS transcriptomes instead of using only one tissue source.
Replace STRING with a pathway enrichment approach and see whether the repurposing shortlist changes.
Test whether hub genes from the network point to different drug classes than the full gene-overlap method.

Learn More

OpenTargets Platform: Search gene-disease evidence and drug target links on the OpenTargets website.
STRING: Explore protein-protein interaction networks and network neighbors on the STRING website.
NCBI GEO: Find public long-COVID and chronic fatigue syndrome transcriptome datasets in the Gene Expression Omnibus.
PubMed: Search review articles on long COVID, ME/CFS, transcriptomics, and drug repurposing.
Enrichr: Run pathway enrichment on shared gene lists at the Enrichr website.

Biomedical and Health Sciences Category Guide

How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →