Polypharmacology Mapping for Diabetic Nephropathy

ISEF Category: Translational Medical Science

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Drug Identification and Testing · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

One drug can hit more than one target. That can help, or it can cause side effects. In diabetic nephropathy, a disease that damages the kidneys, multi-target drugs may work better than single-target ideas. You can use public drug and gene databases to search for those candidates.

What Is It?

Polypharmacology means one molecule affects more than one biological target. Think of it like one key fitting several locks, not just one. That can be useful when a disease comes from a tangled network of genes and proteins, like diabetic nephropathy.

In this project, you would use public databases such as DrugBank, STITCH, and OpenTargets to build a map of drug-target and gene-disease links. Then you would apply matrix factorization, a math method that finds hidden patterns in a big table, to predict which small molecules may hit several nodes in the diabetic-nephropathy network. You would also score off-target liability, which means estimating whether a drug may bind places you do not want it to hit.

Why This Is a Good Topic

This topic works well because it gives you a real biomedical problem, public data, and a clear computational outcome. You can test whether your model recovers known drug-target links and whether it ranks multi-target candidates better than simple one-gene approaches. The project connects to kidney disease drug discovery, but you can still do the work with open datasets, careful coding, and solid evaluation. You can learn data cleaning, network analysis, and model validation in one project.

Research Questions

How does matrix-factorization ranking compare with simple target-count ranking for finding candidate drugs in the diabetic-nephropathy network?
What is the effect of adding STITCH interaction data on the number of predicted drugs that hit at least 3 disease nodes?
Does filtering out low-confidence drug-target pairs change the top-ranked multi-target candidates?
To what extent do known diabetic-nephropathy drugs appear near the top of the model's ranked list?
Which network features best predict off-target liability for candidate small molecules?
How does the prediction set change when you train on different versions of the gene-disease network?

Basic Materials

Computer with internet access.
Python installed with Jupyter Notebook.
Public datasets from DrugBank, STITCH, and OpenTargets.
Spreadsheet software for tracking candidate drugs and scores.
Reference manager or notes app for logging sources.
Basic graphing tool for visualizing rankings and evaluation metrics.

Advanced Materials

High-performance computer access for repeated model training.
Python scientific stack, including pandas, NumPy, SciPy, scikit-learn, and networkx.
Jupyter Notebook or JupyterLab.
Access to version-controlled data storage such as Git.
Network visualization software such as Cytoscape.
Optional access to a small GPU or cloud notebook for faster experiments.

Software & Tools

Python: Runs data cleaning, matrix factorization, and evaluation scripts for your predictions.
Jupyter Notebook: Lets you document code, plots, and notes in one place.
pandas: Organizes drug, target, and disease tables for analysis.
scikit-learn: Supports model splitting, scoring, and baseline comparisons.
Cytoscape: Visualizes the gene and drug network so you can inspect patterns.

Experiment Steps

Define the disease network you will study and decide how you will label nodes and edges.
Assemble a clean table that combines drug-target, protein interaction, and gene-disease links from public databases.
Choose one matrix-factorization approach and one simple baseline so you can compare them fairly.
Plan how you will score prediction quality, including known links, ranking metrics, and off-target risk signals.
Test how your results change when you remove noisy data or lower-confidence interactions.
Build a final ranking that balances predicted disease coverage with liability penalties.

Common Pitfalls

Mixing drug names, gene symbols, and protein IDs without a mapping table, which breaks the network.
Treating every database link as equally reliable, which can flood the model with noise.
Using the same known interactions for training and testing, which inflates performance scores.
Ignoring class imbalance, which makes rare true links look stronger than they are.
Ranking candidates only by predicted disease hits and forgetting off-target liability.

What Makes This Competitive

A stronger project will not just rank drugs. It will test whether the ranking is better than a simple baseline and explain why. You can push the work further by comparing several network definitions, adding confidence weights, or checking whether the model recovers known diabetes or kidney drugs. Strong validation and clear error analysis matter more than a flashy algorithm.

Project Variations

Swap diabetic nephropathy for another kidney disease and compare whether the same model finds different multi-target candidates.
Focus on approved drugs only, then test whether repurposing candidates rise to the top when you remove experimental compounds.
Replace matrix factorization with graph embedding methods and compare which approach better predicts known drug-target links.

Learn More

OpenTargets Platform: Search disease-target evidence and target prioritization data for kidney disease and related genes.
DrugBank: Search drug-target interaction summaries and drug mechanism information in the public database sections.
STITCH Database: Find protein-chemical interaction data and evidence scores for small molecules.
NIH PubMed: Search for review articles on polypharmacology, diabetic nephropathy, and network pharmacology.
MIT OpenCourseWare, Introduction to Algorithms and Data Science materials: Use free course notes to review matrix methods, model validation, and network thinking.

Translational Medical Science Category Guide

How to Do Real Translational Medical Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →