DiffDock Drug Docking for Antimalarial Targets
ISEF Category: Computational Biology and Bioinformatics
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Computational Pharmacology · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
Malaria still sickens hundreds of millions of people worldwide, yet many parasite proteins remain under-studied. That gap creates a chance for you to hunt for drugs in a smarter way. Instead of testing one compound at a time in a wet lab, you can rank thousands of candidates with modeling tools first. Then you can check whether your predictions recover known antimalarial hits.
What Is It?
This project uses computer models to predict how a drug fits into a target protein, a bit like testing which key shape slips best into a lock. DiffDock predicts the pose, which means the 3D way a molecule sits in the binding site. AutoDock Vina then gives each pose a score that estimates how strongly the drug may bind.
Your focus is antimalarial targets such as PfATP4 and PfDHODH. PfATP4 helps the parasite control sodium balance, while PfDHODH helps the parasite make DNA building blocks. If a repurposed FDA drug scores well against one of these targets, that drug may deserve closer study. You can also compare your ranked list with MMV-Pathogen-Box hits, which gives you a built-in reality check.
Why This Is a Good Topic
This is a strong science fair topic because the core question is measurable. You can compare predicted binding poses, docking scores, and hit recovery rates. You also connect to a real problem, malaria drug discovery, where researchers need faster ways to find new candidates. A student can learn structure-based drug design, ranking methods, and validation with known reference compounds.
Research Questions
- How does DiffDock pose prediction change the ranking of repurposed FDA drugs against PfATP4 compared with AutoDock Vina alone?
- What is the effect of target choice, PfATP4 versus PfDHODH, on the number of known MMV-Pathogen-Box hits recovered in the top-ranked compounds?
- Does rescoring DiffDock poses with AutoDock Vina improve agreement with retrospective antimalarial hit lists?
- To what extent does docking score correlate with known antimalarial activity across repurposed FDA compounds?
- Which protein target gives the clearest separation between known hits and likely false positives in a retrospective screen?
- How does the top-hit list change when you compare FDA-approved drugs with the MMV-Pathogen-Box compound set?
Basic Materials
- Computer with a modern multi-core processor.
- Stable internet connection for downloading protein and ligand files.
- Protein structure files for PfATP4 and PfDHODH from the Protein Data Bank or related repositories.
- FDA-approved drug library in SDF, MOL2, or SMILES format.
- MMV-Pathogen-Box compound list for retrospective validation.
- Python installed with scientific libraries for data handling.
- Molecular visualization software such as PyMOL or UCSF ChimeraX.
- Spreadsheet software for tracking compounds, scores, and rankings.
- Basic reference manager for saving papers and notes.
Advanced Materials
- High-memory workstation or access to a university computing cluster.
- GPU access if your DiffDock workflow supports acceleration.
- Protein preparation software for protonation, cleanup, and file conversion.
- Molecular dynamics tools for optional pose refinement or stability checks.
- Docking workflow scripts for batch processing large ligand sets.
- Benchmark dataset of known antimalarial actives and decoys.
- Version control system such as Git for tracking code changes.
Software & Tools
- DiffDock: Predicts likely ligand poses in protein binding sites.
- AutoDock Vina: Rescores docked poses and estimates binding affinity.
- PyMOL: Helps you inspect binding poses and compare how ligands sit in each target.
- Python: Lets you clean results, calculate rankings, and make plots.
- RDKit: Converts chemical formats and filters ligand libraries for analysis.
Experiment Steps
- Define the exact comparison you want to make, such as pose prediction alone versus pose prediction plus rescoring.
- Select a small set of antimalarial targets and collect clean protein structures for each one.
- Prepare a ligand library that includes repurposed FDA drugs and a retrospective validation set with known MMV-Pathogen-Box hits.
- Build a scoring plan that ranks compounds the same way across every target and every method.
- Set up controls that let you tell real signal from docking bias, such as random compounds or decoy sets.
- Plan how you will measure success, using metrics like hit recovery, rank enrichment, and agreement with known actives.
Common Pitfalls
- Using poor-quality protein structures, which can distort the binding pocket and give misleading poses.
- Mixing file formats or protonation states, which can change docking results without warning.
- Treating the raw docking score as truth, which ignores pose quality and target-specific bias.
- Comparing targets without standardizing the ligand set, which makes one protein look better for the wrong reason.
- Skipping validation against known hits, which leaves you with ranked compounds but no proof that the workflow works.
What Makes This Competitive
A competitive project would do more than run docking software once. You would build a careful benchmark, compare methods, and test whether the workflow actually recovers known antimalarial hits better than chance. Strong entries also explain failure cases, not just top hits. A deeper analysis, such as enrichment curves, pose clustering, or target-to-target comparison, can make the project much stronger.
Project Variations
- Test whether the same workflow works on a different parasite target, such as another malaria enzyme with a known structure.
- Compare FDA-approved drugs with natural products or clinical candidates to see which library gives better hit recovery.
- Add a pose-clustering step before rescoring to see whether consensus poses improve ranking quality.
Learn More
- PubMed: Search for review articles on malaria drug discovery, PfATP4, PfDHODH, and structure-based virtual screening.
- Protein Data Bank: Find experimental 3D structures for protein targets and learn how structures are reported.
- NIH Guide to Malaria Research Resources: Look for background on parasite biology and target classes through NIH pages and linked resources.
- AutoDock Vina documentation: Read the original software manual and tutorial pages for docking workflow details.
- RDKit documentation: Use the official docs to handle chemical file conversion, descriptors, and library filtering.
- Nature Reviews Drug Discovery: Search the journal for review articles on antimalarial target discovery and computational screening.
Computational Biology and Bioinformatics Category Guide
How to Do Real Computational Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
