Coronary PRS Transferability

ISEF Category: Biomedical and Health Sciences

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Genetics and Molecular Biology of Disease · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

A polygenic risk score can look strong in one ancestry group and stumble in another. That matters because the same score can mark you as high risk in one dataset and lower risk in another. You can turn that gap into a clean science fair project by measuring where the score holds up, where it slips, and how a corrected version changes the result.

What Is It?

A polygenic risk score, or PRS, turns many tiny DNA effects into one prediction number. A genome-wide association study, or GWAS, supplies the weights, like a recipe that says how much each ingredient should count. If the recipe came from one group of people, it may not work as well when you test it on another group, because the DNA patterns around those variants can differ.

Your project asks how much that drop in performance shows up for coronary artery disease across ancestry groups. A fairness-corrected score tries to shrink the gap by recalibrating the score, adjusting the weights, or adding ancestry-aware scaling so the prediction is less uneven across groups.

Why This Is a Good Topic

This is a strong science fair topic because you can test it with public data, clear metrics, and no wet lab work. You can measure discrimination, calibration, and error gaps across ancestry groups, which gives you real numbers instead of vague claims. The topic connects directly to health equity, since a score that works better for one group can change who gets flagged for follow-up care.

Research Questions

How does coronary artery disease PRS discrimination change across self-reported ancestry groups when you use the same GWAS weights?
What is the effect of ancestry-matched LD reference panels on PRS calibration error?
Does group-specific recalibration reduce false-risk inflation more than a single global correction?
To what extent do multi-ancestry GWAS weights improve transferability compared with European-only weights?
Which fairness correction method best preserves AUC while shrinking the gap in calibration slope?
How does standardizing PRS within each ancestry group change the apparent performance gap?

Basic Materials

Laptop or desktop computer with at least 16 GB RAM.
Python 3 or R installed.
Public coronary artery disease GWAS summary statistics.
An ancestry reference panel or allele frequency table for comparison groups.
Spreadsheet software or a CSV editor for cleaning files.
Cloud storage or GitHub for version control and backups.

Advanced Materials

Secure-access research workstation with enough storage for large genotype and summary-stat files.
PLINK 2.0 for variant filtering and score preparation.
PRSice-2 or LDpred2 for cross-ancestry PRS construction.
R with bigsnpr, data.table, ggplot2, and pROC for analysis and plots.
Ancestry principal-components scripts or reference files for subgroup checks.

Software & Tools

Python: Cleans summary statistics, runs bootstrap tests, and makes plots.
R: Calculates calibration, discrimination, and group-level error metrics.
PLINK 2.0: Filters variants and prepares inputs for PRS pipelines.
PRSice-2: Builds clumping-and-thresholding scores for quick comparisons.
LDpred2: Estimates PRS weights from summary statistics with LD information.

Experiment Steps

Define the ancestry groups, outcome, and performance metric you will compare.
Choose one baseline PRS method and one fairness correction to test against it.
Build a scoring pipeline that keeps training, validation, and test sets separate.
Decide which discrimination, calibration, and uncertainty checks will answer your question.
Plan a sensitivity analysis for LD reference panels, variant thresholds, or ancestry labels.
Map out the tables and figures that will show whether the correction narrows the gap.

Common Pitfalls

Mixing self-reported ancestry with genetically inferred ancestry, which makes your comparison groups inconsistent.
Judging the score by AUC alone, which can hide miscalibration and false-risk inflation.
Using a European LD reference for every group, which can make transferability look better or worse for the wrong reason.
Tuning the fairness correction on the test set, which leaks information and shrinks the gap too much.
Comparing raw PRS values across cohorts without standardizing them, which turns scale shifts into fake performance differences.

What Makes This Competitive

A strong version of this project does more than report a gap. You compare several correction methods, use confidence intervals or bootstrap tests, and check both discrimination and calibration. If you validate the pattern on an external cohort or show that one fix helps a weaker group without hurting another, the work starts to look research-grade.

Project Variations

Compare coronary PRS transferability across European, African, East Asian, and South Asian groups using one fixed scoring pipeline.
Test whether ancestry-matched LD panels improve calibration more than simple score rescaling.
Build a blended fairness-corrected score and compare it with a single global score and group-specific scores.

Learn More

GWAS Catalog: Search for coronary artery disease studies, summary statistics, and linked trait annotations.
PubMed: Search review articles on polygenic risk scores, ancestry bias, and score portability.
National Human Genome Research Institute: Read background pages on GWAS and polygenic risk scores on the NHGRI site.
All of Us Research Program: Read the public cohort documentation and data browser to understand ancestry reporting and available summary resources.
UK Biobank: Read the resource overview and phenotype documentation to understand cohort structure and ancestry notes.

Biomedical and Health Sciences Category Guide

How to Do Real Biomedical and Health Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →