How to Do Real Translational Medical Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Translational medical science used to live inside hospitals, pharma campuses, and university labs with locked doors. That barrier is gone for a high school student with a laptop, a smartphone, and an internet connection.
This guide is your starting point. It walks you through three things: the home kit (consumer health devices and invertebrate models), the free software (docking, molecular dynamics, medical-image and signal ML), and the public databases (de-identified clinical records, wearables, omics, imaging, and drug data) that together let you run a project a judge will take seriously.
Why this is possible now
The first shift is open clinical data. Massive de-identified datasets like MIMIC-IV, eICU, the All of Us public tier, NHANES, and UK Biobank summary statistics are now downloadable (some after a free training course). You can study real patients, real waveforms, and real outcomes from your bedroom.
The second shift is free GPU compute. Google Colab gives you hours of free GPU and TPU time. That means you can run AlphaFold, GROMACS molecular dynamics, transformer models on ECG and PPG signals, and graph neural networks on drug libraries without owning any hardware.
The third shift is consumer-grade biosensing. A $30 USB microscope, a $20 pulse oximeter, a $200 BITalino or OpenBCI board, and the phone in your pocket together capture vital signs, images, and biosignals that 10 years ago required a clinic.
Put those three together and a kitchen counter plus a laptop can now host a virtual drug screen, a wearable-signal analysis, and a real preclinical assay on the same week.
The translational medical science home kit
Group your kit by what each item lets you measure or model.
Consumer vitals and point-of-care strips
- Pulse oximeter (SpO2 and pulse waveform, ~$20)
- Home blood pressure cuff (~$30)
- Smart scale and a body-tape measure
- Glucose, ketone, and urinalysis dipsticks (~$15 to $25 per pack)
- pH strips and saliva-pH strips
- Smartphone with a flashlight and decent camera (you already have one)
Imaging and optics
- USB microscope, 40x to 1000x (~$30)
- Clip-on smartphone macro lens (~$15)
- A $5 plastic diffraction grating for crude spectroscopy
- A printed color-checker card for calibrated photography
Biosignal hardware
- BITalino or OpenBCI Ganglion (~$200) for EMG, ECG, EEG, EDA
- Optional Arduino + cheap sensors for custom rigs (IMU, NDIR CO2, TENS-style stimulation)
- A wearable you can export data from: Apple Watch, Fitbit, or Garmin
Invertebrate and surrogate models
- C. elegans starter culture (~$30) for lifespan and movement assays
- Daphnia magna for cardiotoxicity microscopy
- Planaria for regeneration and wound-healing studies
- Galleria mellonella larvae (sold as fishing bait, ~$0.20 each) for infection models
- Baker's yeast and kombucha SCOBY as BSL-1 microbial surrogates
- Pre-poured agar plates and Kirby-Bauer disks (~$1 each, ordered online)
Pantry-grade reagents
- Store-bought enzymes: papain, bromelain, lactase
- Probiotic capsules, sourdough starter, yogurt
- Spice extracts and herbal preparations from a grocery or health-food store
- Common controls: DMSO, ethanol, distilled water
A reasonable starter kit runs roughly $150 to $400, depending on whether you add the BITalino board.
Signature technique: wearable signal analysis on a free Colab GPU
Translational medicine lives or dies on signals: heart rate, oxygen saturation, gait, sleep, EEG. The single technique that unlocks the most projects in this category is loading a public waveform dataset into Google Colab and training a model on it. Here is the five-step workflow.
- Pick a signal and a dataset. PhysioNet hosts MIT-BIH ECG, Sleep-EDF, PPG-DaLiA, WESAD, and MIMIC-IV waveforms. Choose one signal you care about (PPG, ECG, accelerometry).
- Load the data in Colab. Use the wfdb Python package to stream records straight from PhysioNet. No local storage needed.
- Preprocess. Filter the signal (band-pass for ECG, low-pass for PPG), segment it into windows, and normalize.
- Train a model. Start with a 1D CNN or a small transformer in PyTorch. Use the free GPU runtime. Save checkpoints to Google Drive.
- Explain it. Run SHAP or Captum on the trained model to show which segments of the waveform drove each prediction. Judges reward interpretability.
That same five-step shape works for sound (cough audio with MFCCs), images (chest X-rays with MONAI), or text (clinical notes with HuggingFace).
The dry-lab side: free software you can install today
Group these by what each tool does.
Structure and modeling
- PyMOL and ChimeraX: view and annotate protein structures from the PDB.
- AlphaFold DB: pre-computed structures for almost every human protein.
- AlphaFold2 / AlphaFold-Multimer (Colab): predict structures of proteins and complexes from sequence.
- ESMFold and ESM-2: faster structure prediction and protein embeddings on Colab.
Docking and virtual screening
- AutoDock Vina and Smina: classic molecular docking, runs on a laptop.
- DiffDock: diffusion-based docking that handles flexible binding.
- RDKit: cheminformatics toolkit for handling molecules in Python.
- ChemProp: graph neural network for bioactivity prediction.
Molecular dynamics
- GROMACS and OpenMM: full-physics simulations, free on Colab GPUs.
- PK-Sim: physiologically-based pharmacokinetic modeling, free for academic use.
ADMET and drug-likeness
- SwissADME, pkCSM, and ADMET-AI: predict absorption, toxicity, and drug-likeness from SMILES.
Medical imaging and signals
- MONAI: PyTorch-based medical image deep learning.
- MediaPipe: pose, hand, and face landmark detection from a webcam.
- OpenCV and ImageJ / Fiji: image processing and measurement.
- wfdb: read PhysioNet waveform records in Python.
Modeling and ML
- scikit-learn, PyTorch, HuggingFace Transformers: the standard ML stack.
- SHAP and Captum: model interpretability.
- PhysiCell: agent-based tumor and tissue simulation.
- Mesa and NetLogo: agent-based public-health simulation.
- R with the survival, MatchIt, and tidymv packages: epidemiology and causal inference.
Running these tools yourself is what changes how research feels. You stop reading about science and start producing it.
Public databases that count as real data
Group these by what they contain.
De-identified clinical records and registries
- MIMIC-IV and eICU: ICU records, vitals, labs, notes (free after a short credentialing course).
- HiRID: high-resolution ICU data from Bern.
- NHANES and BRFSS: nationally representative US health surveys.
- All of Us public tier: diverse US cohort, demographics, vitals, surveys.
- UK Biobank summary statistics: GWAS and phenotype summaries.
- CDC WONDER and CDC PLACES: mortality, chronic-disease prevalence, census-tract health.
- ClinicalTrials.gov and FDA Orange Book / 510(k) database: trials and approvals.
Genomics and variant data
- GEO and ArrayExpress: gene expression studies.
- TCGA / GDC and cBioPortal: cancer multi-omics.
- GTEx: tissue-level gene expression.
- ClinVar, gnomAD, OMIM: variants, allele frequencies, clinical interpretation.
- GWAS Catalog, PheWAS Catalog, FinnGen public: published associations.
- OpenTargets, DisGeNET: disease-gene-drug links.
- KEGG, Reactome, STRING: pathways and protein interaction networks.
Medical imaging
- ISIC: dermatology images.
- CheXpert, MIMIC-CXR, NIH ChestX-ray14: chest radiographs.
- BraTS, LIDC-IDRI, fastMRI, ADNI, OASIS: brain MRI, lung CT, neurodegeneration imaging.
- BUSI, Camelyon, Kermany OCT, PathMNIST / MedMNIST: ultrasound, pathology, retinal OCT.
- RSNA challenges and TCIA: curated radiology challenge sets.
Wearables and physiological signals
- PhysioNet: ECG, EEG, PPG, gait, sleep waveforms (MIT-BIH, Sleep-EDF, WESAD, PPG-DaLiA).
- Coswara and COUGHVID: cough audio.
- Apple Health export, Fitbit research releases: consumer wearable streams.
Chemistry and pharmacology
- PDB and AlphaFold DB: 3D structures.
- UniProt: protein sequence and annotation.
- PubChem, ChEMBL, DrugBank, BindingDB, ZINC: compounds, bioactivities, drug targets.
- STITCH: chemical-protein interactions.
- JUMP-CP and BBBC: Cell Painting images.
- LIT-PCBA: virtual-screening benchmark.
Re-analyzing one of these datasets with a fresh question is itself a legitimate research path, and many of the strongest student projects never collect a single new sample.
How to combine wet and dry: the strongest project shape
Pattern A: home measurement, public-data anchor. Run a small, careful at-home study (urinalysis strips for 30 days, a tongue-photo dataset, a cold-pressor cohort) and use a much larger public dataset to calibrate, contextualize, or validate your finding. The public data gives statistical power; your data gives a new endpoint.
Pattern B: in-silico prediction, invertebrate validation. Run a docking, generative-chemistry, or pathway-mining pipeline on Colab to nominate a small set of compounds, then test the top candidates on C. elegans lifespan, Daphnia heart rate, planaria regeneration, or yeast growth. The computation gives novelty; the assay gives biological signal.
Judges reward hybrid shapes because they mirror how real translational pipelines work, from molecule to model to patient.
Choosing a phenomenon that has not been done
- Search Google Scholar for your candidate phrase plus terms like "high school," "ISEF," and the closest method ("smartphone PPG sepsis," "AutoDock Vina KRAS"). Look at the last three years.
- Browse the Society for Science abstracts archive for past ISEF and Regeneron STS projects in Translational Medical Science. Search by keyword and by subcategory.
- Search PubMed and ClinicalTrials.gov for the disease plus your method. Read the most recent review article. Note what is missing: an under-studied population, a missing modality, a method that has not been tried on this dataset.
If you find adjacent prior work, that is good news, not bad news. It means the question is alive, and your job is to find the next step nobody has taken yet.
A realistic timeline
- 1 to 2 weeks: replicate a published wearable or imaging analysis on one public dataset, or run a single home assay (Daphnia heart rate, urinalysis time-series) with proper controls.
- 1 to 2 months: a hybrid project for a regional fair, combining one at-home dataset or assay with one public-data analysis, plus a written report.
- Full year: an ISEF-track project with a real research question, a registered protocol, a deep computational pipeline (docking plus MD, or a transformer with fairness audits), and either an invertebrate validation or a substantial human-volunteer cohort with IRB-equivalent SRC paperwork.
If this is your first project, start with the 1 to 2 week version. You learn more from finishing a small project than from stalling on a big one.
A starter checklist
- Set up a free Google Colab account and verify GPU access.
- Install a local Python environment (Anaconda or uv) with NumPy, pandas, scikit-learn, PyTorch, RDKit, wfdb, and OpenCV.
- Install PyMOL or ChimeraX for structure viewing, plus AutoDock Vina if you are heading toward drug screening.
- Pick one wearable or sensor (pulse oximeter, USB microscope, BITalino) and confirm you can capture clean data from it.
- Pick one public database and complete its access steps (PhysioNet credentialing if you want MIMIC-IV).
- Start a lab notebook, paper or digital, with dated entries from day one.
- Write a one-sentence research question with a measurable outcome.
If you can check all seven, you are ready to pick a phenomenon.
Where to go next
Translational Medical Science has six ISEF subcategories. Each has its own MehtA+ project guide that plugs directly into the kit on this page. Pick the one that pulls you in.
- Disease Detection and Diagnosis (DIS): smartphone-based biomarkers, ML on medical images and signals, at-home screening tools.
- Disease Prevention (PRE): behavioral, environmental, and lifestyle interventions, plus causal inference on public health data.
- Disease Treatment and Therapies (TRE): non-pharmacologic interventions, biofeedback devices, adaptive dosing controllers, digital health apps.
- Drug Identification and Testing (DRU): virtual screening, generative chemistry, repurposing pipelines, antimicrobial assays.
- Pre-Clinical Studies (PCS): invertebrate models, in-silico tissue and pharmacokinetic simulations, surrogate biofilm and yeast assays.
- Other (OTH): knowledge graphs, equity audits, cost-effectiveness models, synthetic data, and clinical-decision-support prototypes.
A kitchen counter plus a laptop is enough to start any of them.
Project ideas in this category (72)
Pre-Clinical Studies · Intermediate
AI Chronic Pain Coaching ChatbotsDisease Treatment and Therapies · Advanced
AI Insulin Dosing in Virtual Diabetes SimulatorsDisease Treatment and Therapies · Advanced
AI Peptide Design for Drug-Resistant PseudomonasDrug Identification and Testing · Advanced
Alzheimer’s Prevention Scorecard From Genetic DataDisease Prevention · Advanced
Amyloid-Beta Drug Repurposing With Molecular ModelingDrug Identification and Testing · Advanced
Aptamer Design for Parkinson’s Protein TargetsDrug Identification and Testing · Advanced
Audio Tones for Tension Headache ReliefDisease Treatment and Therapies · Intermediate
Auricular Vagus Nerve Stimulation and HRVDisease Treatment and Therapies · Advanced
Bayesian Drug Screening for Mpro Lead DiscoveryDrug Identification and Testing · Advanced
Bench-To-Bedside Lag Time in Drug DevelopmentOther · Advanced
C. elegans Polyphenol Heat-Stress ScreeningPre-Clinical Studies · Advanced
CAR-T Tumor Microenvironment ModelingPre-Clinical Studies · Advanced
Cell-Painting AI for Drug Mechanism CluesPre-Clinical Studies · Advanced
Chatbot Vaping Cessation Message StudyDisease Prevention · Advanced
Classroom CO2, Ventilation, and Illness RiskDisease Prevention · Intermediate
Clinical Note Text Mining for Drug EffectsOther · Advanced
Cold Pressor Pain Relief With Music and BreathingDisease Treatment and Therapies · Intermediate
Consent Readability Rewriting for Clinical TrialsOther · Advanced
Daphnia Heart Rate Toxicity Screening ProjectPre-Clinical Studies · Intermediate
Dried Blood Spot Anemia Detection with CNNsDisease Detection and Diagnosis · Advanced
Drug Target Success Prediction With Knowledge GraphsOther · Advanced
EMG Biofeedback for Trapezius RelaxationDisease Treatment and Therapies · Advanced
FDA AI Medical Device Fairness AuditOther · Advanced
Federated AKI Detection and Privacy TradeoffsDisease Detection and Diagnosis · Advanced
Fitbit Recovery Tracking for Post-Op PatientsOther · Advanced
Handwriting AI for Tremor and Parkinson’s DetectionDisease Detection and Diagnosis · Advanced
Heatwave Forecasts and ER Visit AlertsOther · Advanced
Herbal Pathway Mapping for Arthritis Tea BlendsDisease Treatment and Therapies · Advanced
HET-CAM Eye Drop Irritancy Test ProjectPre-Clinical Studies · Intermediate
Hsp90 Inhibitor Design with AI ToolsDrug Identification and Testing · Advanced
Kombucha Biofilm Oil Penetration Kinetics ProjectDrug Identification and Testing · Intermediate
KRAS-G12D Inhibitor Prediction with ChemPropDrug Identification and Testing · Advanced
LNP-MRNA Permeability Modeling for BBB DeliveryPre-Clinical Studies · Advanced
Long COVID Drug Repurposing With Text MiningDisease Treatment and Therapies · Advanced
Low-Cost Tremor-Canceling Utensil DesignDisease Treatment and Therapies · Advanced
Low-Cost Triage Device With TinyMLOther · Advanced
Mendelian Randomization for Nutrients and MigraineDisease Prevention · Advanced
Mobile CBT-I Sleep App PrototypeDisease Treatment and Therapies · Intermediate
Nanobody Design for PD-L1 BindingDrug Identification and Testing · Advanced
PBPK Modeling for Safer Acetaminophen DosingPre-Clinical Studies · Advanced
Planaria Wound Healing Modulators StudyPre-Clinical Studies · Advanced
Polypharmacology Mapping for Diabetic NephropathyDrug Identification and Testing · Advanced
PPG Sepsis Detection With WearablesDisease Detection and Diagnosis · Advanced
Rehydration Drink Testing for AthletesDisease Treatment and Therapies · Intermediate
Root Growth Screen for Topical Emulsion SafetyPre-Clinical Studies · Intermediate
School Cafeteria Disease Spread SimulationDisease Prevention · Intermediate
Sleep Apnea and Hypertension Causality StudyOther · Advanced
Smartphone Cough Biomarkers for Disease DetectionDisease Detection and Diagnosis · Advanced
Smartphone Nailbed Anemia ScreeningDisease Detection and Diagnosis · Advanced
Smartphone Neck-Vein Waveform AnalysisDisease Detection and Diagnosis · Advanced
Smartphone Pupillometry for Concussion ScreeningDisease Detection and Diagnosis · Advanced
Smartphone Saliva pH Tracking for Dental RiskDisease Prevention · Advanced
Smartphone Urinalysis Time-Series for Early SignalsDisease Detection and Diagnosis · Advanced
Smell Test for Early Biomarker ScreeningDisease Detection and Diagnosis · Intermediate
Spice Antimicrobial Testing with Smartphone AnalysisDrug Identification and Testing · Intermediate
Sunscreen Nudge Framing for Teen ReapplicationDisease Prevention · Intermediate
Synthetic EHRs for Rare Disease ResearchOther · Advanced
Tech-Neck Feedback to Improve Homework PostureDisease Prevention · Intermediate
Tongue Image CNN for Disease DetectionDisease Detection and Diagnosis · Advanced
Transparent Clinical Risk Tool With SHAP ExplanationsOther · Advanced
Triple-Burden Health Risk Mapping Project IdeasDisease Prevention · Advanced
Ultra-Processed Food, Sleep, and Metabolic RiskDisease Prevention · Advanced
UV-C Mouthguard Decontamination ProjectDisease Prevention · Intermediate
Virtual Screening for SARS-CoV-2 Protease InhibitorsDrug Identification and Testing · Advanced
Voice Analysis for Parkinson’s ScreeningDisease Detection and Diagnosis · Advanced
VR Exposure Therapy for Needle PhobiaDisease Treatment and Therapies · Advanced
Wax Moth Infection Model for Antibiotic AdjuvantsPre-Clinical Studies · Advanced
Wearable AFib Screening Cost ModelOther · Advanced
Wearable Signals for Hypertension Risk PredictionDisease Prevention · Advanced
xTB Screening of Psychedelic Analog SelectivityDrug Identification and Testing · Advanced
Yeast Stress Assay for Metabolic InteractionsPre-Clinical Studies · Intermediate
