How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Cellular and molecular biology used to live behind the doors of university labs with six-figure instruments. That world has cracked open. A high school student with a laptop, a USB microscope, and a yeast packet can now run experiments and analyze datasets that look a lot like graduate work.

This guide is your starting point. It walks you through three things: the affordable kit you can keep on a desk, the free software professional biologists actually use, and the public databases that store decades of real human and model-organism data.

Why this is possible now

Sequencing got cheap, and the data got public. Almost every major experiment a research lab runs ends up in a free archive like NCBI GEO, GTEx, or the Human Cell Atlas. You can download the same files a professor downloads, on the same day.

Compute got free. Google Colab gives you a GPU in your browser. You can fold a protein with AlphaFold, run molecular dynamics in OpenMM, or train a deep-learning model on single-cell data without owning a graphics card.

Imaging got pocket-sized. A 30 dollar USB microscope plugged into a laptop, or a phone clipped to a cheap adapter, gives you frame rates and resolution that were lab-grade 20 years ago. Pair that with open computer vision libraries and you have a quantitative imaging pipeline.

Put it together and a kitchen counter plus a laptop can now culture model organisms, image live cells, mine terabytes of human sequencing data, and simulate a protein, all in one project.

The cellular and molecular biology home kit

You do not need a wet lab. You need a clean surface, a few model organisms, a small microscope, and some cheap dyes.

Model organisms you can keep at home

  • Baker's or brewer's yeast (a few dollars at the grocery store).
  • Drosophila fruit flies (wild-caught with a banana trap, or ordered cheap).
  • Daphnia water fleas (from a pet store aquarium section).
  • Planaria (pond water, or 10 to 20 dollars from a biological supplier).
  • Tetrahymena and Paramecium (pond water or a 20 dollar culture).
  • C. elegans (free from the Caenorhabditis Genetics Center with a mentor sponsor).
  • Sprouted seeds, onion bulbs, Lemna duckweed, Elodea (under 10 dollars total).

Imaging hardware

  • USB digital microscope, 30 to 100 dollars.
  • Smartphone microscope adapter, 15 to 30 dollars.
  • Foldscope paper microscope, about 1.50 dollars.
  • A tripod or phone clamp for steady time-lapse.

Dyes, reagents, and consumables

  • Methylene blue, neutral red, and trypan blue (a few dollars each, classic viability and uptake stains).
  • Food dyes as cheap uptake tracers.
  • Agar plates and petri dishes from a biology supplier, around 20 dollars for a starter pack.
  • Resazurin powder for metabolic readouts.

Affordable kits and consumables

  • Carolina or Bio-Rad Explorer wet-lab kits, 10 to 50 dollars each (electrophoresis, transformation, PCR-by-mail).
  • Homemade agarose gel rig with a baking-soda buffer and a 9V battery setup.
  • Mail-in Sanger sequencing, around 5 to 15 dollars per sample.

Optional, project-dependent

  • Raspberry Pi for automated time-lapse rigs, around 40 dollars.
  • A consumer EEG headband for neurobiology-leaning projects, 100 to 250 dollars secondhand.

A full starter setup lands between 100 and 300 dollars, and most projects use only a fraction of that.

Signature technique: smartphone microscopy and video analysis

The technique that unlocks the most projects in this category is quantitative smartphone microscopy. You film living cells through a cheap lens, then you let software count, track, and measure for you.

  1. Mount your phone or USB microscope over a slide containing your model organism (yeast, Paramecium, onion epidermis, planaria). Use a desk lamp on white paper for even backlight.
  2. Record a short video or time-lapse. Keep the focus and exposure locked so frame-to-frame comparisons stay valid.
  3. Drop the video into Fiji (ImageJ) or napari. Use plugins like TrackMate for cell tracking, or Cellpose and StarDist for segmenting individual cells.
  4. Export the per-cell measurements (area, velocity, intensity, count) as a CSV.
  5. Open the CSV in a Colab notebook with pandas and scikit-learn. Fit a kinetic, dose-response, or Arrhenius model to your data.

That five-step loop, video to CSV to model, is the backbone of dozens of projects in this guide.

The dry-lab side: free software you can install today

Cell and molecular biology is half wet, half computational. The computational half is entirely free.

Sequence and structure

  • UniProt and NCBI BLAST: look up any protein and find related sequences.
  • PyMOL and ChimeraX: view and annotate 3D protein structures.
  • AlphaFold (via Colab or AlphaFold DB): predict a protein structure from a sequence.
  • Boltz and OpenFold: newer open-source structure predictors with confidence scoring.

Docking and dynamics

  • AutoDock Vina: dock small molecules into protein pockets.
  • GROMACS and OpenMM: run molecular dynamics simulations, free on Colab GPUs.

Genomics and transcriptomics

  • DESeq2, edgeR, and limma (R): differential gene expression from RNA-seq counts.
  • scanpy (Python) and Seurat (R): single-cell RNA-seq analysis.
  • scVelo: RNA velocity for developmental trajectories.
  • PLINK and Hail: population genetics and GWAS analysis.
  • MAGMA and FUMA: gene-level GWAS interpretation.
  • Ensembl VEP and AlphaMissense: variant annotation and pathogenicity scoring.

Machine learning for biology

  • scikit-learn and PyTorch: general ML.
  • ESM2 and ProtBERT: pretrained protein language models you can fine-tune.
  • RNA-FM: pretrained RNA language model.

Imaging

  • Fiji (ImageJ): the standard for scientific image analysis.
  • CellProfiler: pipeline-based cell quantification.
  • napari: modern Python image viewer.
  • DeepLabCut: tracks animals or cells in video with deep learning.

Running these tools yourself changes how research feels, because you stop asking "what did the paper find" and start asking "what happens if I rerun this differently."

Public databases that count as real data

Re-analyzing a public dataset is real research. Many of the strongest projects in this category never touch a pipette.

Bulk and single-cell transcriptomics

  • NCBI GEO: hundreds of thousands of expression studies.
  • GTEx: human gene expression across 50-plus tissues.
  • Expression Atlas: curated expression data across species.
  • Human Cell Atlas and Single Cell Portal: single-cell atlases.
  • Tabula Sapiens and Tabula Muris: whole-organism single-cell references for human and mouse.

Cancer and disease

  • TCGA (via cBioPortal): tumor genomics across cancer types.
  • DepMap, GDSC, and CCLE: cancer cell-line dependencies and drug responses.
  • PsychENCODE: brain transcriptomics for psychiatric disease.
  • LINCS L1000: drug-induced expression signatures.

Human variation

  • gnomAD: population frequencies for almost every human variant.
  • ClinVar: clinical interpretations of variants.
  • GWAS Catalog: summary statistics from thousands of GWAS studies.
  • 1000 Genomes: full sequencing data from a global reference cohort.

Immunology

  • ImmPort, ImmuneSpace, and iReceptor: human immunology cohorts and BCR or TCR repertoires.
  • IEDB: peptide-MHC binding data.
  • SAbDab: antibody structures.

Structures and pathways

  • PDB and AlphaFold DB: experimental and predicted protein structures.
  • Pfam, InterPro, Ensembl, RefSeq: sequence and domain references.
  • miRBase, TargetScan, and Rfam: RNA references.
  • KEGG, Reactome, STRING, BioGRID, and WikiPathways: pathways and protein-protein interactions.

Neuroscience and imaging

  • Allen Brain Atlas: gene expression and connectivity in the brain.
  • OpenNeuro: shared neuroimaging datasets.
  • CeNGEN: single-neuron transcriptomes in C. elegans.

Re-analysis of public data is a legitimate research path, and reviewers know it.

How to combine wet and dry: the strongest project shape

Pattern A: home experiment, public-data interpretation. Run a controlled experiment on a home model organism (yeast under heat stress, planaria regenerating after a drug exposure, onion cells under osmotic stress). Then pull the matching public transcriptome or single-cell atlas and use it to predict which genes or pathways drive what you measured. The wet side gives you phenotype, the dry side gives you mechanism.

Pattern B: public-data discovery, home validation. Mine a public dataset for a new signal (a stress-resilient cell subtype, a candidate riboswitch, a regulatory variant). Then design a tiny home assay (a colorimetric leak test, a chemotaxis assay, a yeast reporter) that probes one prediction from your computational hit. The dry side gives you the hypothesis, the wet side gives you a tangible test.

Judges respond to this hybrid shape because it shows you can both generate and interpret data, which is what working biologists actually do.

Choosing a phenomenon that has not been done

  1. Take your draft research question and search it in Google Scholar with quotes around the most specific phrase. Filter to the last five years and read the top 10 abstracts.
  2. Search the Society for Science abstracts archive for the same keywords across past ISEF finalists.
  3. Search PubMed for review articles on the broader topic, and skim one recent review. Reviews tell you what the field considers solved and what it considers open.

If you find adjacent prior work, that is good news, not bad news. It tells you the question is real, and it points you to the angle nobody has tried yet.

A realistic timeline

  • 1 to 2 weeks: A focused replication or single measurement, like betalain leakage from beetroot across temperatures, or a re-analysis of one GEO dataset.
  • 1 to 2 months: A full hybrid project ready for a regional fair, with a clear wet experiment, a public-data layer, and a fitted model.
  • Full year: An ISEF-track project with multiple model organisms or datasets, replicate runs, statistical robustness checks, and an original computational contribution.

If this is your first research project, start with the 1 to 2 week version. Finishing a small project teaches you more than planning a big one.

A starter checklist

  1. A clean, well-lit workspace with a phone or USB microscope mount.
  2. A free Google account with Colab access (and Colab GPU enabled for at least one test notebook).
  3. A local Python environment with pandas, numpy, scikit-learn, scanpy, and biopython installed.
  4. Fiji (ImageJ) and PyMOL or ChimeraX installed on your laptop.
  5. A bound lab notebook or a dated digital notebook for daily entries.
  6. One model organism alive on your desk (yeast, planaria, Daphnia, Lemna, or sprouted seeds).
  7. A one-sentence written research question and the one public database you expect to use.

Once those are in place, you are ready to pick a phenomenon and start collecting data.

Where to go next

Cellular and Molecular Biology has five ISEF subcategories. Each one has its own MehtA+ project guide that uses the kit on this page, so pick the subcategory that grabs you most.

  • Cell Physiology (PHY): how cells move ions, water, and signals, and how they respond to stress. Strong home-experiment fit with yeast, plant cells, and protists.
  • Cellular Immunology (IMM): how immune cells recognize threats. Heavy on public scRNA-seq, repertoire data, and in silico vaccine and antibody design.
  • Genetics (GEN): variation, inheritance, and the link from DNA to trait. Built around gnomAD, ClinVar, GWAS Catalog, and small model-organism crosses.
  • Molecular Biology (MOL): DNA, RNA, and protein at the molecular level. Mixes home PCR or cell-free kits with ribosome profiling, RNA structure, and ML on sequence.
  • Neurobiology (NEU): how neurons, circuits, and brains work. Combines behavioral assays in C. elegans, planaria, snails, and flies with public neuroimaging and brain atlases.
  • Other (OTH): cross-cutting projects, synthetic biology, microbiome work, aging, and whole-cell modeling that span the boundaries above.

A kitchen counter and a laptop can now do what a small lab used to. Pick a subcategory, pick a phenomenon, and start.

Project ideas in this category (66)

5’UTR RNA Structure and Gene Repression

Cellular and Molecular Biology · Molecular Biology · Advanced

Algae Light Response and Cell Signaling

Cellular and Molecular Biology · Cell Physiology · Advanced

ALS Splicing Signature Discovery

Cellular and Molecular Biology · Molecular Biology · Advanced

Antibody Binding ML for HIV Antibody Maturation

Cellular and Molecular Biology · Cellular Immunology · Advanced

Autism Exon-Usage Analysis in Cortex RNA-Seq

Cellular and Molecular Biology · Neurobiology · Advanced

Autophagy Genes and Mammal Longevity

Cellular and Molecular Biology · Other · Advanced

B-Cell Affinity Maturation Trade-Offs

Cellular and Molecular Biology · Cellular Immunology · Advanced

Beetroot Membrane Leakage and Colorimetry

Cellular and Molecular Biology · Cell Physiology · Intermediate

Binaural Beats, EEG, and Working Memory

Cellular and Molecular Biology · Neurobiology · Advanced

Biofilm Growth on Surface Coatings

Cellular and Molecular Biology · Other · Intermediate

C. elegans Chemotaxis and Preservative Effects

Cellular and Molecular Biology · Neurobiology · Intermediate

C. elegans Epigenetic Inheritance Project

Cellular and Molecular Biology · Genetics · Advanced

Cancer Gene G-Quadruplex Motifs in Promoters

Cellular and Molecular Biology · Molecular Biology · Advanced

Cell-Free Gene Expression and Codon Usage

Cellular and Molecular Biology · Molecular Biology · Intermediate

Ciliate Ribosome Stalling and tRNA Evolution

Cellular and Molecular Biology · Molecular Biology · Advanced

Circular RNA Patterns in Colorectal Cancer Stages

Cellular and Molecular Biology · Molecular Biology · Advanced

Codon Optimization for Efficient Protein Expression

Cellular and Molecular Biology · Molecular Biology · Advanced

Contractile Vacuole Pumping in Pond Microbes

Cellular and Molecular Biology · Cell Physiology · Advanced

Cortical Microcolumn Simulation with Wnt and SHH

Cellular and Molecular Biology · Neurobiology · Advanced

CRISPR Guide Design with Structure and Chromatin

Cellular and Molecular Biology · Genetics · Advanced

Daphnia Heat-Shock Survival and HSP70 Memory

Cellular and Molecular Biology · Cell Physiology · Advanced

Design an AND-Gate Probiotic Biosensor

Cellular and Molecular Biology · Other · Advanced

DIY PCR and SNP Genotype Correlation Project

Cellular and Molecular Biology · Molecular Biology · Advanced

Drosophila Light Color and Sleep Rhythm Effects

Cellular and Molecular Biology · Neurobiology · Intermediate

Drosophila Wing Vein Genetics Across Two Cities

Cellular and Molecular Biology · Genetics · Advanced

Fetal Blood Cell Switching With RNA Velocity

Cellular and Molecular Biology · Other · Advanced

Finding Macrophage States in Long COVID scRNA-Seq

Cellular and Molecular Biology · Cellular Immunology · Advanced

Finding New Imprinted Genes in GTEx RNA-seq

Cellular and Molecular Biology · Genetics · Advanced

GWAS Meta-Analysis Across Ancestries

Cellular and Molecular Biology · Genetics · Advanced

In Silico Multi-Epitope Vaccine Design for Pathogens

Cellular and Molecular Biology · Cellular Immunology · Advanced

Measuring Cytoplasmic Streaming in Elodea Cells

Cellular and Molecular Biology · Cell Physiology · Intermediate

Minimal CD8 T Cell Marker Panel for Tumor Samples

Cellular and Molecular Biology · Cellular Immunology · Advanced

Mining Modifier SNPs in Mendelian Disease

Cellular and Molecular Biology · Genetics · Advanced

miRNA Target Prediction with CLIP-Seq Data

Cellular and Molecular Biology · Molecular Biology · Advanced

Mitochondrial Heteroplasmy and Age Estimation Models

Cellular and Molecular Biology · Genetics · Advanced

Mitochondrial Stress Assay With Redox Dyes

Cellular and Molecular Biology · Cell Physiology · Intermediate

ML Analysis of Urban 16S Soil and Water Data

Cellular and Molecular Biology · Other · Advanced

ML-Designed Riboswitches for Caffeine Sensing

Cellular and Molecular Biology · Molecular Biology · Advanced

Model Nanobody Affinity Maturation Strategies

Cellular and Molecular Biology · Cellular Immunology · Advanced

Model Tau Spread with Brain Connectome Data

Cellular and Molecular Biology · Neurobiology · Advanced

Mycoplasma Protein Allocation Under Nutrient Limits

Cellular and Molecular Biology · Other · Advanced

Onion Cell Plasmolysis and Water Loss

Cellular and Molecular Biology · Cell Physiology · Intermediate

Parkinson’s Drug Repurposing With Gene Signatures

Cellular and Molecular Biology · Neurobiology · Advanced

Planarian Memory Transfer After Regeneration

Cellular and Molecular Biology · Neurobiology · Advanced

Planarian Regeneration Under Caffeine, Melatonin, and Tea

Cellular and Molecular Biology · Cell Physiology · Advanced

Plant Immune Priming in Seedlings

Cellular and Molecular Biology · Cellular Immunology · Intermediate

Polygenic Risk Score Portability Across Ancestries

Cellular and Molecular Biology · Genetics · Advanced

Predicting Depression Treatment Response From Brain Scans

Cellular and Molecular Biology · Neurobiology · Advanced

Predicting Noncoding Variant Pathogenicity With ML

Cellular and Molecular Biology · Genetics · Advanced

Red Blood Cell Osmotic Fragility Project Ideas

Cellular and Molecular Biology · Cell Physiology · Advanced

Regulatory T Cell Genes in Allergic Rhinitis

Cellular and Molecular Biology · Cellular Immunology · Advanced

Sensory Genetics and Pedigree Reconstruction

Cellular and Molecular Biology · Genetics · Advanced

Shared Cytokine Storm Hub Target Discovery

Cellular and Molecular Biology · Cellular Immunology · Advanced

Single-Cell Fibroblast Stress Signatures

Cellular and Molecular Biology · Other · Advanced

Smartphone Pupillometry for Anxiety and Cognitive Load

Cellular and Molecular Biology · Neurobiology · Advanced

Snail Habituation With Nicotine And Herbal Teas

Cellular and Molecular Biology · Neurobiology · Intermediate

Stomatal Aperture Changes Under CO2 and Light

Cellular and Molecular Biology · Cell Physiology · Intermediate

Stress Granules in Yeast and Plants

Cellular and Molecular Biology · Other · Advanced

TCR-pMHC Binding Changes From Tumor Mutations

Cellular and Molecular Biology · Cellular Immunology · Advanced

Tetrahymena Phagocytosis Under Stress

Cellular and Molecular Biology · Cellular Immunology · Intermediate

Transcriptional Drift and Aging Clocks in Human Tissues

Cellular and Molecular Biology · Other · Advanced

Tumor Margin Spatial Transcriptomics Analysis

Cellular and Molecular Biology · Other · Advanced

Yeast Cell-Cycle Synchronization and Modeling

Cellular and Molecular Biology · Cell Physiology · Advanced

Yeast Colony Morphology With Deep Learning

Cellular and Molecular Biology · Other · Advanced

Yeast Fitness Screens and Buffering Networks

Cellular and Molecular Biology · Genetics · Advanced

Yeast Galactose Reporter Dose-Response Science Project

Cellular and Molecular Biology · Molecular Biology · Advanced

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Hub →

Shopping Cart