How to Do Real Cellular and Molecular Biology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Cellular and molecular biology used to live behind the doors of university labs with six-figure instruments. That world has cracked open. A high school student with a laptop, a USB microscope, and a yeast packet can now run experiments and analyze datasets that look a lot like graduate work.
This guide is your starting point. It walks you through three things: the affordable kit you can keep on a desk, the free software professional biologists actually use, and the public databases that store decades of real human and model-organism data.
Why this is possible now
Sequencing got cheap, and the data got public. Almost every major experiment a research lab runs ends up in a free archive like NCBI GEO, GTEx, or the Human Cell Atlas. You can download the same files a professor downloads, on the same day.
Compute got free. Google Colab gives you a GPU in your browser. You can fold a protein with AlphaFold, run molecular dynamics in OpenMM, or train a deep-learning model on single-cell data without owning a graphics card.
Imaging got pocket-sized. A 30 dollar USB microscope plugged into a laptop, or a phone clipped to a cheap adapter, gives you frame rates and resolution that were lab-grade 20 years ago. Pair that with open computer vision libraries and you have a quantitative imaging pipeline.
Put it together and a kitchen counter plus a laptop can now culture model organisms, image live cells, mine terabytes of human sequencing data, and simulate a protein, all in one project.
The cellular and molecular biology home kit
You do not need a wet lab. You need a clean surface, a few model organisms, a small microscope, and some cheap dyes.
Model organisms you can keep at home
- Baker's or brewer's yeast (a few dollars at the grocery store).
- Drosophila fruit flies (wild-caught with a banana trap, or ordered cheap).
- Daphnia water fleas (from a pet store aquarium section).
- Planaria (pond water, or 10 to 20 dollars from a biological supplier).
- Tetrahymena and Paramecium (pond water or a 20 dollar culture).
- C. elegans (free from the Caenorhabditis Genetics Center with a mentor sponsor).
- Sprouted seeds, onion bulbs, Lemna duckweed, Elodea (under 10 dollars total).
Imaging hardware
- USB digital microscope, 30 to 100 dollars.
- Smartphone microscope adapter, 15 to 30 dollars.
- Foldscope paper microscope, about 1.50 dollars.
- A tripod or phone clamp for steady time-lapse.
Dyes, reagents, and consumables
- Methylene blue, neutral red, and trypan blue (a few dollars each, classic viability and uptake stains).
- Food dyes as cheap uptake tracers.
- Agar plates and petri dishes from a biology supplier, around 20 dollars for a starter pack.
- Resazurin powder for metabolic readouts.
Affordable kits and consumables
- Carolina or Bio-Rad Explorer wet-lab kits, 10 to 50 dollars each (electrophoresis, transformation, PCR-by-mail).
- Homemade agarose gel rig with a baking-soda buffer and a 9V battery setup.
- Mail-in Sanger sequencing, around 5 to 15 dollars per sample.
Optional, project-dependent
- Raspberry Pi for automated time-lapse rigs, around 40 dollars.
- A consumer EEG headband for neurobiology-leaning projects, 100 to 250 dollars secondhand.
A full starter setup lands between 100 and 300 dollars, and most projects use only a fraction of that.
Signature technique: smartphone microscopy and video analysis
The technique that unlocks the most projects in this category is quantitative smartphone microscopy. You film living cells through a cheap lens, then you let software count, track, and measure for you.
- Mount your phone or USB microscope over a slide containing your model organism (yeast, Paramecium, onion epidermis, planaria). Use a desk lamp on white paper for even backlight.
- Record a short video or time-lapse. Keep the focus and exposure locked so frame-to-frame comparisons stay valid.
- Drop the video into Fiji (ImageJ) or napari. Use plugins like TrackMate for cell tracking, or Cellpose and StarDist for segmenting individual cells.
- Export the per-cell measurements (area, velocity, intensity, count) as a CSV.
- Open the CSV in a Colab notebook with pandas and scikit-learn. Fit a kinetic, dose-response, or Arrhenius model to your data.
That five-step loop, video to CSV to model, is the backbone of dozens of projects in this guide.
The dry-lab side: free software you can install today
Cell and molecular biology is half wet, half computational. The computational half is entirely free.
Sequence and structure
- UniProt and NCBI BLAST: look up any protein and find related sequences.
- PyMOL and ChimeraX: view and annotate 3D protein structures.
- AlphaFold (via Colab or AlphaFold DB): predict a protein structure from a sequence.
- Boltz and OpenFold: newer open-source structure predictors with confidence scoring.
Docking and dynamics
- AutoDock Vina: dock small molecules into protein pockets.
- GROMACS and OpenMM: run molecular dynamics simulations, free on Colab GPUs.
Genomics and transcriptomics
- DESeq2, edgeR, and limma (R): differential gene expression from RNA-seq counts.
- scanpy (Python) and Seurat (R): single-cell RNA-seq analysis.
- scVelo: RNA velocity for developmental trajectories.
- PLINK and Hail: population genetics and GWAS analysis.
- MAGMA and FUMA: gene-level GWAS interpretation.
- Ensembl VEP and AlphaMissense: variant annotation and pathogenicity scoring.
Machine learning for biology
- scikit-learn and PyTorch: general ML.
- ESM2 and ProtBERT: pretrained protein language models you can fine-tune.
- RNA-FM: pretrained RNA language model.
Imaging
- Fiji (ImageJ): the standard for scientific image analysis.
- CellProfiler: pipeline-based cell quantification.
- napari: modern Python image viewer.
- DeepLabCut: tracks animals or cells in video with deep learning.
Running these tools yourself changes how research feels, because you stop asking "what did the paper find" and start asking "what happens if I rerun this differently."
Public databases that count as real data
Re-analyzing a public dataset is real research. Many of the strongest projects in this category never touch a pipette.
Bulk and single-cell transcriptomics
- NCBI GEO: hundreds of thousands of expression studies.
- GTEx: human gene expression across 50-plus tissues.
- Expression Atlas: curated expression data across species.
- Human Cell Atlas and Single Cell Portal: single-cell atlases.
- Tabula Sapiens and Tabula Muris: whole-organism single-cell references for human and mouse.
Cancer and disease
- TCGA (via cBioPortal): tumor genomics across cancer types.
- DepMap, GDSC, and CCLE: cancer cell-line dependencies and drug responses.
- PsychENCODE: brain transcriptomics for psychiatric disease.
- LINCS L1000: drug-induced expression signatures.
Human variation
- gnomAD: population frequencies for almost every human variant.
- ClinVar: clinical interpretations of variants.
- GWAS Catalog: summary statistics from thousands of GWAS studies.
- 1000 Genomes: full sequencing data from a global reference cohort.
Immunology
- ImmPort, ImmuneSpace, and iReceptor: human immunology cohorts and BCR or TCR repertoires.
- IEDB: peptide-MHC binding data.
- SAbDab: antibody structures.
Structures and pathways
- PDB and AlphaFold DB: experimental and predicted protein structures.
- Pfam, InterPro, Ensembl, RefSeq: sequence and domain references.
- miRBase, TargetScan, and Rfam: RNA references.
- KEGG, Reactome, STRING, BioGRID, and WikiPathways: pathways and protein-protein interactions.
Neuroscience and imaging
- Allen Brain Atlas: gene expression and connectivity in the brain.
- OpenNeuro: shared neuroimaging datasets.
- CeNGEN: single-neuron transcriptomes in C. elegans.
Re-analysis of public data is a legitimate research path, and reviewers know it.
How to combine wet and dry: the strongest project shape
Pattern A: home experiment, public-data interpretation. Run a controlled experiment on a home model organism (yeast under heat stress, planaria regenerating after a drug exposure, onion cells under osmotic stress). Then pull the matching public transcriptome or single-cell atlas and use it to predict which genes or pathways drive what you measured. The wet side gives you phenotype, the dry side gives you mechanism.
Pattern B: public-data discovery, home validation. Mine a public dataset for a new signal (a stress-resilient cell subtype, a candidate riboswitch, a regulatory variant). Then design a tiny home assay (a colorimetric leak test, a chemotaxis assay, a yeast reporter) that probes one prediction from your computational hit. The dry side gives you the hypothesis, the wet side gives you a tangible test.
Judges respond to this hybrid shape because it shows you can both generate and interpret data, which is what working biologists actually do.
Choosing a phenomenon that has not been done
- Take your draft research question and search it in Google Scholar with quotes around the most specific phrase. Filter to the last five years and read the top 10 abstracts.
- Search the Society for Science abstracts archive for the same keywords across past ISEF finalists.
- Search PubMed for review articles on the broader topic, and skim one recent review. Reviews tell you what the field considers solved and what it considers open.
If you find adjacent prior work, that is good news, not bad news. It tells you the question is real, and it points you to the angle nobody has tried yet.
A realistic timeline
- 1 to 2 weeks: A focused replication or single measurement, like betalain leakage from beetroot across temperatures, or a re-analysis of one GEO dataset.
- 1 to 2 months: A full hybrid project ready for a regional fair, with a clear wet experiment, a public-data layer, and a fitted model.
- Full year: An ISEF-track project with multiple model organisms or datasets, replicate runs, statistical robustness checks, and an original computational contribution.
If this is your first research project, start with the 1 to 2 week version. Finishing a small project teaches you more than planning a big one.
A starter checklist
- A clean, well-lit workspace with a phone or USB microscope mount.
- A free Google account with Colab access (and Colab GPU enabled for at least one test notebook).
- A local Python environment with pandas, numpy, scikit-learn, scanpy, and biopython installed.
- Fiji (ImageJ) and PyMOL or ChimeraX installed on your laptop.
- A bound lab notebook or a dated digital notebook for daily entries.
- One model organism alive on your desk (yeast, planaria, Daphnia, Lemna, or sprouted seeds).
- A one-sentence written research question and the one public database you expect to use.
Once those are in place, you are ready to pick a phenomenon and start collecting data.
Where to go next
Cellular and Molecular Biology has five ISEF subcategories. Each one has its own MehtA+ project guide that uses the kit on this page, so pick the subcategory that grabs you most.
- Cell Physiology (PHY): how cells move ions, water, and signals, and how they respond to stress. Strong home-experiment fit with yeast, plant cells, and protists.
- Cellular Immunology (IMM): how immune cells recognize threats. Heavy on public scRNA-seq, repertoire data, and in silico vaccine and antibody design.
- Genetics (GEN): variation, inheritance, and the link from DNA to trait. Built around gnomAD, ClinVar, GWAS Catalog, and small model-organism crosses.
- Molecular Biology (MOL): DNA, RNA, and protein at the molecular level. Mixes home PCR or cell-free kits with ribosome profiling, RNA structure, and ML on sequence.
- Neurobiology (NEU): how neurons, circuits, and brains work. Combines behavioral assays in C. elegans, planaria, snails, and flies with public neuroimaging and brain atlases.
- Other (OTH): cross-cutting projects, synthetic biology, microbiome work, aging, and whole-cell modeling that span the boundaries above.
A kitchen counter and a laptop can now do what a small lab used to. Pick a subcategory, pick a phenomenon, and start.
Project ideas in this category (66)
Molecular Biology · Advanced
Algae Light Response and Cell SignalingCell Physiology · Advanced
ALS Splicing Signature DiscoveryMolecular Biology · Advanced
Antibody Binding ML for HIV Antibody MaturationCellular Immunology · Advanced
Autism Exon-Usage Analysis in Cortex RNA-SeqNeurobiology · Advanced
Autophagy Genes and Mammal LongevityOther · Advanced
B-Cell Affinity Maturation Trade-OffsCellular Immunology · Advanced
Beetroot Membrane Leakage and ColorimetryCell Physiology · Intermediate
Binaural Beats, EEG, and Working MemoryNeurobiology · Advanced
Biofilm Growth on Surface CoatingsOther · Intermediate
C. elegans Chemotaxis and Preservative EffectsNeurobiology · Intermediate
C. elegans Epigenetic Inheritance ProjectGenetics · Advanced
Cancer Gene G-Quadruplex Motifs in PromotersMolecular Biology · Advanced
Cell-Free Gene Expression and Codon UsageMolecular Biology · Intermediate
Ciliate Ribosome Stalling and tRNA EvolutionMolecular Biology · Advanced
Circular RNA Patterns in Colorectal Cancer StagesMolecular Biology · Advanced
Codon Optimization for Efficient Protein ExpressionMolecular Biology · Advanced
Contractile Vacuole Pumping in Pond MicrobesCell Physiology · Advanced
Cortical Microcolumn Simulation with Wnt and SHHNeurobiology · Advanced
CRISPR Guide Design with Structure and ChromatinGenetics · Advanced
Daphnia Heat-Shock Survival and HSP70 MemoryCell Physiology · Advanced
Design an AND-Gate Probiotic BiosensorOther · Advanced
DIY PCR and SNP Genotype Correlation ProjectMolecular Biology · Advanced
Drosophila Light Color and Sleep Rhythm EffectsNeurobiology · Intermediate
Drosophila Wing Vein Genetics Across Two CitiesGenetics · Advanced
Fetal Blood Cell Switching With RNA VelocityOther · Advanced
Finding Macrophage States in Long COVID scRNA-SeqCellular Immunology · Advanced
Finding New Imprinted Genes in GTEx RNA-seqGenetics · Advanced
GWAS Meta-Analysis Across AncestriesGenetics · Advanced
In Silico Multi-Epitope Vaccine Design for PathogensCellular Immunology · Advanced
Measuring Cytoplasmic Streaming in Elodea CellsCell Physiology · Intermediate
Minimal CD8 T Cell Marker Panel for Tumor SamplesCellular Immunology · Advanced
Mining Modifier SNPs in Mendelian DiseaseGenetics · Advanced
miRNA Target Prediction with CLIP-Seq DataMolecular Biology · Advanced
Mitochondrial Heteroplasmy and Age Estimation ModelsGenetics · Advanced
Mitochondrial Stress Assay With Redox DyesCell Physiology · Intermediate
ML Analysis of Urban 16S Soil and Water DataOther · Advanced
ML-Designed Riboswitches for Caffeine SensingMolecular Biology · Advanced
Model Nanobody Affinity Maturation StrategiesCellular Immunology · Advanced
Model Tau Spread with Brain Connectome DataNeurobiology · Advanced
Mycoplasma Protein Allocation Under Nutrient LimitsOther · Advanced
Onion Cell Plasmolysis and Water LossCell Physiology · Intermediate
Parkinson’s Drug Repurposing With Gene SignaturesNeurobiology · Advanced
Planarian Memory Transfer After RegenerationNeurobiology · Advanced
Planarian Regeneration Under Caffeine, Melatonin, and TeaCell Physiology · Advanced
Plant Immune Priming in SeedlingsCellular Immunology · Intermediate
Polygenic Risk Score Portability Across AncestriesGenetics · Advanced
Predicting Depression Treatment Response From Brain ScansNeurobiology · Advanced
Predicting Noncoding Variant Pathogenicity With MLGenetics · Advanced
Red Blood Cell Osmotic Fragility Project IdeasCell Physiology · Advanced
Regulatory T Cell Genes in Allergic RhinitisCellular Immunology · Advanced
Sensory Genetics and Pedigree ReconstructionGenetics · Advanced
Shared Cytokine Storm Hub Target DiscoveryCellular Immunology · Advanced
Single-Cell Fibroblast Stress SignaturesOther · Advanced
Smartphone Pupillometry for Anxiety and Cognitive LoadNeurobiology · Advanced
Snail Habituation With Nicotine And Herbal TeasNeurobiology · Intermediate
Stomatal Aperture Changes Under CO2 and LightCell Physiology · Intermediate
Stress Granules in Yeast and PlantsOther · Advanced
TCR-pMHC Binding Changes From Tumor MutationsCellular Immunology · Advanced
Tetrahymena Phagocytosis Under StressCellular Immunology · Intermediate
Transcriptional Drift and Aging Clocks in Human TissuesOther · Advanced
Tumor Margin Spatial Transcriptomics AnalysisOther · Advanced
Yeast Cell-Cycle Synchronization and ModelingCell Physiology · Advanced
Yeast Colony Morphology With Deep LearningOther · Advanced
Yeast Fitness Screens and Buffering NetworksGenetics · Advanced
Yeast Galactose Reporter Dose-Response Science ProjectMolecular Biology · Advanced
