How to Do Real Biochemistry Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

How to Do Real Biochemistry Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point.But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Biochemistry used to mean a sterile university lab, a tenured PI, and a five-figure instrument budget. That barrier is gone. You can now run real enzyme kinetics on your kitchen counter and dock real drug candidates against real protein structures from your bedroom.

This guide is your starting point. It covers three things: the affordable home kit you can buy this week, the free professional software you can install today, and the public databases that already hold decades of high-quality data waiting to be re-analyzed.

Why This Is Possible Now

Three shifts in the last decade rewrote what a high school student can do alone.

Structures became free and complete. AlphaFold released predicted structures for nearly every protein in every sequenced organism. The Protein Data Bank holds over 200,000 experimental structures. You no longer need a crystallography lab to study a protein. You need a laptop and curiosity.

Compute became free and serious. Google Colab gives you a free GPU strong enough to run molecular dynamics, train a graph neural network, or fold a protein. The same software the pharmaceutical industry uses (GROMACS, OpenMM, AutoDock Vina, RDKit) is open source and free to install.

Your phone became a spectrophotometer. A modern smartphone camera plus a free app can measure color with enough precision to quantify enzymes, sugars, polyphenols, and dyes. Add a $5 Arduino and a photodiode and you have a three-wavelength spectrometer for the price of lunch.

A kitchen counter plus a laptop now does what a small university lab did 15 years ago.

The Biochemistry Home Kit

You can buy a complete starter kit for the price of a textbook. Here is what to gather, grouped by purpose.

Enzyme sources (cheap and biological)

  • Fresh potato or beef liver as a source of catalase.
  • Pineapple juice for bromelain, papaya for papain, fig sap for ficin.
  • Store-bought lactase drops (e.g., Lactaid) for hydrolyzing lactose.
  • Baker's yeast and brewer's yeast for fermentation and glycolysis experiments.
  • A kombucha SCOBY as a living, self-replenishing biochemistry platform.
  • Sprouted seeds (mung, wheat) as a rich source of amylase and protease.

Reagents and indicator chemistry ($10 to $40 total)

  • pH strips covering the 0 to 14 range.
  • Iodine solution for starch detection.
  • Benedict's reagent and Biuret reagent for sugars and proteins.
  • DNS (dinitrosalicylic acid) reagent for reducing-sugar quantification.
  • DPPH or ABTS kits (around $30) for antioxidant assays.
  • Griess reagent for nitrate/nitrite quantification.
  • Folin-Ciocalteu reagent for polyphenol assays.

Hardware (under $80 total)

  • Smartphone with a recent camera (you already own this).
  • Arduino Uno or Nano, a photodiode, and three LEDs (red, green, blue) to build a DIY spectrophotometer.
  • A sous-vide cooker or rice cooker for stable temperature control during isothermal assays.
  • A consumer USB microscope (optional but useful for biofilm and crystal work).
  • TLC plates at roughly $1 each, plus filter paper for chromatography.
  • A DIY gel-electrophoresis box (acrylic, alligator clips, a 9V battery rig) with agarose.

Glassware and lab discipline

  • A set of disposable plastic cuvettes.
  • Plastic pipettes or a $15 micropipette set.
  • A lab notebook (paper or digital, but one you commit to).

Total realistic cost to assemble the full kit: $100 to $200.

The Signature Technique: Smartphone Colorimetry

This is the single move that unlocks the most projects. A color change is a concentration measurement if you treat the phone correctly. Here is the five-step workflow you will reuse across half the projects in this category.

  1. Build a light box. A shoebox lined with white paper, with a single hole for the phone camera and a constant LED light source, removes the biggest source of error: ambient light.
  2. Prepare a standard curve. Make a serial dilution of your analyte at five to seven known concentrations. Photograph each one inside the light box, same distance, same exposure, same white-balance lock.
  3. Extract RGB values. Use a free image-analysis tool like ImageJ or a Python script with Pillow to pull the average R, G, and B values from a fixed region of each photo.
  4. Fit your calibration. Convert RGB to absorbance using A = -log10(I/I0) where I0 is the blank. Fit a line (Beer-Lambert) or train a small regression model if the chemistry is nonlinear.
  5. Validate. Run a blind sample at a known concentration and check that your model predicts within 10 percent. If it does not, the most likely culprit is inconsistent lighting, not the chemistry.

Once this workflow is solid, swap the chemistry in and out: Griess for nitrates, Folin-Ciocalteu for polyphenols, DNS for reducing sugars, Maillard for browning kinetics, anthocyanins for pH.

The Dry-Lab Side: Free Software You Can Install Today

Every tool below is what working biochemists use. Every one is free.

Structure viewing and analysis

  • PyMOL (educational license): the most widely used molecular viewer. Rotate, color, measure, and produce figures publication ready.
  • ChimeraX: a more modern viewer with excellent surface and electrostatics rendering.
  • Fpocket: detects binding pockets, including allosteric ones, on any protein structure.

Docking

  • AutoDock Vina: the standard free docking engine. Predicts how a small molecule binds to a protein.
  • Smina: a Vina fork with better scoring options.
  • GNINA: adds a convolutional neural network for re-scoring poses.
  • DiffDock: a diffusion-model docking tool that handles flexible binding.

Molecular dynamics

  • GROMACS: the workhorse for simulating proteins, lipids, and complexes over nanoseconds to microseconds.
  • OpenMM: pure Python interface, runs beautifully on a free Colab GPU.
  • NAMD: scalable MD with extensive tutorials.
  • PLUMED: adds enhanced sampling methods like metadynamics on top of GROMACS or OpenMM.

Cheminformatics and machine learning

  • RDKit: the standard library for handling molecules in Python.
  • scikit-learn and PyTorch: general ML and deep learning.
  • DeepChem: ML built specifically for chemistry.
  • ESM2 and ProtBERT: protein language models you can call in a notebook.
  • SwissADME (web tool): drug-likeness and ADMET prediction.
  • ViennaRNA: RNA secondary-structure folding and ΔG calculation.

Running professional tools yourself changes how research feels. You are no longer reading about science. You are doing it.

Public Databases That Count as Real Data

Re-analysis of public data is a legitimate research path on its own. Judges respect it because the dataset is already peer-reviewed and large.

Protein structures and sequences

  • PDB (Protein Data Bank): experimental 3D structures.
  • AlphaFold DB: predicted structures for over 200 million proteins.
  • UniProt: protein sequences with functional annotation.
  • Pfam and InterPro: protein family and domain databases.

Small molecules and drugs

  • PubChem: tens of millions of compounds with structure and bioactivity.
  • ChEMBL: curated bioactivity data, ideal for ML training.
  • DrugBank: FDA-approved drugs with targets and mechanisms.
  • ZINC: purchasable compounds for virtual screening.
  • COCONUT: natural-product structures.
  • IMPPAT and TCMSP: Ayurvedic and traditional Chinese medicine compound databases.

Sequences, expression, and metabolism

  • NCBI and Ensembl: genome and transcript sequences.
  • KEGG and MetaCyc: metabolic pathways and reactions.
  • GEO and Expression Atlas: gene expression datasets.
  • TCGA: cancer genomics and expression data.

Metabolomics

  • HMDB (Human Metabolome Database): all known human metabolites.
  • MetaboLights: raw metabolomics studies you can reanalyze.

Pulling a dataset from one of these and asking a new question of it is real science.

How to Combine Wet and Dry: The Strongest Project Shape

The most defensible projects bridge a hands-on measurement and a computational analysis. There are two reliable patterns.

Pattern A: Measure something at home, then explain it computationally. Run an enzyme kinetics experiment, get your Michaelis-Menten parameters, then dock candidate inhibitors against the same enzyme's AlphaFold structure and show which residues your data implicates. Your wet measurement constrains your computational story.

Pattern B: Predict something computationally, then validate one prediction in the kitchen. Screen a public compound library against a target in silico, pick the top three candidates that happen to be available as kitchen ingredients or cheap supplements, and test one prediction with a simple assay. Your computation guides your experiment.

Judges respond to this hybrid shape because it shows you can connect a measurement to a mechanism, which is what real biochemistry research does.

Choosing a Phenomenon That Has Not Been Done

Novelty is a process, not a guess. Run these three checks before you commit to a project.

  1. Google Scholar. Search your candidate phrase in quotes plus a year filter for the last five years. Skim the top 20 results. If your exact study exists, narrow your angle (a different enzyme, a different condition, a different organism, a different model).
  2. Society for Science abstracts archive. Search the public ISEF and Regeneron STS abstract databases for your keywords. This tells you what other high school students have already done.
  3. PubMed. Search for review articles on your topic. Reviews tell you the frontier and, more usefully, the gaps the field admits it has.

If you find adjacent prior work, that is good news, not bad news. It means your topic is real, and you now know exactly where the unanswered question lives.

A Realistic Timeline

  • 1 to 2 weeks (focused replication or measurement). Pick one assay, build the light box, run a calibration curve, and write it up cleanly. This is a strong middle-school-to-9th-grade entry.
  • 1 to 2 months (full hybrid project for regional fair). One wet-lab measurement plus one computational analysis, tied together with a clear hypothesis and statistics.
  • Full year (ISEF-track project). A multi-experiment hybrid pipeline with a novel computational contribution, careful controls, and a written paper ready for review.

First-time researchers should start with the 1 to 2 week version. You learn more from finishing one small thing than from half-finishing a big thing.

A Starter Checklist

Before you pick a specific phenomenon, set these up. They take an afternoon.

  1. A clean, well-lit workspace with a flat surface and a sink nearby.
  2. A free Google account with Colab opened at least once (this confirms GPU access).
  3. A local Python environment (Anaconda or Miniconda) with RDKit, scikit-learn, NumPy, pandas, and Pillow installed.
  4. PyMOL or ChimeraX installed and a test PDB structure loaded.
  5. AutoDock Vina installed and the included tutorial run end to end.
  6. A lab notebook (paper or a dated Google Doc) with the date and a one-line goal on the first page.
  7. A single written sentence: "My research question is whether X affects Y, measured by Z."

If you have all seven, you are ready.

Where to Go Next

Biochemistry at ISEF splits into five subcategories. Pick the one that pulls you in most.

  • Analytical Biochemistry (ANB): measuring and quantifying biochemicals. Smartphone colorimetry, paper microfluidics, DIY spectrometry, and titration projects live here.
  • General Biochemistry (GNR): enzyme kinetics, fermentation, pigment chemistry, and Maillard reactions. The widest entry point for hands-on work.
  • Medicinal Biochemistry (MED): drug discovery, virtual screening, ADMET prediction, and natural-product pharmacology. Heavy on computation, light on equipment.
  • Structural Biochemistry (STR): protein structure, molecular dynamics, AlphaFold-based modeling, and de novo design. Almost entirely on a laptop.
  • Other (OTH): hybrid pipelines, educational tools, benchmarking studies, and meta-analyses that do not fit cleanly into one of the above.

Each subcategory has its own MehtA+ project guide built around the kit on this page. Pick the one that interests you most and follow that link from our blog. Biochemistry used to live behind a locked lab door. The door is open now, and you have the key.

Project ideas in this category (56)

AI-Powered Enzyme Inhibition Experiment Design Guide

Other · Advanced

Alginate Bead Release Kinetics

Medicinal Biochemistry · Intermediate

Allosteric Docking Benchmarking

Other · Advanced

Allosteric Pocket Mapping for Resistance Enzyme Targets

Medicinal Biochemistry · Advanced

AlphaFold p53 Disorder Across Mammals

Structural Biochemistry · Advanced

Antibody Fc Glycosylation Mutation Effects

Structural Biochemistry · Advanced

Artificial Sweeteners and Yeast Fermentation Kinetics

General Biochemistry · Intermediate

Bisphenol Detection in Sunlit Bottled Water

Analytical Biochemistry · Intermediate

BRCA1 Variant Effects on Protein Structure

Structural Biochemistry · Advanced

Catalase Kinetics With Microplastics

General Biochemistry · Intermediate

Computational Drug Screening

Other · Advanced

CRISPR-Cas12a Guide RNA Loading

Structural Biochemistry · Advanced

Curcumin Analogs for NF-κB Docking

Medicinal Biochemistry · Advanced

De Novo Viral Protein Binder Design

Structural Biochemistry · Advanced

Designing Peptides to Block Amyloid-β Clumping in Alzheimer’s

Medicinal Biochemistry · Advanced

DIY Arduino Spectrophotometer for Berry Anthocyanins

Analytical Biochemistry · Intermediate

Early Alzheimer’s Multi-Omics Network Science Project

Other · Advanced

Egg White Protein Unfolding by Salt

General Biochemistry · Intermediate

Enzyme Inhibition Simulator for Science Fair

Other · Intermediate

Enzyme Kinetics Identifiability

Other · Intermediate

Fruit Protease Kinetics in Gelatin

General Biochemistry · Intermediate

Glycolysis Bottlenecks in Cancer Cells

General Biochemistry · Advanced

Green Tea Polyphenol Kinetics

Analytical Biochemistry · Intermediate

hERG Risk in Antimalarials

Medicinal Biochemistry · Advanced

Honey Adulteration Detection

Analytical Biochemistry · Intermediate

Kitchen Polyphenol Synergy in DPPH Antioxidant Tests

General Biochemistry · Intermediate

Kombucha Antioxidant Claims and Publication Bias Study

Other · Advanced

Kombucha Fermentation Drivers

Other · Intermediate

KRAS G12D Cryptic Pockets

Structural Biochemistry · Advanced

Lactase Drop Activity and Reuse

General Biochemistry · Intermediate

Machine Learning Food Dye Spectrum Deconvolution Project

Analytical Biochemistry · Intermediate

Maillard Browning in Baking

General Biochemistry · Intermediate

Membrane Lipids That Block Alpha-Synuclein Aggregation

Structural Biochemistry · Advanced

Orphan GPCR Pocket Clustering

Structural Biochemistry · Advanced

Paper Microfluidic Urine Test Device

Analytical Biochemistry · Intermediate

PCSK9 Pharmacophore Mining in Ayurvedic and TCM Databases

Medicinal Biochemistry · Advanced

Peptide Pore Formation in Membranes

Structural Biochemistry · Advanced

Predicting Blood-Brain Barrier Permeability with Machine Learning

Medicinal Biochemistry · Advanced

Predicting G-Quadruplex Stability in Oncogene Promoters

Structural Biochemistry · Advanced

Predicting Kinase Drug Response in Pediatric Tumors

Medicinal Biochemistry · Advanced

PROTAC Linker Design for Kinase Targeting

Medicinal Biochemistry · Advanced

PubChem Transfer Learning for Tiny Bioactivity Sets

Other · Advanced

Red Cabbage Anthocyanin pH Indicator

General Biochemistry · Intermediate

SARS-CoV-2 5’UTR RNA Folding

Other · Advanced

SARS-CoV-2 Spike Variant Binding Study

Structural Biochemistry · Advanced

Small-Molecule Synthesizability Classifiers

Other · Advanced

Smartphone LAMP Plant Pathogen Test

Analytical Biochemistry · Advanced

Smartphone Nitrate Testing in Vegetables and Water

Analytical Biochemistry · Intermediate

Spice Oil Effects on Bacterial Biofilms

Medicinal Biochemistry · Intermediate

Spinach Chloroplast Light-Response Curves Project Ideas

General Biochemistry · Intermediate

Tea Polyphenols and SCOBY Acid Output

General Biochemistry · Intermediate

Turmeric Brand Analysis with TLC and DPPH Science Fair

Analytical Biochemistry · Intermediate

Urine Biomarker Discovery for Diabetes

Analytical Biochemistry · Advanced

Virtual Drug Screening for Viral Polymerase

Medicinal Biochemistry · Advanced

Vitamin C Decay in Juice

Analytical Biochemistry · Intermediate

Yeast Fermentation and Artificial Sweeteners Study

General Biochemistry · Intermediate

Shopping Cart