How to Do Real Biochemistry Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point.But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Biochemistry used to mean a sterile university lab, a tenured PI, and a five-figure instrument budget. That barrier is gone. You can now run real enzyme kinetics on your kitchen counter and dock real drug candidates against real protein structures from your bedroom.
This guide is your starting point. It covers three things: the affordable home kit you can buy this week, the free professional software you can install today, and the public databases that already hold decades of high-quality data waiting to be re-analyzed.
Why This Is Possible Now
Three shifts in the last decade rewrote what a high school student can do alone.
Structures became free and complete. AlphaFold released predicted structures for nearly every protein in every sequenced organism. The Protein Data Bank holds over 200,000 experimental structures. You no longer need a crystallography lab to study a protein. You need a laptop and curiosity.
Compute became free and serious. Google Colab gives you a free GPU strong enough to run molecular dynamics, train a graph neural network, or fold a protein. The same software the pharmaceutical industry uses (GROMACS, OpenMM, AutoDock Vina, RDKit) is open source and free to install.
Your phone became a spectrophotometer. A modern smartphone camera plus a free app can measure color with enough precision to quantify enzymes, sugars, polyphenols, and dyes. Add a $5 Arduino and a photodiode and you have a three-wavelength spectrometer for the price of lunch.
A kitchen counter plus a laptop now does what a small university lab did 15 years ago.
The Biochemistry Home Kit
You can buy a complete starter kit for the price of a textbook. Here is what to gather, grouped by purpose.
Enzyme sources (cheap and biological)
- Fresh potato or beef liver as a source of catalase.
- Pineapple juice for bromelain, papaya for papain, fig sap for ficin.
- Store-bought lactase drops (e.g., Lactaid) for hydrolyzing lactose.
- Baker's yeast and brewer's yeast for fermentation and glycolysis experiments.
- A kombucha SCOBY as a living, self-replenishing biochemistry platform.
- Sprouted seeds (mung, wheat) as a rich source of amylase and protease.
Reagents and indicator chemistry ($10 to $40 total)
- pH strips covering the 0 to 14 range.
- Iodine solution for starch detection.
- Benedict's reagent and Biuret reagent for sugars and proteins.
- DNS (dinitrosalicylic acid) reagent for reducing-sugar quantification.
- DPPH or ABTS kits (around $30) for antioxidant assays.
- Griess reagent for nitrate/nitrite quantification.
- Folin-Ciocalteu reagent for polyphenol assays.
Hardware (under $80 total)
- Smartphone with a recent camera (you already own this).
- Arduino Uno or Nano, a photodiode, and three LEDs (red, green, blue) to build a DIY spectrophotometer.
- A sous-vide cooker or rice cooker for stable temperature control during isothermal assays.
- A consumer USB microscope (optional but useful for biofilm and crystal work).
- TLC plates at roughly $1 each, plus filter paper for chromatography.
- A DIY gel-electrophoresis box (acrylic, alligator clips, a 9V battery rig) with agarose.
Glassware and lab discipline
- A set of disposable plastic cuvettes.
- Plastic pipettes or a $15 micropipette set.
- A lab notebook (paper or digital, but one you commit to).
Total realistic cost to assemble the full kit: $100 to $200.
The Signature Technique: Smartphone Colorimetry
This is the single move that unlocks the most projects. A color change is a concentration measurement if you treat the phone correctly. Here is the five-step workflow you will reuse across half the projects in this category.
- Build a light box. A shoebox lined with white paper, with a single hole for the phone camera and a constant LED light source, removes the biggest source of error: ambient light.
- Prepare a standard curve. Make a serial dilution of your analyte at five to seven known concentrations. Photograph each one inside the light box, same distance, same exposure, same white-balance lock.
- Extract RGB values. Use a free image-analysis tool like ImageJ or a Python script with Pillow to pull the average R, G, and B values from a fixed region of each photo.
- Fit your calibration. Convert RGB to absorbance using
A = -log10(I/I0)where I0 is the blank. Fit a line (Beer-Lambert) or train a small regression model if the chemistry is nonlinear. - Validate. Run a blind sample at a known concentration and check that your model predicts within 10 percent. If it does not, the most likely culprit is inconsistent lighting, not the chemistry.
Once this workflow is solid, swap the chemistry in and out: Griess for nitrates, Folin-Ciocalteu for polyphenols, DNS for reducing sugars, Maillard for browning kinetics, anthocyanins for pH.
The Dry-Lab Side: Free Software You Can Install Today
Every tool below is what working biochemists use. Every one is free.
Structure viewing and analysis
- PyMOL (educational license): the most widely used molecular viewer. Rotate, color, measure, and produce figures publication ready.
- ChimeraX: a more modern viewer with excellent surface and electrostatics rendering.
- Fpocket: detects binding pockets, including allosteric ones, on any protein structure.
Docking
- AutoDock Vina: the standard free docking engine. Predicts how a small molecule binds to a protein.
- Smina: a Vina fork with better scoring options.
- GNINA: adds a convolutional neural network for re-scoring poses.
- DiffDock: a diffusion-model docking tool that handles flexible binding.
Molecular dynamics
- GROMACS: the workhorse for simulating proteins, lipids, and complexes over nanoseconds to microseconds.
- OpenMM: pure Python interface, runs beautifully on a free Colab GPU.
- NAMD: scalable MD with extensive tutorials.
- PLUMED: adds enhanced sampling methods like metadynamics on top of GROMACS or OpenMM.
Cheminformatics and machine learning
- RDKit: the standard library for handling molecules in Python.
- scikit-learn and PyTorch: general ML and deep learning.
- DeepChem: ML built specifically for chemistry.
- ESM2 and ProtBERT: protein language models you can call in a notebook.
- SwissADME (web tool): drug-likeness and ADMET prediction.
- ViennaRNA: RNA secondary-structure folding and ΔG calculation.
Running professional tools yourself changes how research feels. You are no longer reading about science. You are doing it.
Public Databases That Count as Real Data
Re-analysis of public data is a legitimate research path on its own. Judges respect it because the dataset is already peer-reviewed and large.
Protein structures and sequences
- PDB (Protein Data Bank): experimental 3D structures.
- AlphaFold DB: predicted structures for over 200 million proteins.
- UniProt: protein sequences with functional annotation.
- Pfam and InterPro: protein family and domain databases.
Small molecules and drugs
- PubChem: tens of millions of compounds with structure and bioactivity.
- ChEMBL: curated bioactivity data, ideal for ML training.
- DrugBank: FDA-approved drugs with targets and mechanisms.
- ZINC: purchasable compounds for virtual screening.
- COCONUT: natural-product structures.
- IMPPAT and TCMSP: Ayurvedic and traditional Chinese medicine compound databases.
Sequences, expression, and metabolism
- NCBI and Ensembl: genome and transcript sequences.
- KEGG and MetaCyc: metabolic pathways and reactions.
- GEO and Expression Atlas: gene expression datasets.
- TCGA: cancer genomics and expression data.
Metabolomics
- HMDB (Human Metabolome Database): all known human metabolites.
- MetaboLights: raw metabolomics studies you can reanalyze.
Pulling a dataset from one of these and asking a new question of it is real science.
How to Combine Wet and Dry: The Strongest Project Shape
The most defensible projects bridge a hands-on measurement and a computational analysis. There are two reliable patterns.
Pattern A: Measure something at home, then explain it computationally. Run an enzyme kinetics experiment, get your Michaelis-Menten parameters, then dock candidate inhibitors against the same enzyme's AlphaFold structure and show which residues your data implicates. Your wet measurement constrains your computational story.
Pattern B: Predict something computationally, then validate one prediction in the kitchen. Screen a public compound library against a target in silico, pick the top three candidates that happen to be available as kitchen ingredients or cheap supplements, and test one prediction with a simple assay. Your computation guides your experiment.
Judges respond to this hybrid shape because it shows you can connect a measurement to a mechanism, which is what real biochemistry research does.
Choosing a Phenomenon That Has Not Been Done
Novelty is a process, not a guess. Run these three checks before you commit to a project.
- Google Scholar. Search your candidate phrase in quotes plus a year filter for the last five years. Skim the top 20 results. If your exact study exists, narrow your angle (a different enzyme, a different condition, a different organism, a different model).
- Society for Science abstracts archive. Search the public ISEF and Regeneron STS abstract databases for your keywords. This tells you what other high school students have already done.
- PubMed. Search for review articles on your topic. Reviews tell you the frontier and, more usefully, the gaps the field admits it has.
If you find adjacent prior work, that is good news, not bad news. It means your topic is real, and you now know exactly where the unanswered question lives.
A Realistic Timeline
- 1 to 2 weeks (focused replication or measurement). Pick one assay, build the light box, run a calibration curve, and write it up cleanly. This is a strong middle-school-to-9th-grade entry.
- 1 to 2 months (full hybrid project for regional fair). One wet-lab measurement plus one computational analysis, tied together with a clear hypothesis and statistics.
- Full year (ISEF-track project). A multi-experiment hybrid pipeline with a novel computational contribution, careful controls, and a written paper ready for review.
First-time researchers should start with the 1 to 2 week version. You learn more from finishing one small thing than from half-finishing a big thing.
A Starter Checklist
Before you pick a specific phenomenon, set these up. They take an afternoon.
- A clean, well-lit workspace with a flat surface and a sink nearby.
- A free Google account with Colab opened at least once (this confirms GPU access).
- A local Python environment (Anaconda or Miniconda) with RDKit, scikit-learn, NumPy, pandas, and Pillow installed.
- PyMOL or ChimeraX installed and a test PDB structure loaded.
- AutoDock Vina installed and the included tutorial run end to end.
- A lab notebook (paper or a dated Google Doc) with the date and a one-line goal on the first page.
- A single written sentence: "My research question is whether X affects Y, measured by Z."
If you have all seven, you are ready.
Where to Go Next
Biochemistry at ISEF splits into five subcategories. Pick the one that pulls you in most.
- Analytical Biochemistry (ANB): measuring and quantifying biochemicals. Smartphone colorimetry, paper microfluidics, DIY spectrometry, and titration projects live here.
- General Biochemistry (GNR): enzyme kinetics, fermentation, pigment chemistry, and Maillard reactions. The widest entry point for hands-on work.
- Medicinal Biochemistry (MED): drug discovery, virtual screening, ADMET prediction, and natural-product pharmacology. Heavy on computation, light on equipment.
- Structural Biochemistry (STR): protein structure, molecular dynamics, AlphaFold-based modeling, and de novo design. Almost entirely on a laptop.
- Other (OTH): hybrid pipelines, educational tools, benchmarking studies, and meta-analyses that do not fit cleanly into one of the above.
Each subcategory has its own MehtA+ project guide built around the kit on this page. Pick the one that interests you most and follow that link from our blog. Biochemistry used to live behind a locked lab door. The door is open now, and you have the key.
Project ideas in this category (56)
Other · Advanced
Alginate Bead Release KineticsMedicinal Biochemistry · Intermediate
Allosteric Docking BenchmarkingOther · Advanced
Allosteric Pocket Mapping for Resistance Enzyme TargetsMedicinal Biochemistry · Advanced
AlphaFold p53 Disorder Across MammalsStructural Biochemistry · Advanced
Antibody Fc Glycosylation Mutation EffectsStructural Biochemistry · Advanced
Artificial Sweeteners and Yeast Fermentation KineticsGeneral Biochemistry · Intermediate
Bisphenol Detection in Sunlit Bottled WaterAnalytical Biochemistry · Intermediate
BRCA1 Variant Effects on Protein StructureStructural Biochemistry · Advanced
Catalase Kinetics With MicroplasticsGeneral Biochemistry · Intermediate
Computational Drug ScreeningOther · Advanced
CRISPR-Cas12a Guide RNA LoadingStructural Biochemistry · Advanced
Curcumin Analogs for NF-κB DockingMedicinal Biochemistry · Advanced
De Novo Viral Protein Binder DesignStructural Biochemistry · Advanced
Designing Peptides to Block Amyloid-β Clumping in Alzheimer’sMedicinal Biochemistry · Advanced
DIY Arduino Spectrophotometer for Berry AnthocyaninsAnalytical Biochemistry · Intermediate
Early Alzheimer’s Multi-Omics Network Science ProjectOther · Advanced
Egg White Protein Unfolding by SaltGeneral Biochemistry · Intermediate
Enzyme Inhibition Simulator for Science FairOther · Intermediate
Enzyme Kinetics IdentifiabilityOther · Intermediate
Fruit Protease Kinetics in GelatinGeneral Biochemistry · Intermediate
Glycolysis Bottlenecks in Cancer CellsGeneral Biochemistry · Advanced
Green Tea Polyphenol KineticsAnalytical Biochemistry · Intermediate
hERG Risk in AntimalarialsMedicinal Biochemistry · Advanced
Honey Adulteration DetectionAnalytical Biochemistry · Intermediate
Kitchen Polyphenol Synergy in DPPH Antioxidant TestsGeneral Biochemistry · Intermediate
Kombucha Antioxidant Claims and Publication Bias StudyOther · Advanced
Kombucha Fermentation DriversOther · Intermediate
KRAS G12D Cryptic PocketsStructural Biochemistry · Advanced
Lactase Drop Activity and ReuseGeneral Biochemistry · Intermediate
Machine Learning Food Dye Spectrum Deconvolution ProjectAnalytical Biochemistry · Intermediate
Maillard Browning in BakingGeneral Biochemistry · Intermediate
Membrane Lipids That Block Alpha-Synuclein AggregationStructural Biochemistry · Advanced
Orphan GPCR Pocket ClusteringStructural Biochemistry · Advanced
Paper Microfluidic Urine Test DeviceAnalytical Biochemistry · Intermediate
PCSK9 Pharmacophore Mining in Ayurvedic and TCM DatabasesMedicinal Biochemistry · Advanced
Peptide Pore Formation in MembranesStructural Biochemistry · Advanced
Predicting Blood-Brain Barrier Permeability with Machine LearningMedicinal Biochemistry · Advanced
Predicting G-Quadruplex Stability in Oncogene PromotersStructural Biochemistry · Advanced
Predicting Kinase Drug Response in Pediatric TumorsMedicinal Biochemistry · Advanced
PROTAC Linker Design for Kinase TargetingMedicinal Biochemistry · Advanced
PubChem Transfer Learning for Tiny Bioactivity SetsOther · Advanced
Red Cabbage Anthocyanin pH IndicatorGeneral Biochemistry · Intermediate
SARS-CoV-2 5’UTR RNA FoldingOther · Advanced
SARS-CoV-2 Spike Variant Binding StudyStructural Biochemistry · Advanced
Small-Molecule Synthesizability ClassifiersOther · Advanced
Smartphone LAMP Plant Pathogen TestAnalytical Biochemistry · Advanced
Smartphone Nitrate Testing in Vegetables and WaterAnalytical Biochemistry · Intermediate
Spice Oil Effects on Bacterial BiofilmsMedicinal Biochemistry · Intermediate
Spinach Chloroplast Light-Response Curves Project IdeasGeneral Biochemistry · Intermediate
Tea Polyphenols and SCOBY Acid OutputGeneral Biochemistry · Intermediate
Turmeric Brand Analysis with TLC and DPPH Science FairAnalytical Biochemistry · Intermediate
Urine Biomarker Discovery for DiabetesAnalytical Biochemistry · Advanced
Virtual Drug Screening for Viral PolymeraseMedicinal Biochemistry · Advanced
Vitamin C Decay in JuiceAnalytical Biochemistry · Intermediate
Yeast Fermentation and Artificial Sweeteners StudyGeneral Biochemistry · Intermediate
