How to Do Real Mathematics Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

How to Do Real Mathematics Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Mathematics research used to live inside university seminar rooms and subscription-only journals. Today it runs on a laptop, a free cloud notebook, and a stack of open tools that professional mathematicians actually use.

This guide is your starting point. It covers three things: the home kit you need, the free software that powers modern math research, and the public databases that turn your laptop into a real research workstation.

Why this is possible now

Open mathematical databases changed everything. The OEIS catalogs hundreds of thousands of integer sequences. The LMFDB exposes deep number-theoretic objects (L-functions, modular forms, elliptic curves) that once required specialist software. House of Graphs, ISGCI, and SNAP give you graphs and networks at every scale. You can ask questions against real data on day one.

Professional-grade open-source mathematics is now free. SageMath, PARI/GP, GAP, Macaulay2, Singular, Lean 4 with mathlib, Coq, and Isabelle/HOL match or exceed what was locked behind expensive licenses ten years ago. The same goes for numerics (NumPy, SciPy, Julia) and optimization (OR-Tools, Z3, MiniZinc, SCIP).

Free cloud compute closed the last gap. Google Colab and Kaggle give you GPU and CPU time you can run all night. arXiv and OpenAlex let you read the literature for free. Overleaf and Quarto handle the writeup.

A laptop on your kitchen counter can now enumerate algebraic structures, search for new prime patterns, formalize a theorem, and typeset the result in publication-ready LaTeX, all before dinner.

The mathematics home kit

Math is the lowest-cost ISEF category, but a few deliberate setup choices make a real difference.

Hardware

  • A modern laptop with 8+ GB RAM (anything from the last five years works).
  • An external monitor or a tablet as a second screen for reading papers while coding (optional but nice).
  • A notebook and pen for proof sketching. Paper still beats screens for diagrams.
  • Optional: a small home FDM 3D printer if your project touches geometry or origami rigidity (under $300 for a starter unit).

Workspace

  • A quiet, distraction-free desk.
  • A free Google account for Colab and Drive.
  • A free GitHub account for version control and to share code with a mentor.

Cloud compute

  • Google Colab (free GPU/CPU notebooks).
  • Kaggle Notebooks (another free compute pool with persistent datasets).
  • Your school's library access for MathSciNet and zbMATH, if available.

Writing setup

  • Overleaf free tier for LaTeX.
  • Quarto for combined code-and-prose reports.
  • Inkscape for diagrams that go beyond TikZ.

Approximate total cost: $0 to $300 if you add a 3D printer, $0 otherwise.

The signature technique: running professional math software on a free Colab GPU

The single move that unlocks the most projects in mathematics is learning to drive SageMath (or PARI/GP, or GAP) inside a Colab notebook. Here is the five-step workflow.

  1. Open a fresh Colab notebook. In the first cell, install SageMath or PARI/GP with a single pip or apt command. Restart the runtime so the kernel picks up the new binary.
  2. Pull in a public dataset or a generated object. Load an OEIS sequence, an LMFDB curve, a House of Graphs file, or a family of objects you build yourself (semirings of order n, Latin squares, polytopes).
  3. Run an exhaustive or Monte Carlo experiment. Enumerate small cases, compute the invariant you care about, and save the results to a CSV or JSON file in your Google Drive.
  4. Look for a pattern. Plot with matplotlib, query OEIS to see if your sequence already exists, and form a conjecture.
  5. Try to prove the conjecture. Sketch a proof by induction, by generating function, or by reduction to a known theorem. If the proof is small and clean, formalize it in Lean 4 to lock it in.

This loop is what real mathematicians do. Experiment, conjecture, prove. The fact that the tools are free does not make the work less real.

The dry-lab side: free software you can install today

Computer algebra and discrete math

  • SageMath: one Python-flavored environment that wraps most open math software.
  • PARI/GP: fast number-theoretic computations, especially for sieving and large-integer experiments.
  • GAP: groups, semigroups, near-rings, and combinatorial structures.
  • Macaulay2 and Singular: commutative algebra, Gröbner bases, polynomial ideals.
  • Magma Calculator (online): free web access to a subset of Magma for short jobs.

Proof assistants and formal math

  • Lean 4 with mathlib: the fastest-growing formal math library. Modern, well-documented, friendly community.
  • Coq: long-established, used widely in formal verification.
  • Isabelle/HOL and Agda: alternative ecosystems with strong communities.
  • Metamath: a minimalist proof system, good for understanding foundations.

Numerics, ML, and scientific computing

  • Python with NumPy, SciPy, SymPy, mpmath, NetworkX, and igraph: the default toolkit.
  • Julia with JuMP, Graphs.jl, DifferentialEquations.jl, and DynamicalSystems.jl: fast, modern, great for dynamics and optimization.
  • PyTorch and JAX: for neural networks, including physics-informed networks and ML for math.
  • scikit-learn, XGBoost, Optuna, statsmodels, PyMC, Stan, and R: statistics and machine learning.

Optimization, SAT, and SMT

  • Z3 and CVC5: SMT solvers that can check constraint systems and small algebraic conjectures.
  • OR-Tools, MiniZinc, and SCIP: constraint and integer programming.
  • Kissat and CaDiCaL: state-of-the-art SAT solvers.
  • Gurobi and CPLEX: free academic licenses for serious mixed-integer programs.

Visualization and writing

  • matplotlib, plotly, manim, TikZ, Asymptote, GeoGebra, and Desmos: figures and animations.
  • Overleaf, Quarto, and Inkscape: typesetting and final-figure polish.

Running the same software professional mathematicians run changes how the work feels. You are not simulating research; you are doing it.

Public databases that count as real data

Sequences and number-theoretic objects

  • OEIS: the encyclopedia of integer sequences. Contributing a new sequence is itself a research output.
  • LMFDB: L-functions, modular forms, elliptic curves, number fields. A treasure trove for number theory projects.

Graphs and networks

  • House of Graphs: curated graphs with known invariants, great for testing conjectures.
  • ISGCI: the information system on graph classes and their inclusions.
  • Stanford SNAP and Network Repository: large real-world networks (social, road, citation).

Machine learning and stats benchmarks

  • UCI Machine Learning Repository and Kaggle Datasets: classic and current ML benchmarks.
  • MIMIC-IV (with training): medical time series, when your project goes applied.

Real-world signals

  • OpenStreetMap and GTFS feeds: road networks and public-transit schedules for graph and percolation projects.
  • EPA AirNow, NOAA, and USGS: environmental data for change-point and time-series work.
  • FRED, Yahoo Finance, FAOSTAT, and USDA: economic and agricultural data.
  • FiveThirtyEight data repo: clean datasets with documented context.

Literature and metadata

  • arXiv: free preprints. Set up an RSS feed for the subcategories you care about.
  • OpenAlex: open scholarly metadata. Useful for bibliometric and co-authorship projects.

Re-analyzing public data with a new method is itself a legitimate research path. A clean theorem about a known dataset is worth more than a sloppy theorem about a new one.

How to combine theory and computation: the strongest project shape

Pattern A: experiment, conjecture, prove. Enumerate small cases of an object (groups of order ≤ 24, polytopes with ≤ 20 lattice points, semirings of order ≤ 12) in SageMath or GAP. Spot a pattern in the resulting sequence using OEIS. Prove the pattern for a tractable sub-family by induction, generating functions, or a structural argument.

Pattern B: bound, then verify. Prove an analytic or combinatorial bound (a Hardy-type inequality, a cop-number lower bound, a Berry-Esseen rate). Verify the bound numerically on a large parameter sweep in Python or Julia. Show your bound is sharp by exhibiting an extremal example.

This hybrid shape resonates with judges because it shows both rigorous proof and concrete evidence the proof reflects reality.

Choosing a phenomenon that has not been done

  1. Search Google Scholar for the closest two or three keywords to your candidate question. Skim the most recent five years of results. If the exact question appears, narrow your scope (a smaller parameter range, a stricter hypothesis, a new invariant).
  2. Check the Society for Science abstracts archive for ISEF and Regeneron STS finalists. Search by keyword to see which angles students have already pursued.
  3. Search arXiv and MathSciNet (or zbMATH) by MSC code for the subject area. Read the introductions of three recent papers, and note which open questions they list. Open questions in published papers are gold.

Finding adjacent prior work is good news, not bad news. It means your question lives in an active area and you now know exactly where the frontier is.

A realistic timeline

  • One to two weeks (focused replication or experiment): reproduce a small published result, extend its data by one parameter, and write a 5-page report.
  • One to two months (full hybrid project for a regional fair): run a full Pattern A or Pattern B project, with a clean writeup in LaTeX and reproducible code on GitHub.
  • Full year (ISEF-track project): combine theory, large-scale computation, and a formalization or public-data contribution (a new OEIS sequence, a mathlib pull request, an LMFDB-style table).

If this is your first research project, start with the one-to-two-week version. Finishing a small project teaches you more than half-finishing a big one.

A starter checklist

  1. A quiet workspace and a paper notebook for proof sketches.
  2. A free Google Colab account, plus a Google Drive folder for the project.
  3. A local Python environment (Anaconda or plain venv) with NumPy, SciPy, SymPy, NetworkX, and matplotlib installed.
  4. SageMath installed locally or available as a Colab cell.
  5. A free Overleaf account with a blank LaTeX project named for your research question.
  6. A GitHub repo for your code, with a README and a license.
  7. A single written sentence stating your research question, taped above your desk.

If you have all seven, you are ready to pick a phenomenon.

Where to go next

Mathematics at ISEF splits into six subcategories. Each one has its own MehtA+ project guide that uses the kit and tools on this page. Pick the one that pulls you in.

  • Algebra (ALB): structure of groups, rings, semirings, and their representations. Heavy use of SageMath and GAP.
  • Analysis (ANL): real, complex, and functional analysis. Convergence rates, inequalities, dynamical systems, and PINNs.
  • Combinatorics, Graph Theory, and Game Theory (CGG): counting, graphs, and games. Heavy use of NetworkX, OEIS, and SAT/SMT solvers.
  • Geometry and Topology (GEO): shapes, knots, polytopes, persistent homology, and billiards.
  • Number Theory (NUM): primes, modular forms, elliptic curves, and sieves. Heavy use of LMFDB and PARI/GP.
  • Probability and Statistics (PRO): concentration inequalities, conformal prediction, change-point detection, and applied statistics.
  • Other (OTH): formal verification, mathematical modeling, category theory, and cross-disciplinary work.

Pick the subcategory that interests you most and open its MehtA+ guide. The kitchen counter and laptop you have now are enough.

Project ideas in this category (60)

3D-Printed Geodesics on Translation Surfaces

Mathematics · Geometry and Topology · Advanced

Berry-Esseen Bounds for Wage Inequality Studies

Mathematics · Probability and Statistics · Advanced

Bird Migration Loop Signatures

Mathematics · Geometry and Topology · Advanced

Carmichael Function Divisibility Search

Mathematics · Number Theory · Advanced

Cayley Graph Spectra and Expansion in Small Groups

Mathematics · Algebra · Advanced

Change-Point Detection for Sensor Data

Mathematics · Probability and Statistics · Advanced

Community Detection Thresholds in Network Data

Mathematics · Probability and Statistics · Advanced

Commuting Probability in Finite Semigroups

Mathematics · Algebra · Advanced

Conformal Ranking Algorithms for Public Data

Mathematics · Probability and Statistics · Advanced

Continued Fraction Multifractal Spectra

Mathematics · Analysis · Advanced

Cunningham Chains in Number Fields

Mathematics · Number Theory · Advanced

Discrete Hardy Inequalities on Trees

Mathematics · Analysis · Advanced

Doubly Robust Network Treatment Effects

Mathematics · Probability and Statistics · Advanced

Elliptic Curves for Cubic Sum Solutions

Mathematics · Number Theory · Advanced

Fair Ranking Calibration for ML Models

Mathematics · Probability and Statistics · Advanced

Fast Road Path Algorithms With Tropical Geometry

Mathematics · Other · Advanced

Finite Field Polynomial Factor Patterns

Mathematics · Algebra · Advanced

Finite Semiring Classification with Z3

Mathematics · Algebra · Advanced

Formal Game Theory Proofs in Lean 4

Mathematics · Other · Advanced

Formal Verification for Climate Model Accuracy

Mathematics · Other · Advanced

Fractional Logistic Map Dynamics and Bifurcations

Mathematics · Analysis · Advanced

Gröbner Bases for Sudoku-Style Design Counting

Mathematics · Algebra · Advanced

Group Ring Subalgebras and Möbius Counting

Mathematics · Algebra · Advanced

High School Course Choice Equilibrium Models

Mathematics · Other · Advanced

Higher-Order Data Smoothing With Mollifiers

Mathematics · Analysis · Advanced

Hyperbolic Rep-Tiles and Self-Similarity

Mathematics · Geometry and Topology · Advanced

Icosahedral Polytope Classification in 4D

Mathematics · Geometry and Topology · Advanced

Isogeny Hashes and Collision Resistance

Mathematics · Number Theory · Advanced

Iwasawa Lambda Patterns in Quadratic Fields

Mathematics · Number Theory · Advanced

Lattice-Point Polytopes in 3D

Mathematics · Geometry and Topology · Advanced

Leibniz Algebra Automorphism Groups

Mathematics · Algebra · Advanced

Lucas Pseudoprime Tests in Quadratic Fields

Mathematics · Number Theory · Advanced

Manhattan Polygon Inequalities in Geometry

Mathematics · Geometry and Topology · Advanced

Modeling Misinformation Spread on Twitter/X

Mathematics · Other · Advanced

Modeling School Crowd Flow With Math

Mathematics · Other · Advanced

Near-Rings From Polynomial Maps Mod n

Mathematics · Algebra · Advanced

Noisy Secretary Problem Thresholds

Mathematics · Probability and Statistics · Advanced

NYC Subway Delay Percolation Model

Mathematics · Probability and Statistics · Advanced

Origami Fold Rigidity in Twist Tessellations

Mathematics · Other · Advanced

Percolation Cluster Size Concentration

Mathematics · Probability and Statistics · Advanced

Picard Iteration for Delay Equations

Mathematics · Analysis · Advanced

PINN Error Analysis for Burgers’ Equation

Mathematics · Analysis · Advanced

Polygon Billiard Entropy in Rational-Angled Shapes

Mathematics · Geometry and Topology · Advanced

Polyhedron Net Counts and Shape Perturbations

Mathematics · Geometry and Topology · Advanced

Prime Factors of n! + 1

Mathematics · Number Theory · Advanced

Proving a Thermostat Periodic Orbit with Python

Mathematics · Analysis · Advanced

Quantitative Ergodic Theorems for Torus Rotations

Mathematics · Analysis · Advanced

Randomized Kaczmarz With Noisy Linear Systems

Mathematics · Analysis · Advanced

Ranked-Choice Tie-Breaking in Elections

Mathematics · Other · Advanced

Reaction Network Reasoning With Python Spans

Mathematics · Other · Advanced

Riffle Shuffle Mixing Time and Markov Chains

Mathematics · Probability and Statistics · Advanced

Rook Polynomials on Skew Boards

Mathematics · Algebra · Advanced

Spectral Gaps in Fractal Rectangles

Mathematics · Analysis · Advanced

Squarefree Gaps in Arithmetic Progressions

Mathematics · Number Theory · Advanced

Stern-Brocot Depth Statistics in Rational Numbers

Mathematics · Number Theory · Advanced

Ternary ABC Quality in Number Theory

Mathematics · Number Theory · Advanced

Theta-Graph Tricolorability Invariants

Mathematics · Geometry and Topology · Advanced

Urban Street Network Topology and City Scaling

Mathematics · Geometry and Topology · Advanced

Wasserstein Dialect Boundary Detection

Mathematics · Other · Advanced

Zero-Divisor Graphs of Quotient Rings

Mathematics · Algebra · Advanced

Shopping Cart