How to Do Real Mathematics Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Mathematics research used to live inside university seminar rooms and subscription-only journals. Today it runs on a laptop, a free cloud notebook, and a stack of open tools that professional mathematicians actually use.
This guide is your starting point. It covers three things: the home kit you need, the free software that powers modern math research, and the public databases that turn your laptop into a real research workstation.
Why this is possible now
Open mathematical databases changed everything. The OEIS catalogs hundreds of thousands of integer sequences. The LMFDB exposes deep number-theoretic objects (L-functions, modular forms, elliptic curves) that once required specialist software. House of Graphs, ISGCI, and SNAP give you graphs and networks at every scale. You can ask questions against real data on day one.
Professional-grade open-source mathematics is now free. SageMath, PARI/GP, GAP, Macaulay2, Singular, Lean 4 with mathlib, Coq, and Isabelle/HOL match or exceed what was locked behind expensive licenses ten years ago. The same goes for numerics (NumPy, SciPy, Julia) and optimization (OR-Tools, Z3, MiniZinc, SCIP).
Free cloud compute closed the last gap. Google Colab and Kaggle give you GPU and CPU time you can run all night. arXiv and OpenAlex let you read the literature for free. Overleaf and Quarto handle the writeup.
A laptop on your kitchen counter can now enumerate algebraic structures, search for new prime patterns, formalize a theorem, and typeset the result in publication-ready LaTeX, all before dinner.
The mathematics home kit
Math is the lowest-cost ISEF category, but a few deliberate setup choices make a real difference.
Hardware
- A modern laptop with 8+ GB RAM (anything from the last five years works).
- An external monitor or a tablet as a second screen for reading papers while coding (optional but nice).
- A notebook and pen for proof sketching. Paper still beats screens for diagrams.
- Optional: a small home FDM 3D printer if your project touches geometry or origami rigidity (under $300 for a starter unit).
Workspace
- A quiet, distraction-free desk.
- A free Google account for Colab and Drive.
- A free GitHub account for version control and to share code with a mentor.
Cloud compute
- Google Colab (free GPU/CPU notebooks).
- Kaggle Notebooks (another free compute pool with persistent datasets).
- Your school's library access for MathSciNet and zbMATH, if available.
Writing setup
- Overleaf free tier for LaTeX.
- Quarto for combined code-and-prose reports.
- Inkscape for diagrams that go beyond TikZ.
Approximate total cost: $0 to $300 if you add a 3D printer, $0 otherwise.
The signature technique: running professional math software on a free Colab GPU
The single move that unlocks the most projects in mathematics is learning to drive SageMath (or PARI/GP, or GAP) inside a Colab notebook. Here is the five-step workflow.
- Open a fresh Colab notebook. In the first cell, install SageMath or PARI/GP with a single pip or apt command. Restart the runtime so the kernel picks up the new binary.
- Pull in a public dataset or a generated object. Load an OEIS sequence, an LMFDB curve, a House of Graphs file, or a family of objects you build yourself (semirings of order n, Latin squares, polytopes).
- Run an exhaustive or Monte Carlo experiment. Enumerate small cases, compute the invariant you care about, and save the results to a CSV or JSON file in your Google Drive.
- Look for a pattern. Plot with matplotlib, query OEIS to see if your sequence already exists, and form a conjecture.
- Try to prove the conjecture. Sketch a proof by induction, by generating function, or by reduction to a known theorem. If the proof is small and clean, formalize it in Lean 4 to lock it in.
This loop is what real mathematicians do. Experiment, conjecture, prove. The fact that the tools are free does not make the work less real.
The dry-lab side: free software you can install today
Computer algebra and discrete math
- SageMath: one Python-flavored environment that wraps most open math software.
- PARI/GP: fast number-theoretic computations, especially for sieving and large-integer experiments.
- GAP: groups, semigroups, near-rings, and combinatorial structures.
- Macaulay2 and Singular: commutative algebra, Gröbner bases, polynomial ideals.
- Magma Calculator (online): free web access to a subset of Magma for short jobs.
Proof assistants and formal math
- Lean 4 with mathlib: the fastest-growing formal math library. Modern, well-documented, friendly community.
- Coq: long-established, used widely in formal verification.
- Isabelle/HOL and Agda: alternative ecosystems with strong communities.
- Metamath: a minimalist proof system, good for understanding foundations.
Numerics, ML, and scientific computing
- Python with NumPy, SciPy, SymPy, mpmath, NetworkX, and igraph: the default toolkit.
- Julia with JuMP, Graphs.jl, DifferentialEquations.jl, and DynamicalSystems.jl: fast, modern, great for dynamics and optimization.
- PyTorch and JAX: for neural networks, including physics-informed networks and ML for math.
- scikit-learn, XGBoost, Optuna, statsmodels, PyMC, Stan, and R: statistics and machine learning.
Optimization, SAT, and SMT
- Z3 and CVC5: SMT solvers that can check constraint systems and small algebraic conjectures.
- OR-Tools, MiniZinc, and SCIP: constraint and integer programming.
- Kissat and CaDiCaL: state-of-the-art SAT solvers.
- Gurobi and CPLEX: free academic licenses for serious mixed-integer programs.
Visualization and writing
- matplotlib, plotly, manim, TikZ, Asymptote, GeoGebra, and Desmos: figures and animations.
- Overleaf, Quarto, and Inkscape: typesetting and final-figure polish.
Running the same software professional mathematicians run changes how the work feels. You are not simulating research; you are doing it.
Public databases that count as real data
Sequences and number-theoretic objects
- OEIS: the encyclopedia of integer sequences. Contributing a new sequence is itself a research output.
- LMFDB: L-functions, modular forms, elliptic curves, number fields. A treasure trove for number theory projects.
Graphs and networks
- House of Graphs: curated graphs with known invariants, great for testing conjectures.
- ISGCI: the information system on graph classes and their inclusions.
- Stanford SNAP and Network Repository: large real-world networks (social, road, citation).
Machine learning and stats benchmarks
- UCI Machine Learning Repository and Kaggle Datasets: classic and current ML benchmarks.
- MIMIC-IV (with training): medical time series, when your project goes applied.
Real-world signals
- OpenStreetMap and GTFS feeds: road networks and public-transit schedules for graph and percolation projects.
- EPA AirNow, NOAA, and USGS: environmental data for change-point and time-series work.
- FRED, Yahoo Finance, FAOSTAT, and USDA: economic and agricultural data.
- FiveThirtyEight data repo: clean datasets with documented context.
Literature and metadata
- arXiv: free preprints. Set up an RSS feed for the subcategories you care about.
- OpenAlex: open scholarly metadata. Useful for bibliometric and co-authorship projects.
Re-analyzing public data with a new method is itself a legitimate research path. A clean theorem about a known dataset is worth more than a sloppy theorem about a new one.
How to combine theory and computation: the strongest project shape
Pattern A: experiment, conjecture, prove. Enumerate small cases of an object (groups of order ≤ 24, polytopes with ≤ 20 lattice points, semirings of order ≤ 12) in SageMath or GAP. Spot a pattern in the resulting sequence using OEIS. Prove the pattern for a tractable sub-family by induction, generating functions, or a structural argument.
Pattern B: bound, then verify. Prove an analytic or combinatorial bound (a Hardy-type inequality, a cop-number lower bound, a Berry-Esseen rate). Verify the bound numerically on a large parameter sweep in Python or Julia. Show your bound is sharp by exhibiting an extremal example.
This hybrid shape resonates with judges because it shows both rigorous proof and concrete evidence the proof reflects reality.
Choosing a phenomenon that has not been done
- Search Google Scholar for the closest two or three keywords to your candidate question. Skim the most recent five years of results. If the exact question appears, narrow your scope (a smaller parameter range, a stricter hypothesis, a new invariant).
- Check the Society for Science abstracts archive for ISEF and Regeneron STS finalists. Search by keyword to see which angles students have already pursued.
- Search arXiv and MathSciNet (or zbMATH) by MSC code for the subject area. Read the introductions of three recent papers, and note which open questions they list. Open questions in published papers are gold.
Finding adjacent prior work is good news, not bad news. It means your question lives in an active area and you now know exactly where the frontier is.
A realistic timeline
- One to two weeks (focused replication or experiment): reproduce a small published result, extend its data by one parameter, and write a 5-page report.
- One to two months (full hybrid project for a regional fair): run a full Pattern A or Pattern B project, with a clean writeup in LaTeX and reproducible code on GitHub.
- Full year (ISEF-track project): combine theory, large-scale computation, and a formalization or public-data contribution (a new OEIS sequence, a mathlib pull request, an LMFDB-style table).
If this is your first research project, start with the one-to-two-week version. Finishing a small project teaches you more than half-finishing a big one.
A starter checklist
- A quiet workspace and a paper notebook for proof sketches.
- A free Google Colab account, plus a Google Drive folder for the project.
- A local Python environment (Anaconda or plain venv) with NumPy, SciPy, SymPy, NetworkX, and matplotlib installed.
- SageMath installed locally or available as a Colab cell.
- A free Overleaf account with a blank LaTeX project named for your research question.
- A GitHub repo for your code, with a README and a license.
- A single written sentence stating your research question, taped above your desk.
If you have all seven, you are ready to pick a phenomenon.
Where to go next
Mathematics at ISEF splits into six subcategories. Each one has its own MehtA+ project guide that uses the kit and tools on this page. Pick the one that pulls you in.
- Algebra (ALB): structure of groups, rings, semirings, and their representations. Heavy use of SageMath and GAP.
- Analysis (ANL): real, complex, and functional analysis. Convergence rates, inequalities, dynamical systems, and PINNs.
- Combinatorics, Graph Theory, and Game Theory (CGG): counting, graphs, and games. Heavy use of NetworkX, OEIS, and SAT/SMT solvers.
- Geometry and Topology (GEO): shapes, knots, polytopes, persistent homology, and billiards.
- Number Theory (NUM): primes, modular forms, elliptic curves, and sieves. Heavy use of LMFDB and PARI/GP.
- Probability and Statistics (PRO): concentration inequalities, conformal prediction, change-point detection, and applied statistics.
- Other (OTH): formal verification, mathematical modeling, category theory, and cross-disciplinary work.
Pick the subcategory that interests you most and open its MehtA+ guide. The kitchen counter and laptop you have now are enough.
Project ideas in this category (60)
Geometry and Topology · Advanced
Berry-Esseen Bounds for Wage Inequality StudiesProbability and Statistics · Advanced
Bird Migration Loop SignaturesGeometry and Topology · Advanced
Carmichael Function Divisibility SearchNumber Theory · Advanced
Cayley Graph Spectra and Expansion in Small GroupsAlgebra · Advanced
Change-Point Detection for Sensor DataProbability and Statistics · Advanced
Community Detection Thresholds in Network DataProbability and Statistics · Advanced
Commuting Probability in Finite SemigroupsAlgebra · Advanced
Conformal Ranking Algorithms for Public DataProbability and Statistics · Advanced
Continued Fraction Multifractal SpectraAnalysis · Advanced
Cunningham Chains in Number FieldsNumber Theory · Advanced
Discrete Hardy Inequalities on TreesAnalysis · Advanced
Doubly Robust Network Treatment EffectsProbability and Statistics · Advanced
Elliptic Curves for Cubic Sum SolutionsNumber Theory · Advanced
Fair Ranking Calibration for ML ModelsProbability and Statistics · Advanced
Fast Road Path Algorithms With Tropical GeometryOther · Advanced
Finite Field Polynomial Factor PatternsAlgebra · Advanced
Finite Semiring Classification with Z3Algebra · Advanced
Formal Game Theory Proofs in Lean 4Other · Advanced
Formal Verification for Climate Model AccuracyOther · Advanced
Fractional Logistic Map Dynamics and BifurcationsAnalysis · Advanced
Gröbner Bases for Sudoku-Style Design CountingAlgebra · Advanced
Group Ring Subalgebras and Möbius CountingAlgebra · Advanced
High School Course Choice Equilibrium ModelsOther · Advanced
Higher-Order Data Smoothing With MollifiersAnalysis · Advanced
Hyperbolic Rep-Tiles and Self-SimilarityGeometry and Topology · Advanced
Icosahedral Polytope Classification in 4DGeometry and Topology · Advanced
Isogeny Hashes and Collision ResistanceNumber Theory · Advanced
Iwasawa Lambda Patterns in Quadratic FieldsNumber Theory · Advanced
Lattice-Point Polytopes in 3DGeometry and Topology · Advanced
Leibniz Algebra Automorphism GroupsAlgebra · Advanced
Lucas Pseudoprime Tests in Quadratic FieldsNumber Theory · Advanced
Manhattan Polygon Inequalities in GeometryGeometry and Topology · Advanced
Modeling Misinformation Spread on Twitter/XOther · Advanced
Modeling School Crowd Flow With MathOther · Advanced
Near-Rings From Polynomial Maps Mod nAlgebra · Advanced
Noisy Secretary Problem ThresholdsProbability and Statistics · Advanced
NYC Subway Delay Percolation ModelProbability and Statistics · Advanced
Origami Fold Rigidity in Twist TessellationsOther · Advanced
Percolation Cluster Size ConcentrationProbability and Statistics · Advanced
Picard Iteration for Delay EquationsAnalysis · Advanced
PINN Error Analysis for Burgers’ EquationAnalysis · Advanced
Polygon Billiard Entropy in Rational-Angled ShapesGeometry and Topology · Advanced
Polyhedron Net Counts and Shape PerturbationsGeometry and Topology · Advanced
Prime Factors of n! + 1Number Theory · Advanced
Proving a Thermostat Periodic Orbit with PythonAnalysis · Advanced
Quantitative Ergodic Theorems for Torus RotationsAnalysis · Advanced
Randomized Kaczmarz With Noisy Linear SystemsAnalysis · Advanced
Ranked-Choice Tie-Breaking in ElectionsOther · Advanced
Reaction Network Reasoning With Python SpansOther · Advanced
Riffle Shuffle Mixing Time and Markov ChainsProbability and Statistics · Advanced
Rook Polynomials on Skew BoardsAlgebra · Advanced
Spectral Gaps in Fractal RectanglesAnalysis · Advanced
Squarefree Gaps in Arithmetic ProgressionsNumber Theory · Advanced
Stern-Brocot Depth Statistics in Rational NumbersNumber Theory · Advanced
Ternary ABC Quality in Number TheoryNumber Theory · Advanced
Theta-Graph Tricolorability InvariantsGeometry and Topology · Advanced
Urban Street Network Topology and City ScalingGeometry and Topology · Advanced
Wasserstein Dialect Boundary DetectionOther · Advanced
Zero-Divisor Graphs of Quotient RingsAlgebra · Advanced
