Adaptive Graph Algorithm Selection for Shortest Paths

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Algorithms · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

Some graph algorithms are fast on one network and slow on another. That means the best method is not always the most famous one. Your project asks a smart question: can a model predict which algorithm will win before you run it? If you get this right, you learn both graph theory and real performance engineering.

What Is It?

This topic is about algorithm selection. You feed a graph into a system, and the system chooses between two or more shortest-path methods before solving the problem. One class of methods uses matrix multiplication ideas, which can be fast on some inputs. Another class uses combinatorial tricks, which often work better on real-world graphs. Your meta-algorithm is like a coach picking the right runner for each race.

The key idea is graph features. These are measurable properties of a graph, such as number of nodes, number of edges, density, degree distribution, clustering, and whether the graph looks sparse or dense. A learned model uses those features to predict which solver should finish first, or which solver gives the best speed-quality tradeoff. The hard part is making those predictions without losing correctness guarantees. That makes this topic a mix of machine learning, graph theory, and systems thinking.

Why This Is a Good Topic

This is a strong science fair topic because you can test a clear prediction, compare real algorithms, and measure speed on many graphs. It connects to routing, social networks, web graphs, and large-scale data systems. You can also study a real engineering question, not just a toy benchmark, because different graph families behave very differently. A student can learn feature engineering, benchmarking, and statistical comparison while building something that feels like a real research tool.

Research Questions

How does graph density affect which shortest-path algorithm finishes first?
What is the effect of using degree-based features on algorithm-choice accuracy?
Does a learned selector beat a fixed rule such as always choosing the combinatorial solver?
To what extent do clustering coefficient and degree variance predict runtime on SNAP graphs?
Which graph features best separate cases where matrix-based methods outperform combinatorial methods?
How does training on one graph family affect selector performance on a different graph family?

Basic Materials

Laptop or desktop computer with at least 16 GB RAM.
Python 3 installation.
NetworkX for graph handling.
Pandas for feature tables.
NumPy for numeric work.
scikit-learn for classifier training.
Access to SNAP graph datasets or other public graph collections.
Spreadsheet software for organizing benchmark results.

Advanced Materials

Linux workstation or server with strong CPU and enough memory for large graphs.
Python with SciPy, scikit-learn, and NetworkX.
C or C++ compiler for faster graph kernels.
Implementation of at least two shortest-path or APSP methods.
Job scheduler or batch runner for repeated benchmarks.
Large public graph sets from SNAP or similar sources.
Memory profiler and timing tools for detailed runtime analysis.

Software & Tools

Python: Runs graph feature extraction, benchmarking scripts, and model training.
NetworkX: Loads graphs and computes structural features for each dataset.
scikit-learn: Trains and evaluates a selector model for algorithm choice.
Pandas: Organizes graph metrics, runtimes, and prediction results.
Matplotlib: Plots runtime, accuracy, and feature importance comparisons.

Experiment Steps

Define the graph problem you will predict, such as all-pairs shortest paths or another shortest-path task, and list the candidate algorithms you will compare.
Choose graph features that a selector can measure before running the solver, then decide how you will store them in a table.
Build a benchmark set of graphs with different sizes, densities, and structures so your model sees varied cases.
Train a simple classifier first, then compare it with a hand-built rule such as density thresholding.
Test whether the selector improves runtime without hurting correctness, and plan how you will measure worst-case behavior.
Analyze which features matter most, then check whether the same rule works across different graph families.

Common Pitfalls

Using only one graph family, which makes the selector look better than it really is on varied inputs.
Measuring runtime once per graph, which hides noise from caching, background processes, and OS scheduling.
Letting graph size alone decide the winner, which misses structural features that actually drive performance.
Comparing algorithms with different output definitions, which makes the accuracy and runtime numbers unfair.
Training and testing on nearly identical graphs, which causes data leakage and inflates performance.

What Makes This Competitive

A competitive version of this project would test many graph families, not just one dataset. You would also compare a learned selector against strong human-designed rules and report where each one fails. Better projects explain the tradeoff between speed and correctness, then back it up with clean statistics and repeated trials. The strongest entries often include a feature-importance analysis or a cross-dataset test that shows the idea can generalize.

Project Variations

Use weighted social networks instead of unweighted graphs, and test whether edge weights change the selector’s best choice.
Swap shortest paths for connected components or centrality, then study whether the same feature set still predicts the fastest solver.
Compare a learned classifier with a regression model that predicts actual runtime, then pick the algorithm with the lower predicted cost.

Learn More

SNAP datasets: Find large real-world graph datasets from Stanford Network Analysis Project, then use them for benchmarking and feature extraction.
MIT OpenCourseWare: Search for algorithms and graph theory course materials, then review shortest paths, complexity, and graph representations.
USGS data portals: Explore public network-style datasets and geospatial graphs, then practice cleaning large real-world structures.
arXiv: Search for papers on algorithm selection, graph algorithms, and runtime prediction to see current research methods.
PubMed: Search for graph-based machine learning review articles if you want examples of feature-based model comparison, especially in network analysis.

Systems Software Category Guide

How to Do Real Systems Software Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →