OpenStax Prerequisite Routing System
ISEF Category: Systems Software
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Online Learning · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
Most textbooks are written like straight roads, but students do not learn in a straight line. When you get stuck, the fastest fix is often one missing idea, not the whole chapter. This project builds a system that maps those missing ideas and sends you to the cheapest next step.
What Is It?
This project turns a textbook into a map of concepts. Each concept becomes a node, and each prerequisite becomes an arrow. That kind of map is called a directed acyclic graph, or DAG, which means the arrows move forward and do not loop back on themselves.
Think of it like a subway system for learning. If you miss one stop, the system can route you to the nearest useful stop instead of making you start over. The goal is not just to find any review page. The goal is to find the shortest path that repairs the gap with the least extra work.
The evaluation part matters too. Instead of testing only with real students, you can train simulated student models on EdNet, a large dataset of student answer histories. Then you can ask whether your routing system helps a model recover faster, make fewer mistakes, or spend less time on unnecessary review.
Why This Is a Good Topic
This is a strong science fair topic because you can measure it with clear numbers. You can compare routing methods, test graph-building choices, and check whether one remediation path beats another on simulated students. The project connects to personalized learning, textbook design, and student support tools, so the real-world use is easy to explain. You can also learn how to work with text data, graph algorithms, and evaluation metrics without needing a wet lab.
Research Questions
- How does a prerequisite DAG built from OpenStax compare with a chapter-order baseline for finding the next best review node?
- What is the effect of different graph construction rules on the accuracy of remediation path recommendations?
- Does routing by lowest estimated review cost improve simulated student recovery more than routing by shortest path?
- To what extent do concept embeddings improve prerequisite detection compared with keyword overlap alone?
- Which textbook sections produce the most common dead ends when routed through the concept graph?
- How does a student model trained on EdNet respond to remediation paths chosen by different scoring methods?
Basic Materials
- Laptop with at least 16 GB RAM.
- OpenStax textbook PDFs or HTML pages.
- EdNet dataset access or a public subset.
- Python 3 installation.
- Jupyter Notebook.
- Text editor such as VS Code.
- Graph visualization tool such as Gephi or NetworkX plotting.
- Spreadsheet software for tracking experiments.
- Version control with Git.
- External storage for datasets and model files.
Advanced Materials
- University workstation or cloud compute access.
- GPU access for training larger student models.
- Natural language processing library such as spaCy or sentence-transformers.
- PyTorch or TensorFlow for model training.
- NetworkX or igraph for graph construction and analysis.
- Gephi for interactive graph inspection.
- Evaluation dataset split scripts.
- Annotation tool for marking prerequisite relations.
- Database system for storing concept nodes and edges.
- Statistical testing tools for comparison studies.
Software & Tools
- Python: Runs data cleaning, graph building, routing logic, and evaluation scripts.
- Jupyter Notebook: Lets you test ideas step by step and keep results in one place.
- NetworkX: Builds and analyzes the prerequisite graph structure.
- Gephi: Helps you inspect whether the concept graph has strange clusters or missing links.
- PyTorch: Trains simulated student models and compares routing strategies.
Experiment Steps
- Define the textbook scope and decide which OpenStax subject you will model first.
- Extract candidate concepts and prerequisite links, then choose rules for turning text into graph nodes and edges.
- Build a baseline route that follows chapter order, so you have something simple to beat.
- Design a cost score for remediation nodes, then decide what counts as the lowest-cost useful fix.
- Train or adapt simulated student models from EdNet, then test how they respond to different routes.
- Compare methods with the same metrics, then check whether your gains hold across multiple textbook sections.
Common Pitfalls
- Treating every heading as a concept, which makes the graph noisy and hard to use.
- Building prerequisite links from word overlap alone, which confuses related terms with true dependencies.
- Testing only one chapter, which makes the routing results too narrow to mean much.
- Using a model score as if it were a real learning gain, which can hide weak recommendations.
- Ignoring baseline comparisons, which makes it impossible to tell whether the graph system helps at all.
What Makes This Competitive
A stronger version of this project goes past a simple concept map. You could compare several graph-building rules, test multiple routing objectives, and report which one helps simulated students recover fastest. You could also add error analysis, such as which textbook topics create bad recommendations and why. That kind of careful evaluation shows real systems thinking, not just code that runs.
Project Variations
- Build the same routing system for one OpenStax biology textbook and compare it with a computer science textbook.
- Replace manual prerequisite extraction with embeddings from sentence-transformers and test whether the graph improves.
- Use real student response patterns from EdNet to estimate remediation cost instead of counting text length or page count.
Learn More
- OpenStax: Free online textbooks to use as source material, found by searching the OpenStax website.
- EdNet: A student interaction dataset used for knowledge tracing, described in the original dataset paper and related GitHub materials.
- PubMed: Search for review articles on knowledge tracing, adaptive learning, and educational recommendation systems.
- MIT OpenCourseWare: Search for free course notes on algorithms, graph theory, and machine learning.
- NetworkX Documentation: Free guide for building and analyzing graphs in Python, found on the official NetworkX site.
- Gephi: Free graph visualization software, with tutorials on the official Gephi website.
Systems Software Category Guide
How to Do Real Systems Software Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
