Conformal Ranking Algorithms for Public Data

ISEF Category: Mathematics

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Probability and Statistics · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

Rankings look simple, but a tiny change in data can flip the order. That makes recommendation systems risky when you need reliable top choices, not just a single guess. This project asks a sharp question, can you rank items and still know how much to trust the list?

What Is It?

Conformal prediction is a way to wrap uncertainty around a machine learning model. Instead of saying, “this item is number one,” it says, “the true answer should land inside this set or this range with a chosen level of confidence.” For classification, that means a set of likely labels. For ranking, it means a list or ordering with a mathematical guarantee about how often the true ranking falls within the predicted uncertainty.

Think of it like a weather forecast for a playlist or a restaurant list. A normal ranking model gives you one ordered list, like a single best guess. A conformal ranking method adds a safety margin around that list, so you can see which swaps are shaky and which positions are reliable. The phrase “marginal coverage” means that, over many test cases, the method should include the truth at the promised rate, even if any one case can still miss.

Why This Is a Good Topic

This is a strong science fair topic because you can test a real algorithmic claim with real public data. You do not need a biology wet lab or expensive hardware, and you can measure success with clear statistics like coverage rate, ranking error, and list stability. It also connects to a real problem, recommendation systems often need uncertainty estimates, not just a best guess.

Research Questions

How does the conformal ranking method change coverage as the confidence level changes?
What is the effect of training data size on ranking coverage and list stability?
Does the method keep its coverage on Yelp data and MovieLens data equally well?
To what extent do noisy or sparse ratings reduce the quality of the predicted ranking set?
Which score function gives the best balance between coverage and narrow ranking sets?
How does the conformal ranking approach compare with a standard point prediction ranking model?

Basic Materials

Laptop with enough memory to run data analysis software.
Python installed with NumPy, pandas, SciPy, and scikit-learn.
Public Yelp or MovieLens dataset files.
Text editor or notebook environment for code and notes.
Spreadsheet software for tracking experiments and results.
Plotting tool such as Matplotlib or Seaborn.
External storage or cloud drive for versioned data backups.

Advanced Materials

University workstation or high-memory laptop for repeated resampling experiments.
Python with specialized conformal prediction code you write or adapt from open-source examples.
Public ranking benchmark datasets such as Yelp and MovieLens, stored in a clean research folder.
Statistical testing tools for calibration checks and uncertainty comparison.
Version control software such as Git for tracking code changes.
Optional GPU access if you test larger model families or repeated simulations.

Software & Tools

Python: Runs the ranking model, calibration code, and statistical tests.
Jupyter Notebook: Lets you document each experiment and keep code, plots, and notes together.
pandas: Organizes ratings, rankings, and evaluation tables.
NumPy: Handles fast numerical calculations and resampling loops.
Matplotlib: Makes coverage, error, and comparison plots.

Experiment Steps

Define the ranking task you will test, such as predicting top items from user preference data.
Choose one conformal ranking method and one standard ranking baseline so you can compare them fairly.
Decide which outcome measures matter most, such as coverage, average list size, and rank error.
Build a calibration plan that separates training, calibration, and test data without leakage.
Plan sensitivity tests for data size, noise level, or sparsity so you can see when the method breaks down.
Prepare a statistical analysis plan that checks whether the promised coverage holds across many trials.

Common Pitfalls

Mixing calibration and test data, which makes the coverage result look better than it really is.
Using only one random split, which can make the ranking method seem stable when it is not.
Measuring only top-one accuracy, which misses the point of uncertainty-aware ranking.
Comparing models on different train-test splits, which makes the baseline comparison unfair.
Ignoring sparse rating patterns, which can distort rankings on Yelp and MovieLens data.

What Makes This Competitive

A competitive version of this project goes beyond checking whether the code runs. You would test several ranking settings, compare multiple uncertainty scores, and report whether the coverage guarantee survives harder data conditions. Strong projects also use repeated trials, clear calibration logic, and careful statistics, not just one lucky split. A novel comparison, such as how the method behaves on sparse versus dense rating data, can make the work stand out.

Project Variations

Test the method on a different public ranking dataset, such as movie, product, or restaurant preference data.
Compare conformal ranking against pairwise ranking models instead of only listwise predictors.
Study how the method changes when ratings are binarized, scaled, or filtered for sparse users.

Learn More

MIT OpenCourseWare: Search for probability, statistics, and machine learning course notes that explain calibration, resampling, and model evaluation.
PubMed: Search for review articles on uncertainty quantification and predictive intervals in machine learning applications.
arXiv: Search for recent preprints on conformal prediction and conformal ranking methods.
NASA CMR or NOAA data portals: Browse examples of how large public datasets support reproducible analysis and uncertainty reporting.
The Elements of Statistical Learning: A widely used textbook with chapters on prediction, model assessment, and regularization, often accessible through libraries or previews.

Mathematics Category Guide

How to Do Real Mathematics Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →