Predicting Glass Transition Temperature With ML

Predicting Glass Transition Temperature With ML

ISEF Category: Materials Science

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Ceramic and Glasses  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

Glass can act like a solid and a liquid, depending on how you look at it. That weird behavior starts near the glass-transition temperature, or Tg, the point where a material becomes much less stiff. If you can predict Tg from composition alone, you can help design better screens, seals, and optical materials faster. Machine learning makes that search much less random.

What Is It?

This project uses machine learning to predict the glass-transition temperature of ternary oxide systems. A ternary oxide system means a glass made from three oxide components, such as silica, borate, and alumina. The goal is to teach a model, like a random forest, how composition links to Tg.

Think of it like teaching a friend to guess a recipe's baking behavior from its ingredients. The model looks for patterns in data from the Materials Project and published glass studies. Random forest works by combining many decision trees, each making a small guess, then averaging the answers. That helps the model handle messy real-world data better than a single formula would.

You then test the model on literature compositions it did not see during training. If the predicted Tg lines up with published values, your model has learned something useful. If it misses, you can study which chemistry features or data gaps caused the error.

Why This Is a Good Topic

This is a strong science fair topic because you can test a clear question with real data and measurable output. You are not guessing whether your model worked, because Tg gives you a numeric target. The project connects to glass design, coatings, electronics, and manufacturing, so the real-world value is easy to explain. You can also learn data cleaning, feature selection, model evaluation, and error analysis, which are valuable research skills.

Research Questions

  • How does the choice of input features change the accuracy of Tg prediction for ternary oxide glasses?
  • What is the effect of adding literature-only compositions to the training set on model performance?
  • Does a random forest outperform linear regression for predicting glass-transition temperature in ternary oxide systems?
  • To what extent do oxide family labels improve prediction compared with composition alone?
  • Which chemical features, such as ion radius or oxygen fraction, most strongly influence predicted Tg?
  • How does the model error change across different ranges of glass-transition temperature?

Basic Materials

  • Laptop or desktop computer with enough memory for Python data analysis.
  • Python installed with pandas, scikit-learn, numpy, matplotlib, and seaborn.
  • Access to Materials Project data through its public website or API.
  • Spreadsheet software for tracking compositions and literature values.
  • PubMed, Google Scholar, or journal access through your school library for collecting validation data.
  • Digital notebook for recording data cleaning decisions and model settings.

Advanced Materials

  • Workstation with higher RAM for larger feature sets and cross-validation runs.
  • Python with scikit-learn, xgboost, shap, and scipy.
  • Access to Materials Project API for automated data pulls.
  • Reference management software for tracking source papers.
  • Statistical software or notebooks for confidence intervals and residual analysis.
  • University library access for full-text glass property papers and supplemental data tables.

Software & Tools

  • Python: Builds the dataset, trains the model, and evaluates prediction error.
  • Jupyter Notebook: Keeps code, notes, plots, and results in one place.
  • scikit-learn: Trains the random forest and compares it with baseline models.
  • pandas: Cleans composition tables and merges data from multiple sources.
  • matplotlib: Plots predicted versus actual Tg values and error patterns.

Experiment Steps

  1. Define the exact ternary oxide family you will study and decide which compositions count as valid samples.
  2. Gather a clean training set from Materials Project and published glass data, then decide how you will handle missing values and duplicate compositions.
  3. Choose your input features and make a baseline model so you have something simple to beat.
  4. Train the random forest and plan a cross-validation scheme that tests generalization, not memorization.
  5. Build a validation set from literature compositions the model has not seen, then compare predicted and reported Tg values.
  6. Examine errors by composition class so you can explain when and why the model works or fails.

Common Pitfalls

  • Mixing glass data from incompatible measurement methods, which makes Tg values look more inconsistent than they really are.
  • Training and testing on near-duplicate compositions, which inflates accuracy and hides weak generalization.
  • Treating missing composition features as zeros, which can create fake chemical patterns.
  • Using too many rare oxide combinations, which leaves the model with tiny sample sizes and unstable predictions.
  • Comparing predictions to literature values without checking whether the papers used the same heating rate or definition of Tg.

What Makes This Competitive

A competitive version goes beyond a simple model fit. You compare several feature sets, test multiple validation schemes, and report where the model breaks down. You also explain the chemistry behind the errors, not just the score. Strong entries often add a novelty angle, like testing whether one oxide family is easier to predict than another or whether uncertainty estimates improve trust in the model.

Project Variations

  • Use binary oxide glasses instead of ternary systems to see whether simpler chemistry improves prediction accuracy.
  • Compare random forest with gradient boosting or support vector regression to test whether another model handles Tg better.
  • Add structural descriptors from literature, such as network former fraction or modifier content, to see whether chemistry-aware features improve performance.

Learn More

  • Materials Project: Search the public database for oxide compositions, structures, and property data relevant to glass prediction.
  • scikit-learn User Guide: Read the model selection, regression, and random forest sections for practical Python methods.
  • NIST Chemistry WebBook: Look up basic property context and compound information for oxide-related chemistry.
  • PubMed: Search for review articles on glass transition temperature, oxide glasses, and data-driven materials discovery.
  • MIT OpenCourseWare Materials Science courses: Review free lecture notes and problem sets on structure, properties, and phase behavior.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart