Predicting Bulk Modulus From Crystal Structure
ISEF Category: Materials Science
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Computation and Theory · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months
The Hook
A crystal can look simple on the outside and act very different under pressure. That difference is called bulk modulus, and it tells you how hard a material is to compress. You can train a model to predict it from structure alone, then ask which parts of the crystal matter most. That turns your project from a prediction task into a real materials insight.
What Is It?
Bulk modulus measures how much a material resists being squeezed. Think of it like a springiness score for compression. A high bulk modulus means the material stays hard to compress. A lower value means the structure gives way more easily.
Graph neural networks, or GNNs, are machine learning models that work well on connected things. A crystal fits that idea because atoms connect through bonds and local neighborhoods. Instead of treating the material as a simple table of numbers, a GNN treats the crystal like a network. That helps the model learn patterns from how atoms sit next to each other.
Interpretability asks a second question. After the model makes a prediction, you check which atoms, bonds, or neighborhoods influenced the result most. Attention weights are one way to do that. They act like a spotlight, showing which parts of the crystal the model paid attention to when it guessed the bulk modulus.
Why This Is a Good Topic
This is a strong science fair topic because you can test a clear numeric target, compare model choices, and measure whether structure really predicts a material property. The Materials Project dataset gives you real research data, so your work connects to how materials scientists screen new compounds. You can learn data cleaning, graph-based modeling, validation, and basic interpretability without needing a physical lab.
Research Questions
- How does a graph neural network’s bulk modulus prediction accuracy compare with a random forest trained on composition-only features?
- What is the effect of different crystal graph definitions on bulk modulus prediction error?
- Does adding attention-based interpretability change which structural features best explain high bulk modulus materials?
- To what extent do model errors vary across crystal families with similar chemistry but different bonding patterns?
- Which structure descriptors most strongly correlate with residual error after GNN prediction?
- How does the size of the training set affect bulk modulus prediction performance on held-out materials?
Basic Materials
- Laptop with a modern CPU and at least 8 GB RAM.
- Stable internet connection for downloading dataset files and documentation.
- Python installed with Jupyter Notebook.
- Access to the Materials Project dataset through its public website or API.
- Spreadsheet software for tracking samples, features, and model runs.
- Text editor or notebook environment for coding and notes.
Advanced Materials
- Laptop or workstation with a dedicated GPU.
- Python environment with PyTorch and a graph neural network library such as PyTorch Geometric or DGL.
- Materials Project structure files and property table exports.
- Crystal graph featurization tools such as pymatgen.
- Version control software such as Git for tracking experiments.
- Plotting and analysis tools for model explainability and error inspection.
Software & Tools
- Python: Runs data cleaning, feature engineering, model training, and evaluation scripts.
- Jupyter Notebook: Helps you test ideas, inspect outputs, and keep a clear research log.
- pandas: Organizes the Materials Project table and helps you filter, merge, and audit records.
- pymatgen: Reads crystal structures and converts them into forms a model can use.
- PyTorch Geometric: Builds and trains graph neural networks on crystal graphs.
- Matplotlib: Plots accuracy, error trends, and attention-based comparisons.
Experiment Steps
- Define the prediction target and decide which materials you will include or exclude from the dataset.
- Build one clear crystal graph representation and decide how atoms, bonds, and neighbors become model inputs.
- Split the data in a way that prevents leakage and tests whether the model generalizes to unseen materials.
- Train a baseline model first, then compare it against the graph neural network.
- Plan an interpretability method that lets you compare attention weights with known chemical or structural features.
- Evaluate error patterns by material class, composition range, or structural family, then decide what patterns matter most.
Common Pitfalls
- Using a random train test split on nearly duplicate structures, which can make the model look better than it really is.
- Mixing different target definitions or units for bulk modulus, which breaks comparisons across samples.
- Trusting attention weights as direct proof of causation, which can lead you to overclaim what the model learned.
- Keeping rare crystal families in the training set and test set together, which hides generalization problems.
- Skipping a simple baseline model, which makes it hard to tell whether the GNN adds real value.
What Makes This Competitive
A strong version of this project does more than report one accuracy score. You compare several split strategies, test at least one baseline, and analyze where the model fails. You also check whether attention highlights chemically meaningful neighbors, not just random graph edges. If you connect those patterns to known materials trends, your project starts to look like real computational materials research.
Project Variations
- Use formation energy or band gap instead of bulk modulus to see whether the same graph model works across different materials properties.
- Restrict the dataset to one crystal family, such as oxides or perovskites, and test whether the model learns family-specific structure-property rules.
- Compare attention-based explanation with a simpler feature-importance method to see whether both methods point to the same structural cues.
Learn More
- Materials Project: Search the public materials database and documentation to learn how computed materials properties are organized and accessed.
- MIT OpenCourseWare, Introduction to Machine Learning: Use the course materials to review model training, validation, and overfitting.
- pymatgen documentation: Learn how to load crystal structures and prepare them for graph-based modeling.
- Nature Communications: Search for review articles and research papers on graph neural networks for materials property prediction.
- PubMed: Search for papers on interpretable machine learning in materials science, especially work on feature attribution and attention.
Materials Science Category Guide
How to Do Real Materials Science Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
