Fish Stock Collapse Prediction with LSTMs | Science Fair
ISEF Category: Animal Sciences
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Ecology and Agriculture · Difficulty: Advanced · Setup: Home Setup · Time: 1 to 2 Months
The Hook
Fish stocks can slide toward collapse long before a fishery looks empty. That makes early warning models useful for scientists, managers, and local communities. NOAA keeps public stock-assessment data that you can turn into a prediction problem. If you build a good model, you can test whether past patterns give away future risk.
What Is It?
This project asks whether a machine learning model can spot warning signs in fish-stock data before a stock crashes. You feed the model a sequence of past assessments, catch levels, or biomass estimates, and it learns how those numbers usually move before trouble starts. Think of it like reading a comic strip frame by frame and guessing what happens in the next panel.
An LSTM, short for long short-term memory, is a neural network built for sequences. It works well when order matters, like yearly stock assessments. Instead of looking at one row at a time, it tries to remember earlier trends, dips, and rebounds, then uses that history to estimate collapse risk.
Why This Is a Good Topic
This is a strong science fair topic because it turns a real public dataset into a clear prediction question. You can test a model, compare it with simpler baselines, and measure whether it gives an early warning or just reacts late. The project also connects to food supply, fisheries management, and ecosystem health, so your results have real stakes. You can learn data cleaning, time-series modeling, and model evaluation without needing a wet lab.
Research Questions
- How does the length of the lookback window affect collapse-risk prediction accuracy?
- What is the effect of adding environmental variables to catch-history-only models?
- Does an LSTM outperform logistic regression on NOAA stock-collapse labels?
- To what extent do class-balancing methods change false-negative rates?
- Which feature set gives the best early-warning signal for stocks with sparse data?
- How does temporal cross-validation change model performance compared with random splitting?
Basic Materials
- Laptop with 8 GB RAM or more.
- Internet access for downloading NOAA datasets.
- Python 3.11 installed.
- Jupyter Notebook or VS Code.
- Spreadsheet software or a CSV viewer.
- NOAA fisheries stock-assessment data files.
Advanced Materials
- GPU-enabled workstation or university cluster account.
- University library access to fisheries assessment reports.
- Large-memory storage for merged stock tables.
- Python environment with PyTorch, pandas, and scikit-learn.
- Access to published NOAA stock assessment archives and supporting metadata.
Software & Tools
- Python: Runs the preprocessing, modeling, and evaluation code.
- Jupyter Notebook: Keeps code, notes, and plots in one place.
- pandas: Cleans and merges NOAA tables.
- PyTorch: Trains the LSTM model.
- scikit-learn: Scores baselines and handles cross-validation.
Experiment Steps
- Define the collapse label and decide exactly which future year your model will try to predict.
- Choose the input window and feature set, then decide whether you will use catch history, assessment variables, or both.
- Build a simple baseline first so you can compare the LSTM against something easier to explain.
- Split the data by time, not at random, so future information does not leak into training.
- Set up metrics that punish false reassurance, then check calibration, not just accuracy.
- Plan one test that checks whether the model gives an early warning before a stock crosses a danger threshold.
Common Pitfalls
- Mixing years from the same stock across train and test sets, which leaks future information into the model.
- Using raw accuracy on a highly imbalanced dataset, which can hide a model that misses most collapse cases.
- Feeding missing assessment values straight into the network, which can turn data gaps into fake trends.
- Combining stocks with different assessment methods without tracking the method, which makes one label mean different things across rows.
- Judging the model only by final predictions, which misses whether it gives an early warning or just reacts late.
What Makes This Competitive
A strong version of this project does more than train an LSTM. It defines collapse risk carefully, uses time-based validation, and compares the network against simpler baselines like logistic regression. The best entries also test calibration and false-negative rates, since a missed warning matters more than a confident guess. If you can check whether the model transfers across regions or species groups, the project starts to look much closer to real fisheries work.
Project Variations
- Swap the target from collapse risk to one-year biomass decline and see whether the same model learns a cleaner signal.
- Compare an LSTM with a random forest or XGBoost baseline to test whether sequence memory adds value.
- Add ocean variables, such as sea surface temperature or chlorophyll, to see whether environmental context improves early warnings.
Learn More
- NOAA Fisheries Stock Assessment Reports: Official stock status summaries and methods, found on the NOAA Fisheries website.
- NOAA Fisheries Data Access Portal: Download public catch and assessment datasets, found through NOAA Fisheries data pages.
- FishBase: Species biology and life-history context, found by searching FishBase online.
- PubMed: Review articles on fisheries stock assessment, time-series forecasting, and neural networks, found by searching PubMed.
- MIT OpenCourseWare: Free lectures on machine learning and time series, found by searching MIT OpenCourseWare.
Animal Sciences Category Guide
How to Do Real Animal Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
