Smartphone Microbiology Image Dataset Project
ISEF Category: Microbiology
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Other · Difficulty: Advanced · Setup: School Lab · Time: Full Year
The Hook
A single bad photo can ruin a whole image dataset. In microscopy, that means blurry labels, weak models, and results nobody can trust. You can help fix that by building a clean, shared benchmark for simple microbiology images. That is a real research contribution, not just a class demo.
What Is It?
This project asks you to build a public image dataset of simple microbiology samples, then train a baseline convolutional neural network, or CNN. A CNN is a type of machine learning model that learns visual patterns from images. Think of it like training a friend to spot patterns in a stack of microscope photos, but at a much faster scale.
Your job is not just to take pictures. You also need consistent labels, the same imaging setup, and a clear data structure so other people can reuse your work. That means thinking like a dataset curator. You decide what counts as one class, how to name files, how to handle duplicates, and how to document the whole pipeline so someone else can test it later.
Why This Is a Good Topic
This is a strong science fair topic because you can test a real method, not just report observations. You can compare how image quality, sample prep, label quality, and model settings change classification accuracy. The project also connects to a real need, since low-cost microscopy and citizen science depend on clean shared data. You can learn imaging, labeling, data management, and basic machine learning in one project.
Research Questions
- How does image resolution affect CNN accuracy on smartphone microscope images?
- What is the effect of consistent lighting on label accuracy and model performance?
- Does adding more labeled images improve classification performance for rare sample classes?
- To what extent do different smartphone camera models change image quality and downstream CNN accuracy?
- Which preprocessing method produces the best classification results for low-cost microscope images?
- How does including duplicate or near-duplicate images affect validation accuracy?
- What is the effect of manual label review on dataset consistency and model reliability?
Basic Materials
- Smartphone with a high-resolution camera.
- Clip-on smartphone microscope lens or basic phone microscope adapter.
- Basic light microscope with a camera adapter, or a stable phone-to-microscope mount.
- Safe BSL-1 sample set, such as baker's yeast, pond water, onion epidermis, or yogurt bacteria from supervised school sources.
- Glass slides, cover slips, and slide storage boxes.
- Disposable transfer tools, such as plastic pipettes or inoculation loops, if approved by your teacher.
- Lens cleaning cloth and alcohol wipes for optics care.
- Uniform background card or stage setup for repeatable imaging.
- Spreadsheet software for labels and metadata.
- Digital notebook for filenames, class labels, and capture conditions.
Advanced Materials
- Research-grade brightfield microscope with camera port.
- Smartphone adapters with fixed alignment for repeatable capture.
- Calibration slide for scale correction.
- Controlled illumination setup with consistent color temperature.
- Image annotation software for dataset auditing.
- Computing laptop or desktop with enough memory for CNN training.
- External hard drive or cloud storage for raw images and backups.
- Basic lab supplies approved by your supervisor for BSL-1 sample preparation.
- Reference slide set for quality control and class verification.
Software & Tools
- Python: Trains a baseline CNN and handles dataset cleaning, splitting, and evaluation.
- ImageJ: Checks image quality, scale, and contrast across the dataset.
- LabelImg: Helps audit image labels and spot inconsistent metadata.
- Google Sheets: Tracks sample IDs, imaging conditions, and class labels in a clean table.
- Jupyter Notebook: Keeps your analysis, code, and plots in one reproducible file.
Experiment Steps
- Define the image classes you will include and write strict label rules for each one.
- Choose one imaging setup, then lock it down so you can collect consistent photos across sessions.
- Plan a metadata scheme that records sample source, magnification, lighting, and file name for every image.
- Build a quality-control system that removes blurry frames, duplicates, and mislabeled samples before training.
- Split the dataset into train, validation, and test sets in a way that avoids leakage across near-identical images.
- Train a baseline CNN, then compare its performance across image quality and preprocessing choices.
Common Pitfalls
- Mixing images from different lighting setups, which makes the model learn brightness instead of biology.
- Letting near-duplicate frames appear in both training and test sets, which inflates accuracy.
- Using vague labels, which turns classes like "cell" or "colony" into inconsistent categories.
- Collecting too few images for one class, which makes the dataset unbalanced and weakens the baseline model.
- Skipping quality control on focus and scale, which creates noisy data that other people cannot reuse.
What Makes This Competitive
A strong version of this project does more than collect pictures. You would need a clear curation protocol, clean metadata, and a dataset split that avoids leakage. You would also compare at least one thoughtful design choice, such as lighting control, preprocessing, or class balance, with a real statistical test. If you document the workflow well enough that another student can reuse it, the project starts to feel like a research tool, not just a photo archive.
Project Variations
- Build the dataset around one sample type, such as yeast, algae, or onion cells, to reduce label noise and make the benchmark more precise.
- Compare smartphone microscopy against a school microscope camera to test which setup produces cleaner images for CNN training.
- Add a data-quality study that measures how blur, crop size, and contrast change classification accuracy across the same labeled images.
Learn More
- PubMed: Search review articles on smartphone microscopy, image analysis, and low-cost diagnostics.
- NIH NCBI Bookshelf: Find free background chapters on microscopy, cell imaging, and machine learning basics.
- NASA Open Science Data Repository: Study how well-documented public datasets are organized and described.
- ImageJ documentation: Learn free tools for image inspection, scale calibration, and contrast checks.
- MIT OpenCourseWare: Search for free courses on machine learning and computer vision to understand CNN basics.
Microbiology Category Guide
How to Do Real Microbiology Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
