Urban Tree Phylogeny From Leaf Features
ISEF Category: Plant Sciences
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Systematics and Evolution · Difficulty: Advanced · Setup: Home Setup · Time: Full Year
The Hook
A single leaf can tell a lot about a tree. Shape, edge, vein pattern, and symmetry all hold clues about related species. Your job is to turn those clues into data, then test whether they match DNA-based family trees.
What Is It?
This project asks a simple question with a deep biology twist, can leaf shape help you rebuild relationships among tree species? A phylogeny is a family tree for organisms. Instead of people and grandparents, you compare species and shared traits. If two trees have similar leaves, they may be related, but leaf shape can also change because of climate, light, or city stress.
You will measure leaf traits in two ways. ImageJ can pull numerical features from leaf photos, such as area, perimeter, length, width, and shape ratios. A CNN, or convolutional neural network, can learn patterns from images and sort species or predict relatedness. Then you compare that leaf-based tree to a molecular phylogeny built from public ITS and rbcL sequences, which are common DNA barcode regions used in plant studies. Think of it like comparing a tree built from fingerprints on the outside with one built from DNA on the inside.
Why This Is a Good Topic
This is a strong science fair topic because it has real data, clear variables, and a built-in comparison. You can test whether morphology, which you can measure from photos, agrees with molecular evidence from public databases. That makes the project more than a photo classification task. It connects to plant systematics, urban ecology, and species identification, and you can learn image analysis, basic machine learning, and phylogenetic thinking in one project.
Research Questions
- How does leaf-shape similarity from ImageJ measurements compare with species relatedness in the ITS and rbcL phylogeny?
- What is the effect of using CNN-extracted image features versus hand-measured traits on tree topology agreement?
- Does adding vein pattern metrics improve agreement between the morphology-based phylogeny and the molecular phylogeny?
- To what extent do urban growing conditions change leaf measurements within the same tree species?
- Which leaf traits best predict clades in local urban tree species?
- How does species identification accuracy change when you train a CNN on leaves from only one neighborhood versus multiple neighborhoods?
Basic Materials
- Digital camera or smartphone with consistent photo settings.
- Flat scanner or light box for leaf imaging.
- Ruler or scale bar for photos.
- White background and dark background for contrasting images.
- Computer with ImageJ installed.
- Spreadsheet software for organizing measurements.
- Public access to a computer for NCBI sequence searches.
- Local tree identification guide or campus arborist list.
- Leaves from multiple urban tree species, collected legally and ethically.
- Optional tripod or copy stand for fixed-distance photography.
Advanced Materials
- Herbarium-quality drying materials for archiving voucher leaves.
- High-resolution flatbed scanner or macro camera setup.
- Computer with Python, TensorFlow, or PyTorch.
- External storage for large image datasets.
- Access to MEGA, IQ-TREE, or similar phylogeny software.
- Access to MUSCLE or MAFFT for sequence alignment.
- Geometric morphometrics tools for landmark analysis.
- GPS-enabled phone or device for sampling location metadata.
- Access to a university herbarium or botany lab for specimen verification.
Software & Tools
- ImageJ: Measures leaf area, perimeter, length, width, and shape descriptors from images.
- Python: Runs data cleaning, feature analysis, and CNN workflows.
- Google Colab: Gives you a free cloud notebook for Python and model training.
- MEGA: Aligns sequences and builds simple molecular phylogenies from public ITS and rbcL data.
- iNaturalist: Helps you confirm likely tree species and compare your field identifications.
Experiment Steps
- Define the tree species set you will compare, and make sure each species has both usable leaves and public ITS or rbcL data.
- Decide which leaf traits you will measure by hand, by ImageJ, and by CNN features, so your image pipeline stays consistent.
- Build a clean photo standard, then test whether your images can be measured reliably across different lighting and backgrounds.
- Gather public sequence data, align the DNA regions, and plan how you will build the molecular phylogeny for comparison.
- Choose the statistic that will compare the two trees, such as topology agreement, distance correlation, or trait-clade association.
- Plan a validation step that checks whether your model and measurements hold up across neighborhoods, seasons, or nearby species.
Common Pitfalls
- Mixing leaves from the same tree with leaves from different species, which makes your trait data look more similar than it really is.
- Photographing curved or damaged leaves, which distorts area, perimeter, and shape ratios.
- Using inconsistent image backgrounds or scaling, which makes ImageJ measurements impossible to compare across samples.
- Building a CNN with too few species or too few images per species, which leads to overfitting instead of real pattern learning.
- Comparing the morphology tree to poorly chosen DNA sequences, which can give a misleading mismatch because the alignment or gene region is weak.
What Makes This Competitive
A competitive version of this project does more than classify leaves. It asks whether morphology really tracks evolutionary history, and it tests that claim with careful statistics. Strong projects compare several feature sets, use blinded validation, and explain where leaf shape fails. The best entries also show why urban growth conditions may blur or distort signals that taxonomy textbooks assume are stable.
Project Variations
- Compare street trees, campus trees, and park trees to see whether urban stress changes leaf-based phylogeny agreement.
- Replace CNN features with geometric morphometrics landmarks to test whether shape engineering beats deep learning on the same dataset.
- Focus on one genus, such as maple or oak, and test whether closely related species are easier to separate by leaves than by public DNA barcodes.
Learn More
- NCBI GenBank: Search public ITS and rbcL sequences for your target tree species and download records for phylogeny work.
- NCBI Taxonomy Browser: Check accepted scientific names and synonym history before you build your species list.
- US National Arboretum Plant Hardiness and tree resources: Use government plant references to confirm common urban tree identities.
- ImageJ documentation: Learn how to measure leaf images and extract shape descriptors. Find the official documentation and tutorials online.
- MIT OpenCourseWare, Introduction to Computational Thinking and Data Science: Review free Python-based data analysis ideas that help with data cleaning and comparison.
- MEGA user documentation and tutorials: Learn how to align sequences and build basic phylogenetic trees using a widely used free tool.
Plant Sciences Category Guide
How to Do Real Plant Sciences Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →
