AI Phishing Email Detection With Tiny On-Device Models

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Cybersecurity · Difficulty: Advanced · Setup: University Lab · Time: Full Year

The Hook

Phishing emails get past people because they copy the look of real messages. AI now makes those fakes cleaner and faster to produce. That creates a sharp test for a student project, can software spot clues humans miss? You can turn that question into a real detector and measure how well it works.

What Is It?

This project asks whether a small model can flag phishing emails by combining three kinds of clues. First, headers, which are the hidden routing details of an email. Second, link entropy, which measures how random or messy a URL looks. Third, stylometric drift, which means the writing style shifts away from the sender's normal pattern.

Think of it like checking a note for three signs of a fake signature. The ink can look right, the paper can look right, but the spacing, pressure, and wording can still feel off. Here, the email body, the link text, and the header data each give you a different clue. A tiny on-device model tries to learn from all of them without sending private mail to the cloud.

Why This Is a Good Topic

This is a strong science fair topic because you can measure real performance, not just build a demo. You can test accuracy, false positives, false negatives, and speed on known phishing sets and synthetic AI-made phish. It connects to a real problem that affects schools, families, and businesses. You can also explore tradeoffs between privacy, model size, and detection quality, which gives you real research depth.

Research Questions

How does adding header features change phishing detection accuracy compared with text only?
What is the effect of including link entropy on false positive rates?
Does a tiny on-device model perform better than a baseline keyword filter on the Nazario corpus?
To what extent do GPT-generated phishing emails evade a detector trained only on older phishing data?
Which feature set best separates phishing emails from legitimate emails when the model must run locally?
How does stylometric drift differ between real phishing emails and synthetic AI phishing emails?

Basic Materials

Laptop or desktop computer with at least 8 GB RAM.
Thunderbird or Outlook test account access.
Python installed locally.
Email corpora, such as the Nazario phishing corpus and a set of legitimate emails you are allowed to use.
Text editor or IDE, such as VS Code.
Spreadsheet software for tracking model results.
Basic graphics tool for confusion matrices and charts.

Advanced Materials

University or school server with GPU access for model testing.
Email header parsing library in Python.
Natural language processing library, such as scikit-learn or spaCy.
TinyML or lightweight inference framework, such as ONNX Runtime or TensorFlow Lite.
Secure sandbox environment for handling email corpora.
Version control system, such as Git.
Annotation tool for labeling edge cases in phishing and legitimate mail.

Software & Tools

Python: Runs data cleaning, feature extraction, model training, and evaluation.
Jupyter Notebook: Lets you test features and compare classifiers step by step.
scikit-learn: Builds baseline models and reports precision, recall, and F1 score.
pandas: Organizes email metadata, text features, and evaluation tables.
Thunderbird add-on development tools: Help you prototype a local detector inside an email client.

Experiment Steps

Define the detection task by choosing exactly what counts as phishing, AI-generated phishing, and legitimate mail.
Collect and label a balanced dataset, then separate training, validation, and test sets so no email appears twice.
Design feature groups for headers, links, and writing style, then decide which groups you will compare alone and together.
Build a baseline classifier first, then add the tiny model and compare accuracy, precision, recall, and false positive rate.
Plan an evaluation set that includes older phishing, modern phishing, and synthetic AI phishing so you can test generalization.
Test whether the detector can run fast enough inside an email client without sending message content off device.

Common Pitfalls

Mixing training and test emails from the same campaign, which makes the model look better than it really is.
Using only one kind of phishing email, which teaches the detector one template instead of the broader pattern.
Letting legitimate training mail come from a very different source than the phishing mail, which adds dataset bias.
Parsing headers incorrectly, which drops useful routing clues and creates noisy features.
Evaluating only overall accuracy, which hides the false positive problem that matters most in email security.

What Makes This Competitive

A strong version of this project goes beyond a simple classifier. You compare feature families, test on old and new phishing styles, and report where the model fails. You also measure speed and memory use, since local protection only matters if it can run inside an email client. The best projects include careful error analysis, not just one accuracy score.

Project Variations

Test whether the detector works better on corporate phishing emails than on consumer scam emails.
Replace stylometric features with URL and sender-domain features to see how much text style really matters.
Compare a tiny model against a rules-based filter and a larger ML model to study the privacy-speed-accuracy tradeoff.

Learn More

Nazario Phishing Corpus: Search for the public phishing corpus used in security research and download it from mirrored academic or archived sources when available.
PubMed: Search for review articles on phishing detection, email security, and social engineering.
IEEE Xplore: Search for peer-reviewed papers on email phishing detection, stylometry, and lightweight classifiers.
NIST Computer Security Resource Center: Read guidance on phishing, email authentication, and security evaluation methods.
NOAA-style? No, use MIT OpenCourseWare: Search for machine learning and cybersecurity course materials that explain classification, text features, and model evaluation.
USGS? No, use ACM Digital Library abstracts when available through school access for recent work on phishing detection and email analysis.

Systems Software Category Guide

How to Do Real Systems Software Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Databases →

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub →