AI Art Watermarking for Provenance Testing

AI Art Watermarking for Provenance Testing

ISEF Category: Technology Enhances the Arts

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Music and Image Manipulation  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

AI images can spread fast, and their source can vanish just as fast. That makes provenance a big deal, like a digital fingerprint that should still survive reposts and filters. Your project asks a sharp question, can an invisible mark stay readable after common edits? If it can, you have a real tool for trust in AI art.

What Is It?

This project studies digital watermarking for AI-generated images. A watermark is a hidden signal inside an image that says where it came from. In this case, the signal is an invisible perceptual hash, which is a compact code built from image features that stay similar even after mild edits.

Think of it like writing a name in pencil under layers of paint. If the paint gets scratched, the name should still be there. Your job is to test whether the mark survives common image changes, such as JPEG compression, cropping, resizing, and social media filters. If the detector still reads the mark, the image keeps its provenance. If not, the signal is too fragile for real-world use.

The core challenge is tradeoff. A watermark must stay invisible to viewers, but strong enough for a detector to find later. That makes this a good mix of computer vision, signal processing, and experimental design.

Why This Is a Good Topic

This is a strong science fair topic because you can test it with clear inputs and measurable outputs. You can compare how different image transformations affect detection accuracy, false positives, and signal strength. It also connects to a real problem, which image came from a model, and whether that source can survive reposting, editing, or platform compression. You can learn how to design controlled experiments, run batch image tests, and analyze accuracy with real numbers.

Research Questions

  • How does JPEG compression change watermark detection accuracy in AI-generated images?
  • What is the effect of cropping on the detector's ability to recover the embedded perceptual hash?
  • Does resizing before or after compression change the watermark's survival rate?
  • To what extent do common social media filters reduce similarity between the embedded hash and the detected signal?
  • Which transformation causes the largest drop in detection precision across Stable Diffusion images?
  • How does adding multiple mild edits together affect the detector compared with a single edit?
  • To what extent does the watermark remain detectable in images from different Stable Diffusion prompts?

Basic Materials

  • A computer with a modern GPU or access to a school or university workstation.
  • Stable Diffusion image generation software or an approved interface.
  • Python with image processing libraries such as Pillow, OpenCV, and NumPy.
  • A dataset of AI-generated test images from multiple prompts.
  • A folder system for organizing original images and transformed copies.
  • A spreadsheet or notebook for recording detection outcomes.
  • A digital image viewer for checking visual invisibility.
  • A JPEG compression and image editing tool for controlled transformations.

Advanced Materials

  • A university GPU workstation or compute cluster.
  • Python environment with PyTorch, OpenCV, NumPy, SciPy, and scikit-image.
  • Model access for Stable Diffusion fine-tuning or latent-space modification experiments.
  • Image attack and augmentation scripts for controlled robustness testing.
  • A labeled benchmark set of generated and transformed images.
  • A secure storage system for large image batches and logs.
  • A plotting package such as Matplotlib or Seaborn.
  • A statistical analysis toolkit for ROC curves, confusion matrices, and significance tests.

Software & Tools

  • Python: Runs the image generation, transformation, detection, and analysis scripts.
  • OpenCV: Applies image edits and measures pixel-level changes.
  • Pillow: Handles batch image loading, saving, and format conversion.
  • ImageJ: Lets you inspect images and compare before and after transformation.
  • Jupyter Notebook: Keeps code, results, and notes in one place for analysis.

Experiment Steps

  1. Define the watermark goal, then choose whether you care most about invisibility, detectability, or both.
  2. Select one embedding method and one detector, then keep them fixed while you test transformations.
  3. Build a transformation set that covers real reposting behavior, including compression, crop, resize, and filter changes.
  4. Create a labeled image set with originals, transformed copies, and negative controls from unmarked images.
  5. Decide your success metrics, such as detection rate, false positive rate, and similarity score.
  6. Plan your comparison strategy so you can test each transformation alone and in combination.

Common Pitfalls

  • Testing only one prompt style, which makes the watermark look stronger than it really is across diverse AI art.
  • Forgetting negative control images, which hides false positives and weakens your claim.
  • Changing image size and compression together without separating the effects, which makes the results hard to interpret.
  • Judging watermark survival by eye instead of detector scores, which does not measure provenance performance.
  • Mixing up original files and transformed copies, which breaks your labels and ruins the benchmark.

What Makes This Competitive

A class-level version of this project only shows that a watermark works on a few images. A stronger version tests many prompts, many attacks, and clean controls, then reports precision, recall, and false positive rates. You can also compare your method against a baseline watermark or hash method, which gives your results real context. If you add a new attack pipeline, or a better way to score survival across edits, the project becomes much more compelling.

Project Variations

  • Test whether the watermark survives AI-upscaling tools instead of only social media edits.
  • Compare latent-space watermarking with a visible or semi-visible watermark on the same image set.
  • Measure whether different prompt categories, such as portraits, landscapes, and text-heavy art, change watermark survival.

Learn More

  • PubMed: Search for review articles on digital watermarking, image forensics, and robustness to compression.
  • arXiv: Search for preprints on AI-generated image provenance, watermarking, and latent diffusion models.
  • MIT OpenCourseWare: Look for free computer vision and signal processing course materials.
  • NVIDIA Research publications: Search for papers on AI image watermarking and detection methods.
  • IEEE Xplore: Use the abstract search to find peer-reviewed papers on perceptual hashing and image authentication.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

To discover more projects, visit the MehtA+ Science Fair Project Discovery Hub​ →

Shopping Cart