Local-First Encrypted Note Search

Local-First Encrypted Note Search

ISEF Category: Systems Software

Ready to Turn This Idea Into a Real Project?

This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.

For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →

Subcategory: Databases  ·  Difficulty: Advanced  ·  Setup: University Lab  ·  Time: Full Year

The Hook

Your phone can feel instant, even when the network is not. That same idea gets much harder when your notes must stay private, sync later, and still search fast. This project asks a big systems question, can one device keep a huge encrypted note store usable without asking a server for help?

What Is It?

This phenomenon combines three hard software problems. First, the app stores everything locally, so your phone can read and edit notes without internet access. Second, the data stays encrypted, which means the app must search without exposing plain text to a server. Third, the data structure must keep multiple copies of the notes in sync after people edit offline, even if those edits happen at the same time.

A simple way to picture it is a shared notebook that gets copied to several phones. Each phone can write in its own copy. Later, the copies merge. A CRDT, or conflict-free replicated data type, is a design that helps those copies merge in a predictable way. The formal proof of convergence asks a deep question, if each phone applies the same merge rules, do all copies end up in the same final state after every offline change gets shared?

Why This Is a Good Topic

This makes a strong science fair topic because you can test real system limits with measurable results. You can compare search speed, index size, sync correctness, and battery or memory use across different designs. The project connects to privacy, offline access, and resilient software, which are real problems in note apps, medical apps, and field tools. You can learn database indexing, distributed systems ideas, encryption tradeoffs, and how to evaluate performance with data instead of guesses.

Research Questions

  • How does the indexing method affect search latency on an encrypted local note store?
  • What is the effect of note count on full-text search speed and memory use?
  • Does adding semantic search increase query time more than keyword search alone?
  • To what extent does offline edit rate affect merge time after devices reconnect?
  • Which CRDT merge strategy produces the fewest conflicts after concurrent edits?
  • How does database compression affect storage size without hurting search accuracy?

Basic Materials

  • Laptop or phone for testing local search performance.
  • Free note database sample set or self-made text corpus.
  • Computer with enough storage to hold at least one large note index.
  • External SSD or cloud backup for copies of test data.
  • Spreadsheet software for logging latency, size, and merge results.
  • Network simulator or airplane mode for offline testing.
  • Stopwatch or built-in timing tools for quick checks.
  • Basic scripting environment such as Python for automation.

Advanced Materials

  • Android phone or older smartphone with developer access for real device testing.
  • Desktop or laptop with SSD and enough RAM to build large test indexes.
  • SQLite or another embedded database for local-first storage experiments.
  • Full-text search engine library such as Tantivy, Lucene, or SQLite FTS for comparison.
  • Vector embedding model for semantic search experiments.
  • Encryption library that supports local file or field-level encryption.
  • CRDT library or reference implementation for merge testing.
  • Profiling tools for memory, CPU, and battery analysis.
  • Packet capture or sync logging tools for reconnection tests.
  • Statistical software or Python notebooks for comparing groups and plotting results.

Software & Tools

  • Python: Automates benchmarks, logs results, and runs statistical analysis.
  • SQLite: Stores notes locally and supports full-text search experiments.
  • ImageJ: Not used for image analysis here, so omit this tool from the final build.
  • Jupyter Notebook: Organizes tests, charts, and comparisons in one place.
  • PubMed: Searches review articles on search systems, encryption, and CRDTs when you need background reading.
  • Git: Tracks code changes and helps you compare versions of your prototype.

Experiment Steps

  1. Define the core system claim you want to test, such as faster local search, safer offline sync, or better merge behavior.
  2. Choose one baseline design and one improved design so you can measure real tradeoffs instead of building a single prototype.
  3. Plan your data set size, note structure, and query types so your results reflect realistic use, not a tiny toy example.
  4. Design controls that separate search speed, encryption overhead, and sync overhead, because these costs can hide inside each other.
  5. Build a benchmark plan that measures latency, memory use, index size, and merge success after partition and offline edits.
  6. Decide which statistical comparison will answer your research question, then predefine how you will judge success.

Common Pitfalls

  • Mixing encrypted and unencrypted storage paths, which makes performance numbers hard to trust.
  • Testing only tiny note sets, which hides the slowdown that appears at larger scale.
  • Measuring search after warm cache only, which makes the app look faster than it really is.
  • Ignoring offline conflict cases, which means you never test the CRDT part of the system.
  • Comparing semantic search and keyword search with different query sets, which makes the latency results unfair.

What Makes This Competitive

A competitive project here needs more than a working demo. You would need a clear benchmark plan, a fair baseline, and evidence that your design stays correct after many offline edits. Strong entries often compare multiple indexing or merge strategies, then use careful statistics to explain the tradeoffs. A formal convergence argument, paired with real performance data on a phone, would make the work much stronger.

Project Variations

  • Test the same local-first search design on text notes versus mixed notes with tags, attachments, and metadata.
  • Compare keyword search, vector search, and a hybrid search pipeline on the same encrypted note database.
  • Study how different CRDT merge rules change correctness and performance when several phones edit the same notebook offline.

Learn More

  • MIT OpenCourseWare, Distributed Systems: Search lecture notes and assignments on replication, consistency, and fault tolerance.
  • SQLite Documentation: Read about FTS, indexing, and embedded database design in the official docs.
  • PubMed: Search review articles on searchable encryption and privacy-preserving search.
  • NIH PMC: Find open-access papers on data synchronization, indexing, and secure storage.
  • NASA OpenMCT documentation: Explore a real open-source local data system for ideas about offline-first architecture.
  • ACM Digital Library abstract pages: Search for recent papers on CRDTs, local-first apps, and encrypted search, then read the abstracts and open-access versions where available.
Shopping Cart