Feature-complete · pure Rust · MIT

The semantic filesystem
for coding agents.

SynaFS fuses a vector index, symbol graph, and version DAG into the filesystem write boundary. Write a file and its semantic index updates atomically — then read those results back through a plain read(), a syscall, or an agent API. No external reindex pipeline. No stale index.

~/myrepo — syna
# index a repo, then search by meaning — not grep
$ syna index .
indexed 312 files · 1,840 chunks · 0.07s

$ syna query "where do we validate the auth token?" --top 3
src/auth/token.rs:42  0.91  fn validate_token(raw: &str) -> Result<Claims>
src/middleware.rs:118  0.74  let claims = validate_token(&header)?;
src/auth/mod.rs:9      0.69  pub use token::validate_token;

# or mount it — every tool (cat/ls/grep) gains semantic search
$ ls /mnt/repo/.syna/query/validate%20token/
token.rs:42→58  middleware.rs:118→121  mod.rs:9→9
The problem

An agent's bottleneck is context acquisition.

Every turn repeats ls → grep → read → embed → rerank, and the external index is always one edit behind. SynaFS pushes search and understanding down to the filesystem write boundary, so the index can never be stale.

01Zero-integration

Any tool or agent that reads files gains semantic search — through magic directories, symlinks, and xattrs. No SDK to adopt.

02Always-fresh

The index lives at the write boundary. The moment a file is written, its chunks are re-embedded transactionally — with read-your-writes consistency.

03Unix-composable

Magic paths, xattrs, and an NDJSON device stream compose with the existing ecosystem instead of replacing it.

What's inside

A hybrid index, fused at the write path.

One engine, four signals — combined with reciprocal-rank fusion and graph reranking.

Hybrid search

Vector (HNSW) + BM25 + trigram, fused with RRF. Tree-sitter semantic chunking for Rust, Python, JS, Go.

Symbol graph

defs / refs / callers / callees / importers resolved across the repo. Rerank results along the call graph.

Version DAG & as-of

Commit snapshots, --as-of HEAD~3 time-travel search, symbol-level diffs, GC that preserves shared chunks.

Transactional writes

blob → WAL → token → async reindex. Consistency tokens give strong / read-your-writes reads. Crash-safe via WAL replay.

Degraded mode

If the embedder or HNSW fails, search falls back to BM25/lexical and flags degraded:true — it never goes dark.

Pure Rust

No async runtime, no protoc, no libfuse required. Native engines slot behind traits. Heavy bits (TLS, io_uring) are opt-in features.

Triple API surface

Reach the same engine three ways.

The same query semantics — and the same results — whether you read a file, call a syscall, or hit the network.

NATIVE · FUSE

Filesystem

Mount the repo passthrough with a virtual /.syna/** namespace. cat, ls, grep just work.

$ syna mount /mnt/repo
$ cat /mnt/repo/.syna/\
      symbol/validate_token/callers
SYSCALL · libsyna

Device & C ABI

/dev/synafs (CUSE) with a write-query / read-NDJSON model, plus libsyna.so for any FFI caller. io_uring batch.

echo '{"text":"parse config"}' \
  > /dev/synafs
cat /dev/synafs   # NDJSON hits
WEB · MCP / gRPC / REST

Network

REST + WebSocket, gRPC-Web and native HTTP/2 (hand-rolled HPACK), MCP for agents. TLS/mTLS opt-in.

$ syna serve --addr :5200   # REST
$ syna grpc  --addr :5201   # HTTP/2
$ syna-mcp .                # agents
Performance

Fast to index. Faster to search.

Measured on real codebases with a reproducible harness (bench/run_bench.py) — not a synthetic corpus. Full method in docs/benchmarks.md.

Search latency

p50 / p99 · in-process · lower is better

p500.21 msp990.24 msmilliseconds · in-process · 5,000-chunk engine

Indexing throughput

files / second on real repos · hash embedder

nidavellir840/sSynaFS476/srogers320/s552 / 45 / 467 files · 0.5–43 MB

Semantic vs lexical

recall@5 & MRR · CodeRankEmbed vs hash

recall@50.560.39MRR0.430.22semanticlexical · 18-query gold set

Embed-cache hit rate

on a 1-line edit

80%cached · 80%re-embedded · 20%touched chunks only

Incremental edit cost

a 1-line edit re-embeds only neighbour chunks — p50 ~20 ms

80% served from embed-cache20% re-embeddedp50 ~20 ms for a single-line edit · whole-file reindex avoided

Incremental invariant

A 1-line edit re-embeds only the touched chunks (≈neighbour count), with an 80% embed-cache hit rate on edits — not the whole file.

Deterministic

Identical input → identical ChunkID and vector. The benchmark harness reports throughput, p50/p99 latency, cache hit-rate, and recall@k / MRR.

Quality is an 18-query NL→code gold set over the SynaFS source, graded at file level, run through the same hybrid pipeline with only the embedder swapped — lexical hash vs the local CodeRankEmbed model (137 M params, 768-dim, pure-Rust candle, CPU). Throughput is real source trees (code files only); search/reindex latency is the in-process engine. The vector engine is still M0 brute-force, so these do not reflect 100k–1M-chunk behaviour. Reproduce: cargo build --release --features coderank && python3 bench/run_bench.py. Numbers vary by corpus, embedder, and hardware.

Quickstart

Build it, index a repo, ask a question.

Pure-Rust workspace — cargo build needs no C/C++ toolchain for the core.

# clone & build (84 tests pass: core + write-path + symbols + as-of + FUSE + web/gRPC + CUSE)
git clone https://github.com/021flow/synafs && cd synafs
cargo build --release

# index and search by meaning
./target/release/syna index .
./target/release/syna query "reciprocal rank fusion" --top 3 --lang rust

# write path → consistency token → instant read-your-writes
./target/release/syna edit src/auth.rs --content "$(cat new.rs)"

# surfaces: mount, serve, native gRPC, agent MCP
./target/release/syna mount /mnt/repo
SYNA_TOKEN=secret ./target/release/syna serve --addr 127.0.0.1:5200
./target/release/syna grpc --addr 127.0.0.1:5201
./target/release/syna-mcp .

# real semantic embeddings (pure-Rust candle) and benchmarks
cargo build --release --features coderank
cargo run --release -- bench
Architecture

One pipeline, twelve crates.

Core → store → {chunk, embed} → index → query → engine → surfaces. Native engines slot in behind traits.

syna-core BLAKE3 IDs · data model · symbol-graph types · traits syna-store content-addressed blob store + index snapshot syna-chunk tree-sitter + symbols syna-embed cache + CodeRankEmbed syna-index vector + BM25 + trigram · RRF fusion · symbol graph syna-query parser · hybrid executor · graph rerank · as-of filter syna-engine write path · consistency · WAL · recovery · version DAG syna-fuse NATIVE FUSE mount · xattr /.syna magic paths syna-sys SYSCALL CUSE /dev/synafs libsyna C ABI · io_uring syna-web WEB REST · WebSocket gRPC-Web · TLS/mTLS syna-grpc WEB native HTTP/2 hand-rolled HPACK syna-mcp AGENTS MCP server stdio JSON-RPC