SynaFS fuses a vector index, symbol graph, and version DAG into the filesystem write boundary. Write a file and its semantic index updates atomically — then read those results back through a plain read(), a syscall, or an agent API. No external reindex pipeline. No stale index.
# index a repo, then search by meaning — not grep $ syna index . indexed 312 files · 1,840 chunks · 0.07s $ syna query "where do we validate the auth token?" --top 3 src/auth/token.rs:42 0.91 fn validate_token(raw: &str) -> Result<Claims> src/middleware.rs:118 0.74 let claims = validate_token(&header)?; src/auth/mod.rs:9 0.69 pub use token::validate_token; # or mount it — every tool (cat/ls/grep) gains semantic search $ ls /mnt/repo/.syna/query/validate%20token/ token.rs:42→58 middleware.rs:118→121 mod.rs:9→9
Every turn repeats ls → grep → read → embed → rerank, and the external index is always one edit behind. SynaFS pushes search and understanding down to the filesystem write boundary, so the index can never be stale.
Any tool or agent that reads files gains semantic search — through magic directories, symlinks, and xattrs. No SDK to adopt.
The index lives at the write boundary. The moment a file is written, its chunks are re-embedded transactionally — with read-your-writes consistency.
Magic paths, xattrs, and an NDJSON device stream compose with the existing ecosystem instead of replacing it.
One engine, four signals — combined with reciprocal-rank fusion and graph reranking.
Vector (HNSW) + BM25 + trigram, fused with RRF. Tree-sitter semantic chunking for Rust, Python, JS, Go.
defs / refs / callers / callees / importers resolved across the repo. Rerank results along the call graph.
Commit snapshots, --as-of HEAD~3 time-travel search, symbol-level diffs, GC that preserves shared chunks.
blob → WAL → token → async reindex. Consistency tokens give strong / read-your-writes reads. Crash-safe via WAL replay.
If the embedder or HNSW fails, search falls back to BM25/lexical and flags degraded:true — it never goes dark.
No async runtime, no protoc, no libfuse required. Native engines slot behind traits. Heavy bits (TLS, io_uring) are opt-in features.
The same query semantics — and the same results — whether you read a file, call a syscall, or hit the network.
Mount the repo passthrough with a virtual /.syna/** namespace. cat, ls, grep just work.
$ syna mount /mnt/repo
$ cat /mnt/repo/.syna/\
symbol/validate_token/callers/dev/synafs (CUSE) with a write-query / read-NDJSON model, plus libsyna.so for any FFI caller. io_uring batch.
echo '{"text":"parse config"}' \
> /dev/synafs
cat /dev/synafs # NDJSON hitsREST + WebSocket, gRPC-Web and native HTTP/2 (hand-rolled HPACK), MCP for agents. TLS/mTLS opt-in.
$ syna serve --addr :5200 # REST $ syna grpc --addr :5201 # HTTP/2 $ syna-mcp . # agents
Measured on real codebases with a reproducible harness (bench/run_bench.py) — not a synthetic corpus. Full method in docs/benchmarks.md.
p50 / p99 · in-process · lower is better
files / second on real repos · hash embedder
recall@5 & MRR · CodeRankEmbed vs hash
on a 1-line edit
a 1-line edit re-embeds only neighbour chunks — p50 ~20 ms
A 1-line edit re-embeds only the touched chunks (≈neighbour count), with an 80% embed-cache hit rate on edits — not the whole file.
Identical input → identical ChunkID and vector. The benchmark harness reports throughput, p50/p99 latency, cache hit-rate, and recall@k / MRR.
Quality is an 18-query NL→code gold set over the SynaFS source, graded at file level, run through the same hybrid pipeline with only the embedder swapped — lexical hash vs the local CodeRankEmbed model (137 M params, 768-dim, pure-Rust candle, CPU). Throughput is real source trees (code files only); search/reindex latency is the in-process engine. The vector engine is still M0 brute-force, so these do not reflect 100k–1M-chunk behaviour. Reproduce: cargo build --release --features coderank && python3 bench/run_bench.py. Numbers vary by corpus, embedder, and hardware.
Pure-Rust workspace — cargo build needs no C/C++ toolchain for the core.
# clone & build (84 tests pass: core + write-path + symbols + as-of + FUSE + web/gRPC + CUSE) git clone https://github.com/021flow/synafs && cd synafs cargo build --release # index and search by meaning ./target/release/syna index . ./target/release/syna query "reciprocal rank fusion" --top 3 --lang rust # write path → consistency token → instant read-your-writes ./target/release/syna edit src/auth.rs --content "$(cat new.rs)" # surfaces: mount, serve, native gRPC, agent MCP ./target/release/syna mount /mnt/repo SYNA_TOKEN=secret ./target/release/syna serve --addr 127.0.0.1:5200 ./target/release/syna grpc --addr 127.0.0.1:5201 ./target/release/syna-mcp . # real semantic embeddings (pure-Rust candle) and benchmarks cargo build --release --features coderank cargo run --release -- bench
Core → store → {chunk, embed} → index → query → engine → surfaces. Native engines slot in behind traits.