Home/Compare
SynaFS is one point in a crowded design space — lexical search, external dense-RAG indexers, code-intelligence formats, and semantic filesystems. This page positions it two ways: a capability matrix against those approaches, and a measured, apples-to-apples retrieval comparison on the same corpus, queries, and scoring. See the Agents page for end-to-end token / tool-call results and Related research for sources.
Where each approach keeps its index, and what it can answer. The dividing line is freshness: SynaFS is the only one that updates the index inside the write, so it is never stale and needs no separate pipeline.
| Always-fresh (index-at-write) | Zero-integration (read()/FUSE) | Hybrid signals (vector + BM25) | Symbol graph | Version / as-of | Agent API (MCP) | |
|---|---|---|---|---|---|---|
| grep / ripgreplexical tool-loop | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
| External dense-RAG indexerCursor / Continue / Cody-style | ◑ | ✗ | ✓ | ◑ | ✗ | ◑ |
| SCIP / LSIFcode-intelligence format | ◑ | ✗ | ✗ | ✓ | ◑ | ✗ |
| LSFS (2024)LLM semantic filesystem | ◑ | ◑ | ◑ | ✗ | ✗ | ◑ |
| Semantic File System (1991)Gifford et al. | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
| SynaFSthis project | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
✓ = native · ◑ = partial / bolt-on · ✗ = no. External RAG indexers can do hybrid retrieval, but their index lives outside the filesystem and re-syncs after the edit; SCIP is incremental yet still a separate build step; LSFS adds LLM file ops above the FS but re-embeds out of band.
how many of the six capabilities each approach covers · native vs partial
We pulled the 1,603 indexed chunks straight from SynaFS's index and re-ranked the 23 gold queries with three pure dense retrievers, scoring identically to the benchmark. It isolates the one variable that matters — the embedding model. Two jumps stand out: any dense model leaps over keyword search at rank-1 (0.04 → ~0.39, because grep rarely puts the exact file first on paraphrased queries), and a code-specialised embedder then beats the generic ones on every metric. SynaFS's hybrid fusion adds more still — and reaches the answer reading the least code.
share of 23 queries with the gold file ranked first · higher is better
gold file within the top 5 results · higher is better
1/rank of the gold file, averaged · higher is better
median KB of whole files ingested before the gold file · lower is better
median distinct files before the gold file · lower is better
recall@1 split by easy / medium / hard (paraphrased) queries · grep vs code-dense vs SynaFS
| Retriever | recall@1 | recall@5 | MRR | files→gold |
|---|---|---|---|---|
| greplexical tool-loop | 0.04 | 0.52 | 0.28 | 5 |
| syna-lexBM25 / hash index | 0.04 | 0.17 | 0.08 | 5 |
| Dense · MiniLMall-MiniLM-L6-v2 · 384d · generic | 0.39 | 0.61 | 0.48 | 3 |
| Dense · BGE-basebge-base-en-v1.5 · 768d · generic | 0.39 | 0.57 | 0.49 | 2 |
| Dense · CodeRankEmbed768d · code-specialised | 0.43 | 0.70 | 0.56 | 1.5 |
| SynaFS-semCodeRankEmbed + BM25 + trigram · RRF | 0.57 | 0.83 | 0.67 | 1 |
File-level relevance, 23 NL→code gold queries, 259-file corpus. Dense rows are pure cosine over identical chunks (MiniLM 384d, BGE-base 768d, CodeRankEmbed 768d). grep / syna-lex / SynaFS-sem are from the Performance benchmark; SynaFS-sem = CodeRankEmbed + BM25 + trigram fused with RRF. This is an embedder-isolation study, not a reproduction of any product's full pipeline.
The measured table is an embedder-isolation study: every retriever sees the exact same chunks, queries, and scoring, so the only thing that varies is the embedding model. Here is the procedure end to end, and the commands to run it yourself.
.syna/index.coderank.json) — each carries its source text and file path — so no retriever gets a different chunking.# 1 · isolated Python env — no system pip/venv required curl -LsSf https://astral.sh/uv/install.sh | sh cd experiments uv venv .venv uv pip install --python .venv torch --index-url https://download.pytorch.org/whl/cpu uv pip install --python .venv numpy sentence-transformers einops # 2 · embedder-isolation benchmark — same chunks, queries, scoring .venv/bin/python harness/exp7_compare.py # 3 · results: dense baselines vs grep / syna-lex / SynaFS-sem cat results/exp7_compare.json
Pure-CPU run; CodeRankEmbed embeds ~1,600 chunks in roughly ten minutes (no GPU). The harness reads the corpus path from a constant at the top of exp7_compare.py — point it at any repo you've indexed with SynaFS. Full code: experiments/harness/exp7_compare.py.
Better retrieval only matters if an agent converts it into fewer tokens and tool-calls. It does — for the right model and task shape (and it can backfire for others). The full live A/B across Haiku, Sonnet, Opus 4.8, and Codex is on the Agents page.
harness/exp7_compare.py).