If you’ve been running OpenClaw for a while, you’ve noticed: the more your agent remembers, the worse it gets at finding what it already knows.
Your agent writes daily logs, saves preferences to MEMORY.md, and accumulates weeks of context.
But ask it to recall a specific decision from three weeks ago and it either misses it or pulls up something tangentially related.
The problem isn’t that the memory is gone. It’s that the default search can’t find it.
I wrote about setting up OpenClaw for daily intelligence briefings a few weeks ago.
Since then, I’ve been digging into the memory side and landed on QMD.
TL;DR: OpenClaw’s default SQLite memory search struggles as your agent accumulates context. QMD replaces it with a local hybrid search engine that runs entirely on your machine. Install with
bun install -g https://github.com/tobi/qmd, setmemory.backend = "qmd"in your config, and restart OpenClaw.
What is QMD?
QMD (Query Markup Documents) is a local search engine for Markdown files created by Tobi Lütke (of Shopify fame).
It combines three search approaches:
- BM25 full-text search — fast keyword matching. Great for exact terms, error messages, code symbols, and IDs.
- Vector semantic search — finds conceptually similar content even when the wording differs. Uses local GGUF embedding models.
- Hybrid search with LLM re-ranking — runs both in parallel, merges results using Reciprocal Rank Fusion, then re-ranks with a local language model.
Everything runs locally, no API keys, no cloud dependencies.
Three small GGUF models auto-download on first run:
| Model | Purpose | Size |
|---|---|---|
| embedding-gemma-300M | Vector embeddings | ~300MB |
| qwen3-reranker-0.6b | Result re-ranking | ~640MB |
| qmd-query-expansion-1.7B | Query expansion | ~1.1GB |
Here’s why this matters.
Say your agent saved this note three weeks ago:
Decided to run the gateway on the Mac Mini in the closet. Port 18789, Cloudflare tunnel for external access.
Search for “gateway server setup” with the default SQLite backend and it misses this — the note doesn’t contain “server” or “setup.”
QMD finds it.
The vector search catches the conceptual match, BM25 hits on “gateway,” and query expansion fills in related phrasings like “infrastructure configuration” before the re-ranker sorts the results.
The trade-off is speed.
Hybrid searches take a few seconds instead of being instant.
For me, accurate recall is worth more than a couple seconds of latency.
Comparing OpenClaw’s memory options
OpenClaw’s memory system supports three search backends.
The default SQLite with vector search works out of the box.
It handles paraphrases well but misses exact tokens like IDs, error strings, and code symbols.
SQLite with hybrid search adds BM25 keyword matching alongside vectors, with optional MMR deduplication and temporal decay, no extra install needed.
QMD goes further with query expansion and LLM re-ranking.
| SQLite (Vector) | SQLite (Hybrid) | QMD | |
|---|---|---|---|
| Setup | None (built-in) | Config change | Install binary + config |
| Search types | Semantic only | Semantic + BM25 | Semantic + BM25 + LLM re-ranking |
| Query expansion | No | No | Yes |
| LLM re-ranking | No | No | Yes |
| Result diversity (MMR) | No | Yes (optional) | Built into ranking |
| Temporal decay | No | Yes (optional) | No (use dated file organization) |
| Embedding provider | Local, OpenAI, Gemini, Voyage | Local, OpenAI, Gemini, Voyage | Local only (GGUF) |
| API keys needed | Optional (local model works) | Optional (local model works) | None |
| Disk overhead | ~600MB (local model) | ~600MB (local model) | ~2GB (3 GGUF models) |
| Privacy | Full (with local embeddings) | Full (with local embeddings) | Full (always local) |
| Speed | Fast | Fast | Fast (BM25) to moderate (hybrid) |
| External dir indexing | Via extraPaths | Via extraPaths | Via paths[] with patterns |
| Fallback on failure | N/A (built-in) | Falls back to vector-only | Falls back to SQLite |
My recommendation: If you just set up OpenClaw and have a few daily logs, the default is fine.
If you’ve been running for weeks and noticing gaps in recall, switch to QMD.
If you want a middle ground without installing anything extra, enable hybrid search on SQLite first:
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4
}
}
}
Prerequisites
- OpenClaw running — follow my OpenClaw setup guide if you haven’t
- Node.js 22+ or Bun 1.0+
- ~2GB disk space for GGUF models (auto-downloaded on first run)
- macOS or Linux (Windows via WSL2)
- On macOS:
brew install sqlitefor SQLite extension support
Setting up QMD as your OpenClaw memory backend
Install QMD
bun install -g https://github.com/tobi/qmd
Or with npm:
npm install -g @tobilu/qmd
If you installed via Bun, add $HOME/.bun/bin to your PATH:
export PATH="$HOME/.bun/bin:$PATH"
Verify:
qmd --help
Enable the QMD backend
memory: {
backend: "qmd",
citations: "auto"
}
citations: "auto" is optional but adds source path and line numbers to results.
What happens on boot
When OpenClaw starts with QMD enabled:
- Creates a QMD environment at
~/.openclaw/agents/<agentId>/qmd/ - Indexes your workspace memory files (
MEMORY.mdandmemory/**/*.md) plus any configured external paths - Runs
qmd update(text indexing) andqmd embed(vector embeddings — slower on first run as models download) - Re-indexes every 5 minutes in the background
The refresh runs asynchronously, so your agent is available for chat right away.
Verify it works
Restart OpenClaw, then ask your agent about something in its memory.
If it returns relevant results with source citations, QMD is working.
You can also verify directly:
STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"
qmd status
qmd search "test query" -c memory-root
Advanced configuration
Search modes
memory: {
backend: "qmd",
qmd: {
searchMode: "search" // "search", "vsearch", or "query"
}
}
search(default) — BM25 keyword search. Fast, usually instant.vsearch— semantic vector search. Slower, but finds conceptually similar results.query— full hybrid pipeline with LLM re-ranking. Highest quality, slowest.
Start with the default and switch to query once you’re comfortable with slightly longer response times.
Indexing external Markdown directories
You can point QMD at any Markdown directory — Obsidian vaults, project docs, meeting notes — and search across all of them in a single query.
memory: {
backend: "qmd",
qmd: {
includeDefaultMemory: true,
paths: [
{ name: "notes", path: "~/notes", pattern: "**/*.md" },
{ name: "obsidian", path: "~/Documents/Obsidian", pattern: "**/*.md" },
{ name: "work-docs", path: "~/work/docs", pattern: "**/*.md" }
]
}
}
Each path gets its own named collection. includeDefaultMemory: true keeps your agent’s own memory files indexed alongside external directories.
Tuning result limits and intervals
memory: {
backend: "qmd",
qmd: {
update: {
interval: "5m",
debounceMs: 15000,
onBoot: true,
waitForBootSync: false
},
limits: {
maxResults: 6,
maxSnippetChars: 700,
timeoutMs: 4000
}
}
}
The defaults are sensible.
Only adjust if you’re seeing too many/few results or timeouts.
Pre-warming the index
The first run downloads models and builds embeddings from scratch.
To avoid a slow first interaction, pre-warm manually:
STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"
qmd update && qmd embed
qmd query "test" -c memory-root --json >/dev/null 2>&1
Enable the automatic memory flush
Easy to miss, but it matters.
OpenClaw has a “pre-compaction ping” — when a session approaches context compaction, it silently prompts the model to write important context to disk before the window resets.
Without it, your agent loses decisions, preferences, and facts that were discussed but never explicitly saved.
Better search is only useful if the memories make it into the files.
agents: {
defaults: {
compaction: {
memoryFlush: {
enabled: true,
softThresholdTokens: 4000
}
}
}
}
The agent responds with NO_REPLY so you never see the interaction, but the memories it saves show up in QMD searches later.
Wrapping up
The whole thing takes about five minutes: install QMD, change one config line, restart.
If you haven’t set up OpenClaw yet, start with my guide on automating daily intelligence briefings first.
Once you’re running, come back here and upgrade the memory.