8 min read

How to Fix OpenClaw's Memory Search with QMD

Last updated on Mar 3, 2026

If you’ve been running OpenClaw for a while, you’ve noticed: the more your agent remembers, the worse it gets at finding what it already knows.

Your agent writes daily logs, saves preferences to MEMORY.md, and accumulates weeks of context.

But ask it to recall a specific decision from three weeks ago and it either misses it or pulls up something tangentially related.

The problem isn’t that the memory is gone. It’s that the default search can’t find it.

I wrote about setting up OpenClaw for daily intelligence briefings a few weeks ago.

Since then, I’ve been digging into the memory side and landed on QMD.

TL;DR: OpenClaw’s default SQLite memory search struggles as your agent accumulates context. QMD replaces it with a local hybrid search engine that runs entirely on your machine. Install with bun install -g https://github.com/tobi/qmd, set memory.backend = "qmd" in your config, and restart OpenClaw.


What is QMD?

QMD (Query Markup Documents) is a local search engine for Markdown files created by Tobi Lütke (of Shopify fame).

It combines three search approaches:

  1. BM25 full-text search — fast keyword matching. Great for exact terms, error messages, code symbols, and IDs.
  2. Vector semantic search — finds conceptually similar content even when the wording differs. Uses local GGUF embedding models.
  3. Hybrid search with LLM re-ranking — runs both in parallel, merges results using Reciprocal Rank Fusion, then re-ranks with a local language model.

Everything runs locally, no API keys, no cloud dependencies.

Three small GGUF models auto-download on first run:

ModelPurposeSize
embedding-gemma-300MVector embeddings~300MB
qwen3-reranker-0.6bResult re-ranking~640MB
qmd-query-expansion-1.7BQuery expansion~1.1GB

Here’s why this matters.

Say your agent saved this note three weeks ago:

Decided to run the gateway on the Mac Mini in the closet. Port 18789, Cloudflare tunnel for external access.

Search for “gateway server setup” with the default SQLite backend and it misses this — the note doesn’t contain “server” or “setup.”

QMD finds it.

The vector search catches the conceptual match, BM25 hits on “gateway,” and query expansion fills in related phrasings like “infrastructure configuration” before the re-ranker sorts the results.

The trade-off is speed.

Hybrid searches take a few seconds instead of being instant.

For me, accurate recall is worth more than a couple seconds of latency.


Comparing OpenClaw’s memory options

OpenClaw’s memory system supports three search backends.

The default SQLite with vector search works out of the box.

It handles paraphrases well but misses exact tokens like IDs, error strings, and code symbols.

SQLite with hybrid search adds BM25 keyword matching alongside vectors, with optional MMR deduplication and temporal decay, no extra install needed.

QMD goes further with query expansion and LLM re-ranking.

SQLite (Vector)SQLite (Hybrid)QMD
SetupNone (built-in)Config changeInstall binary + config
Search typesSemantic onlySemantic + BM25Semantic + BM25 + LLM re-ranking
Query expansionNoNoYes
LLM re-rankingNoNoYes
Result diversity (MMR)NoYes (optional)Built into ranking
Temporal decayNoYes (optional)No (use dated file organization)
Embedding providerLocal, OpenAI, Gemini, VoyageLocal, OpenAI, Gemini, VoyageLocal only (GGUF)
API keys neededOptional (local model works)Optional (local model works)None
Disk overhead~600MB (local model)~600MB (local model)~2GB (3 GGUF models)
PrivacyFull (with local embeddings)Full (with local embeddings)Full (always local)
SpeedFastFastFast (BM25) to moderate (hybrid)
External dir indexingVia extraPathsVia extraPathsVia paths[] with patterns
Fallback on failureN/A (built-in)Falls back to vector-onlyFalls back to SQLite

My recommendation: If you just set up OpenClaw and have a few daily logs, the default is fine.

If you’ve been running for weeks and noticing gaps in recall, switch to QMD.

If you want a middle ground without installing anything extra, enable hybrid search on SQLite first:

memorySearch: {
  query: {
    hybrid: {
      enabled: true,
      vectorWeight: 0.7,
      textWeight: 0.3,
      candidateMultiplier: 4
    }
  }
}

Prerequisites

  • OpenClaw running — follow my OpenClaw setup guide if you haven’t
  • Node.js 22+ or Bun 1.0+
  • ~2GB disk space for GGUF models (auto-downloaded on first run)
  • macOS or Linux (Windows via WSL2)
  • On macOS: brew install sqlite for SQLite extension support

Setting up QMD as your OpenClaw memory backend

Install QMD

bun install -g https://github.com/tobi/qmd

Or with npm:

npm install -g @tobilu/qmd

If you installed via Bun, add $HOME/.bun/bin to your PATH:

export PATH="$HOME/.bun/bin:$PATH"

Verify:

qmd --help

Enable the QMD backend

memory: {
  backend: "qmd",
  citations: "auto"
}

citations: "auto" is optional but adds source path and line numbers to results.

What happens on boot

When OpenClaw starts with QMD enabled:

  1. Creates a QMD environment at ~/.openclaw/agents/<agentId>/qmd/
  2. Indexes your workspace memory files (MEMORY.md and memory/**/*.md) plus any configured external paths
  3. Runs qmd update (text indexing) and qmd embed (vector embeddings — slower on first run as models download)
  4. Re-indexes every 5 minutes in the background

The refresh runs asynchronously, so your agent is available for chat right away.

Verify it works

Restart OpenClaw, then ask your agent about something in its memory.

If it returns relevant results with source citations, QMD is working.

You can also verify directly:

STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"

qmd status
qmd search "test query" -c memory-root

Advanced configuration

Search modes

memory: {
  backend: "qmd",
  qmd: {
    searchMode: "search"  // "search", "vsearch", or "query"
  }
}
  • search (default) — BM25 keyword search. Fast, usually instant.
  • vsearch — semantic vector search. Slower, but finds conceptually similar results.
  • query — full hybrid pipeline with LLM re-ranking. Highest quality, slowest.

Start with the default and switch to query once you’re comfortable with slightly longer response times.

Indexing external Markdown directories

You can point QMD at any Markdown directory — Obsidian vaults, project docs, meeting notes — and search across all of them in a single query.

memory: {
  backend: "qmd",
  qmd: {
    includeDefaultMemory: true,
    paths: [
      { name: "notes", path: "~/notes", pattern: "**/*.md" },
      { name: "obsidian", path: "~/Documents/Obsidian", pattern: "**/*.md" },
      { name: "work-docs", path: "~/work/docs", pattern: "**/*.md" }
    ]
  }
}

Each path gets its own named collection. includeDefaultMemory: true keeps your agent’s own memory files indexed alongside external directories.

Tuning result limits and intervals

memory: {
  backend: "qmd",
  qmd: {
    update: {
      interval: "5m",
      debounceMs: 15000,
      onBoot: true,
      waitForBootSync: false
    },
    limits: {
      maxResults: 6,
      maxSnippetChars: 700,
      timeoutMs: 4000
    }
  }
}

The defaults are sensible.

Only adjust if you’re seeing too many/few results or timeouts.

Pre-warming the index

The first run downloads models and builds embeddings from scratch.

To avoid a slow first interaction, pre-warm manually:

STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"

qmd update && qmd embed
qmd query "test" -c memory-root --json >/dev/null 2>&1

Enable the automatic memory flush

Easy to miss, but it matters.

OpenClaw has a “pre-compaction ping” — when a session approaches context compaction, it silently prompts the model to write important context to disk before the window resets.

Without it, your agent loses decisions, preferences, and facts that were discussed but never explicitly saved.

Better search is only useful if the memories make it into the files.

agents: {
  defaults: {
    compaction: {
      memoryFlush: {
        enabled: true,
        softThresholdTokens: 4000
      }
    }
  }
}

The agent responds with NO_REPLY so you never see the interaction, but the memories it saves show up in QMD searches later.


Wrapping up

The whole thing takes about five minutes: install QMD, change one config line, restart.

If you haven’t set up OpenClaw yet, start with my guide on automating daily intelligence briefings first.

Once you’re running, come back here and upgrade the memory.

Frequently Asked Questions

What is QMD?
QMD (Query Markup Documents) is a local search engine for Markdown files that combines BM25 keyword search, vector semantic search, and LLM re-ranking. It was created by Tobi Lütke and runs entirely on your machine with no API keys or cloud dependencies.
Does QMD require an API key?
No. QMD runs entirely locally using three small GGUF models totaling about 2GB. No API keys, no cloud services, and no data leaves your machine.
How much disk space does QMD need?
About 2GB for the three GGUF models (embedding, re-ranking, and query expansion). The SQLite search index grows with your content but is typically small.
Can I use QMD with Obsidian or other Markdown files?
Yes. QMD can index any directory of Markdown files alongside your OpenClaw agent's memory. Configure additional paths in your OpenClaw config to search across Obsidian vaults, project docs, meeting notes, and more in a single query.

Subscribe to my newsletter

Get the latest posts delivered to your inbox