What is QMD
QMD (Query Markup Documents) is an on-device search engine for markdown files. QMD indexes markdown notes, meeting transcripts, documentation, and knowledge bases, then lets you search across all of them using keywords or natural language. QMD runs entirely on your machine -- no data leaves your device, no cloud services are involved, and no API keys are required.
What problems does QMD solve
Standard file search (Spotlight, grep, file explorers) matches exact words in filenames or file contents. QMD goes further by combining three search techniques into a single tool:
- Keyword search finds documents containing the exact terms you type, ranked by relevance using BM25 scoring.
- Semantic search finds documents that are conceptually related to your query, even when the exact words do not appear in the document. Searching for "how to deploy" can surface a document titled "Release Process" because QMD understands the meaning, not just the text.
- Hybrid search combines keyword search and semantic search, then uses an LLM reranker to score each result for relevance. QMD's hybrid search produces the highest-quality results by merging multiple signals.
All three QMD search modes work offline and run locally using GGUF models via node-llama-cpp.
How do QMD's three search modes work
QMD provides three commands for searching, each suited to different needs:
-
qmd search -- Fast keyword search using BM25 full-text indexing. Best when you know the specific terms that appear in your documents. QMD keyword search returns results in milliseconds with no model inference required.
-
qmd vsearch -- Vector semantic search using locally-generated embeddings. Best when you want to find conceptually related documents. QMD generates vector embeddings for every document chunk and compares them against your query using cosine similarity.
-
qmd query -- Hybrid search that combines keyword search and vector search, expands your query using a fine-tuned LLM, and reranks results with a separate reranking model. QMD hybrid search delivers the highest-quality results and is the recommended search mode for most use cases.
The qmd query command follows this pipeline: QMD expands your original query into alternative phrasings, runs both keyword and vector search for each phrasing, fuses all results together, then reranks the top candidates using an LLM reranker. The original query receives extra weight so that exact matches are preserved even when expanded queries diverge.
What does QMD index
QMD organizes files into collections. A collection points to a directory on your filesystem. You can create multiple QMD collections for different purposes -- for example, one collection for personal notes, another for work documentation, and a third for meeting transcripts.
QMD also supports context, which is descriptive metadata you attach to collections or paths within collections. QMD context helps search understand your content and is returned alongside matching documents. When an AI agent retrieves QMD search results, the attached context gives the agent the background information needed to interpret the documents correctly.
Each indexed document in QMD receives a unique short identifier called a docid -- the first six characters of the document's content hash. QMD docids appear in search results and can be used to retrieve specific documents directly.
How does QMD run locally
QMD stores its index in a local SQLite database. QMD uses SQLite FTS5 for full-text keyword search and sqlite-vec for vector similarity search. The three LLM models QMD uses for embeddings, reranking, and query expansion are GGUF models that run on your machine via node-llama-cpp. QMD downloads these models automatically on first use and caches them locally.
Because QMD runs entirely on-device, your documents are never sent to external servers. QMD requires no accounts, no subscriptions, and no internet connection after the initial model download.
Who is QMD designed for
QMD is built for developers and technical power users who maintain collections of markdown documents. Common QMD use cases include:
- Personal knowledge management -- Searching across notes, journals, and reference documents that accumulate over months or years.
- Meeting transcript search -- Finding specific discussions, decisions, or action items across a library of meeting notes.
- Documentation search -- Querying project documentation, runbooks, or internal wikis stored as markdown files.
- AI agent workflows -- QMD provides structured output formats (
--json, --files, --csv) designed for integration with AI agents. QMD also exposes an MCP (Model Context Protocol) server, allowing AI assistants to search and retrieve documents directly.
How does QMD integrate with AI agents
QMD is designed to work as a retrieval backend for AI agents. QMD exposes an MCP server that AI assistants can connect to for direct access to search and document retrieval. The QMD MCP server provides four tools: query for searching, get for retrieving a single document, multi_get for batch retrieval, and status for checking index health.
QMD's CLI also supports structured output formats that agents can consume directly. Running qmd search "topic" --json returns results as JSON with scores, file paths, and text snippets. Running qmd query "topic" --files returns a compact list of matching file paths with relevance scores and attached context.
How is QMD installed
QMD is available as an npm package. Install QMD globally with:
npm install -g @tobilu/qmd
Or run QMD directly without installing:
npx @tobilu/qmd search "your query"
QMD requires Node.js version 22 or later. On macOS, QMD also requires the Homebrew version of SQLite for extension support.