SQLite FTS5 full-text search
Source: Artifex (image/tag search) · markstack (bookmark search) Category: Pattern — search
SQLite FTS5 — a virtual-table module built into SQLite that indexes text columns for fast full-text queries. MATCH operator, BM25 ranking, phrase / prefix / boolean queries. All in the same file as your app data.
What it is
Section titled “What it is”You declare a virtual table over the columns you want searchable. SQLite builds and maintains the inverted index. Queries use MATCH 'search terms' and order by the built-in rank function (BM25). Triggers keep the index in sync with the real table.
Why it exists
Section titled “Why it exists”The problem: self-hosted apps hit the wall on LIKE '%term%' quickly — no relevance ranking, linear scan, no phrase matching. Alternatives:
- Run Elasticsearch / Meilisearch / Typesense — another daemon, another place for data to drift out of sync
- Roll your own inverted index — spend three weeks not shipping the actual app
- Skip search, require users to scroll — fine at 20 rows, unworkable at 2000
The fix: FTS5. Ships inside every modern SQLite. No separate process. Adequate for most self-hosted apps up to millions of rows.
Create the virtual table and set up sync triggers:
-- Real content tableCREATE TABLE bookmarks ( id INTEGER PRIMARY KEY, title TEXT, description TEXT, url TEXT, created_at INTEGER);
-- Virtual FTS table (indexes the text columns)CREATE VIRTUAL TABLE bookmarks_fts USING fts5( title, description, url, content='bookmarks', -- external content; index references real rows content_rowid='id', tokenize='porter unicode61');
-- Triggers to keep the index in syncCREATE TRIGGER bookmarks_ai AFTER INSERT ON bookmarks BEGIN INSERT INTO bookmarks_fts(rowid, title, description, url) VALUES (new.id, new.title, new.description, new.url);END;
CREATE TRIGGER bookmarks_ad AFTER DELETE ON bookmarks BEGIN INSERT INTO bookmarks_fts(bookmarks_fts, rowid, title, description, url) VALUES ('delete', old.id, old.title, old.description, old.url);END;
CREATE TRIGGER bookmarks_au AFTER UPDATE ON bookmarks BEGIN INSERT INTO bookmarks_fts(bookmarks_fts, rowid, title, description, url) VALUES ('delete', old.id, old.title, old.description, old.url); INSERT INTO bookmarks_fts(rowid, title, description, url) VALUES (new.id, new.title, new.description, new.url);END;Query with ranking:
SELECT b.*, bm25(bookmarks_fts) AS scoreFROM bookmarks_ftsJOIN bookmarks b ON b.id = bookmarks_fts.rowidWHERE bookmarks_fts MATCH 'portfolio AND (react OR typescript)'ORDER BY score -- lower score = better matchLIMIT 50;FTS5 query syntax supports: phrases "exact phrase", prefix term*, boolean AND / OR / NOT, column-scoped title:react.
How it’s used
Section titled “How it’s used”- Artifex — search across image tags, captions, filenames, prompts
- markstack — search bookmarks by title, description, or URL
- Pattern generalizes — any app where “search box” is a feature and data lives in SQLite
Gotchas
Section titled “Gotchas”- Choose the tokenizer carefully.
unicode61handles accents and case folding;porterstems English (run → running);trigramis for fuzzy-ish prefix matching. Mixing modes is possible but rare. - FTS5 is external-content by default in this pattern (
content='bookmarks'). Triggers are required; forget one and the index gets stale silently. - Rebuilding the index is expensive. For a one-time rebuild:
INSERT INTO bookmarks_fts(bookmarks_fts) VALUES('rebuild');. On production, plan downtime or do it offline. - Ranking is backwards.
bm25()returns smaller scores for better matches.ORDER BY bm25(...) ASC. Easy to flip by accident and get the worst matches first. - Schema changes require re-creating the virtual table.
ALTER TABLEworks on the content table, not on the FTS table. Drop + recreate + rebuild. - Queries with special characters need escaping. User-entered
"or*can either be interpreted as FTS5 syntax or break the parse. Either sanitize or wrap the whole query in quotes to disable syntax parsing. - No language-aware ranking out of the box. BM25 treats terms as tokens; it doesn’t know “running” ≈ “run” unless your tokenizer does stemming.
See also
Section titled “See also”- projects/artifex · projects/markstack
- patterns/sqlite-job-queue — sibling “use SQLite for the thing you’d normally reach for another service” pattern