Build a RAG Search for Your Blog with Open‑Source Tools — The Agentic Web

Your blog doesn’t need a heavyweight stack to get great semantic search. In this guide, we’ll build a small, open‑source RAG pipeline that indexes your markdown posts and powers fast, relevant retrieval. You’ll see how to chunk content, pick embeddings, choose a vector store you control, and evaluate quality before you ship.

Goals

Accurate results for “what’s that thing I wrote about tokens?”‑style queries.
Simple, cheap infra you can run locally or on your favorite VPS.
Clear evaluation so you know when changes help, not just hope.

Architecture at a glance

Ingest: read markdown files, parse titles/tags/front‑matter.
Chunk: split into passage‑sized chunks with headers preserved.
Embed: generate dense vectors for each chunk.
Store: write to a vector DB you control.
Retrieve: hybrid search (BM25 + vector) or vector‑only with filters.
Rank + Answer: return the best passages; optionally synthesize a summary.

Ingestion and chunking

Keep chunks short enough to match query intent but long enough to be meaningful. A good starting point:

300–600 tokens per chunk
Overlap 50–80 tokens to preserve continuity
Carry section headers into chunk metadata

Practical tips:

Strip boilerplate (nav/footers) from pages.
Keep slug, title, section, and position in metadata.
Store the raw markdown and a cleaned text version.

Embeddings: small and strong

Start with a compact, high‑quality open model:

bge‑small‑en or bge‑base‑en (General retrieval; strong bang‑for‑buck)
all‑MiniLM‑L6‑v2 (Tiny, runs almost anywhere)
jina‑embeddings‑v2‑base‑en (Good out‑of‑the‑box for English)

If you need on‑device, use ONNX or WebAssembly builds (e.g., @xenova/transformers). Otherwise, a small GPU/CPU VM is plenty.

Normalize vectors and pick a cosine metric unless your DB defaults differently; be consistent end‑to‑end.

Vector database options you control

SQLite + VSS (sqlite‑vss): simplest possible setup, great for single‑node blogs.
Postgres + pgvector: robust, familiar admin, good filtering and joins.
Qdrant (or Milvus): feature‑rich standalone vector DB with HNSW indexes.

For most solo blogs, SQLite/pgvector is perfect. Qdrant shines once you want collections, payload filters, and distributed options.

Retrieval strategies

Vector‑only: fast and simple; start here.
Hybrid (BM25 + vector): combine keyword and semantic for best of both worlds.
Filters: use tags or dates to scope retrieval.

A pragmatic hybrid approach:

Run BM25 for top 100.
Re‑rank those with the vector model.
Return top 5–10 with titles, sections, and snippets.

Evaluating retrieval (don’t skip this)

Create a tiny labeled set—10–50 queries with expected passages. Measure:

Recall@k (k=5,10): how often a correct passage is in the top‑k.
MRR (mean reciprocal rank): how high the first correct passage appears.
NDCG: graded relevance if you label multiple acceptable answers.

Iterate on chunking and model choice; keep the index constant while you isolate variables. When recall@5 is stable at your target, ship.

UX pattern that works

Single search box; results show title → section → snippet.
Keyboard navigation (j/k) and quick open of the underlying post.
Optional answer synthesis only when top‑k confidence is high.
Always show sources; never answer without them.

Rollout checklist

Index job runs locally and in CI (protect against broken parsers).
A one‑line command to rebuild the index.
Evaluations live next to the indexer; one script prints recall@k.
Search UI ships with analytics: queries, clicks, and abandonment.
Document how to add a new collection (e.g., notes or docs) later.

Troubleshooting

Irrelevant results? Reduce chunk size or try bge‑base‑en; add hybrid.
Duplicates? Deduplicate by URL+position and trim overlap.
Slow queries? Precompute HNSW/IVF indexes and limit payload.
Wrong passages rank higher? Carry section headers into the chunk text.

Start small, measure, and iterate. A simple RAG stack beats a complicated one you don’t understand—and you can always layer in sophistication once you’re confident in the basics.