RAG, entirely in your browser

A static web app that answers questions about a fixed document set using retrieval-augmented generation. Documents are chunked and embedded at build time. The browser handles query embedding, vector search, and LLM inference locally via WebGPU — no backend at runtime.

Or try the bare-bones version (hand-rolled, no widget library).