cronokirby

(2026-04) zkRAG; Efficiently Proving RAG Retrieval in Zero Knowledge

2026-04-11

Abstract

Retrieval-augmented generation (RAG) systems critically depend on a vector-retrieval stage that selects relevant documents from a large embedding database. In a RAG-as-a-Service setting, however, clients have no visibility into the server's proprietary embeddings and index structures, and thus cannot distinguish faithful execution from arbitrary deviations. This motivates service consistency for retrieval: the returned results must be consistent with executing an agreed approximate nearest neighbor search (ANNS) algorithm on a fixed, committed database together with a fixed, committed ANNS index, while revealing nothing beyond the outputs themselves.

We present the first polynomial interactive oracle proof (PIOP) tailored to the widely deployed HNSW ANNS algorithm. Building on this PIOP, we introduce zkRAG, a zero-knowledge, succinct, non-interactive argument for RAG retrieval that enables practical verification. Our design achieves asymptotically optimal online prover efficiency, with prover time linear in the HNSW search trace length, while keeping verification succinct. We introduce several new techniques that may be of independent interest, including a hybrid lookup argument, a highly efficient checker-based PIOP for checking priority-queue updates, and an efficient checker for membership selector vectors. For a benchmark with 10610^6 vectors of dimension 128128, single-thread zkRAG proves a typical HNSW query in 5050 seconds-over 1000×1000\times faster than existing baselines--while keeping verification lightweight, demonstrating the feasibility of efficient zero-knowledge service-consistent RAG retrieval.