Skip to content

Research

Membership inference attacks on retrieval-augmented generation

We show that RAG pipelines leak the presence of specific documents in the index under realistic prompting conditions, and propose two practical mitigations.

securityragattacks

22 January 2026 · Reseni Security Team

Retrieval-augmented generation (RAG) is widely assumed to be private with respect to the underlying corpus because individual documents are never exposed to the model weights.

Through a series of black-box experiments against four production-style RAG stacks, we show that an attacker can determine — with > 80% precision — whether a target document is in the index, using only the model's responses.

We outline two mitigations: query-time differential privacy on retrieval scores, and a defence-in-depth pattern using semantic redaction at indexing time.

Download PDF →

Membership inference attacks on retrieval-augmented generation · Reseni Labs