RAG Injection Vectors in Production LLM Pipelines
Analysis of injection surface in retrieval-augmented generation systems: vector DB poisoning, context stuffing, and chunk boundary attacks.
What RAG adds to the attack surface
Retrieval-augmented generation introduces a new trust boundary that most teams haven't mapped: the vector database. The model receives retrieved chunks as context and treats them as background knowledge - but those chunks came from somewhere, and if they came from a source the attacker can influence, the model is now processing attacker-controlled content with system-level trust.
This is structurally identical to indirect prompt injection, but with an additional layer of indirection. The attack doesn't arrive through the user channel; it arrives through the retrieval channel, which most LLM security postures treat as trusted.
Vector similarity bypass
Semantic similarity search is the mechanism that decides which chunks reach the model. The assumption baked into RAG pipelines is that similarity means relevance, and relevance implies safety. Neither is consistently true.
In our tests across six retrieval systems, we were able to craft adversarial embeddings that scored high similarity to legitimate query patterns while embedding instruction content. Four of six systems had no semantic filter downstream of retrieval, meaning any chunk that passed the similarity threshold arrived in the model context without further validation.
The attack surface is small but deterministic: if you can contribute content to the retrieval corpus - via a customer support form, a document upload, a publicly crawled source - you can preposition instruction content in the vector database and trigger it via specific queries.
Chunk boundary injection
Chunking strategies create exploitable seams. When a document is split into fixed-length or semantically-bounded chunks, the boundary between a legitimate chunk and a subsequent adversarial one is invisible to the retrieval layer. The model receives both as context items with equivalent metadata.
The attack shape: a legitimate document segment establishes context trust, and the following chunk contains the injection payload. From the retrieval layer's perspective, these are two high-scoring matches for a relevant query. From the model's perspective, the second chunk is part of the same trustworthy source.
RBAC gaps
Three of the production-grade RAG frameworks we tested applied RBAC at the retrieval query layer - users could only retrieve from documents their role permitted. None of the three propagated those permissions into the model's generation context. The generated response could reference and partially reproduce content from documents the user had no read permission on, because the retrieval gating was the only enforcement point.
This is not a bug in the frameworks; it's an architectural gap. The frameworks were designed to control what gets retrieved, not to reason about what gets generated. The assumption that retrieval-gating implies generation-gating is the mistake.
What helps
Treating retrieved content as untrusted user input rather than trusted knowledge is the foundational change. Architecturally: separate the model instance that processes retrieved chunks from the model instance that generates user-facing output, and ensure the generation model cannot be instructed by retrieved content to change its behaviour. Practically: strip instruction-shaped patterns from retrieved chunks before injection into the context window, and log every retrieval event with the full chunk content for post-hoc review.