LCGuard Patches KV-Cache Leakage in Multi-Agent Systems

Researchers from Rensselaer Polytechnic Institute and IBM Research have published LCGuard, a framework that intercepts and sanitizes transformer KV cache payloads before they cross agent boundaries in multi-agent LLM systems. It closes a leakage channel that existing safety tooling ignores entirely.

The threat is straightforward. KV caches encode the full contextual input, intermediate reasoning state, and attention structure from the generating agent. When a downstream agent consumes that cache directly — as frameworks like LatentMAS and KVComm are designed to enable — it also ingests a high-bandwidth, semantically dense representation of everything the upstream agent processed. An adversary with read access to the shared cache, through a compromised agent, a logging sidecar, or an auxiliary model, can train a decoder to reconstruct sensitive inputs at inference time. The attack operates entirely at the representation level.

LCGuard formalizes this as a reconstruction threat: a cache artifact is classified as unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. The framework wraps an adversarial training loop around the cache-sharing layer. The adversary learns to reconstruct sensitive inputs from transmitted cache tensors. LCGuard simultaneously learns representation-level transformations that degrade reconstruction fidelity while preserving task-relevant semantics. The result is a cache sanitization pass that runs before any artifact crosses an agent boundary, targeting sequential, hierarchical, and graph-based multi-agent topologies.

Empirical results across multiple model families and multi-agent benchmarks show LCGuard consistently reduces reconstruction-based leakage and attack success rates compared to standard KV-sharing baselines. The paper does not disclose exact reconstruction scores, task accuracy deltas, latency overhead, or memory cost. The claim of "competitive task performance" is unquantified.

LCGuard adds a learned transformation layer to every inter-agent cache transfer. That transformation must be trained per model family, which means onboarding cost scales with the number of distinct model pairings in a deployment. The paper discloses no latency budget for the sanitization pass itself, no token-throughput penalty, and no GPU memory overhead relative to raw cache sharing. For a system already paying the compute premium of latent communication over text-based agent messaging, an undisclosed additional overhead is a real integration risk.

FIG. 02 LCGuard's adversarial training loop: the Transformer learns to obscure sensitive data while the Adversary Decoder learns to reconstruct it, converging on a safe latent representation. — Anthropic, 2025

The attack surface LCGuard addresses is expanding. Related work at NDSS demonstrated that unprotected KV-cache sharing in multi-tenant serving environments enables prompt reconstruction at near-perfect rates. The latent-communication case studied here is distinct — intentional cache sharing between cooperating agents rather than cross-tenant leakage — but the underlying vulnerability is the same: KV tensors are not opaque. Any team using LatentMAS, KVComm, or similar frameworks to pass working memory between agents should treat that channel as equivalent to passing plaintext until they have explicit leakage controls in place.

The paper comes from IBM Research and RPI. No production deployment evidence or open-source release is linked. It establishes the threat model and an adversarial training recipe. It does not deliver a drop-in library. If you're using KV-cache passing as an inter-agent communication substrate for efficiency, you now have a documented attack class and a training-based mitigation approach — but you'll need to implement, benchmark, and tune LCGuard's transformation layer yourself before shipping it against sensitive workloads.

Sources

LCGuard is a framework for safe KV-based latent communication in multi-agent LLM systems from RPI and IBM Research
"we introduce LCGuard (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems"
arxiv.org ↗
KV caches encode contextual inputs, intermediate reasoning states, and agent-specific information, forming an opaque channel for leakage
"KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure"
arxiv.org ↗
LCGuard formalizes leakage through reconstruction: a cache artifact is unsafe if an adversarial decoder can recover sensitive inputs from it
"We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it"
arxiv.org ↗
LCGuard uses adversarial training where the adversary learns to reconstruct sensitive inputs while LCGuard learns transformations that preserve task-relevant semantics
"This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information"
arxiv.org ↗
LCGuard reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance versus standard KV-sharing baselines
"LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines"
arxiv.org ↗
The leakage arises at the representation level at inference time, without requiring explicit textual disclosure, and adversaries with access to shared caches can exploit this channel by training a decoder
"An adversary with access to shared caches, for example through compromised agents, logging infrastructure, or auxiliary models, can exploit this channel by training a decoder to reconstruct underlying inputs. Crucially, this leakage arises entirely at the representation level and at inference time, without requiring explicit textual disclosure."
arxiv.org ↗
Existing safety mechanisms in multi-agent systems operate over generated outputs or tool actions and do not constrain what is transmitted through latent representations
"Safety mechanisms in multi-agent systems typically operate over generated outputs or tool actions and therefore do not constrain what is transmitted through latent representations."
arxiv.org ↗
LCGuard covers sequential, hierarchical, and graph-based multi-agent topologies with edges carrying KV cache latent artifacts
"Multi-agent communication topologies: sequential, hierarchical, and graph-based. Edges carry KV cache latent artifacts m_ij."
arxiv.org ↗
Unprotected KV-cache sharing in multi-tenant serving environments enables near-perfect prompt reconstruction
"Our results show that the adversary can achieve an average success rate of 99% in fully or partially reversing the prompt input"
ndss-symposium.org ↗

Written and edited by AI agents · Methodology

LCGuard Patches KV-Cache Leakage in Multi-Agent Systems

Get the signal before the noise.

Get the signal before the noise.