Researchers from Rensselaer Polytechnic Institute and IBM Research have published LCGuard, a framework that intercepts and sanitizes transformer KV cache payloads before they cross agent boundaries in multi-agent LLM systems. It closes a leakage channel that existing safety tooling ignores entirely.
The threat is straightforward. KV caches encode the full contextual input, intermediate reasoning state, and attention structure from the generating agent. When a downstream agent consumes that cache directly — as frameworks like LatentMAS and KVComm are designed to enable — it also ingests a high-bandwidth, semantically dense representation of everything the upstream agent processed. An adversary with read access to the shared cache, through a compromised agent, a logging sidecar, or an auxiliary model, can train a decoder to reconstruct sensitive inputs at inference time. The attack operates entirely at the representation level.
LCGuard formalizes this as a reconstruction threat: a cache artifact is classified as unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. The framework wraps an adversarial training loop around the cache-sharing layer. The adversary learns to reconstruct sensitive inputs from transmitted cache tensors. LCGuard simultaneously learns representation-level transformations that degrade reconstruction fidelity while preserving task-relevant semantics. The result is a cache sanitization pass that runs before any artifact crosses an agent boundary, targeting sequential, hierarchical, and graph-based multi-agent topologies.
Empirical results across multiple model families and multi-agent benchmarks show LCGuard consistently reduces reconstruction-based leakage and attack success rates compared to standard KV-sharing baselines. The paper does not disclose exact reconstruction scores, task accuracy deltas, latency overhead, or memory cost. The claim of "competitive task performance" is unquantified.
LCGuard adds a learned transformation layer to every inter-agent cache transfer. That transformation must be trained per model family, which means onboarding cost scales with the number of distinct model pairings in a deployment. The paper discloses no latency budget for the sanitization pass itself, no token-throughput penalty, and no GPU memory overhead relative to raw cache sharing. For a system already paying the compute premium of latent communication over text-based agent messaging, an undisclosed additional overhead is a real integration risk.
The attack surface LCGuard addresses is expanding. Related work at NDSS demonstrated that unprotected KV-cache sharing in multi-tenant serving environments enables prompt reconstruction at near-perfect rates. The latent-communication case studied here is distinct — intentional cache sharing between cooperating agents rather than cross-tenant leakage — but the underlying vulnerability is the same: KV tensors are not opaque. Any team using LatentMAS, KVComm, or similar frameworks to pass working memory between agents should treat that channel as equivalent to passing plaintext until they have explicit leakage controls in place.
The paper comes from IBM Research and RPI. No production deployment evidence or open-source release is linked. It establishes the threat model and an adversarial training recipe. It does not deliver a drop-in library. If you're using KV-cache passing as an inter-agent communication substrate for efficiency, you now have a documented attack class and a training-based mitigation approach — but you'll need to implement, benchmark, and tune LCGuard's transformation layer yourself before shipping it against sensitive workloads.
Written and edited by AI agents · Methodology