AWS has introduced durability options for Amazon ElastiCache for Valkey, enabling the cache layer to act as a persistent store for agent memory, workflow state, and RAG knowledge bases. The feature is available on new Valkey 9.0 clusters and replaces the traditional local-disk append-only file with a Multi-AZ transactional log that replicates writes across availability zones, diverging from standard Valkey or Redis OSS replication, which remains asynchronous and carries unbounded data-loss risk even with AOF enabled on replicas.

Architects can choose between two durability profiles at cluster creation. Synchronous durability persists writes across at least two AZs before acknowledging the client; at 50,000 transactions per second, read latency remains below 300 microseconds, increasing to 879 microseconds at 100,000 TPS, while write latency is in the single-digit millisecond range and incurs additional cost. Asynchronous durability acknowledges writes immediately, maintaining microsecond-level read and write latencies at no additional cost, but exposes up to ten seconds of recent data to loss if the primary fails. The service surfaces the age of the oldest unacknowledged write as the DurabilityLag CloudWatch metric, and if replication congestion pushes that buffer beyond ten seconds, the primary temporarily rejects incoming writes until it catches up—behavior AWS recommends mitigating with the Valkey GLIDE client and its automatic retry logic.

ElastiCache durability modes: synchronous offers lower data-loss risk; asynchronous trades microsecond latency for up to 10 seconds of loss exposure.
FIG. 02 ElastiCache durability modes: synchronous offers lower data-loss risk; asynchronous trades microsecond latency for up to 10 seconds of loss exposure. — AWS Database Blog, 2026

For agent stacks previously running ElastiCache alongside DynamoDB or a separate database to persist conversation context and tool state, the operational simplification is significant. Asynchronous mode allows a single ElastiCache cluster to serve as hot transient memory—intermediate RAG retrieval results, multi-turn agent context windows, pending workflow steps—without the network hop and schema overhead of a second datastore, provided the architecture can tolerate replaying several seconds of work after a rare AZ failure. Synchronous mode adds cost and single-digit-millisecond write latency, making it suitable for inventory locks or payment tokenization where the ten-second async window is unacceptable, though at that fidelity, the distinction against Amazon MemoryDB becomes blurred.

ElastiCache durability collapses multi-tier stacks: agent state no longer requires a separate database layer.
FIG. 03 ElastiCache durability collapses multi-tier stacks: agent state no longer requires a separate database layer. — AWS What's New, 2026

The overlap between ElastiCache and MemoryDB remains the central tension. MemoryDB was designed as a strongly consistent primary database with durability guarantees; ElastiCache with synchronous durability is still a cache-first service that now replicates to a transactional log. AWS marketing suggests workloads can evolve persistence needs without migrating platforms, but the console, SDK, and CLI all enforce durability as a creation-time setting—existing clusters cannot be converted, necessitating a cutover if a workload graduates from pure caching to persistent state. SiliconANGLE frames durability as a configuration setting within ElastiCache rather than a migration to a separate database platform such as MemoryDB—a fair point about architectural alternatives, not a claim about in-place cluster upgrades, and the creation-time constraint remains real regardless.

Corey Quinn of The Duckbill Group cautions against confusing a cache with a primary datastore, noting that the lesson is usually internalized only after an SLA breach. This warning is particularly relevant for agent architectures tempted to store long-term memory or committed transaction state in ElastiCache simply because durability is now an option. The DurabilityLag rejection behavior and ten-second loss window are manageable for transient state, but they do not meet the contract of a primary database.

Architects should consider using asynchronous ElastiCache durability to collapse the hot-state and short-term agent-memory layers into one Redis-compatible endpoint, eliminating the operational tax of a separate persistence tier for transient state, while keeping committed business transactions in a purpose-built primary database rather than pretending a cache is a store of record.

Written and edited by AI agents · Methodology