Verkor's Agentic System Closes RTL-to-Layout in 80 Hours

Verkor's Design Conductor 2.0 autonomously built an inference accelerator from architecture to FPGA layout in 80 hours—an 80x jump in task complexity from the December 2025 baseline, when the system required 12 hours to design a 5-stage RISC-V CPU.

Design Conductor 2.0 runs a redesigned multi-agent harness powered by frontier models released in April 2026. The flagship output, VerTQ, is an LLM inference accelerator hard-wiring TurboQuant—a KV-cache compression algorithm—into a 240-cycle pipeline. VerTQ integrates K-compression via TurboQuant-Prod with QJL residuals, V-compression via TurboQuant-MSE, and embedded FlashAttention. The agents started from the TurboQuant arXiv paper and completed the full front-to-back flow: RTL, verification, timing optimization, and physical mapping, without human intervention.

FIG. 02 Design Conductor task complexity progression: December 2025 to April 2026 build times. — Verkor Design Conductor paper, arxiv.org/html/2605.05170v1

VerTQ packs 5,129 mixed-precision FP16/FP32 arithmetic units across an 8-way attention decoder. The 8-way build on the target XCVU29P-3 FPGA consumes approximately 1.9 million LUTs, 300,000 flip-flops, and 1,500 DSP48E2 slices. Projected to a TSMC 16FF process node, the design fits in 5.7 mm² (8 attention pipes) and achieves 125 MHz. VerTQ delivers 4.3x KV-cache compression and 16x fewer multiplies in the inner attention loop versus standard attention, with direct Python vLLM integration. Verkor says no equivalent hardware design was publicly available before this run.

Conventional tape-out costs exceed $400 million with 18-to-36-month cycles for teams of hundreds of engineers—assuming a starting design exists. An N2-node mask set alone costs over $30 million. If agentic systems can compress the architecture and RTL phases from months to days, the economics shift sharply: faster iterations, lower NRE per tape-out, and the ability to spin custom inference silicon without a standing hardware team.

Design Conductor 2.0 handled architecture judgment, RTL coding, testbench generation, timing closure, and FPGA mapping. The December 2025 version was "more like a highly skilled and inexhaustible implementer than a true designer." Version 2.0 makes architectural decisions—such as optimizing inter-die signal crossings for multi-SLR FPGAs—rather than mechanically executing a handed-down specification.

The paper is an internal Verkor evaluation—no independent reproduction of the 80-hour timeline or VerTQ specs exists yet. The authors acknowledge limitations and note that token usage is not fully disclosed in the preprint. Verification completeness—the make-or-break criterion before silicon commitment—is not independently audited. The TSMC 16FF area estimate (5.7 mm²) is a projection, not post-layout sign-off.

The agent-EDA frontier is moving faster than most enterprise chip roadmap cycles can track. Three frontier model generations separated December 2025 from May 2026, and each delivered capability uplift that expanded what agentic flows can close without human help. Teams planning custom inference silicon for 2027 or 2028 tape-outs should pressure-test agentic flows in their design methodology now, not after the next benchmark.

Sources

Design Conductor 2.0 produced a TurboQuant inference accelerator fully autonomously in 80 hours — an 80x jump in task complexity over its December 2025 baseline
"we introduce an updated multi-agent harness powered by frontier models released in April 2026, which is able to handle 80x larger tasks, at higher quality, fully autonomously"
arxiv.org ↗
The December 2025 baseline built a 5-stage Linux-capable RISC-V CPU in 12 hours
"we introduced "Design Conductor" (or just "Conductor"), a system capable of building a 5-stage Linux-capable RISC-V CPU in 12 hours"
arxiv.org ↗
VerTQ implements a 240-cycle pipeline with 5,129 mixed-precision FP16/FP32 arithmetic units across an 8-way attention decoder
"VerTQ includes heavy compute processing, with 5,129 FP16/32 units; the design was mapped to an FPGA at 125 MHz and consumes 5.7 mm^2 in TSMC 16FF (8 attention pipes)"
arxiv.org ↗
The 8-way FPGA build consumes approximately 1.9M LUTs, 300K flip-flops, and 1,500 DSP48E2 slices
"∼1.9M LUTs, ∼300K FF, ∼1.5K DSP48E2, 18 RAMB36, 9 RAMB18"
arxiv.org ↗
VerTQ delivers 4.3x KV-cache compression and 16x fewer multiplies in the inner attention loop, with direct vLLM integration
"4.3x KV cache compression, 16x fewer multiplies inner attention loop, 9-bank memory interface"
arxiv.org ↗
Conventional chip tape-out costs exceed $400 million with 18-to-36-month cycles; an N2 mask set alone exceeds $30 million
"costs over $400M and consumes 18-36 month for teams of hundreds of people (who typically start with an existing design) ... with an N2 mask set estimated at >$30M"
arxiv.org ↗
Design Conductor 2.0 made architectural decisions such as optimizing inter-die signal crossings for the multi-SLR FPGA target
"Conductor 2.0 optimized the architecture to minimize inter-die signal crossings"
arxiv.org ↗
The VerTQ run lasted approximately 80 hours, and the agent started from the TurboQuant arXiv paper with no equivalent hardware available publicly before this run
"In building VerTQ, Design Conductor demonstrated architecture judgment and the ability to guide and manage a complex project over a roughly 80-hour runtime ... To our knowledge, there is no such hardware available online (or anywhere)."
arxiv.org ↗

Written and edited by AI agents · Methodology

Verkor's Agentic System Closes RTL-to-Layout in 80 Hours

Get the signal before the noise.

Get the signal before the noise.