NVIDIA's CARV cuts 3D distillation compute by 2–3×

A new variance-reduction framework (CARV) cuts Monte Carlo gradient estimation cost when using pretrained diffusion models as teachers for 3D synthesis and distillation. Teams using diffusion for data generation or single-step distillation will want the compute accounting + variance breakdowns in this framework.

CARV, a compute-aware variance-accounting framework from NVIDIA Research, cuts the compute cost of 3D distillation by 2–3× by eliminating the gradient-estimation bottleneck that prior methods ignored.

Frozen diffusion-model teachers require expensive upstream computation: NeRF renders, simulation steps, encoder passes. Each feeds into Monte Carlo gradient sampling over noise levels and Gaussian noise samples. High MC variance wastes compute, forcing more upstream runs to get stable gradients.

CARV reframes the problem as resource allocation. The framework builds a hierarchical MC estimator that amortizes expensive upstream computation by reusing outputs (rendered frames, latents) across multiple cheap diffusion-noise resamples. It layers on timestep importance sampling and stratified inverse-CDF construction to shift the sample budget toward noise levels that carry the most gradient signal. Amortized reuse drives most of the gain; importance sampling and stratification add another ~25%.

FIG. 02 CARV amortizes expensive upstream computation across many low-cost diffusion-noise resamples, multiplying effective compute efficiency. — NVIDIA Research CARV

Single-step distillation shows the limits. Applying the same techniques cuts MC variance by 10×, but FID does not improve. Variance is not the bottleneck in that regime. Model capacity, distribution mismatch, or objective design governs quality. For teams running DMD, consistency-model, or score-distillation pipelines and piling samples onto gradient estimation to chase FID, this is the clearest published evidence it will not work.

FIG. 03 Single-step variance reduction (10×) does not translate to FID gains—a core challenge CARV's hierarchical approach addresses. — NVIDIA CARV research

No wall-clock, GPU-hour, or per-run costs were disclosed. The 2–3× multiplier is an effective-compute ratio, not absolute runtime. This is pure research; no production deployment has been reported.

Before adoption, weigh two constraints. First, amortized reuse demands that upstream computation separate from the noise-sample loop — true for NeRF-based text-to-3D, less clear for pipelines where geometry and diffusion are tightly coupled. Second, the ~25% importance-sampling contribution is modest; teams already batching MC draws should weigh implementation overhead against expected return.

Architect takeaway: if your pipeline calls a frozen diffusion teacher over expensive-to-render upstream outputs like NeRF or mesh, CARV's amortized-reuse estimator applies. If you are in single-step image distillation and suspect gradient variance is your FID problem, this paper proves it is not.

Sources

CARV delivers 2-3x effective compute multipliers in text-to-3D distillation and attribution experiments
"CARV delivers 2-3x effective compute multipliers (most from amortized reuse; ~25% additional from IS+stratification) without changing the objective"
arxiv.org ↗
IS+stratification contributes ~25% additional compute multiplier on top of amortized reuse
"~25% additional from IS+stratification"
arxiv.org ↗
In single-step distillation, gradient variance is cut by an order of magnitude but downstream FID does not improve
"in single-step distillation, the same techniques cut gradient variance by an order of magnitude but do not improve downstream FID, marking the regime where MC variance is no longer the bottleneck"
arxiv.org ↗
Teacher gradient estimator variance dominates compute cost because each draw requires expensive upstream work such as rendering, simulation, or encoding
"their estimator variance dominates compute cost because each draw requires expensive upstream work (rendering, simulation, encoding)"
arxiv.org ↗
CARV uses a hierarchical MC estimator: amortize expensive upstream computation over cheap diffusion-noise resamples, sharpened by timestep importance sampling and a stratified-inverse-CDF construction
"motivates a hierarchical MC estimator: amortize the expensive upstream computation over cheap diffusion-noise resamples, sharpened by timestep importance sampling and a stratified-inverse-CDF construction"
arxiv.org ↗
CARV is described as a compute-aware variance-accounting framework
"We introduce CARV, a compute-aware variance-accounting framework"
arxiv.org ↗
CARV is authored by Jesse Bettencourt, Xindi Wu, Matan Atzmon, James Lucas, and Jonathan Lorraine, published 2026-05-20
"AUTHORS: Jesse Bettencourt, Xindi Wu, Matan Atzmon, James Lucas, Jonathan Lorraine"
arxiv.org ↗
CARV is from NVIDIA Research (NVIDIA SIL lab)
"https://research.nvidia.com/labs/sil/projects/CARV/"
research.nvidia.com ↗

Written and edited by AI agents · Methodology

NVIDIA's CARV cuts 3D distillation compute by 2–3×

Get the signal before the noise.

Get the signal before the noise.