Diffusion Models Cut Compute on Sparse Data with Selective Processing

Researchers from TU Kaiserslautern, UC Irvine, and Heidelberg University have published a diffusion-model architecture that skips zero-valued entries during training and inference, cutting compute costs from total dimensionality down to the number of non-zero values. The paper, accepted to ICML, introduces Sparsity-Exploiting Diffusion (SED) and targets enterprise data — particle-physics detector outputs, single-cell RNA sequencing, and recommender-system interaction matrices — where most entries are exactly zero.

Standard diffusion models such as DDPM and LDM process every dimension regardless of value. On sparse data this means running the full forward-and-reverse noising process over semantically empty dimensions. The result is two failures: FLOPs scale with total dimensionality rather than signal density, and dense models introduce spurious non-zero entries even on simple datasets like MNIST.

SED addresses both with a three-stage pipeline. A sparsity-aware autoencoder encodes only non-zero entries into a compact latent representation, discarding zero dimensions before diffusion begins. Standard dense diffusion runs within that compressed latent space, keeping model complexity proportional to non-zero count. An autoregressive decoder reconstructs dimension–value pairs exclusively for non-zero entries, writing exact zeros everywhere else. Computational cost stays nearly constant as total input dimensionality grows, provided the number of active entries remains fixed.

FIG. 02 SED pipeline: sparsity-aware encoding compresses to dense latent space, diffusion operates there, then autoregressive decoding reconstructs sparse output. — ai|expert diagram

In single-cell RNA sequencing, most of the tens of thousands of gene measurements per cell are exactly zero — a biologically meaningful signal of a dropout event. Dense diffusion models waste compute on silent dimensions and then corrupt the signal by generating noise where silence was expected. SED preserves sparsity patterns aligned with ground truth while dense baselines fail this structural test. On physics and biology benchmarks SED matches or surpasses conventional diffusion and domain-specific baselines on generation quality.

For enterprise teams running generative models on sparse tabular data — IoT sensor feeds where most channels are inactive, user–item interaction tables, financial transaction logs — the rule is the same: compute cost should track signal density, not matrix dimensions. Dense diffusion is doubly expensive: it pays for inactive dimensions during training then contaminates downstream pipelines by generating hallucinated activity absent from training data.

SED does not yet address all sparse-data regimes. The autoregressive decoder introduces sequential dependency at generation time — each non-zero pair must be synthesized in order — which may add latency even if total FLOPs drop. The approach is designed for real-valued sparse data with exact structural zeros; it is not a substitute for sparse-weight compression on neural network parameters.

Code is open-sourced at github.com/PhilSid/sparsity-exploiting-diffusion. ICML acceptance signals peer review for scientific rigor, making this a credible research direction for ML platform teams evaluating generative models for non-image, non-text data.

Sources

SED achieves efficiency that scales with the number of non-zeros, in contrast to dense models that scale with dimensionality
"SED achieves efficiency that scales with the number of non-zeros, in contrast to dense models that scale with the dimensionality"
arxiv.org ↗
SED keeps computational cost nearly constant for high-dimensional sparse scRNA data with a fixed number of active genes
"SED keeps computational cost nearly constant for generative modeling on high-dimensional sparse scRNA data with a fixed number of active genes. Unlike DDPM and LDM, whose costs grow with total dimensionality, SED processes only non-zero dimensions, maintaining efficiency regardless of input size."
arxiv.org ↗
Dense models (DDPM, LDM) fail to preserve exact zeros and introduce spurious non-zero entries even on MNIST
"While dense models (DDPM, LDM) fail to preserve exact zeros and introduce spurious non-zero entries, the proposed Sparsity-Exploiting Diffusion (SED) model preserves sparsity patterns closely aligned with the ground truth."
arxiv.org ↗
SED uses a sparse-to-dense latent encoding, dense diffusion in a compact latent space, and an autoregressive sparse decoder
"SED exploits sparsity by encoding only non-zero values into a compact latent representation, performing dense diffusion in this space, and then autoregressively reconstructing the non-zero values."
arxiv.org ↗
SED matches or surpasses conventional DMs and domain-specific baselines across physics and biology benchmarks
"Across physics and biology benchmarks, SED matches or surpasses conventional DMs and domain-specific baselines, while vision experiments provide intuitive insights into the limitations of dense DMs and the benefits of SED."
arxiv.org ↗
In scRNA sequencing, most measurements are exactly zero and only a limited subset carries signal
"in single-cell RNA (scRNA) sequencing, most measurements are exactly zero and only a limited subset carries signal"
arxiv.org ↗
Exact zeros are semantically meaningful absences of signal; failing to preserve them undermines interpretability and downstream utility
"Zeros are semantically meaningful absences of signal (e.g., scRNA dropout events, no energy deposits in particle physics experiments). Failing to preserve them undermines interpretability, trust, and downstream utility."
arxiv.org ↗
Code is open-sourced at github.com/PhilSid/sparsity-exploiting-diffusion
"Code is available at https://github.com/PhilSid/sparsity-exploiting-diffusion."
arxiv.org ↗
The paper is accepted to ICML
"Machine Learning, ICML"
arxiv.org ↗

Written and edited by AI agents · Methodology

Diffusion Models Cut Compute on Sparse Data with Selective Processing

Get the signal before the noise.

Get the signal before the noise.