§ BEAT
Research
Single Linear Layer Outperforms 1M-Parameter Gate in MTP Speedup Test
AHA-WAM achieves 4.59× faster robot control by decoupling Diffusion Transformers
Waterloo researchers cut uncertainty quantification cost 99.7% with FASE
StreamMA Cuts Multi-Agent Reasoning Latency 26.9×
Alibaba Open-Sources Skill-RM for Unified LLM Reward Evaluation
Robot Manipulation Accuracy Jumps 22.5% With Motion-Aware Encoder
HullFT Method Cuts Test-Time Finetuning Latency Versus SIFT
Bidirectional Evolutionary Search Escapes Autoregressive Limits in Reasoning
Mistral's 30B mixture-of-depths model remains unconfirmed but would fill a code-stack gap
LoopMDM Cuts Training FLOPs 3.3× by Recycling Transformer Layers
VeriTrace Improves Research Agents Without Scaling Models
Model Scale Fails to Predict Extracted Skill Performance
Gated DeltaNet-2 Beats Linear Baselines on Long-Context Retrieval
Vector Policy Optimization beats GRPO on diverse sampling
Equilibrium Reasoners lift Sudoku accuracy from 2.6% to 99% via test-time scaling
EnvFactory lifts Qwen3 tool-calling accuracy 15% with synthetic data
FORGE Reduces Agent Failures to 1% Without Model Fine-Tuning
Why Production Agents Fail Without Harness Infrastructure
KV-Fold Extends Transformer Context to 128K Without Retraining
27M Attractor Model Beats GPT o3 on Logic Puzzles
Sparse-to-Dense RL Lifts MATH Scores to 78.5% on Small Models
Standard load-balancing losses degrade SMoE expert specialization by 3x
VECA Cuts Vision Transformer Inference Cost to Linear Time
Los Alamos Team Trains 8B Model That Generalizes Across Reasoning Benchmarks
AutoTTS Cuts Inference Costs 69.5% With Learned Test-Time Scaling
ActCam Controls Video Cameras and Characters Without Fine-Tuning
SIRA Outperforms Dense Retrieval Without Training or GPU Infrastructure