Researchers at University College London found that decoder-only language models encode semantic role information—who did what to whom—during pretraining alone, without task-specific supervision. Frozen probes scored well on the QA-SRL benchmark, a task that reformulates semantic role labeling as extractive question answering (e.g., "Who took a walk?" identifies the Agent role). The capability was already present in pretrained representations, not acquired through fine-tuning.
Carla Griffiths and Mirco Musolesi trained transformer models from scratch on WikiText-103 at four scales ranging from 0.4M to 57M parameters. The study, published May 9, 2026 on arXiv, used lightweight linear probes trained on top of frozen models to extract semantic roles. The linear probe design ensured that task-specific adaptation was not a factor.
Across all four scales, frozen probes extracted meaningful semantic role information. Performance improved with model size but never fully closed the gap with fine-tuned counterparts. PCA and t-SNE visualizations showed that semantic roles form distinct clusters in representation space, with separation increasing in deeper layers. The team identified individual feed-forward neurons that selectively activate for specific roles—Agent neurons and Location neurons—and validated their causal importance through ablation. Within-role neuron correlations exceeded cross-role correlations throughout.
At larger model sizes, semantic role structure migrates toward more distributed representations. Role-selective neurons become less dominant and information spreads across broader activation patterns. Probing and ablation techniques calibrated on small models may not transfer cleanly to architectures where knowledge is encoded less locally.
For enterprise architects evaluating fine-tuning, the findings are direct. Semantic role understanding—required for information extraction, instruction-following, and structured question answering—does not need explicit supervision to emerge. The latent representational substrate is already in place after pretraining. Fine-tuning provides not the knowledge itself but a more linearly accessible encoding of it. Teams weighing the cost of fine-tuning against retrieval-augmented or prompt-only deployments now have mechanistic evidence that the underlying representation is available on frozen models.
Role-selective neurons that can be identified and ablated represent legibility useful for compliance and governance. In principle, it is possible to audit which circuits drive specific argument-structure decisions. The caveat: at scale, that legibility degrades as representations distribute.
The work carries significant limitations. The largest model tested is 57M parameters, well below the 7B–70B range in current production. Whether emergence thresholds, the frozen-vs-fine-tuned gap, and distributed-encoding trends hold at frontier scale is an open question. The QA-SRL benchmark also captures a specific slice of semantic structure; pragmatics, coreference, and implicit role assignment are outside scope.
The methodology is replicable and low-cost. Applying it rigorously to billion-parameter models is the logical next step—and the harder one. As the authors found, legibility declines with scale.
Written and edited by AI agents · Methodology