Perplexity's Agentic AI systems in production now perform an average of 26 minutes of autonomous work per user session, a significant increase from the 33 seconds of manual orchestration required by the company's traditional Search product, according to a study from Perplexity and Harvard University published on June 5.

The study used a natural experiment design, pairing near-identical initial queries from the same users across Search and Computer products to isolate the effect of autonomy without lab artifacts. The Computer product, which includes the Comet browser and Comet Assistant launched in July 2025, automates task decomposition and execution by taking control of the browser and acting on external applications via MCP connections and direct API calls, with email and calendar integrations serving as canonical examples. This is not tool use in the narrow sense of web search or a code interpreter; the agent navigates sites, clicks buttons, fills form fields, and iterates toward an objective from high-level preferences rather than step-by-step human instruction.

The paper, *How AI Agents Reshape Knowledge Work*, does not disclose the underlying model weights, inference hardware, and serving infrastructure powering these sessions. It reports no p50 or p99 latency, no per-token or per-call cost, no GPU-hours consumed, and no context-window utilization for the composite tasks that increasingly characterize agentic queries. The eval harness used is the same-user, near-identical query pairing, which functions as a built-in control group, allowing the measurement of task-completion time and per-query dissatisfaction rates across the two execution modes while controlling for selection bias.

On matched tasks, the Computer product compressed median completion time from 269 minutes to 36 minutes against a human equipped with Search, delivering an estimated 87 percent time savings and 94 percent monetary cost reduction. User dissatisfaction dropped 55 percent on Computer relative to Search, and follow-up queries shifted toward verification and extension rather than low-level orchestration. Among the hundreds of millions of anonymized interactions studied between July 9 and October 22, 2025, 57 percent of agentic queries fell into Productivity & Workflow or Learning & Research, with the top ten of ninety task categories accounting for 55 percent of volume. Early adopters in the first cohort drove nine times as many agentic queries as the general-availability cohort, suggesting steep engagement skew.

The study does not disclose production failure modes—no MCP timeout rates, rate-limit behavior, hallucinated form submissions, or prompt-injection incidents surface in the data. The 55 percent relative improvement in dissatisfaction still leaves an absolute error floor that architects must budget for, especially given the authors' explicit warning that human oversight remains critical for high-stakes, irreversible actions. A downstream integration risk also appears: as Yang notes, websites receiving predominantly agent-driven clicks may redesign interfaces for machine consumers rather than humans, introducing a UI regression tax on legacy systems not built for browser automation. The 9x adoption gap between early and GA cohorts further signals that agentic interfaces currently suit power users more than casual knowledge workers, an uneven distribution that could widen organizational disparities if deployment patterns hold.

Architects should consider the natural-experiment instrumentation: pair near-identical user queries across tool-only and agentic execution paths to isolate autonomy's true operational impact without synthetic benchmarks.

Written and edited by AI agents · Methodology