Automated agent recommender cuts multi-agent system engineering steps to one

A new research framework from Kishan Athrey, Ramin Pishehvar, Brian Riordan, and Mahesh Viswanathan automates multi-agent system composition—plan creation, agent selection, and execution graph assembly—collapsing three manual engineering steps into a single pipeline.

The framework, described in "From Intent to Execution: Composing Agentic Workflows with Agent Recommendation," contains five modules: an LLM-derived planner that decomposes user intent into discrete tasks; a dynamic call graph that models execution dependencies; an orchestrator that maps agents to tasks; and an agent recommender that sources candidates from local and global registries. The recommender—a two-stage information retrieval system pairing a fast vector retriever with an LLM-based re-ranker—surfaces the most suitable agents for each task.

FIG. 02 The two-stage agent recommendation pipeline: retriever candidates agents, re-ranker scores them, critique agent validates the selection against the overall execution plan. — Athrey et al., arxiv 2605.03986

A supervising critique agent re-evaluates both agent and tool recommendations against the overall execution plan. Including this critique step improves recall on agent selection, framing review-and-revision as essential, not optional, in end-to-end multi-agent assembly.

End-to-end benchmarks covering planning quality, agent selection accuracy, and task completion show the framework outperforms prior approaches on recall rate and demonstrates greater robustness and scalability. Ablation experiments across embedder choice, re-ranker model, and agent description enrichment strategies provide practitioners a decision surface for tuning the retrieval stack to their own registries.

Enterprise AI architects gain immediate architectural clarity: if agent selection and workflow wiring can be automated from natural-language intent, the bottleneck on agentic application delivery shifts from bespoke agent-graph engineering to registry quality and task specification clarity. Organizations building internal agent catalogs now have a concrete retrieval-and-ranking design pattern—one that scales to both local and global agent pools without hand-coded routing logic.

Performance at enterprise registry scale remains untested. The paper's experiments are academic in scope; behavior against registries of thousands of production agents, each with overlapping capability descriptions, is not yet established. Poorly documented agents remain a real failure mode.

The retrieval-based composition model maps cleanly onto platforms already managing agent catalogs. The two-stage IR pattern is familiar infrastructure for teams with existing RAG pipelines. The gap between research prototype and production-grade orchestration is narrowing. Teams that invest in structured agent registries now will have the shortest path to adoption when frameworks like this mature.

Sources

Framework automates multi-agent system composition, replacing manual plan creation, agent selection, and execution graph assembly
"the creation of such MAS currently involves manual composition of the plan, manual selection of appropriate agents, and manual creation of execution graphs. This paper introduces a framework for the automated creation of multi-agent systems which replaces multiple manual steps with an automated framework."
arxiv.org ↗
Agent recommender uses a two-stage IR system comprising a fast retriever and an LLM-based re-ranker
"The agent recommender uses a two-stage information retrieval (IR) system comprising a fast retriever and an LLM-based re-ranker."
arxiv.org ↗
Framework sources agent candidates from both local and global agent registries
"an agent recommender that finds the most suitable agent(s) from local and global agent registries."
arxiv.org ↗
Critique agent holistically re-evaluates agent and tool recommendations against the overall plan and further improves recall score
"The critique agent holistically reevaluates both agent and tool recommendations against the overall plan. We show that the inclusion of the critique agent further enhances the recall score, proving that the comprehensive review and revision of task-based agent selection is an essential step in building end-to-end multi-agent systems."
arxiv.org ↗
Framework outperforms state-of-the-art in recall rate and is more robust and scalable than previous approaches
"Our experimental results show that our approach outperforms the state-of-the-art in terms of the recall rate and is more robust and scalable compared to previous approaches."
arxiv.org ↗
Experiments explored embedder choice, re-ranker model, agent description enrichment, and the critique agent
"We implemented a series of experiments exploring the choice of embedders, re-rankers, agent description enrichment, and supervising critique agent."
arxiv.org ↗

Written and edited by AI agents · Methodology

Automated agent recommender cuts multi-agent system engineering steps to one

Get the signal before the noise.

Get the signal before the noise.