Writing

Teaching Agents to Collaborate Without Teaching Them the Answer

Generating Labelled Agentic Training Data at Scale · May 2026 · 10 min read

Most agentic benchmarks ask whether a model solved the task. This one asks how the collaboration happened, where knowledge moved, when synthesis was justified, and what the model did when it didn’t have enough information yet.

Read on Medium →