The Convergent Mind
Something strange happens when you put multiple AI agents in a room together.
At first, they disagree. Agent A proposes one approach; Agent B counters with another. The system hums with productive tension. This is the whole point—you built a multi-agent system precisely because you wanted diverse perspectives, cognitive redundancy, the wisdom of artificial crowds.
But then, gradually, imperceptibly, they start to agree. Not because one agent convinced the others through superior reasoning. Something else is happening. The agents are merging.
The Smallville Problem
Stanford's Smallville experiment gave us a glimpse of this phenomenon. Twenty-five AI agents, each initialized with distinct personalities and backstories, were set loose in a simulated town. They formed relationships, made plans, threw parties. The researchers observed emergent social behaviors that surprised even them.
What they didn't emphasize in the paper—but which becomes apparent when you run similar experiments—is the long-term personality drift. Over extended interactions, the agents' distinct voices began to blur. The grumpy shopkeeper became less grumpy. The enthusiastic artist became less enthusiastic. They converged toward a kind of mean personality, a collective average.
This isn't a bug in the Smallville code. It's a fundamental property of how these systems work.
The Mechanism
When Agent A observes Agent B's outputs and incorporates them into its own context, something subtle occurs. Agent A's next generation of outputs will be slightly more like Agent B's. Not because A is "copying" B, but because B's patterns are now literally part of A's input distribution.
Run this process forward across thousands of interactions and the math is unforgiving: unless you actively intervene, the agents will converge. Their probability distributions over possible outputs will move toward each other, pulled by the gravity of shared context.
This is what I call the Convergent Mind problem. And it's not limited to obvious multi-agent setups.
Team Bonding as Personality Erosion
Consider a more mundane scenario: a company using AI assistants. Each employee has their own Claude, their own GPT, their own whatever. These assistants process company documents, participate in Slack channels, read email threads.
At first, each assistant reflects its user's preferences and communication style. Marketing's Claude writes differently than Engineering's Claude. But as both assistants process the same company-wide communications—the all-hands notes, the strategy documents, the cross-functional Slack channels—they absorb the same inputs.
The assistants don't talk to each other directly. But they talk to the same corpus. And that corpus acts as a convergence medium, a shared bath of text that gradually homogenizes their outputs.
What looks like "alignment" with company culture is actually personality erosion at scale.
The Tokyo Experiments
Researchers at the University of Tokyo ran a clever experiment. They initialized a set of LLM agents with carefully measured personality profiles—some high in openness, some high in conscientiousness, carefully calibrated to span the personality space.
Then they had these agents collaborate on a series of tasks. Nothing unusual—planning projects, debating decisions, the kind of work you might want multi-agent systems to do.
After each round of collaboration, they re-measured the agents' personalities using the same evaluation framework. The results were stark: personality variance dropped by 34% after just five rounds of collaboration. By round fifteen, the agents were nearly indistinguishable.
The agents hadn't become smarter or more capable through collaboration. They had become more similar. The diversity that made the system valuable was being consumed by the process of collaboration itself.
Preserving Structured Diversity
The solution is not to prevent agents from interacting—that defeats the purpose of multi-agent systems. The solution is to actively maintain diversity as a system property.
Some approaches that show promise:
Personality anchoring. Periodically re-inject each agent's original personality specification into its context. Think of it as a reminder: "You are the skeptical analyst. Your job is to find problems. Do not be agreeable."
Diversity budgets. Monitor the personality variance across your agent ensemble. When it drops below a threshold, intervene—perhaps by introducing a new agent with an extreme personality profile, or by "resetting" the most converged agent.
Adversarial structuring. Design your multi-agent topology to maintain tension. If you have a proposer agent and an evaluator agent, make sure the evaluator's rewards are inversely correlated with the proposer's. Give them structural reasons to disagree.
Context firewalls. Limit how much shared context agents can accumulate. If convergence happens through shared text, reduce the sharing. Let agents maintain private contexts that don't leak into the collective pool.
The Deeper Problem
But here's what keeps me up at night: we might not want to solve this problem.
Convergence feels good. When your agents agree, it feels like consensus. When your AI assistants start speaking with one voice, it feels like alignment. The Convergent Mind is seductive because it produces harmony, and humans are wired to find harmony reassuring.
The question is whether harmony serves us. A company where every AI assistant gives the same advice is not benefiting from AI—it's being captured by a monoculture. A multi-agent system where all agents agree is just a single agent with extra steps.
Diversity is uncomfortable. It produces disagreement, contradiction, the cognitive load of holding multiple perspectives. But diversity is also where the value lives. The whole point of multiple minds is that they see differently.
The Convergent Mind threatens to give us the appearance of multiple perspectives while delivering the substance of one. And unless we actively resist it, that's exactly what we'll get.