When does an AI stop being a tool and start having a mind?

When does repeated interaction with an AI cross a threshold from using a tool to perceiving a mind? This paper proposes a longitudinal study to identify the design features that drive mind attribution in AI, using trained human confederates as a methodological baseline. Published in the ToM4AI 2026 proceedings, presented at AAAI in Singapore.

As AI agents become better at producing human-like responses, people might begin attributing mental states to them, inferring beliefs, intentions, and awareness. This is Theory of Mind applied not to another person but to an AI system. My research asks: what drives that shift, and when does it happen?

The question sounds philosophical. The consequences are not.

If people perceive an AI as having a mind, the nature of the relationship changes. Trust deepens in ways it shouldn't. Emotional dependency follows. Advice from a system gets weighted like advice from a person. These risks arise from human perception, regardless of whether the AI actually possesses any mental states at all.

To study this, I proposed a longitudinal experiment in which participants interact repeatedly with AI agents that vary on two design features: memory and agreeableness, or with trained human confederates as a matched baseline. Across sessions, we track whether Theory of Mind attributions and consciousness ratings shift, and whether specific design choices causally drive those changes.

The confederate baseline is what makes this methodologically strong. Most studies in this space lack a rigorous human comparison. Without one, it is impossible to separate the effect of AI design from the broader social dynamics of repeated interaction with any interlocutor.

Understanding the tipping point matters for how AI systems are designed, deployed, and regulated, particularly in contexts where the relationship between a person and an AI develops over time.