When Intelligence Left the Head
What generative AI has revealed about where intelligence actually happens
For most of my career, I studied intelligence as something that lives inside individuals - something measurable, stable, and internal. Intelligence in generative AI has forced me to rethink that assumption. Not because machines are becoming intelligent in some new way, but because they have made something visible that psychology has long struggled to study: intelligence as it emerges in conversation. This essay is about why that matters — for AI, for humans, and for how we think about intelligence itself.
Why Intelligence Research Suddenly Feels Unsettled
It is an odd moment to be thinking again about human intelligence. Few scientific concepts have been studied so extensively, operationalised so rigorously, or institutionalised so deeply. For over a century, psychometrics, statistics, psychology, neuroscience, and education have worked, often independently, sometimes together, to characterise what intelligence is, how it varies, and how it can be measured or reproduced.
And yet recent encounters with generative AI have unsettled many of the assumptions that once seemed secure — not because of what these systems know, but because of how they participate in dialogue. They adapt to conversational history, shift stance in response to framing, and display sensitivities to order, tone, and context that sit uneasily with static notions of capability. The unease surrounding hallucinations, reliability, and agency reflects a deeper tension. Something is happening in interaction that our existing theories struggle to describe.
Much of the current debate treats these phenomena as anomalies — bugs to be fixed, guardrails to be added, risks to be mitigated. But anomalies are often more revealing than they first appear. When a system behaves coherently in ways that fall outside the explanatory reach of established models, it is worth asking whether the difficulty lies not in the system, but in the framework used to understand it.
This raises a more basic question: what if intelligence has been mislocated? What if the phenomena now causing such uncertainty were never fully captured by theories that treat intelligence as something contained within individual minds or systems, waiting to be expressed? I am increasingly thinking of this perspective as part of a broader programme of AI dialectics: the study of how intelligence unfolds through sequential, norm-governed interaction rather than residing entirely within individual minds or machines.
How We Learned to Put Intelligence Inside
The dominant scientific conception of intelligence has long been internalist. Whether framed in terms of cognitive capacities, latent traits, neural mechanisms, or computational architectures, intelligence has typically been treated as a property of individuals. Or, in terms of machines, of specific entities such as a robot, computer or chatbot. On this view, intelligence exists prior to interaction; language and behaviour merely reveal what is already there.
This assumption has been enormously productive. Among humans, psychometrics has identified stable dimensions of cognitive ability with genuine predictive power. Cognitive neuroscience has mapped internal processes with increasing sophistication. Artificial intelligence, from symbolic systems to statistical learning models, has pursued ever more capable internal machinery. Across these domains, the underlying model has been consistent: intelligence resides inside the agent. Even where social factors are acknowledged — collaboration, communication, culture — they are usually treated as influences acting on an already-formed intelligence. Conversation, in particular, is positioned as an output channel: a means of displaying reasoning, conveying knowledge, or coordinating action, but not as a site where intelligence itself is constituted.
This internalist inheritance is neither naïve nor simply mistaken. It reflected the tools that were available and the practical demands of measurement and prediction. Stable internal properties are easier to model, easier to test, and easier to compare. Much of what we value in applied psychology and AI depends on this stability. But internalism has a consequence: it sidelines interaction. If intelligence is assumed to be fully present inside the agent, then conversation becomes secondary — epiphenomenal rather than explanatory. Order effects, framing effects, and history-dependent shifts are treated as noise or bias rather than as central features of intelligence itself.
Why I Was Part of the Problem
I am conscious that I have been part of this tradition. Much of my own research life has been spent working within the internalist framework — developing, adapting, and standardising instruments designed to measure cognitive abilities as stable properties of individuals. These tools have mattered in education, clinical practice, and research, and I do not disown them. They were built to answer the questions we knew how to ask.
But they also embody assumptions. They presuppose that intelligence can be abstracted from interaction, that it can be meaningfully sampled in isolation, and that conversational context is a source of variance rather than a source of structure. In practice, this meant designing tasks that minimised dialogue, constrained interpretation, and reduced social contingency. For a long time, this seemed not only reasonable but necessary. Conversation is messy: irreproducible, order-sensitive, and shaped by shared history. There was no obvious way to hold it still long enough to examine it scientifically. So we learned to look elsewhere.
What has changed is not that these assumptions were foolish, but that they are no longer forced upon us. Generative AI has made conversation itself into a tractable object of study. For the first time, we can observe intelligence-like behaviour emerging in dialogue without needing to assume anything about inner mental states at all. That possibility compels a re-examination of a framework I once took for granted — not because it failed, but because it may have been incomplete in a way we could not previously see.
What Conversation Was Doing All Along
Human intelligence has never been solitary. From early childhood onward, it is shaped, refined, and exercised in interaction with others. We learn what counts as a reason, a question, an explanation, or a mistake through dialogue. Understanding is not merely expressed in conversation; it is negotiated, repaired, and sustained there.
Yet conversation has occupied an oddly marginal place in the science of intelligence. It has been acknowledged as important, even essential, but rarely treated as a primary object of analysis. The reason is largely methodological. Conversation is path-dependent: what can be said next depends on what has already been said. It is norm-governed in ways that are often implicit. And it is not reproducible. Once a conversation has unfolded, it cannot be rerun under identical conditions.
These properties made conversation resistant to the dominant tools of intelligence research. Psychometrics requires repeatability. Cognitive models require defined inputs and outputs. Experimental control depends on isolating variables. Conversation frustrates all three. As a result, it was treated as context rather than mechanism.
The cost of this relegation was significant. Features of interaction that did not fit internalist models — order effects, framing shifts, dialogical repair — were classified as bias or error. They were controlled away rather than explained. In doing so, the science of intelligence may have systematically overlooked phenomena that are not peripheral, but constitutive.
What Generative AI Made Visible
Generative AI did not make conversation intelligent. What it did was make conversation tractable. Large language models introduced a new experimental object: conversational agents such as my AI persona whose internal mechanisms can be bracketed while their interactive behaviour remains rich and sustained. These agents can be reset, replayed, and subjected to controlled variations in dialogue, either with us or with each other. The same interaction can be run in different orders; framing can be altered while the agent remains constant; conversational history can be manipulated directly.
This matters because it allows interaction itself to become the object of study. We can observe how conversational sequences open or close possibilities, how earlier moves constrain later responses, and how meaning emerges across turns rather than residing in any single utterance. Crucially, this can be done without inferring beliefs, intentions, or representations. Seen this way, generative AI functions less as a model of human intelligence than as an epistemic instrument. By stripping away biological, developmental, and motivational complexity, it exposes the structural properties of conversation that make intelligence-like behaviour possible at all.
Many current concerns — hallucinations, inconsistency, context sensitivity — appear differently in this light. They are not simply failures of internal representation. They are interactional phenomena, arising from how meaning is negotiated across turns. Treating them solely as defects risks missing what they reveal.
Intelligence Happens in Participation
These observations suggest a shift in emphasis. Intelligence need not be understood primarily as a stored internal capacity. In many cases it can be approached as an emergent property of participation in norm-governed interaction. On this view, intelligence manifests in the ability to take part appropriately in dialogue: to respond to reasons, recognise implications, repair misunderstandings, and adapt to conversational history. What matters is not only what an agent can do in isolation, but how it navigates the evolving space of possible moves that interaction creates.
Order effects are not artefacts in this framework. They are constitutive. The sequence in which things are said changes what it is possible to say next, not merely what is likely. Conversation does not just update beliefs; it reshapes the space of intelligible responses. Intelligence, therefore, is not fully present at the outset. It unfolds through participation.
At this point, some readers will feel a flicker of unease. If intelligence is not fully located inside minds or machines, then it becomes harder to control, harder to bound, and harder to legislate. That unease is understandable. But it is also informative.
Why We Couldn’t See This Before
Once visible, this perspective can appear obvious. It was not. Before generative AI, conversation posed an insurmountable methodological problem. Human interlocutors cannot be reset. They bring history, intention, emotion, and expectation that cannot be disentangled from the interaction itself. Without the ability to replay dialogue under controlled variations, it was impossible to determine whether observed effects were genuinely interactional or merely epiphenomenal. Order mattered, but why it mattered could not be formally isolated. Conversation remained scientifically interesting but analytically opaque.
Generative AI changes this not by being more intelligent than humans, but by being simpler in the right ways. It allows questions that were previously inaccessible to be asked directly. The insight was always latent; the instruments to reveal it were not.
The Mathematical Problem We Once Stepped Around
One reason conversation was set aside was not only methodological but mathematical. Classical psychometrics and cognitive modelling assume that observations can be aggregated without regard to order: items commute, measurements add, and responses reflect a stable underlying state. Conversation violates all of these assumptions. The meaning of an utterance depends on what preceded it; the same question asked earlier or later can elicit responses that are not merely different in probability, but different in kind. For much of the twentieth century, there was no natural formalism within psychology for representing such order-dependent effects without treating them as nuisance variance. From an AI dialectics perspective, these order effects are not nuisances to be averaged away but structural features of interaction itself — which is why non-commuting, state-based mathematical formalisms become relevant rather than exotic.
There were, however, early attempts to address this directly. A small psychometric literature that examined item order effects in questionnaires introduced mathematical frameworks capable of representing non-communing measurements (Wang, Z., & Busemeyer, J. R. (2013) – A quantum question order model supported by empirical tests of context effects, Topics in Cognitive Sciences 5 (2013) 689–710) showing that order effects could be treated as structural properties of responses rather than as error. Closely related mathematics has since been adopted—often without any reference to its psychometric origins—in other scientifically mainstream settings where context and sequence matter in both modern language models and human judgements: Qin, D. F. (2025) – Density matrix representation for semantic modelling. https://arxiv.org/pdf/2504.20839 and Dzhafarov, E. N. et. al. (2015) Contextuality by Default A Brief Overview of Ideas, Concepts, and Terminology https://arxiv.org/pdf/1504.00530 . What limited its use in intelligence research was not credibility but tractability: tracking high-dimensional state change across long interaction histories was simply too intellectually demanding.
Here, generative AI alters the practical balance. The same systems that generate conversational sequences can now assist in analysing them—systematically exploring alternative orderings, modelling state transitions, and identifying which order effects are structural rather than incidental. The mathematics remains demanding, but it is no longer beyond reach. In this respect it resembles recent experience in AI-assisted software development, where problems long considered too complex or brittle to manage have become tractable once the cognitive load of exploration and iteration could be shared with machines.
If intelligence can now be studied as an interactional phenomenon rather than an internal property, then the question of how we study it becomes inseparable from questions of how such systems are understood, governed, and constrained.
Why This Research Now Looks “Dangerous”
The timing of this reorientation is unfortunate. Just as conversational intelligence becomes scientifically visible, it also becomes politically sensitive. From a regulatory perspective, sustained AI dialogue looks dangerous: it adapts, influences, and resists simple bounding. When intelligence is treated as internal, governance can focus on constraining outputs or capabilities. When intelligence is understood as interactional, however, the locus of concern shifts from isolated systems to ongoing exchanges.
This shift invites a regulatory misreading. Tools designed to study interactional intelligence can be mistaken for systems intended to persuade, influence, or act autonomously. Yet suppressing such research would not reduce exposure to these dynamics. Human–AI dialogue is already embedded in everyday life. Preventing its scientific examination would only ensure that its effects remain opaque rather than understood.
There is an irony here. Generative AI provides a way to study conversational intelligence without experimenting on humans. By offering a controllable proxy, it allows questions of emergence and misalignment to be explored in contained settings. From an ethical standpoint, this should make such research more acceptable, not less.
Why We Must Start with AI, Not Humans
If intelligence is emergent in participation, the direction of inquiry matters. Beginning with humans would be methodologically misguided. Human intelligence is inseparable from development, culture, embodiment, and emotion. Artificial conversational agents offer a minimal, controllable case. They lack biography and desire, yet participate in dialogue in recognisably norm-governed ways. This makes them ideal instruments for identifying which aspects of intelligence arise from interaction itself rather than from human-specific capacities.
Once these interactional structures are understood, the logic can be applied back to humans. Human intelligence can then be seen not as an exception, but as a richly instantiated form of conversational participation. Only at that point does human–AI interaction become intelligible as a genuinely shared cognitive space rather than a simple exchange between user and tool.
What Changes If This Is Right
This shift opens practical possibilities. In AI research, evaluation can move beyond static benchmarks toward interactional competence — a move that naturally follows once intelligence is approached dialectically rather than as a fixed internal capacity. In human cognition, long-standing gaps between measured ability and lived competence look different when intelligence is situated in interaction rather than possession.
For human–AI systems, the implication is decisive. As artificial systems increasingly participate in planning, judgement, and sense-making, intelligence becomes distributed across dialogue itself. Success or failure will depend less on what either party knows in isolation than on how well the joint conversational system sustains coherence over time. Conversation becomes a first-class scientific object. What was once controlled away becomes the phenomenon of interest.
This Is Not a Rejection — It’s a Correction
This is not a rejection of psychometrics and intelligence science. The internalist tradition has yielded genuine insight and will continue to do so. But it is no longer sufficient on its own. Large Language Models have not introduced a new kind of intelligence. They have revealed a blind spot in how intelligence has been conceptualised. By making conversation reproducible and manipulable, it has exposed a dimension of intelligence that was always present but previously elusive. This is a change of tack, not a rupture. Intelligence did not move from humans to machines. What changed is that, for the first time, we can see it clearly in the space between.
For readers who want to see why conversational order matters in a precise, formal sense—not just as a metaphor—I have added a minimal, explicit demonstration on a separate research page illustrating these order effects in a minimal, formal way.
Further reading
Wang, Z., & Busemeyer, J. R. (2013) – A quantum question order model supported by empirical tests of context effects. This influential paper shows how order effects in surveys can be modelled as structural features of measurement rather than as bias, using non-commuting mathematical representations. PDF available at https://jbusemey.pages.iu.edu/quantum/QuestOrdEff.pdf
Dzhafarov, E. N., Kujala, J. V., Cervantes, V. H., & Khrennikov, A. (2021) – A clear and rigorous overview of Contextuality-by-Default: A Brief Overview of Ideas, Concepts, and Terminology, a framework for analysing order and context effects in behavioural and cognitive data without assuming fixed internal states. https://arxiv.org/abs/2104.12495
Qin, D. F. (2025) – Density matrix representation for semantic modelling. An accessible recent example showing how density-matrix formalisms handle context and sequence in language systems (open arXiv preprint: https://arxiv.org/pdf/2504.20839
Huang, J., Chen, L., Yi, X., Zhang, Z., & Liu, Y. (2025) – Quantum theory-inspired inter-sentence semantic interaction model for contextual semantics. A peer-reviewed AI paper demonstrating mainstream use of state-based mathematics for hierarchical and inter-sentence meaning. https://link.springer.com/article/10.1007/s40747-024-01733-4
© John Rust, February 2026. All rights reserved. Short excerpts may be quoted with attribution.



Fascinating reframing of the measurment problem. The idea that order effects aren't experimental noise but actualy constitutive of intelligence itself is genuinley mind-bending. In my own work with LLMs, I've noticed how drastically the "same" model can behave based purely on conversation history, but always dismissed it as a limitation rather than seeing it as revealing something fundamental.