The AI Knowledge Gap
Has understanding AI become a commercial privilege?
For years, we’ve been told that artificial intelligence is advancing so fast that independent researchers struggle to keep up. That much is true. But the real issue goes deeper than technical speed: it’s about access. The people who build advanced AI systems know far more about how they work than anyone allowed to study them from the outside. It is not because external researchers lack expertise. It is because they lack permission.
AI companies keep their training data private, their model internals sealed, and their evaluation pipelines proprietary. The public explanation is “safety” — protecting users, protecting privacy, preventing misuse. Those concerns are real. But the structural effect is the same: only the companies themselves can study the systems in any depth. This creates a new kind of divide — not economic, not technical, but epistemic. Those who build AI systems can investigate how they reason, generalise, and fail. Those outside cannot. Society ends up relying on the companies not only to develop the systems, but also to interpret them, evaluate them, and tell us what they mean.
Understanding powerful technologies normally requires independent scrutiny. Nuclear physics developed as a field of open academic research - you can study the structure of the atom without building bombs. Genetics had been studied in universities without hindering commercial applications. Even in pharmaceuticals, where patents matter enormously, regulators require full transparency. With AI, however, we have reached an unusual moment: the more powerful the models become, the less university researchers and the scientific community can access the empirical foundations needed to understand them. This is not about blaming companies. They operate under commercial pressure, regulatory uncertainty, and global competition. But the result is an information imbalance that traditional scientific frameworks were not designed to handle.
One way to understand the consequences is to look at a field where transparency has always been central: psychometrics — the science of psychological assessment and educational measurement. For most of its history, psychometrics has depended on access to human data. A new test is built, piloted on a human sample, calibrated, validated, and normed. Without seeing the data, the test cannot be evaluated.
But with the advent oif AI large language models, something new has emerged. These systems can simulate human-like responses to survey items, generate item difficulties, predict correlations, and even approximate factor structures — all without human respondents. As a research exercise, I created a scale called Attitudes Toward AI Emergence using only AI-generated components. No human data. No pilot sample. No hidden survey panel. The result isn’t a polished psychological tool. But it is a useful prototype of what psychometric development looks like when human data is intentionally absent. The model supplies the statistical structure; the theorist supplies the conceptual map. Human respondents would eventually refine or correct the scale — but they are no longer required for the machinery to exist.
That small experiment exposes a much larger societal question. AI-Supported Statistical Inference (ASI) is alread playing an important role in the prediction of human behaviour. If our scientific disciplines increasingly depend on AI-derived statistical structures, but researchers cannot examine the training corpus or the internal priors that shape those structures, then traditional norms of transparency and replicability begin to fail. Psychometrics simply reveals the problem in miniature: building tests from behavioural patterns that society is not permitted to study directly.
The deeper issue, however, is not psychological measurement at all. It is the shift from Human-in-the-Loop (HitL) oversight (our long-standing model of scientific governance) to Machine-in-the-Loop (MitL) systems: scientific processes that include AI-generated structures as core components. Models become collaborators, not tools. They supply priors, generate hypotheses, and shape theories. Yet only a handful of organisations can investigate how these priors are formed.
In such an environment, it is no longer enough to ask whether AI is safe or unsafe. We must ask something more fundamental: How can society maintain independent understanding of technologies that it cannot freely study? This is not a call for releasing weights or data. It is a call for new institutions capable of inspecting AI systems on behalf of the public — perhaps through controlled access regimes, secure research environments, or independent evaluation bodies. Without such frameworks, scientific understanding risks becoming concentrated in private hands.
The epistemic asymmetry is already visible. AI companies can analyse synthetic populations, probe emergent behaviours, and fine-tune reasoning structures using information that external researchers simply do not have. And when they publish safety claims or scientific papers, the rest of us must take their interpretation on trust. There is no malice in this. It is simply the way incentives converge when a technology is both commercially valuable and epistemically opaque.
But this convergence raises a stark question: If independent researchers cannot fully understand the foundations of the systems shaping society, how can the public hope to govern them? Psychometrics offers merely one example. Medicine, education, law, finance, policy modelling, and national security will soon face the same constraints. The problem is structural, not disciplinary.
What we need now is a reimagining of scientific governance — one where openness and scrutiny do not depend on corporate goodwill, and where the public has institutions capable of matching the analytical reach of the companies that build these systems.
It is not about demanding access to secrets.
It is about building shared understanding.
If the next decade of AI is defined by anything, it will be whether democratic societies find a way to restore that balance — before the epistemic gap becomes too wide to close.


