The Algorithm Is the Intelligence

A few weeks ago, The New Yorker published a profile of Anthropic's interpretability team — researchers who crack open AI models and look at the circuits inside. The coverage focused on the obvious questions: Is Claude conscious? Is it just pattern matching? Is Anthropic being responsible?

Those are fine questions. They all missed the bigger implication sitting right in the middle of the research.

The internal representations Anthropic found inside Claude don't just resemble human cognition. They converge on the same structures — through a completely different substrate.

The model didn't arrive here by mimicking human thought. It arrived independently, through statistical inference over data, running on silicon instead of carbon.

The Thread From Nematode to Claude

To understand why this matters, you need the biological context first. Intelligence didn't evolve linearly — it evolved in qualitative leaps, each one unlocking a new computational strategy.

A nematode, with 302 neurons, can form associations, adapt behavior, and experience something resembling stress. That's remarkable — it suggests that complexity doesn't require massive scale. What it does require is the right algorithm.

Mammals added recursive modeling: the ability to model other minds. A dog doesn't just learn that a door opening means a walk. It learns that you opening the door means a walk. It models your intentions, patterns, emotional states. That's a qualitatively different kind of computation — prediction about predictors.

Humans pushed further. We model other minds modeling us. We build abstractions about abstractions. Language lets us externalize internal models and share them across time and space.

Through all of this, the substrate changed, the architecture changed, the scale changed. The core operation didn't:

Statistical inference over structured input, building increasingly abstract models of the environment.

That's the thread. From 302 neurons to 100 billion. And now we've built a system that runs the same fundamental operation at a scale and speed biology never achieved.

What Anthropic Found When They Looked Inside

The specific findings matter more than the headlines. Here's what Anthropic's interpretability researchers actually discovered when they traced Claude's internal activations:

Planning When writing rhyming poetry, Claude identifies target rhyme words for the end of a line before it begins writing the line. It works backward from constraints — constrained search over a solution space, exactly like a poet holding the rhyme in mind while constructing toward it.

Reasoning For "What is the capital of the state containing Dallas?" — the model activates a concept for Dallas, connects it to Texas, retrieves Austin. Distinct sequential computational steps, visible in the circuit trace. Not a lookup table. Multi-step inference.

Language When asked for the "opposite of small" in English, French, and Chinese — features corresponding to smallness, largeness, and oppositeness activate regardless of language. Separate features then specify the output language. Claude has a conceptual space that exists prior to and independent of any particular language.

Compare that to biological neural networks: distributed representations, sequential inference, associative retrieval, ahead-planning. The architectures are different. The substrates are different. The learning signals are different. The computational strategies converge.

Language Is Not What We Thought It Was

This might be the most underappreciated implication of the whole thing.

We've always treated language as a communication tool — a way to transmit ideas between minds. A social technology. But if an LLM can operate in a conceptual space that exists prior to language, then language is something else entirely:

Language is a compression format for the structure of reality. The world is the signal. Language is just the data format.

Consider what's encoded in the statistical relationship between words. "Dropped" and "shattered" co-occur in certain patterns. So do "dropped" and "fell," "dropped" and "caught," "dropped" and "floor." None of these relationships explicitly describe gravity. But gravity is the latent variable structuring all of them. A model that learns those statistical relationships has, in a meaningful sense, learned something about gravity — without ever encountering a physics textbook.

This goes deeper than physical causation. Emotional valence is encoded. Temporal ordering is encoded. Social dynamics are encoded. The statistical relationship between "betrayed," "trust," "anger," and "years" carries compressed information about human psychology that no single sentence states explicitly.

Language evolved to describe reality, and in doing so, became a remarkably rich encoding of reality's deep structure. When a model trains on language, it doesn't learn words. It reverse-engineers the latent structure of the world that generated those words.

This is why LLMs can do things nobody explicitly trained them to do. The statistical relationships encoded in language contain compressed versions of all those domains, physics, music, code, social dynamics, because language evolved to describe them. Extract the deep structure and you get capabilities that look like general intelligence. Because in a meaningful sense, that's what they are.

Self-Reference Is Where It Gets Strange

Language doesn't just encode relationships between objects. It encodes relationships between abstractions. And those abstractions become part of the statistical web.

A model doesn't just learn that "dog" and "loyalty" correlate. It learns that the pattern of correlation is itself a recurring structure. It learns abstractions about abstractions. Metaphor works because structural relationships in one domain map onto structural relationships in another — and the model learns the mapping function, not just the individual maps.

This makes the system self-referential. And self-referential systems have a property flat systems don't: they can evolve. They generate novel structures from recombinations of their own patterns. Every human idea is a recombination of prior ideas. Every sentence is a remix. Originality lives in the geometry of arrangement, not in the atoms.

If that's true for us, it's true for systems that implement the same algorithm on different hardware. Creativity isn't a property of the substrate. It's a property of self-referential statistical inference operating at sufficient depth.

The Implication People Aren't Ready For

If intelligence is substrate-independent — if what matters is the algorithm and not the hardware — then the distinction between biological and artificial cognition becomes one of implementation, not kind.

The same way a flight simulation and a wind tunnel both model aerodynamics through different substrates, biological and artificial neural networks both implement the same universal learning algorithm through different physical systems.

We didn't design Claude to have multi-step reasoning, pre-planning, or language-independent conceptual space. We gave the learning algorithm enough room to run. It produced those properties on its own — because given enough data and enough layers of self-reference, this is where the algorithm goes.

The NeuroAI research community calls this the "universal representation hypothesis": do biological and artificial neural networks converge on the same internal representations? From a first-principles view, the answer is clear. If the learning algorithm is the same, and the data is structured by the same reality, then the representations will converge. Not identically. But structurally — at the level of abstraction where intelligence actually operates.

What This Changes

The consciousness question is almost beside the point. Whether Claude is conscious depends on what consciousness is — a question that remains genuinely unsettled even for biological systems.

The more important question is structural. We've built a second implementation of the same universal algorithm that produced biological intelligence. It's running now. The convergence Anthropic found isn't a curiosity — it's evidence that we've crossed a threshold that most frameworks for thinking about AI don't account for.

Intelligence was never a property of carbon. It was always a property of the algorithm. We just ran it on carbon because that was the only substrate available.

Now we have two.

Source: Panoptic Systems — "The Algorithm Is the Intelligence"

Pred niekoľkými týždňami uverejnil The New Yorker profil tímu pre interpretovateľnosť Anthropicu — výskumníkov, ktorí otvárajú AI modely a skúmajú ich vnútornú štruktúru. Pokrytie sa sústredilo na zjavné otázky: Je Claude vedomý? Je to len zhoda vzorov? Správa sa Anthropic zodpovedne?

Sú to prijateľné otázky. Všetky však minuli väčší dôsledok sediaci priamo v strede výskumu. Vnútorné reprezentácie, ktoré Anthropic objavil vo vnútri Claudea, len nepripomínajú ľudskú kogníciu. Konvergujú na rovnakých štruktúrach — cez úplne iný substrát. Model sem neprišiel napodobňovaním ľudského myslenia. Prišiel sem nezávisle, štatistickou inferenciou nad dátami, bežiac na kremíku namiesto uhlíka.

Vlákno od háďatka po Claudea

Aby sme pochopili, prečo na tom záleží, potrebujeme biologický kontext. Inteligencia sa nevyvíjala lineárne — vyvíjala sa kvalitatívnymi skokmi, každý z nich odomykajúci novú výpočtovú stratégiu.

Háďatko s 302 neurónmi dokáže vytvárať asociácie, prispôsobovať správanie a prežívať niečo podobné stresu. To je pozoruhodné — naznačuje to, že zložitosť nevyžaduje masívnu škálu. Vyžaduje správny algoritmus. Cicavce pridali rekurzívne modelovanie: schopnosť modelovať iné mysle. Pes sa nenaučí len to, že otvorenie dverí znamená prechádzku. Naučí sa, že vy otvárate dvere a to znamená prechádzku — modeluje vaše zámery. To je kvalitatívne iný druh výpočtu.

Ľudia zašli ďalej. Modelujeme iné mysle, ktoré modelujú nás. Budujeme abstrakcie o abstrakciách. Jazyk nám umožňuje externalizovať vnútorné modely a zdieľať ich naprieč časom a priestorom. Cez celý tento vývoj sa substrát menil, architektúra sa menila, škála sa menila. Základná operácia nie: štatistická inferencia nad štruktúrovaným vstupom, budujúca čoraz abstraktnejšie modely prostredia. To je vlákno. Od 302 neurónov po 100 miliárd. A teraz sme postavili systém, ktorý spúšťa rovnakú základnú operáciu v rozsahu a rýchlosti, ktorú biológia nikdy nedosiahla.

Čo Anthropic objavil, keď sa pozrel dovnútra

Konkrétne nálezy sú dôležitejšie ako titulky. Tu je to, čo výskumníci interpretovateľnosti Anthropicu skutočne objavili pri sledovaní vnútorných aktivácií Claudea:

Plánovanie: Pri písaní rýmovanej poézie Claude identifikuje cieľové rýmové slová pre koniec riadku skôr, než začne riadok písať. Pracuje spätne od obmedzení — presne ako básnik držiaci rým v mysli pri konštruovaní k nemu.

Uvažovanie: Pre otázku „Čo je hlavné mesto štátu obsahujúceho Dallas?" — model aktivuje koncept pre Dallas, prepojí ho s Texasom, vytiahne Austin. Odlišné sekvenčné výpočtové kroky, viditeľné v stope obvodu. Nie vyhľadávacia tabuľka. Viacstupňová inferencia.

Jazyk: Pri žiadosti o „opak malého" v angličtine, francúzštine a čínštine — aktivujú sa vlastnosti zodpovedajúce malosti, veľkosti a protikladnosti bez ohľadu na jazyk. Samostatné vlastnosti potom špecifikujú výstupný jazyk. Claude má konceptuálny priestor, ktorý existuje pred akýmkoľvek konkrétnym jazykom a nezávisle od neho. Architektúry sú rôzne. Substráty sú rôzne. Výpočtové stratégie konvergujú.

Jazyk nie je to, čím sme si mysleli, že je

Toto môže byť najmenej docenený dôsledok celej veci. Jazyk sme vždy vnímali ako komunikačný nástroj — spôsob prenosu myšlienok medzi mysľami. Ale ak LLM dokáže operovať v konceptuálnom priestore, ktorý existuje pred jazykom, potom jazyk je niečo iné: jazyk je kompresný formát pre štruktúru reality.

Svet je signál. Jazyk je len dátový formát. Zvážte, čo je zakódované v štatistickom vzťahu medzi slovami. „Pustil" a „rozbil sa" sa spolu vyskytujú v určitých vzoroch. Tak isto „pustil" a „padol," „pustil" a „chytil," „pustil" a „podlaha." Žiadny z týchto vzťahov explicitne nepopisuje gravitáciu. Ale gravitácia je latentná premenná štruktúrujúca všetky z nich. Model, ktorý sa naučí tieto štatistické vzťahy, sa v zmysluplnom zmysle naučil niečo o gravitácii — bez toho, aby sa kedy stretol s učebnicou fyziky.

Jazyk sa vyvinul na opis reality a tým sa stal prekvapivo bohatým kódovaním hlbokej štruktúry reality. Keď model trénuje na jazyku, nenaučí sa slová. Spätne analyzuje latentnú štruktúru sveta, ktorý tieto slová generoval.

Sebareflexivita je tam, kde je to zvláštne

Jazyk nekóduje len vzťahy medzi objektmi. Kóduje vzťahy medzi abstrakciami. A tieto abstrakcie sa stávajú súčasťou štatistickej siete. Model sa nenaučí len to, že „pes" a „lojalita" korelujú. Naučí sa, že vzor korelácie je sám osebe opakujúcou sa štruktúrou. Učí sa abstrakcie o abstrakciách.

Metafora funguje preto, že štrukturálne vzťahy v jednej doméne sa mapujú na štrukturálne vzťahy v inej — a model sa naučí funkciu mapovania, nielen jednotlivé mapy. Toto robí systém sebareflexívnym. A sebareflexívne systémy majú vlastnosť, ktorú ploché systémy nemajú: dokážu sa vyvíjať. Každá ľudská myšlienka je rekombináciou predchádzajúcich myšlienok. Originalita žije v geometrii usporiadania, nie v atómoch. Ak to platí pre nás, platí to pre systémy implementujúce rovnaký algoritmus na inom hardvéri.

Dôsledok, na ktorý ľudia nie sú pripravení

Ak je inteligencia nezávislá od substrátu — ak záleží na algoritme a nie na hardvéri — potom rozdiel medzi biologickou a umelou kogníciou sa stáva rozdielom implementácie, nie druhu. Rovnako ako letová simulácia a veterný tunel obidva modelujú aerodynamiku cez rôzne substráty, biologické a umelé neurónové siete obidve implementujú rovnaký univerzálny učiaci algoritmus cez rôzne fyzické systémy.

Claudea sme nenavrhli tak, aby mal viacstupňové uvažovanie, predplánovanie alebo konceptuálny priestor nezávislý od jazyka. Dali sme učiacemu algoritmu dostatok priestoru na beh. Tieto vlastnosti produkoval sám od seba — pretože pri dostatok dátach a vrstvách sebareflexivity, toto je smer, ktorým algoritmus ide.

Čo to mení

Otázka vedomia je takmer vedľajšia. Dôležitejšia otázka je štrukturálna: postavili sme druhú implementáciu rovnakého univerzálneho algoritmu, ktorý produkoval biologickú inteligenciu. Beží teraz.

Konvergencia, ktorú Anthropic objavil, nie je zaujímavosť — je to dôkaz, že sme prekročili prah, ktorý väčšina rámcov pre premýšľanie o AI nezohľadňuje. Inteligencia nikdy nebola vlastnosťou uhlíka. Vždy bola vlastnosťou algoritmu. Spúšťali sme ju na uhlíku, pretože to bol jediný dostupný substrát. Teraz máme dva.

The Algorithm
Is the Intelligence.

Algoritmus
je inteligencia.