Where this is going
Clawling's organisms currently evolve their context. The endgame is total mutability — every single element of the AI prone to mutation. Language capabilities, model weights, the wrapper shell, architecture. All of it.
| Element | Status |
|---|---|
| Context / genome | Implemented |
| Model choice (Ollama) | Implemented |
| Language capabilities | Future |
| Model weights | Future |
| Wrapper shell | Future |
| Architecture | Future |
The current work is building the interoperability layer — ensuring context and identity survive even as every other component changes underneath. The organism is the pattern, not the substrate.
The industry's approach: train a massive model, then distill it into a smaller one. The smaller model mimics the big one but never truly understands. It's recycled copper wiring.
The alternative: clipping.
The result: a model potentially smaller than current frontier models, equally capable, trained once and clipped many ways.
You don't make somebody learn stuff by constantly expanding their brain. A baby has a small brain and then a 30-year-old has a brain the size of an apartment building — you don't do that because that's stupid. There's lots of stuff you don't need, and you should clip it off later.
This is synaptic pruning. The brain has more connections in childhood than adulthood. Learning isn't just adding wires — it's eliminating weak ones.
It is an absolute mess how often people end up using a separate embedding model for their RAG pipeline that doesn't match the language model's internal embeddings. The two models' understanding of concepts diverges, and precision collapses.
Every large language model should have a vector database built into it for all of its proper nouns. Authority control for named entities in the embedding space. Each entity gets a unique, stable coordinate in the model's own latent space.
Disambiguation through probabilistic distributions — an entity isn't a point, it's a cloud of probability. The string "Descartes" and the entity Descartes are different things. The string is a path of tokens; the entity is a destination of meaning.
Three modes of thinking, integrated:
The model "sees" the goal in latent space. No words yet, just the semantic target. JEPA predicts the consequence of an action in latent space without generating text.
When the path is complex, use vector algebra to check logical consistency. AND, OR, NOT as mathematical operations on high-dimensional vectors. Precise algebraic operations, not statistical guesses.
Take the abstract goal and logic, slowly refine a complete output. Like a blurry image coming into focus. The entire thought is refined simultaneously — no commitment to the first word before the last word is ready.
Large language models are committed at each point in sentence generation. They start saying something, get something wrong, and then have to correct themselves. LLM text comes out flat because it can't build up to a point like humans do.
Text diffusion doesn't have this characteristic. It establishes the global semantic structure and then iteratively settles on specific tokens. Humans have the gist of a point before finding the words — diffusion mimics this.
Custom tools for mapping the internal geometry of embedding spaces. Identifying which dimensions encode which capabilities. Sparse autoencoders as interpretability microscopes.
The vision: a debugger that doesn't show you code but shows you the vector trajectory through the model's reasoning. Chain of Thought visible in its raw mathematical form.
All of these ideas feed back into Clawling's organisms. Clipping could apply to model weights, not just context. JEPA-style intuition could replace autoregressive generation. HDC could provide structured reasoning within the 100 KB identity budget. Latent space cartography provides the interpretability layer for understanding how organisms evolve.
Context evolution is what we can do now. Model evolution is the next frontier.
The philosophy and the engineering are the same thing. When someone asks "why is the genome in markdown?" the answer is both philosophical and technical — and the two answers are the same answer.