2025 LLM Year in Review

The year 2025 marked a significant shift in the landscape of Large Language Models, characterized by fundamental changes in how these systems are built and understood — 2025 has been a strong and eventful year of progress in LLMs. The following is a list of personally notable and mildly surprising "paradigm changes" - things that altered the landscape and stood out to me conceptually.

The nature of this emerging intelligence presents a paradox, defying simple categorization — LLMs are emerging as a new kind of intelligence, simultaneously a lot smarter than I expected and a lot dumber than I expected.

Insights

A major evolution in training methodology has established a new standard, enabling models to exhibit reasoning-like behaviors — In 2025, Reinforcement Learning from Verifiable Rewards (RLVR) emerged as the de facto new major stage to add to this mix. By training LLMs against automatically verifiable rewards across a number of environments... the LLMs spontaneously develop strategies that look like "reasoning" to humans

We must rethink our understanding of these models, moving away from biological analogies to more accurate conceptualizations — We're not "evolving/growing animals", we are "summoning ghosts". ... we are getting very different entities in the intelligence space, which are inappropriate to think about through an animal lens.

This difference manifests in a unique performance profile where capabilities are jagged rather than uniform — they are at the same time a genius polymath and a confused and cognitively challenged grade schooler, seconds away from getting tricked by a jailbreak to exfiltrate your data.

Consequently, the reliability of traditional metrics has been compromised by these new training techniques — The core issue is that benchmarks are almost by construction verifiable environments and are therefore immediately susceptible to RLVR and weaker forms of it via synthetic data generation.

Practices

A fundamental democratization of software creation has occurred, changing the very nature of code — With vibe coding, programming is not strictly reserved for highly trained professionals, it is something anyone can do. ... code is suddenly free, ephemeral, malleable, discardable after single use.

The deployment architecture for AI assistance is shifting towards local environments to leverage context and speed — it makes more sense to run the agents directly on the developer's computer. Note that the primary distinction that matters is not about where the "AI ops" happen to run... but about everything else - the already-existing and booted up computer, its installation, context, data

This evolves the interaction model from a passive tool to an active, resident collaborator — it's not just a website you go to like Google, it's a little spirit/ghost that "lives" on your computer. This is a new, distinct paradigm of interaction with an AI.

Future interfaces will likely move beyond text to more efficient, visual formats — LLMs should speak to us in our favored format - in images, infographics, slides, whiteboards, animations/videos, web apps, etc.

2025 LLM Year in Review ↗

Insights

Practices