Mine definitely would. This sounds so clichéd, but Claude (Opus 4.5, but also the others) just "gets how I think" better. I've tried Gemini 3 and GPT 5.2 and didn't like them at all -- not when I know I can have Claude. I mostly code Python + Django, so it could also be from that.
Gemini 3 has this extremely annoying habit of bleeding its reasoning process onto comments which are hard to read and not very human-like (they're not "reasoning", they're "question for the sake of questioning", which I get as a part of the process, but not as a comment in the code!). I've seen it do things like these many times:
# Because so and so and so and so we must do x(param1=True, param2=False)
# Actually! No, wait! It is better if we do x(param1=True, param2=True)
x(param1=True, param2=True, param3=False) # This one is even better!
Beyond that, it just does not produce what I consider good python code. I daily-drove Gemini 2.5 before I realized how good Anthropic's models were (or perhaps before they punched back after 2.5?) and haven't been able to go back.
As for GPT 5.2, I just feel like it doesn't really follow my instructions or way of thinking. Like it's dead set on following whatever best practices it has learned, and if I disagree with them, well tough luck. Plus, and I have no better way of saying this, it's just rude and cold, and I hate it for it.
I recently discovered Claude, and it does much better than Codex or Gemini for python code.
Gemini seems to lean to making everything a script, disconnected from the larger vision. Sure, it uses our existing libraries, but the files it writes and functions it makes can’t be integrated back in.
Codex is fast. Very fast. Which makes it great for a conversational UI, and answering questions about the codebasw or proposing alternatives but when it writes code it’s too clever. The code is valid but not pythonic. Like the invention of one line functions just to optimize a situation that had could be parameterized in three places.
Claude on the other hand makes code that is simple to understand and has enough architecture that you can lift it out and use as is without too much rewriting.
Because the concept of skills is not tied to code development :) Of course if that's what you're talking about, you are already very close to the "interface" that skills are presented in, and they are obvious (and perhaps not so useful)
But think of your dad or grandma using a generic agent, and simply selecting that they want to have certain skills available to it. Don't even think of it as a chat interface. This is just some option that they set in their phone assistant app. Or, rather, it may be that they actually selected "Determine the best skills based on context", and the assistant has "skill packs" which it periodically determines it needs to enable based on key moments in the conversation or latest interactions.
These are all workarounds for the problems of learning, memory...and, ultimately, limited context. But they for sure will be extremely useful.
I'm so excited for the future, because _clearly_ our technology has loads to improve. Even if new models don't come out, the tooling we build upon them, and the way we use them, is sure to improve.
One particular way I can imagine this is with some sort of "multipass makeshift attention system" built on top of the mechanisms we have today. I think for sure we can store the available skills in one place and look only at the last part of the query, asking the model the question: "Given this small, self-contained bit of the conversation, do you think any of these skills is a prime candidate to be used?" or "Do you need a little bit more context to make that decision?". We then pass along that model's final answer as a suggestion to the actual model creating the answer. There is a delicate balance between "leading the model on" with imperfect information (because we cut the context), and actually "focusing it" on the task at hand, and the skill selection". Well, and, of course, there's the issue of time and cost.
I actually believe we will see several solutions make use of techniques such as this, where some model determines what the "big context" model should be focusing on as part of its larger context (in which it may get lost).
In many ways, this is similar to what modern agents already do. cursor doesn't keep files in the context: it constantly re-reads only the parts it believes are important. But I think it might be useful to keep the files in the context (so we don't make an egregious mistake) at the same time that we also find what parts of the context are more important and re-feed them to the model or highlight them somehow.
You mean it hasn’t been that way for the last 14 years and it hasn’t survived 5 different computer changes and a dozen or so OS migrations and you don’t still have a tiny document with “fun business ideas” to start that company with your fresh out of college gang right next to that “how to organize a great 10 year reunion” sheet from 3 years ago?
I'm 32, but I noticed I started making similar mistakes around 28 or so. Occasionally I write out words which are completely wrong, but sound similar.
It's as if one part of the brain is doing the thinking, and another one is "listening" to it and transcribing/typing it out, making mistakes.
For a little while I was a bit worried, but I then realized nothing else had changed, so I've just gotten used to it and like to jokingly say "I've become so fast at thinking that even I can't keep up!"
I suspect that as I got more vocabulary (and more languages) and started to actually speak the language more, I remember the "sound" of the word where I used to remember the spelling before. Like "it's" and "its" wasn't a problem before, but now I catch it somewhere in the text. At least I consistently write more than 50% of the articles nowdays. Languages are weird.
While I was writing the reply, I considered giving the absolutely exact same explanation, which I agree with, down to the example you gave.
Two examples in portuguese that always trip me up, and absolutely never used to before are (i) "voz" (voice) and "vós" (you); and (ii) "trás" (back) and "traz" (bring).
I think the logic can be applied to humans as well as AI:
Sure, the AI _can_ code integrations, but it now has to maintain them, and might be tempted to modify them when it doesn't need to (leaky abstractions), adding cognitive load (in LLM parlance: "context pollution") and leading to worse results.
Batteries-included = AI and humans write less code, get more "headspace"/"free context" to focus on what "really matters".
As a very very heavy LLM user, I also notice that projects tend to be much easier for LLMs (and humans alike) to work on when they use opinionated well-established frameworks.
Nonetheless, I'm positive in a couple of years we'll have found a way for LLMs to be equally good, if not better, with other frameworks. I think we'll find mechanisms to have LLMs learn libraries and projects on the fly much better. I can imagine crazy scenarios where LLMs train smaller LLMs on project parts or libraries so they don't get context pollution but also don't need a full-retraining (or incredibly pricey inference). I can also think of a system in line with Anthropic's view of skills, where LLMs very intelligently switch their knowledge on or off. The technology isn't there yet, but we're moving FAST!
> As a very very heavy LLM user, I also notice that projects tend to be much easier for LLMs (and humans alike) to work on when they use opinionated well-established frameworks.
i have the exact opposite experience. its far better to have llms start from scratch than use batteries that are just slightly the wrong shape... the llm will run circles and hallucinate nonexistent solutions.
that said, i have had a lot of success having llms write opinionated (my opinions) packages that are shaped in the way that llms like (very little indirection, breadcrumbs to follow for code paths etc), and then have the llm write its own documentation.
- Strict team separation (frontend versus backend)
- Moving all state-managament out of the backend and onto the frontend, in a supposedly easier to manage system
- Page refreshes are indeed jarring to users and more prone to leading to sudden context losses
- Desktop applications did not behave like web apps: they are "SPA"s in their own sense, without jarring
refreshes or code that gets "yanked" out of execution. Since the OS has been increasingly abstracted under the browser, and the average computer user has moved more and more towards web apps[1], it stands to reason that the behavior of web apps should become more like that of desktop apps (i.e. "SPA"s)[2]
(Not saying I agree with these, merely pointing them out)
[1] These things are not entirely independent. It can be argued that the same powers that be (big corps) that pushed SPAs onto users are also pushing the "browser as OS" concept.
[2] I know you can get desktop-like behavior from non-SPAs, but it is definitely not as easy to do it or at least to _learn it_ now.
My actual opinion: I think it's a little bit of everything, with a big part of it coming from the fact that the web was the easiest way to build something that you could share with people effortlessly. Sharing desktop apps wasn't particularly easy (different targets, java was never truly run everywhere, etc.), but to share a webapp app you just put it online very quickly and have someone else point their browser to a URL -- often all they'll do is click a link! And in general it is definitely easier to build an SPA (from the frontender's perspective) than something else.
This creates a chain:
If I can create and share easily
-> I am motivated to do things easily
-> I learn the specific technology that is easiest
-> the market is flooded with people who know this technology better than everything else
-> the market must now hire from this pool to get the cheapest workers (or those who cost less to acquire due to quicker hiring processes)
-> new devs know that they need to learn this technology to get hired
-> the cycle continues
So, TL;DR: Much lower barrier to entry + quick feedback loops
P.S (and on topic): I am an extremely satisfied django developer, and very very very rarely touch frontend. Django is A-M-A-Z-I-N-G.
These days with 90% of SPAs being broken piles of browser standard breaking stuff, I find a page refresh or page load to be a soothing experience. They are like checkpoints in the process of using a website. Points to which I can go back using my browser's back button, and I can trust, that my browser keeps track of them.
In contrast, when I see an SPA, I need to worry about the whole site going to shit, because I blocked some third-party unwanted script, and then I need to fear not being able to go back properly, and having to re-do everything. Now that is a jarring experience.
I feel you. It's definitely a tradeoff. SPAs do tend to be buggier but I can't deny that, when done right, they also tend to be better.
Unfortunately, there's _more_ people, building _more_ stuff, so there's _more_ terrible stuff out there. The amount of new apps (and new developers, especially ones with quite limited skills) is immense compared to something like ten years ago. This means that there's just more room for things to be poorly-built.
Not being able to new tab to a new instance of the app is terrible. Even super complex SPAs like Facebook let me for the most part right click new tab to create a new instance while preserving the old one.
100% agree. This is exactly the kind of big picture thinking that so many people often seem to miss. I did too, when I was young and thought the world was just filled with black and white, good vs evil dichotomies
It is a tale as old as time, and one which no doubt all of us repeat at some point in our lives. There are hundreds of clichéd books, hundreds of songs, and thousand of letters that echo this sentiment.
We are rarely capable of valuing the freedoms we have never been deprived of.
To be privileged is to live at the quiet centre of a never-ending cycle: between taking a freedom for granted (only to eventually lose it), and fighting for that freedom, which we by then so desperately need.
And as Thomas Paine put it: "Those who expect to reap the blessings of freedom, must, like men, undergo the fatigues of supporting it."
(I make this remark merely as a gag. I think you pinpoint an issue which has been unresolved for ages and is knee-deep into ethics. One could argue that many of our disagreements about AI and progress (in a broader sense) stem from different positions on ethics, including utilitarianism).
I’m a utilitarian, but a transparent one. I think most people are uncomfortable saying “I acknowledge the pain and suffering it took to make my iPhone, and the tradeoff is worth it.”
Gemini 3 has this extremely annoying habit of bleeding its reasoning process onto comments which are hard to read and not very human-like (they're not "reasoning", they're "question for the sake of questioning", which I get as a part of the process, but not as a comment in the code!). I've seen it do things like these many times:
Beyond that, it just does not produce what I consider good python code. I daily-drove Gemini 2.5 before I realized how good Anthropic's models were (or perhaps before they punched back after 2.5?) and haven't been able to go back.As for GPT 5.2, I just feel like it doesn't really follow my instructions or way of thinking. Like it's dead set on following whatever best practices it has learned, and if I disagree with them, well tough luck. Plus, and I have no better way of saying this, it's just rude and cold, and I hate it for it.
reply