I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed. It has a learning curve. You need to learn how to give it proper context, how to chunk up the work, etc. And you need to know how to program, obviously. Asking it to do something you don't know how to do, that's just asking for a disaster. I have more than 25 years of experience, so I'm confident with anything Claude Code will try to do and can review it, or stop and redirect it. About 10-15 years ago, I was dreaming about some kind of neural interface, where I could program without writing any code. And I realized that with Claude Code, it's kind of here.
A couple of times I hit the daily limits and decided to try Gemini CLI with the 2.5 pro model as a replacement. That's not even comparable to Claude Code. The frustration with Gemini is just not worth it.
I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
If you are a Senior Developer, who is comfortable giving a Junior tips, and then guiding them to fixing them (or just stepping in for a brief moment and writing where they missed something) this is for you. I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
For me the sweet spot is for boilerplate (give me a blueprint of a class based on a description), translate a JSON for me into a class, or into some other format. Also "what's wrong with this code? How would a Staff Level Engineer white it?" those questions are also useful. I've found bugs before hitting debug by asking what's wrong with the code I just pounded on my keyboard by hand.
Yes, can confirm that as a senior developer who has needed to spend huge amounts of time reviewing junior code from off-shore contractors with very detailed and explicit instructions, dabbling in agentic LLM coding tools like Claude Code has felt like like a gift from heaven.
I also have concerns about said junior developers wielding such tools, because yes, without being able to supply the right kind of context and being able to understand the difference between a good solution and a bad solution, they will produce tons of awful, but technically working code.
Totally agree with the off-shore component of this. I'm already going to have to break a task down into clear detail and resolve any anticipated blocker myself upfront to avoid multi-timezone multi-day back and forth.
Now that I'm practiced at that, the off-shored part is no longer valuable
Many companies that see themselves as non-technical at the core prefer building solutions with an army of intermediate developers that are hot swappable. Having highly skilled developers is a risk for them.
Unlikely. Microsoft had layoffs everywhere except India. There they keep hiring more. As song as the can keep upskilling themselves while still being much cheaper than US workers they won't fear unemployment.
Just yesterday I saw on X a video of a Miami hotel where the check-in procedure was via a video call to a receptionist in India.
Six months from now, that singular worker if they are still employed, will manage a high number of receptionist avatars. And then they themselves will be replaced. It will still lead to a massive collapse in the labor market and with all of that excess labor, existing jobs while being overworked will still see flat to decreasing wages.
Most people underestimate how strongly capital wants to displace labor, even if the outcomes are demonstrably worse. Esp in a captured scenario like hotel reception, you have already booked, you aren't going anywhere else.
Blowing away the junior -> senior pipeline would, on average, hit every country the same.
Though it raises an interesting point: if a country like India or China did make the investment in hiring, paying, and mentoring junior people but e.g. the US didn't, then you could see a massive shift in the global center of gravity around software expertise in 10 years (plus or minus).
Someone is going to be the best at planning for and investing in the future on this, and someone is going to maximally wishful thinking / short-term thinking this, and seductive-but-not-really-there vibe coding is probably going to be a major pivot point there.
This is such an important point. Not sure about India, which is still very market forces driven, but china can just force its employers to do whatever is of strategic importance. That’s long gone in the US. Market forces here will only ever optimize for short term game, shooting ourselves in the chest.
I've got myself in a PILE of trouble when trying to use LLMs with languages/technologies I am unfamiliar with (React, don't judge me).
But with something that I am familiar with (say Go, or Python) LLMs have improved my velocity massively, with the caveat that I have had to explicitly tell the LLM when it is producing something that I know that I don't want (me arguing with an LLM was an experience too!)
Ah mate I can’t relate more to the offshore component. I had a very sad experience where I recently had to let go of an offshore team due to them providing devs that essentially ‘junior with copilot’ but labelled as a ‘senior’.
Time and time again I would find telltale signs of dumping LLM output into PRs n then claiming it as their own. Not a problem, but the code didn’t do what the detailed ticket asked and introduced other bugs as a result.
It ultimately became a choice of ‘go through the hassle of making a detailed brief for it to just be put in copilot verbatim and then go through the hassle of reviewing it and explaining the issues back to the offshore dev’ or ‘brief Claude directly’
I hate to say it but from a business perspective the latter won outright. It tears me up as it goes against my morality.
I know what you mean it just feels a bit inhumane to me. Sort of like defining a value for a living being and then determining that they fell beneath said value.
> I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
If this is the case then we better have full AI generated code within the next 10 years since those "juniors" will remain atrophied juniors forever and the old timers will be checking in with the big clock in the sky. IF we, as a field, believe that this can not possibly happen, then we are making a huge mistake leaning on a tool that requires "deep [orthogonal] experience" to operate properly.
You can't atrophy if you never grew in the first place. The juniors will be stunted. It's the seniors who will become atrophied.
As for whether it's a mistake, isn't that just the way of things these days? The current world is about extracting as much as you can while you're still here. Look around. Nobody is building for the future. There are a few niche groups that talk about it, but nobody is really doing it. It's just take, take, take.
This just seems more of the same, but we're speeding up. We started by extracting fossil fuels deposited over millions of years, then extracting resources and technology from civilisations deposited over millennia, then from the Victorians deposited only a century or two ago, and now it's software deposited over only mere decades. Someone is going to be left holding the bag, we just hope it's not us. Meanwhile most of the population aren't even thinking about it, and most of the fraction that do think are dreaming that technology is going to save us before it's payback time.
IT education and computer science (at least part of it) will need a stronger focus on software engineering and software architecture skills to teach developers how to be in control of an AI dev tool.
The fastest way is via struggle. Learn to do it yourself first. Understand WHY it does not work. What's good code? What's bad code? What are conventions?
There are no shortcuts - you are not an accountant just because you have a calculator.
With that mindset you don't have to go to school, you could learn everything through struggle... Ideally it's a bit of both, you need theory and experience to succeed.
Brains are not computers and we don't learn by being given abstract rules. We also don't learn nearly as well from class room teaching as we do from doing things IRL for a real purpose - the brain always knows the difference and that the (real, non-artificially created) stakes are low in a teaching environment.
That's also the huge difference between AI and brains: AI does not work on the real world but on our communication (and even that is limited to text, missing all the nuance or face to face communication includes). The brain works based on sensor data from the real world. The communication method, language, is a very limited add-on on top of how the brain really works. We don't think in language, to do even some abstract language based thinking, e.g. when doing formal math, requires a lot of concentration and effort and still uses a lot of "under the hood" intuition.
That is why even with years of learning the same curriculum we still need to make a significant effort for every single concrete example to "get everyone on the same page", creating compatible internal models under the hood. Everybody's own internal model of even simple things are slightly different, depending on what brain they brought to learning and what exactly they learned, where even things like social classroom interactions went into how the connections were formed. Only based on a huge amount of effort can we then use language to communicate in the abstract, and even then, when we leave the central corridor of ideas people will start arguing forever about definitions. No matter how the written text is the same, the internal model is different for every person.
As someone who took neuroscience, I found this surprisingly well written:
"The brain doesn't like to abstract unless you make it"
> This resource, prepared by members of the University of London Centre for Educational Neuroscience (CEN), gives a brief overview of how the brain works for a general audience. It is based on the most recent research. It aims to give a gist of the brain’s principles of function, covering the brain’s evolutionary origin, how it develops, and how it copes in the modern world.
The best way to learn is to do things IRL that matter. School is a compromise and not really all that great. People motivated by actual need often can learn things that take years in school with middling results significantly faster and with better and deeper results.
Yeah. The only, and I mean only non-social/networking advantages to universities stem from forced learning/reasoning about complex theoretical concepts that form the requisite base knowledge to learn the practical requirements of your field while on the job.
Trade schools and certificate programs are designed to churn out people with journeyman-level skills in some field. They repeatedly drill you on the practical day-in-day-out requirements, tasks, troubleshooting tools and techniques, etc. that you need to walk up to a job site and be useful. The fields generally have a predictable enough set of technical problems to deal with that a deep theoretical exploration is unnecessary. This is just as true for electricians and auto mechanics as it is for people doing limited but logistically complex technical work, like orchestrating a big fleet of windows workstations with all the Microsoft enterprise tools.
In software development and lots of other fields that require grappling with complex theoretical stuff, you really need both the practical and the theoretical background to be productive. That would be a ridiculous undertaking for a school, and it’s why we have internships/externships/jr positions.
The combination of these tools letting the seniors in a department do all of the work so companies don’t have to invest in interns/juniors so there’s no reliable entry point into the field, and there being an even bigger disconnect between what schools offer and the skills they need to compete, the industry has some rough days ahead and a whole lot of people trying to get a foothold in the industry right now are screwed. I’m kind of surprised how little so many people in tech seem to care about the impending rough road for entry-level folks in the industry. I guess it’s a combination of how little most higher level developers have to interact with them, and the fact that everybody was tripping over themselves to hire developers when a lot of seniors joined the industry.
It's not a particularly moral way to think, but if you're currently mid level or senior, the junior dev pipeline being cut off will be beneficial to you personally in a few years' time.
Potentially very beneficial, if it turns out software engineers are still needed but nobody has been training them for half a decade
It’s clear that it harms those that get to keep their jobs less to some extent (though when you’ve got a glut of talent and few jobs, the only winners are employers because salaries tank eventually.) But frankly, the pervasiveness of that intense greed and self-absorption used to be anathema to the American software industry. Now it looks a lot more like a bunch of private equity bros than a bunch of people who stood to make good money selling creative solutions to the world’s problems. Even worse, the developers that built this business still think they’re part of the in-club, and too special and talented to get tossed out like a bag of moldy peaches. They’re wrong, and it’s sad to watch.
And that is the best thing about AI, it allows you to do and try so much more in the limited time you have. If you have an idea, build it with AI, test it, see where it breaks. AI is going to be a big boost for education, because it allows for so much more experimentation and hands-on.
By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software, so being able to do much more in a limited amount of time will not necessarily make you a more knowledgeable programmer, or at least that knowledge will most likely only be surface-level pattern recognition. It still needs to be combined with hands-on building your own thing, to truly understand the nuts and bolts of such projects.
If you end up with a working project where you understand all the moving parts, I think AI is great for learning and the ultimate proof whether the learning was succesful if whether you can actually build (and ship) things.
So human teachers are good to have as well, but I remember they were of limited use for me when I was learning programming without AI. So many concepts they tried to teach me without having understood themself first. AI would have likely helped me to get better answers instead of, "because that is how you do it" when asking why to do something in a certain way.
So obviously I would have prefered competent teachers all the time and also now competent teachers with unlimited time instead of faulty AIs for the students, but in reality human time is limited and humans are flawed as well. So I don't see the doomsday expectations for the new generation of programmers. The ultimate goal, building something that works to the spec, did not change and horrible unmaintainable code was also shipped 20 years ago.
I don't agree, to me switching from hand coded source code to ai coded source code is like going from a hand-saw to an electric-saw for your woodworking projects. In the end you still have to know woodworking, but you experiment much more, so you learn more.
Or maybe it's more like going from analog photography to digital photography. Whatever it is, you get more programming done.
Just like when you go from assembly to c to a memory managed language like java. It did some 6502 and 68000 assembly over 35 years ago, now nowbody knows assembly.
Key words there. To you, it's a electric saw because you already know how to program, and that's the other person's point; it doesn't necessarily empower people to build software. You? Yes. Generally though when you hand the public an electric saw and say "have at it, build stuff" you end up with a lot of lost appendages.
Sadly, in this case the "lost appendages" are going to be man-decades of time spent undoing all the landmines vibecoders are going to plant around the digital commons. Which means AI even fails as a metaphorical "electric saw", because a good electric saw should strike fear into the user by promising mortal damage through misuse. AI has no such misuse deterrent, so people will freely misuse it until consequences swing back wildly, and the blast radius is community-scale.
> more like going from analog photography to digital photography. Whatever it is, you get more programming done.
By volume, the primary outcome of digital photography has been a deluge of pointless photographs to the extent we've had to invent new words to categorize them. "selfies". "sexts". "foodstagramming". Sure, AI will increase the actual programming being done, the same way digital photography gave us more photography art. But much more than that, AI will bring the equivalent of "foodstagramming" but for programs. Kind of like how the Apple App Store brought us some good apps, but at the same time 9 bajillion travel guides and flashlight apps. When you lower the bar you also open the flood gates.
Being able to do it quicker and cheaper will often ensure more people will learn the basics. Electrical tools open up woodworking to more people, same with digital photography, more people take the effort to learn the basics. There will also be many more people making rubbish, but is that really a problem?
With ai it’s cheap and fast for a professional to ask the AI: what does this rubbish software do, and can you create me a more robust version following these guidelines.
> With ai it’s cheap and fast for a professional to ask the AI: what does this rubbish software do, and can you create me a more robust version following these guidelines.
This falls apart today with sufficiently complex software and also seems to require source availability (or perfect specifications).
One of the things I keep an eye out for in terms of "have LLMs actually cracked large-product complexity yet" (vs human-overseen patches or greenfield demos) is exactly that sort of re-implementation-and-improvement you talk about. Like a greenfield Photoshop substitute.
Your last point is also something that happened when the big game engines such as Unity became free to use. All of a sudden, Steam Greenlight was getting flooded with gems such as "potato peeling simulator" et al. I suppose it is just a natural side effect of making things more accessible.
> Sadly, in this case the "lost appendages" are going to be man-decades of time spent undoing all the landmines vibecoders are going to plant around the digital commons.
Aren't you being overly optimistic that these would even get traction?
Pessimistic, but yeah. It's just my whole life has been a string of the absolute worst ideas being implemented at scale, so I don't see why this would buck the trend.
> By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software
> will not necessarily make you a more knowledgeable programmer
I think we'd better start separating "building software" from programming, because the act of programming is going to continue to get less and less valuable.
I would argue that programming has been very overvalued for a while even before AI. And the industry believes it's own hype with a healthy dose of elitism mixed in.
But now AI is removing the facade and it's showing that the idea and the architecture is actually the important part, not the coding if it.
Ok. But most developers aren't building AI tech. Instead, they're coding a SPA or CRUD app or something else that's been done 10000 times before, but just doing it slightly differently. That's exactly why LLMs are so good at this kind of (programming) work.
I would say most people are dealing with tickets and meetings about the tickets more than they are actually spending time with their editor. It may be similar, but that 1 percent difference needs to be nailed down right, as that's where the business lifeline lays.
Unfortunately education everywhere is getting really hurt by access to AI, both from students who are enabled to not their homework, and by teacher review/feedback being replaced by chatbots.
In Germany, software engineer is a trade you go to trade school for three years while working in a company in parallel. I don't think that IT education and computer science in universities should have a stronger focus on SE as universities are basically a trade school for being a researcher.
Yes, but it is more of a cultural thing than anything else. Studying computer science to be a software developer* is like studying mechanical engineering to be a machine operator.
* except if you are developing complicated algorithms or do numeric stuff. However, I believe that the majority of developers will never be in such a situation.
A software degree or a CS degree with a more applied focus will teach you way better than the trade schools will. It'd be nice if that weren't the case, but from all I've seen it is.
So you end up in that weird spot where it would work very well for someone with a strong focus on self-learning and a company investing into their side of the training, but at that point you could almost skip the formal part completely and just start directly, assuming you have some self-taught base. Or work part-time will studying on the side, and get the more useful degree that way. Plenty places will hire promising first-year uni students.
Yeah I noticed the issue with more Junior developers right away. Some developers, Junior or not, have yet to be exposed to environments where their PRs are put under HEAVY scrutiny. They are used to loosey-goosey and unfortunately they are not prepared to put LLM changes under the level of scrutiny they require.
The worst is getting, even smallish, PRs with a bunch of changes that look extraneous or otherwise off. After asking questions the code changes without the questions being answered and likely with a new set of problems. I swear I've been prompting an LLM through an engineer/PR middleman :(
That is how you get Oracle source code. It broke my illusions after entering real life big company coding after university, many years ago. It also led to this gem of an HN comment: https://news.ycombinator.com/item?id=18442637
A couple of week ago, I had a little down time and thought about a new algorithm I wanted to implement. In my head it seemed simple enough that 1) I thought the solution was already known, and 2) it would be fairly easy to write. So I asked Claude to "write me a python function that does Foo". I spent a whole morning going back and forth getting crap and nothing at all like what I wanted.
I don't know what inspired me, but I just started to pretend that I was talking to one one of my junior engineers. I first asked for a much simpler function that was on the way to what I wanted (well, technically, it was the mathematical inverse of what I wanted), then I asked it to modify it to add one different transform, and then another, and then another. And then finally, once the function was doing what I wanted, I asked it to write me the inverse function. And it got it right.
What was cool about it, is that it turned out to be more complex linear algebra and edge cases than I originally thought, and it would have been weeks for me to figure all of that out. But using it as a research tool and junior engineer in one was the key.
I think if we go down the "vibe coding" route, we will end up with hoards of juniors who don't understand anything and the stuff they produce with AI will be garbage and brittle. But using AI as a tool is starting to feel more compelling to me.
The LLM will never admit it doesn't have a clue what's going on, but over time you develop a sense of when it's onto something and when it's trapped in a loop of plausible sounding nonsense
Edit: Also, it's funny how often you can get it to improve its output by just saying "this looks kind of bad for x reason, there must be a way to make it better"
I have experimented with instructing CC to doubt itself greatly and presume it is not validating anything properly.
It caused it to throw out good ideas for validation and working code.
I want to believe there is some sweet spot.
The constant “Aha!” type responses followed by self validating prose that the answer is at hand or within reach can be intoxicating and can not be trusted.
The product is also seemingly in constant flux of tuning, where some sessions result in great progress, others the AI seems as if it is deliberately trying to steer you into traffic.
Anthropic is alluded toward this being the result of load. They mentioned in their memo about new limits for Max users that abuse of the subscription levels resulted in ~subpar product experiences. It’s possible they meant response times and the overloaded 500 responses or lower than normal TPS, but there are many anecdotal accounts of CC suddenly having a bad day from “longtime” users, including myself.
I don’t understand how load would impact the actual model’s performance.
It seems like only load based impacts on individual session context would result in degraded outputs. But I know nothing of serving LLM at scale.
Can anyone explain how high load might result in an unchanged product performing objectively worse?
An observation. If we stipulate that this is true that a 'senior developer' benefits from Claude Code but a junior developer do not. Then I'm wondering if that creates this gap where you have a bunch of newly minted '10x' engineers who are doing the work that a bunch of junior devs helped with, and now you're not training any new junior devs because they are unemployable. Is that correct?
It already was the case wasn't it, that you could either get one senior dev to build your thing in a week, or give them a team of juniors and it would take the whole team 4 weeks and be worse.
Yet somehow companies continued to opt for the second approach. Something to do with status from headcount?
Yes, there are companies that opt for broken organizations for a variety of reasons. The observation though is this; Does this lead to a world where the 'minimum' programmer is what we consider today to be a 'Senior Dev' ? It echoes the transition of machinists to operators of CAD/CAM workstations to operate machining centers, rather than hands on the dials of a mill or lathe. It certainly seems like it might make entering the field through a "coder camp" would no longer be practical.
It'll be interesting to see if in a decade when a whole cohort of juniors didn't get trained whether LLMs will be able to do the whole job. I'm guessing a lot of companies are willing to bet on yes.
The issue is there's a kind of prisoner's dilemma going on - probably some people can see that there's a serious risk of still needing software engineers in 10 years' time and there not being enough because nobody is training juniors in 2025.
However, noticing this doesn't help because if you invest in training juniors in 2025 but nobody else does, someone else can just recruit them in 2030 and benefit from your investment
Yes exactly if workers can just up and leave and treat the job transactionally, that creates a race to the bottom. Workers have to train themselves then.
“Wasting” effort on juniors is where seniors come from. So that first approach is only valid at a sole proprietorship, at an early stage startup, or in an emergency.
I'm getting my moneys worth having claude write tools. We've reached the dream where I can vibe out some one off software and it's great; today I made two different (shitty but usable!) gui programs in seconds that let me visually describe some test data. The alternative was probably half an hour of putting something together if my first idea was good. Then I deleted them and moved on.
It still writes insane things all the time but I find it really helpful to spit out single use stuff and to brainstorm with. I try to get it to perform tasks I don't know how to accomplish (eg. computer vision experiments) and it never really works out in the end but I often learn something and I'm still very happy with my subscription.
I've also found it good at catching mistakes and helping write commit messages.
"Review the top-most commit. Did I make any mistakes? Did I leave anything out of the commit message?"
Sometimes I let it write the message for me:
"Write a new commit message for the current commit."
I've had to tell it how to write commit messages though. It likes to offer subjective opinions, use superlatives and guess at why something was done. I've had to tell it to cut that out: "Summarize what has changes. Be concise but thorough. Avoid adjective and superlatives. Use imperative mood."
Review your own code. Understand why you made the changes. And then clearly describe why you made them. If you can't do that yourself, I think that's a huge gap in your own skills.
Making something else do it means you don't internalize the changes that you made.
Your comment is not a fair interpretation of what I wrote.
For the record, I write better and more detailed commit messages than almost anyone I know across a decades[^0] long career[^1,^2,^3,^4,^5]. But I'm not immune from making mistakes, and everyone can use an editor, or just runs out of mental energy. Unfortunately, I find it hard to get decent PR reviews from my colleagues at work.
So yeah, I've started using Claude Code to help review my own commits. That doesn't mean I don't understand my changes or that I don't know why I made them. And CC is good at banging out a first draft of a commit message. It's also good at catching tiny logic errors that slip through tests and human review. Surprisingly good. You should try it.
I have plenty of criticisms for CC too. I'm not sure it's actually saving me any time. I've spent the last two weeks working 10 hour days with it. For some things it shines. For other things, I would've been better off writing the code from scratch myself, something I've had to do maybe 40% of the time now.
[^5]: None of the these are my best examples, just the ones I found quickly. Most of my commit messages are obviously locked away by my employer. Somewhere in the git history is a paragraphs long commit message from Jeff King (peff) explaining a one line diff. That's probably my favorite commit message of all time. But I also know that at work I've got a message somewhere explaining a single character diff.
What I can recommend is to tell it that for all documentation, readmes and PR descriptions to keep it "tight, no purple-prose, no emojis". That cuts everything down nicely to to-the-point docs without GPTisms and without the emoji storm that makes it look like yet another frontend framework Readme.
My commits’description part, if warranted, is about the reason for the changes, not the specificity of the solution. It’s a little memo to the person reading the diff, not a long monograph. And the diff is usually small.
Can also confirm. Almost any output from claude code needs my careful input for corrections, which you could only spot and provide if you have experience. There is no way a junior is able to command these tools because the main competency to use them correctly is your ability to guide and teach others in software development, which by definition is only possible if you have senior experience in this field. The sycophancy provided by these models will outright damage the skill progression for juniors, but on the other hand there is no way to not use them. So we are in a state where the future seems really uncertain for most of us.
I find the "killer app" right now is anything where you need to integrate information you don't already have in your brain. A new language or framework, a third-party API, etc. Something straightforward but foreign, and well-documented. You'll save so much time because Claude has already read the docs
The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low. They’re convinced everything the tools produce will be junk or that the worst case examples people provide are representative of the average.
Then they finally go out and use the tools and realize that they exceed their (extremely low) expectations, and are amazed.
Yeah we all know Claude Code isn’t going to generate a $10 billion SaaS with a team of 10 people or whatever the social media engagement bait VCs are pushing this week. However, the tools are more powerful than a lot of people give them credit for.
In case some people having realized it by now: it’s not just the code, it’s also/mostly the marketing. Unless you make something useful that’s hard to replicate..
I have recently found something that’s needed but very niche and the sort of problem that Claude can only give tips on how to go about it.
People are using different definitions of "vibe coding". If you expect to just prompt without even looking at the code and being involved in the process the result will be crap. This doesn't preclude the usefulness of models as tools, and maybe in the future vibe coding will actually work. Essentially every coder I respect has an opinion that is some shade of this.
There are the social media types you mention and their polar opposites, the "LLMs have no possible use" crowd. These people are mostly delusional. At the grown-ups table, there is a spectrum of opinions about the relative usefulness.
It's not contradictory to believe that the average programmer right now has his head buried in the sand and should at least take time to explore what value LLMs can provide, while at the same time taking a more conservative approach when using them to do actual work.
>maybe in the future vibe coding will actually work
Vibe coding works today at small enough of scale.
I'm building a personal app to help me track nutrition and I only needed to get involved in the code when Claude would hit its limits for a single file and produced a broken program (and this was via the UI, not Claude Code.) Now at ~3000 lines of python.
After I told it to split it into a few files I don't think I've had to talk about anything at the code level. Note that I eventually did switch to using Claude Code which might have helped (gets annoying copy/pasting multiple files and then my prompts hit max limits).
I just prompt it like an experienced QA/product person to tell it how to build it, point out bugs (as experienced as a user), point out bad data, etc.
A few of my recent prompts (each is a separate prompt):
>for foods found but not in database, list the number of times each shows up
>sort the list by count descending
>Period surplus/deficit seems too low. looking at 2025/07/24 to 2025/07/31
>do not require beige color (but still track it). combine blue/purple as one in stats (but keep separate colors). data has both white and White; should use standard case and not show as two colors
Hmm not my experience. I've been aggressively trying to use both Cursor and Claude Code. I've done maybe 20-30 attempts with Code at different projects, a couple of them personal small projects. All of them resulted in sub-par results, essentially unusable.
I tried to use it for Python, Rust and Bash. I also tried to use it for crawling and organizing information. I also tried to use it as a debugging buddy. All of the attempts failed.
I simply don't understand how people are using it in a way that improves productivity. For me, all of this is so far a huge timesink with essentially nothing to show for it.
The single positive result was when I asked it to optimize a specific SQL query, and it managed to do it.
Anyway I will keep trying to use it, maybe something needs to click first and it just hasn't yet.
I asked it to implement a C++ backend for an audio plug-in API (CLAP) for the DAW I'm developing and it got it right in maybe less than ten interactions. Implementing other plug-in APIs such as VST3 took me weeks to get to the same level of support.
I’ve been delighting all of my tedious tasks with as much context as I would give a person, and my personal win rate at this is substantially higher than I expected.
If you give it trash and expect gold, sure, gambling.
Which is what I meant by, "you need to be very deliberate with it", you have to spend a lot of time on the inputs to get good outputs. Which makes it feel a fair bit less like "Intelligence" and a lot more like a calculator.
Context is everything. Whether they were talking to your junior employee, or to an LLM, if you don’t say what it is, you want don’t be surprised when it’s left to interpretation and comes out wrong.
Specifically, Claude coat is really good at making markdown files of plans and if you review them add in context, you can let it run a little more free than you would otherwise.
If you don’t feel like giving it, the right amount of context, make the job smaller, where there’s just less of it to begin with.
I wouldn’t tell my interns to change the formatting of these printf statements, because I don’t feel like it, but Claude does that stuff pretty great and doesn’t complain as much.
You're probably in an obscure niche domain, or asking it to do something creative.
Try like upgrading JS package dependencies, or translating between languages, limited tedious things, and you will be surprised how much better it does.
Hmmmm.. I am working in a niche domain (Confidential Computing) and the work is fairly creative, although I wouldn't say I asked it domain-specific things. I didn't ask it to come up with encryption schemes or security protocols, I learned very quickly that it cannot even start on those problems. "Design discussions" were just sycophantic affirmations of whatever I wrote. What I mostly tried were "add this function" or "refactor this based on XY" or "analyze this piece of code for race conditions".
(Un?)fortunately my work doesn't involve a lot of "drone coding". With personal projects I let it do whatever it wanted including picking the language and libraries. With one of them it ended up so confused with the Redis API(!!!) that it kept going back and forth between different versions to "fix" the issues until it literally removed the functionality it was supposed to add. Problem solved, eh?
Oh I've definitely seen that too, even with common front-end stuff.
I think people might be exaggerating how much they are out of the loop. Claude often needs to be guided away from stupid decisions, and it will absolutely delete code in order to "fix" a problem.
Still, it saves me work on tedious stuff, and makes it more pleasant. I wouldn't ask it to do anything I don't understand, unless I don't care about the result.
I think it's some variation of the efficient markets hypothesis. There are no problems that are both that lucrative and that easy to solve; if they existed, they would get dogpiled and stop being lucrative. Even in this day and age, $10B of revenue is an incredibly high bar.
On the other hand, $10B as valuation (not revenue) just requires a greater fool. Maybe it's possible, but I doubt there are too many of those fools available.
The question is not whether you can or can't, but whether it is still worth it long term:
- There is a moat of doing so (i.e. will people actually pay for your SaaS knowing that they could do it too via AI) and..
- How many large scale ideas do you need post AI? Many SaaS products are subscription based and loaded with features you don't need. Most people would prefer a simple product that just does what they need without the ongoing costs.
There will be more software. The question is who accrues the economic value of this additional software - the SWE/tech industry (incumbent), the AI industry (disruptor?) and/or the consumer. For the SWE's/tech workers it probably isn't what they envisioned when they started/studied for this industry.
It seems obvious to me it is the consumer who will benefit most.
I had been thinking of buying an $80 license for a piece of software but ended up knocking off a version in Claude Code over a weekend.
It is not even close to something commercial grade that I could sell as a competitor but it is good enough for me to not spend $80 on the license. The huge upside is that I can customize the software in any way I like. I don't care that it isn't maintainable either. Making a new version in ChagGPT5 is going to be my first project.
Just like a few hours ago I was thinking how I would like to customize the fitness/calorie tracking app I use. There are so many features I like that would be tightly coupled to my own situation and not a mass market product.
This to me seems obvious of what the future of software looks like for everything but mission critical software.
This has a lot of future implications for employment in tech of course, architecture/design decisions, etc. Why would a non-tech company use a SaaS when it just AI up something and have 1-2 engineers on the build accountable? Its a lot cheaper and amortisable over many products saving some companies millions. Not just tech implementors but sales staff would be disrupted. Especially when the SaaS is implementing a standard or requires significant customisation anyway. Buy vs build, product vs implementation, it should all change soon - the silver lining in all of this.
> The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low.
Or they have actually used all these tools, know how they work, and don't buy into hype and marketing.
It doesn't help that a lot of skeptics are also dishonest. A few days ago someone here tried to claim that inserting verbose debug logging, something Claude Code would be very good at, is "actually programming" and it's important work for humans to do.
No, Claude can create logs all across my codebase with much better formatting far faster than I can, so I can focus on actual problem solving. It's frustrating, but par for the course for this forum.
Edit: Dishonest isn't correct, I should have said I just disagree with their statements. I do apologize.
That's not pedantry, pedantry would be if it were a very minor or technical detail, but being dishonest doesn't have anything to do with having a different opinion.
No, some skeptics are actually dishonest. It's part of trolling, and trolling is in fashion right now. Granted, some skeptics are fair, but many do it strictly for the views, without any due diligence.
No, coding can be done by machines. But if you're telling a machine what to program, you're not coding. The machine is. Youre no longer a programmer, you're just a user.
I suppose a more rigorous definition would be useful. We can probably make it more narrow as time goes on
To me, the essence of coding is about using formal languages and definable state machines (i.e, your toolchain) to manipulate the state of a machine in a predictable way.
C, C++, even with their litany of undefined behavior, are still formal languages, and their compilers can still be predicted and understood (no matter how difficult that is). If the compiler does something unexpected, its because you, the programmer, lacked the knowledge of either the language or the compiler's state.
Vibe coding uses natural languages, and interacts with programs whose state is not only unknown, but unknowable. The machine, for the same input, may produce wildly different output. If the machine produces unexpected code, its not because of a lack of knowledge on the part of it programmer - its because the machine is inherently unpredictable and requires more prodding in soft, fuzzy, natural language.
Telling something what outcomes you want, even if described in technical terms only a programmer would understand, is not coding. It's essentially just being a project manager.
Now you may ask - who cares about this no true Scotsman fallacy? If its coding or not coding, we are still producing a program which serves the product needs of the customer.
Personally, I did not learn to code because I give a shit about the product needs of the customer, or the financial wellbeing of the business. I enjoy coding for its own sake - because it is fun to use systems of well defined rules to solve problems. Learning and using C++ is fun, for me; it seems every day i learn something new about the language and how the compiler behaves and I've been using C++ for several years (and I started learning it when I was 12!)
Describing the outcome or goal of a project in natural human language sounds like a nightmare, to be honest. I became a software engineer so I could minimize the amount of natural language required to succeed in life. Natural language has gotten me (and, I suspect, people like me) in trouble over and over again throughout adolescence, but I've never written a piece of code that was misunderstood or ambiguous enough for people to become threatened by or outraged by it.
I think the disconnect is that some people care about products, and some people care about code.
It's qualitatively different to go through source code and specifications to understand how something works than to look at a database with all the weights of an LLM and pretend like you could predict the output.
Ummm, my entire career I have been telling machines what to program, the machines are taking my garbage C/Go/Python/Perl/whatever prompts and translating it to ASM/Machine code that oher machines will use to do... stuff
They're substantively different. Using a compiler requires you to have an internalized model of a state machine and, importantly, a formal language. C, assembler, java, etc.
are all essentially different from using the softness of the English language to coerce results out of a black box
In both all you need is the ability to communicate to the machine in a way that the machine can convert your ideas into actions.
The restricted language of a compiler is a handicap, not evidence of a skill - we've been saying forever that "Natural Language" compilers would be a game changer, and that's all that an AI really is
Edit: It appears that this discussion is going to end up with a definition of "coding"
Is it coding if you tell the computer to perform some action, or is it coding if you tell it how to do that in some highly optimised way (for varying definitions of optimised, eg. Memory efficient, CPU efficient, Dev time efficient... etc)
No one is skeptical of compilers?! I guess you haven’t met many old fashioned C systems programmers, who go out of their way to disable compiler optimisations as much as they can because “it just produces garbage”.
Every generation, we seem to add a level of abstraction conceding because for most of us, it enhances productivity. And every generation, there is a crowd who rails against the new abstraction, mostly unaware of all of the levels of abstraction they already use in their coding.
Luxury! When I were a lad we didn't have them new fangled compilers, we wrote ASM by hand, because compilers cannot (and still to this day I think) optimise ASM as well as a human
Abstractions and compilers are deterministic, no matter if a neckbeard is cranky about the results. LLMs are not deterministic, they are a guessing game. An LLM is not an abstraction, it's a distraction. If you can't tell the difference, then maybe you should lay off the "AI" slop.
I've been thinking about this - you're right that LLMs are not going to be deterministic (AIUI) when it comes to producing code to solve a problem.
BUT neither are humans, if you give two different humans the same task, then, unless they copy one another, then you will get two different results.
Further, as those humans evolve through their career, the code that they produce will also change.
Now, I do want to point out that I'm very much still at the "LLMs are an aid, not the full answer.. yet" point, but a lot of the argument against them seems to be (rapidly) coming to the point where it's no longer valid (AI slop and all).
You keep making these claims as though you are some sort of authority, but nothing you have said has matched reality.
I mean full credit to you with your disingenuous goalpost shifting and appeals to authority, but reality has no time for you (and neither do I anymore).
Second Edit: Adding the following paragraph from the wikipedia page for emphasis
Researchers have started to experiment with natural language programming environments that use plain language prompts and then use AI (specifically large language models) to turn natural language into formal code. For example Spatial Pixel created a natural language programming environment to turn natural language into P5.js code through OpenAI's API. In 2021 OpenAI developed a natural language programming environment for their programming large language model called Codex.
I think after all the goalpost moving, we have to ask - why the bitflip does it matter what we call it?
Some people are getting a lot of work done using LLMs. Some of us are using it on occasion to handle thing we don't understand deeply but can trivially verify. Some of us are using it out of laziness because it helps with boilerplate. Everyone who is using it outside of occasional tests is doing it because they find it useful to write code. If it's not coding, then I personally couldn't care less. Only a True Scotsman should case.
If my boss came to me and said "hey we're going to start vibe coding everything st work from now on. You can manually edit code but claude code needs to be your primary driver from now on" I would quit and find a new career. I enjoy coding. I like solving puzzles using the specifics of a language syntax. I write libraries and APIs and I put a great deal of effort into making sure the interface is usable by a human being.
If we get to the point where we are no longer coding, we are just describing things in product language to a computer and letting it do all the real work, then I will find a more fulfilling career because this ain't it
By the time it works flawlessly, it won't be your career anymore, it'll be the product manager's. They will describe what they want and the AI will produce it. You won't be told to "use Claude all the time".
I personally hate coding, but it's a means to an end, and I care about the end. I'm also paranoid about code I don't understand, so I only rarely use AI and even then it's either for things I understand 100% or things that don't matter. But it would be silly to claim they don't produce working code, no matter what we want to call it.
This is the core of the issue. You hate coding, and I love it. I chose to be a software engineer not because I like using software, but because I like writing software.
If we get to a point where the engineers are replaced by machines, I would hope that the project managers were replaced years before that, as a final act of revenge
I enjoy a lot of things (Software Engineering is one of them) that in NO way determines whether or not AI is coding, nor does it guarantee me a career (just ask all the blacksmiths that disappeared once cars became the mass transport vehicle).
The fact that people are going to (possibly) be able to instruct a computer to do whatever they wish without the need of a four year degree and several years of experience scares you, I get that, but that's not going to have any effect on reality.
Edit: Have a look at all the peoples careers that have ended because software took over.
And more importantly perhaps to u/shortrounddev2, if they enjoy coding so much, they'll still be able to do it as a hobby! It's just that there may not be anybody willing to pay for a slow lumbering human to work their way through the problem.
Technically you’re not vibe coding. You’re using AI to do software engineering. Vibe coding is specifically the process of having AI produce code and plowing ahead without understanding it.
I know I’m being pedantic, but people mean very different things when they talk about this stuff, and I don’t think any credence should be given to vibe coding.
To some extent, OP is still vibe coding because one has to trust Claude's every single decision which can't be easily verified at the first glance anyway. Agreed that we need a new word for heavily AI-assisted software development though, I once used a word "vivid coding" for this kind of process.
I vibe code quite a bit and will plow through a lot of front end code despite being a backend engineer. In my case, it's on personal projects where I'm ambitious and asking the LLM to "replace an entire SaaS" sort of thing. At work most of the code is a couple lines here or there and trivial to review.
When I try the more complex things I will do multiple passes with AI, have 2-3 LLMs review it and delete deprecated code, refactor, interrogate it and ask it to fix bad patterns, etc. In an evening I can refactor a large code base this way. For example Gemini is meh compared to Claude Opus at new code, but somewhat decent for reviewing code that's already there, since the 1M context window allows it to tie things together Claude wouldn't be able to fit in 256k. I might then bounce a suggestion back from Gemini -> Claude -> Grok to fix something. It's kind of like managing a team of interns with different specialties and personalities.
"A key part of the definition of vibe coding is that the user accepts code without full understanding.[1] Programmer Simon Willison said: 'If an LLM wrote every line of your code, but you've reviewed, tested, and understood it all, that's not vibe coding in my book—that's using an LLM as a typing assistant.'"
I wasn't familiar with his full message, so I didn't realize that the current definition of vibe coding was so cynical. Many of us don't see it that way.
1. Not looking at the code
2. YOLO everything
3. Paste errors back into the model verbatim
That said, I describe what I do as vibe coding, but I introduce code review bots into the mix. I also roadmap a plan with deep research before hand and require comprehensive unit and behavioural tests from the model.
Here's the full original definition from Karpathy:[*]:
> There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.
Some of the key points here being "forget that the code even exists," "'Accept All' always," "don't read the diffs," and "The code grows beyond my usual comprehension."
Doing software engineering using AI assistance is not vibe coding, by Karpathy's definition.
Just a few months ago I couldn't imagine paying more than $20/mo for any kind of subscription, but here I am paying $200/mo for the Max 20 plan!
Similarly amazed as an experienced dev with 20 YoE (and a fellow Slovak, although US based). The other tools, while helpful, were just not "there" and they were often simply more trouble than they were worth producing a lot of useless garbage. Claude Code is clearly on another level, yes it needs A LOT of handholding; my MO is do Plan Mode until I'm 100% sure it understands the reqs and the planned code changes are reasonable, then let it work, and finally code review what it did (after it auto-fixes things like compiler errors, unit test failures and linting issues). It's kind of like a junior engineer that is a little bit daft but very knowledgeable but works super, super fast and doesn't talk back :)
It is definitely the future, what can I say? This is a clear direction where software development is heading.
When I first tried letting Cursor loose on a relatively small code base (1500 lines, 2 files), I had it fix a bug (or more than one) with a clear testcase and a rough description of the problem, and it was a disaster.
The first commit towards the fix was plausible, though still not fully correct, but in the end not only it wasn't able to fix it, each commit was also becoming more and more baroque. I cut it when it wrote almost 100 lines of code to compare version numbers (which already existed in the source). The problem with discussing the plan is that, while debugging, you don't yourself have a full idea of the plan.
I don't call it a total failure because I asked the AI to improve some error messages to help it debug, and I will keep that code. It's pretty good at writing new code, very good at reviewing it, but for me it was completely incapable of performing maintainance.
These tools and LLMs differ in quality, for me Claude Code with Claude 4 was the first tool that worked well enough. I tried Cursor before, it's been a 6+ months ago though, but I wasn't very impressed.
Same for me. Cursor was a mess for me. I don't know why and how it works for other people. Claude code on the other hand was a success from day one and I'm using it happily for months now.
I used Cursor for about 5 months before switching to Claude Code. I was only productive with Cursor when I used it in a very specific way, which was basically me doing by hand what Claude Code does internally. I maintained planning documents, todo lists, used test driven development and linting tools, etc. My .cursorrules file looks like what I imagine the Claude system prompt to be.
Claude Code took the burden of maintaining that off my shoulders.
Also Cursor was/is utterly useless any all non-Anthropic models, which are the default.
This was a problem I regularly had using Copilot w/ GPT4o or Sonnet 3.5/3.7... sometimes I would end up down a rabbit hole and blow multiple days of work, but more typically I'd be out an hour or two and toss everything to start again.
Don't have this w/ Claude Code working over multiple code bases of 10-30k LOC. Part of the reason is the type of guidance I give in the memory files helps keep this at bay, as does linting (ie. class/file length), but I also chunk things up into features that I PR review and have it refactor to keep things super tidy.
Yeah, Github Copilot just didn't work for me at all. The completions are OK and I actually still use it for that but the agent part is completely useless. Claude Code is in another league.
Fwiw, I dipped my toes into AI assisted coding a few weeks ago and started with cursor. Was very unimpressed (spent more time prompting and fight the tool than making forward progress) until I tried Claude code. Happily dropped cursor immediately (cancelled my sub) and am now having a great time using CC productively (just the basic $20/mo plan). Still needs hand-holding but it's a net productivity boost.
May I ask what you are use it for? I have been using it for fun mostly, side projects, learning, experimenting. I would never use it for work codebase, unless, well, the company ordered or at least permitted it. And even then, I'm not really sure I would feel comfortable with the level of liberty CC takes. So I'm curious about others.
Of course you need an explicit permit from the company to use (non-local) AI tools.
Before that was given, I used AI as a fancier search engine, and for coming up with solutions to problems I explained in abstract (without copy-pasting actual code in or out).
Fascinating since I found the recent Claude models untrustworthy for writing and editing SQL. E.g. it'd write conditions correctly, but not add parens around ANDs and ORs (which gemini pro then highlighted as a bug, correctly.)
If you aren't already (1) telling Claude Code which flavor of SQL you want (there are several major dialects and many more minor ones) and (2) giving it access to up-to-date documentation via MCP (e.g. https://github.com/arabold/docs-mcp-server) so it has direct access to canonical docs for authoritative grounding and syntax references, you'll find that you get much better results by doing one or both of those things.
Documentation on features your SQL dialect supports and key requirements for your query are very important for incentivizing it to generate the output you want.
As a recent example, I am working on a Rust app with integrated DuckDB, and asked it to implement a scoring algorithm query (after chatting with it to generate a Markdown file "RFC" describing how the algorithm works.) It started the implementation with an absolute minimal SQL query that pulled all metrics for a given time window.
I questioned this rather than accepting the change, and it said its plan was to implement the more complex aggregation logic in Rust because 1) it's easier to interpret Rust branching logic than SQL statements (true) and 2) because not all SQL dialects include EXP(), STDDEV(), VAR() support which would be necessary to compute the metrics.
The former point actually seems like quite a reasonable bias to me, personally I find it harder to review complex aggregations in SQL than mentally traversing the path of data through a bunch of branches. But if you are familiar with DuckDB you know that 1) it does support these features and 2) the OLAP efficiency of DuckDB makes it a better choice for doing these aggregations in a performant way than iterating through the results in Rust, so the initial generated output is suboptimal.
I informed it of DuckDB's support for these operations and pointed out the performance consideration and it gladly generated the (long and certainly harder to interpret) SQL query, so it is clearly quite capable, just needs some prodding to go in the right direction.
Even when I hand roll certain things, it still nice to have Claude Code take over any other grunt work that might come my way. And there are always yaks to shave, always.
I found claude sonnet 4 really good at writing SQL if you give it a feedback loop with real data. It will research the problem, research the data, and improve queries until it finds a solution. And then it will optimize it, even optimize performance if you ask it to run explain plan or look at pg_stat_statemnts (postgres).
It's outrageously good at performance optimization. There's been multiple really complex queries I've optimized with it that I'd been putting off for a long time. Claude code figured the exact indexes to add within seconds (not ones I would have got easily manually).
This kind of thing is a key point. Tell Claude Code to build the project, run linters, run the tests, and fix the errors. This (in my experience) has a good chance of filtering out mistakes. Claude is fully capable of running all of the tools, reading the output, and iterating. Higher level mistakes will need code written in a way that is testable with tests that can catch them, although you probably want that anyway.
Feels like the most valuable skill to have as a programmer in times of Claude Code is that of carefully reading spec documentation and having an acute sense of critical thinking when reviewing code.
Critical Skills is spotting the potential bugs before they happen but in order to do that you need to have an extremely acute understanding or a have a lot of experience in the stack, libs and programming language of choice. Something that, ironically, you will not get by "vibe coding".
> I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed.
And, yet, when I asked it to correct a CMake error in a fully open source codebase (broken dependency declaration), it couldn't work it out. It even started hallucinating version numbers and dependencies that were so obviously broken that at least it was obvious to me that it wasn't helping.
This has been, and continues to be, my experience with AI coding. Every time I hit something that I really, really want the AI to do and get right (like correcting my build system errors), it fails and fails miserably.
It seems like everybody who sings the praises of AI coding all have one thing in common--Javascript. Make of that what you will.
This is typically the outcome, when you have it look at a generic problem and fix it, especially if the problem depends on external information (like specific version numbers, etc). You have to either tell it where to look it up, or ask it to ask you questions how things need to be resolved. I personally use it to work on native code, C++ (with CMake), Zig, some Python. Works fine.
I have a similar amount of engineering experience, was highly skeptical, and I've come to similar conclusions with Claude Code after spending two weeks on a greenfield project (TS api, react-native client, TS/React admin panel).
As I've improved planning and context management, the results have been fairly consistent. As long as I can keep a task within the context window, it does a decent job almost every time. And occasionally I have to have it brute-force its way to green lint/typecheck/tests. That's been one of the biggest speed bumps.
I've found that gemini is great at the occasional detailed code-review to help find glaring issues or things that were missed, but having it implement anything has been severely lacking. I have to literally tell it not to do anything because it will gladly just start writing files on a whim. I generally use the opus model to write detailed plans, sonnet to implement, and then opus and gemini to review and plan refactors.
I'm impressed. The progress is SLOW. I'd have gotten to the stage I'm at in 1/3 to 1/2 the time, likely with fewer tests and significantly less process documentation. But the results are otherwise fairly great. And the learning process has kept me motivated to keep this old side-project moving.
I was switching between two accounts for a week while testing, but in the end upgraded to the $100/month plan and I think I've been rate-limited once since. I don't know if I'll be using this for every-day professional work, but I think it's a great tool for a few categories of work.
I found Gemini CLI to be totally useless too. Last week I tried Claude Code with GLM4.5 (via z.ai API), though, and it was genuinely on par with Sonnet.
Thank you for the recommendation. I've been testing this on an open source project and it's indeed good. Not as good as Sonnet 4, but good enough. And the pricing is very reasonable. Don't know if I'd trust it to work on private code, but for public code it's a great option.
I have not tried it, for a variety of reasons, but my (quite limited, anecdotal, and gratis) experience with other such tools is, that I can get them to write something I could perhaps get as an answer on StackOverflow: Limited scope, limited length, address at most one significant issue; and perhaps that has to do with what they are trained on. But that once things get complicated, it's hopeless.
You said Claude Code was significantly better than some alternatives, so better than what I describe, but - we need to know _on what_.
I've been working on the design of a fairly complicated system using the daffy robots to iterate over a bunch of different ideas. Trying things out (conceptually) to explore the pros and cons of each decision before even writing a single line of code. The code is really a formality at this point as each and every piece is laid out and documented.
Contrast this with the peg parser VM it basically one-shotted but needed a bunch of debug work. A fuzzy spec (basically just the lpeg paper) and a few iterations and it produced a fully tested VM. After that the AST -> Opcode compiler was super easy as it just had to do some simple (fully defined by this point) transforms and Bob's your uncle. Not the best code ever but a working and tested system.
Then my predilection for yak shaving took over as the AST needed to be rewritten to make integration as a python C extension module viable (and generated). And why have separate AST and opcode optimization passes when they can be integrated? Oh, and why even have opcodes in the first place when you can rewrite the VM to use Continuation Passing Style and make the entire machine AST-> CPS Transform -> Optimizer -> Execute with a minimum of fuss?
So, yeah, I think it's fair to say the daffy robots are a little more than a StackOverflow chatbot. Plus, what I'm really working on is a lot more complicated than this, needing to redo the AST was just the gateway drug.
Not with Claude Code but with Cursor using Claude Sonnet 4 I coded an entire tower defense game, title, tutorial, gameplay with several waves of enemies, and a “rewind time” mechanic. The whole thing was basically vibe coded, I touched maybe a couple dozen lines of code. Apparently it wasn’t terrible [0]
I completed my degree over 20 years ago and due to dot com bust and the path I took never coded as a full time role, some smallbits of dev and scripting but nothing where I would call myself a developer. I've had loads of ideas down through the years but never had the time work to complete them or learn the language/stack to complete them. Over the last 3 weeks I've been working on something small that should be ready for a beta release by the end of August. The ability to sit down and work on a feature or bug when I only have a spare 30 mins and be immediately productive without having to get in the zone is a game changer for me. Also while I can read and understand the code writing it would be at least 10 times slower for me. This is a small codebase that will have less than 5k lines and is not complicated so github copilot is working well for me in this case.
I could see me paying for higher tiers given the productivity gains.
The only issue I can see is that we might end up with a society where those that can afford the best subscriptions have more free time, get more done, make more money and are more successful in general. Even current base level subscriptions are too expensive for huge percentage of the global population.
One thing I’ve started doing is using Gemini cli as a sidecar for Claude Code to load in a huge amount of context around a set of changes to get a second opinion - it’s been pretty handy for that particular use case due to its context size advantage
> I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
Sadly, my experience with the Max plan has been extremely poor. It’s not even comparable, I’ve been vastly experimenting with claude code in the last weeks, spending more than 80$ per day, it’s amazing. The problem is that in the Max plan you’re not the one managing the context length, and this ruins the model ability to keep things memory. Of course this is expected, the longer the context the more expensive to run, but it’s so frustrating to fail in a coding task because it’s so obvious the model lost a crucial part of the context.
My experience has been similar, over perhaps 4-6 weeks of Claude Code. My first few days were a bit rough, and I was tempted to give up and proclaim that all my skeptic's opinions were correct and that it was useless. But there is indeed a learning curve to using it. After a month I'm still learning, but I can get it to give me useful output that I'm happy committing to my projects, after reviewing it line by line.
Agreed that context and chunking are the key to making it productive. The times when I've tried to tell it (in a single prompt) everything I want it to do, were not successful. The code was garbage, and a lot of it just didn't do what I wanted it to do. And when there are a lot of things that need to be fixed, CC has trouble making targeted changes to fix issues one by one. Much better is to build each small chunk, and verify that it fully works, before moving on to the next.
You also have to call its bullshit: sometimes it will try to solve a problem in a way you know is wrong, so you have to stop it and tell it to do it in another way. I suppose I shouldn't call it "bullshit"; if we're going to use the analogy of CC being like an inexperienced junior engineer, then that's just the kind of thing that happens when you pair with a junior.
I still often do find that I give it a task, and when it's done, realize that I could have finished it much faster. But sometimes the task is tedious, and I'm fine with it taking a little longer if I don't have to do it myself. And sometimes it truly does take care of it faster than I would have been able to. In the case of tech that I'm learning myself (React, Tailwindcss, the former of which I dabbled with 10 years ago, but my knowledge is completely out of date), CC has been incredibly useful when I don't really know how to do something. I'm fine letting CC do it, and then I read the code and learn something myself, instead of having to pore over various tutorials of varying quality in order to figure it out on my own.
So I think I'm convinced, and I'll continue to make CC more and more a part of my workflow. I'm currently on the Pro plan, and have hit the usage limits a couple times. I'm still a little shy about upgrading to Max and spending $100/mo on a dev tool... not sure if I'll get over that or not.
I haven't used Claude Code - but have been using Amp a lot recently. Amp always hits on target. They created something really special.
Has anyone here used both Claude Code and Amp and can compare the two's effectiveness? I know one is CLI and the other an editor extensions. I'm looking for comparisons beyond that. Thanks!
It burns through credit too quickly. As a previous Sourcegraph Cody user, I was trying Amp first, but I've spent tens of dollars every day for the trial, and that was with an eye on the usage. It felt horrible seeing how I pay mostly for it's mistakes and the time it takes debugging. With CC, I can let go of the anxiety. I get several hours a day out of the Claude Pro plan and that's mostly good enough for now. If it's not, I'll upgrade to Max, as at $100 that's still less than what I'd have spent on Amp.
That's the thing for me too: I don't want to pay for the agent's mistakes, even if those mistakes are in part the fault of my prompt. I'm fine with having usage limits if it means I pay a fixed cost per month. Not sure how long this will last, considering how expensive all this is for the companies to run, though.
I feel like Amp's costs are actually in line with Sourcegraph's costs, and eventually Anthropic, OpenAI, et al. will all be charging a lot more than they are now.
It's the classic play to entice people to something for low cost, and then later ramp it up once they're hooked. Right now they can afford to burn VC money, but that won't last forever.
>I was dreaming about some kind of neural interface, where I could program without writing any code. And I realized that with Claude Code, it's kind of here.
I had a similar thought about the Turing Test…
It as science fiction for decades… then it passed silently in the night and we barely noticed.
Gemini is not that good right now. GLM-4.5, which just came out, is pretty decent and very cheap. I use these with the RooCode plugin for VSCode that connects to it via OpenRouter. $10 of credits lasts a day of coding for me where as Claude would run that out in an hour.
Claude code is great until it isn’t. You’re going to get to a point where you need to modify something or add something… a small feature that would have been easy if you wrote everything, and now it’s impossible because the architecture is just a mishmash of vibe coded stuff you don’t understand.
I understand completely what you're saying. But with the delusions that management is under right now, you're just going to seem like someone that's resisting the flow of code and becoming a bottleneck.
So far I'm bullish on subagents to help with that. Validate completion status, bullshit detection, catching over engineering etc. I can load them with extra context like conventions ahd specific prompts to clamp down on the Claude-isms during development.
This. It helps to tell it to plan and to then interrogate it about that plan, change it to specification etc. Think of it as a refinement session before a pairing session. The results are considerably better if you do it this way. I've written kubernetes operators, flask applications, Kivy applications, and a transparent ssh proxy with Claude in the last two months, all outside of work.
It also helps to tell it to write tests first: I lean towards integration tests for most things but it is decent at writing good unit tests etc too. Obviously, review is paramount if TDD is going to work.
As a hobbyist coder, the more time I spend brainstorming with all the platforms about specs and tests and architecture, the better the ultimate results.
Having used Claude Code extensively for the last few months, I still haven't reached this "until it isn't" point. Review the code that comes out. It goes a long way.
Yes, my point is that you don't even have "it compiles" as a way to measure a code review. Maybe you did a great job, maybe you did a terrible job, how do you tell?
You're not setting good enough boundaries or reviewing what it's doing closely enough.
Police it, and give it explicit instructions.
Then after it's done its work prompt it with something like "You're the staff engineer or team lead on this project, and I want you to go over your own git diff like it's a contribution from a junior team member. Think critically and apply judgement based on the architecture of the project describes @HERE.md and @THERE.md."
Ah yes…the old “you’re holding it wrong”. The problem is these goddamn things don’t learn, so you put in the effort to police it…and you have to keep doing that until the end of time. Better off training someone off the street to be a software engineer.
It's just a tool, not an intelligence or a person.
You use it to make your job easier. If it doesn't make your job easier, you don't use it.
Anybody trying to sell you on a bill of goods that this is somehow "automating away engineers" and "replacing expensive software developers" is either stupid or lying (or both).
I find it incredibly useful, but it's garbage-in, garbage-out just like anything else with computers. If your code base is well commented and documented and laid out in a consistent pattern, it will tend to follow that pattern, especially if it follows standards. And it does better in languages (like Rust) that have strict type systems and coding standards.
Even better if you have rigorous tests for it to check its own work against.
Yes, sometimes you are actually indeed holding it wrong. Sometimes a product has to be used in a certain way to get good results. You're not going to blame the shampoo when someone uses only a tiny drop of it, and the hair remains dirty.
This is still early days with LLMs and coding assistants. You do have to hold them in the right way sometimes. If you're not willing to do that, or think that provides less value than doing it another way... great, good for you, do it the way you think is best for you.
I've been a coding assistant skeptic for a long time. I just started playing with Claude Code a month or so ago. I was frustrated for a bit until I learned how to hold it the right way. It is a long, long way from being a substitute for a real human programmer, but it's helpful to me. I certainly prefer it to pair programming with a human (I hate pair programming), so this provides value.
If you don't care to figure out for yourself if it can provide you value, that's your choice. But this technology is going to get better, and you might later find yourself wishing you'd looked into it earlier. Just like any new tool that starts out rough but eventually turns out to be very useful.
They don't learn by themselves, but you can add instructions as they make mistakes that are effectively them learning. You have to write code review feedback for juniors, so that s not an appreciable difference.
> Better off training someone off the street to be a software engineer.
And that person is going to quit and you have to start all over again. They also cost at least 100x the price.
I've been telling people, this is Uber in 2014, you're getting a benefit and it's being paid for with venture capital money, it's about as good as it's going to get.
Your claude.md (or equivalent) is the best way to teach them. At the end of any non-trivial coding session, I'll ask for it to propose edits/additions to that file based on both the functional changes and the process we followed to get there.
That's not the end of the story, though. LLMs don't learn, but you can provide them with a "handbook" that they read in every time you start a new conversation with them. While it might take a human months or years to learn what's in that handbook, the LLM digests it in seconds. Yes, you have to keep feeding it the handbook every time you start from a clean slate, and it might have taken you months to get that handbook into the complete state it's in. But maybe that's not so bad.
The good thing about this process its it means such a handbook functions as documentation for humans too, if properly written.
Claude is actually quite good at reading project documentation and code comments and acting on them. So it's also useful for encouraging project authors to write such documentation.
I'm now old enough that I need such breadcrumbs around the code to get context anyways. I won't remember why I did things without them.
Not so. Adding to context files helps enormously. Having touchstone files (ARCHITECTURE.md) you can reference helps enormously. The trick is to steer, and create the guardrails.
Honestly, it feels like DevOps had a kid with Product.
> Honestly, it feels like DevOps had a kid with Product.
You've just described a match made in hell. DevOps - let's overcomplicate things (I'm looking at you K8s) and Product - they create pretty screenshots and flows but not actually think about the product as a system (or set of systems.)
How can you end up with code you don't understand, if you review anything it writes? I wouldn't let it deviate from the architecture I want to have for the project. I had problems with junior devs in the past, too eager to change a project, and I couldn't really tell them to stop (need to work on my communication skills). No such problem with Claude Code.
I don’t remember what architecture was used by PRs I reviewed a month ago. I remember what architecture I designed 15 years ago for projects I was part of.
I've only used the agentic tools a bit, but I've found that they're able to generate code at a velocity that I struggle to keep in my head. The development loop also doesn't require me to interact with the code as much, so I have worse retention of things like which functions are in which file, what helper functions already exist, etc.
It's less that I can't understand, and more that my context on the code is very weak.
I might have to try this. Without having tried it, it feels like the context I think I lack is more nitty gritty than would be exposed like this. It's not like I'm unsure of how a request ends up in a database transaction, but more "do we need or already have an abstraction over paging in database queries?". It doesn't feel like mermaid diagrams or design documents would include that, but I'm open to being wrong there.
> a mishmash of vibe coded stuff you don’t understand.
No, there is a difference between "I wrote this code" and "I understand this code". You don't need to write all the code in a project to understand it. Otherwise writing software in a team would not be a viable undertaking.
Yes, the default when it does anything is to try and create. It will read my CLAUDE.md file, it will read the code that is already there, and then it will try to write it again. I have had this happen many times (today, I had to prompt 5/6 times to read the file as a feature had already been implemented).
...and if something is genuinely complex, it will (imo) generally do a bad job. It will produce something that looks like it works superficially, but as you examine it will either not work in a non-obvious way or be poorly designed.
Still very useful but to really improve your productivity you have to understand when not to use it.
English is much less expressive compared to code. Typing the keys was never the slow part for senior developers.
It does work with an LLM, but you’re reinventing the wheel with these crazy markup files. We created a family of language to express how to move bits around and replacing that with English is silly.
Vibe coding is fast because you’re ok with not thinking about the code. Anytime you have to do that, an LLM is not going to be much faster.
In theory, there is no reason why this is the case. For the same reason, there is no reason why juniors can't create perfect code first time...it is just the tickets are never detailed enough?
But in reality, it doesn't work like that. The code is just bad.
You are responsible for the layers. You should either do the design on your own, or let the tool ask you questions and guide you. But you should have it write down the plan, and only then you let it code. If it messes up the code, you /clear, load the plan again and tell it to do the code differently.
It's really the same with junior devs. I wouldn't tell a junior dev to implement a CRM app, but I can tell the junior dev to add a second email field to the customer management page.
Completely agree. You really have to learn how to use it.
For example, heard many say that doing big refactorings is causing problems. Found a way that is working for SwiftUI projects. I did a refactoring, moving files, restructuring large files into smaller components, and standardizing component setup of different views.
The pattern that works for me: 1) ask it to document the architecture and coding standards, 2) ask it to create a plan for refactoring, 3) ask it to do a low-risk refactoring first, 4) ask it to update the refacting plan, and then 5) go through all the remaining refactorings.
The refactoring plan comes with timeline estimates in days, but that is completely rubbish with claude code. Instead i asked it to estimate in 1) number of chat messages, 2) number of tokens, 3) cost based on number of tokens, 4) number of files impacted.
Another approach that works well is to first generate a throw away application. Then ask it to create documentation how to do it right, incorporate all the learning and where it got stuck. Finally, redo the application with these guidelines and rules.
Another tip, sometimes when it gets stuck, i open the project in windsurf, and ask another LLM (e.g., Gemini 2.5 pro, or qwen coder) to review the project and problem and then I will ask windsurf to provide me with a prompt to instruct claude code to fix it. Works well in some cases.
Also, biggest insight so far: don't expect it to be perfect first time. It needs a feedback loop: generate code, test the code, inspect the results and then improve the code.
Works well for SQL, especially if it can access real data: inspect the database, try some queries, try to understand the schema from your data and then work towards a SQL query that works. And then often as a final step it will simplify the working query.
I use an MCP tool with full access to a test database, so you can tell it to run explain plan and look at the statistics (pg_stat_statements). It will draw a mermaid diagram of your query, with performance numbers included (nr records retrieved, cache hit, etc), and will come back with optimized query and index suggestions.
Tried it also on csv and parquet files with duckdb, it will run the explain plan, compare both query, explain why parquet is better, will see that the query is doing predicate push down, etc.
Also when it gets things wrong, instead of inspecting the code, i ask it to create a design document with mermaid diagrams describing what it has built. Quite often that quickly shows some design mistake that you can ask it to fix.
Also with multiple tools on the same project, you have the problem of each using it's own way of keeping track of the plan. I asked claude code to come up with rules for itself and windsurf to collaborate on a project. It came back with a set of rules for CLAUDE.md and .windsurfrules on which files to have, and how to use them (PLAN.md, TODO.md, ARCHITECTURE.md, DECISION.md, COLLABORATION.md)
This is not actually such a big change for me. I've been doing mostly architecture for several years now. Thinking about the big picture, how things fit together, and how to make complex things simple is what I care about. I've been jokingly calling what I do "programming without coding" even before the current AIs existed. It's just that now I have a extra tool I can use for writing the code.
Gemini is shockingly, embarrassingly, shamefully bad (for something out of a company like Google). Even the open models like Qwen and Kimi are better on opencode.
In my experience, Gemini is pretty good in multishotting. So just give it a system prompt, some example user/assistant pairs, and it can produce great results!
And this is its biggest weakness for coding. As soon as it makes a single mistake, it's over. It somehow has learned that during this "conversation" it's having, it should make that mistake over and over again. And then it starts saying things like "Wow, I'm really messing up the diff format!"
Ah I was thinking maybe the Gemini-cli agent itself might be attributable to the problems, thus maybe try the opencode/Gemini combo instead..
I'd like to mess around with "opencode+copilot free-tier auth" or "{opencode|crush}+some model via groq(still free?)" to see what kind of mileage I can get and if it's halfway decent..
check out openrouter.ai you can pay for credits that get used per prompt instead of forking out a fixed lump sum and it rotates keys so you can avoid being throttled, you can even use the same credits on any model in their index
The more I use it the more I realise my first two weeks and the amazement I felt were an illusion.
I’m not going to tell you it’s not useful, it is. But then shine wears off pretty fast and when it does, you’re basically left with a faster way to type. At least in my experience.
I really don't know what is it, but Claude Code just seems like an extremely well tuned package. You can have the same core models, but the internal prompts matter, how they are looking up extra context matters, how easy is it to add external context matters, how it applies changes matters, how eager is it to actually use an external tool to help you matters. With Claude Code, it just feels right. When I say I want a review, I get a review, when I want code, I get code, when I want just some git housekeeping, I get that.
That has not been my experience, Copilot using Claude is way different than claude code for me. Anecdotal, and "vibes" based, but it'd what I've been experiencing.
I use vim for most of my development, so I'm always in a terminal anyway. I like my editor setup, and getting the benefits of a coding assistant without having to drastically change my editor has huge value to me.
Having spent a couple of weeks putting both AIDE-centric (Cursor, Windsurf) and CLI-centric (Claude Code, OpenAI Codex, Gemini CLI) options through real-world tasks, Cursor was one of the least effective tools for me. I ultimately settled on Claude Code and am very happy with it.
I realized Claude Code is the abstraction level I want to work in. Cursor et al still stick me way down into the code muck when really I only want to see the code during review. It's an implementation detail that I still have to review because it's makes mistakes, even when guided perfectly, but otherwise I want to think in interfaces, architecture, components. The low level code, don't care. Is it up to spec and conventions, does it work? Good enough for me.
The native tool use is a game changer. When I ask it to debug something it can independently add debug logging to a method, run the tests, collect the output, and code based off that until the tests are fixed.
A couple of times I hit the daily limits and decided to try Gemini CLI with the 2.5 pro model as a replacement. That's not even comparable to Claude Code. The frustration with Gemini is just not worth it.
I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.