This is my biggest frustration with the code they generate (but it does make it easy to check if my students have even looked at the generated code). I dont want to fail silently or hard code an error message, it creates a pile of lies to work through for future debugging
Writing bad tests and error handling have been the worst performance part of Claude for me.
In particular writing tests that do nothing, writing tests and then skipping them to resolve test failures, and everybody's favorite: writing a test that greps the source code for a string (which is just insane, how did it get this idea?)
Seriously. Maybe 60% of the time I use claude for tests, the "fix" for the failing tests is also to change the application code so the test passes (in some cases it will want to make massive architecture changes to accomodate the test, even if there's an easy way to adapt the test to better fit the arch). Maybe half the time that's the right thing to do, but the other half the time it is most definitely not. It's a high enough error rate that it borderlines on useful.
Usually you want to fix the code that's failing a test.
The assumption is that your test is right. That's TDD. Then you write your code to conform to the tests. Otherwise what's the point of the tests if you're just trying to rewrite them until they pass?