Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Try giving this prompt to your favorite LLM:

"Write unit tests with full line and branch coverage for this function:

def add_two_numbers(x, y): return x + y + 1 "

Sometimes the LLM will point out that this function does not, in fact, return the sum of x and y. But more often, it will happily write "assert add_two_numbers(1, 1) == 3", without comment.

The big problem is that LLMs will assume that the code they are writing tests for is correct. This defeats the main purpose of writing tests, which is to find bugs in the code.





Tip: teach it how to write tests properly. I’ll share what has worked pretty well for me.

Run Cursor in “agent” mode, or create a Codex or Claude Code “unit test” skill. I recommend claude code.

Explain to the LLM that after it creates or modifies a test, it must run the test to confirm it passes. If it fails, it’s not allowed to edit the source code, instead it must determine if there is a bug in the test or the source code. If the test is buggy it should try again, if there is a bug in the source code it should pause, propose a fix, and consult with you on next steps.

The key insight here is you need to tell it that it’s not supposed to randomly edit the source code to make the test pass. I also recommend reviewing the unit tests at a high level, to make sure it didn’t hallucinate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: