However, LLMs aren't capable of writing proofs. They can only regurgitate text that looks like a proof by training on proofs already written. There's no reasoning behind what they produce. So why waste the time of a human to review it?
If the provers' kernel and automation make the proof pass but it makes no sense, what use is it?
People write proofs to make arguments to one another.
I'm not worried about LLM's impact on my career. In their current form they're nothing more than a trap, who will suck in anyone foolish enough to grow a dependence on them, then destroy them, both personally and corporately. (Note difference between "using them" and "grow a dependence on them".) Code that no human can understand is bad enough, code that no human has ever understood is going to be even worse as it starts piling up. There are many characteristic failures of software programming that LLMs are going to suffer from worse than humans. They're not going to be immune to ever-growing piles of code. They're not going to be immune to exponentially increasing complexity. They'll just ensure that on the day you say "OK, now, please add this new feature to my system" and they fail at it, nobody else will be able to fix it either.
If progress froze, over the next few years the programming community would come to an understanding of the rather large amounts of hidden technical, organizational, and even legal and liability debt that depending on LLMs creates.
But LLMs != AI. I don't guarantee what's going to happen after LLMs. Is it possible to build an AI that can have an actual comprehension of architecture at a symbolic level, and have some realistic chance of being told "Take this code base based on Mac OS and translate it to be a Windows-native program", and even if it has to chew on it for a couple of weeks, succeed? And succeed like a skilled and dedicated human would have, that is, an actual, skillful translation, where the code base is still comprehensible to a human at the end? LLMs are not the end state of AI.
I didn't mean to have the AI write the proof. On the contrary. Humans would be responsible for writing them, to ensure that the code generated by the AI meet certain constraints (safety for example).
It's almost always easier to understand theorems than proofs. People could be writing down properties of their programs and the AI could generate a proof or produce a counterexample. It is not necessary to understand the proof in order to know that it is correct.
At least in principle. Practically, any system as least as powerful as first-order logic is undecidable, so there can never be any computer program that would be able to do this perfectly. It might be that some day, an AI could be just as good as a trained mathematician, but so far, it doesn't look like it.
> It might be that some day, an AI could be just as good as a trained mathematician, but so far, it doesn't look like it.
One of the on-going arguments in automated theorem proving has been this central question: would anyone understand or want to read a proof written by a machine?
A big part of why we write them and enjoy them is elegance, a rather subjective quality appreciated by people.
In computer science we don't tend to care so much for this quality; if the checker says the proof is good that's generally about as far as we take it.
But endearing proofs that last? That people want to read? That's much harder to do.
I had mentioned that. This is true today when software developers are writing proofs and why automation is useful.
You would still need to be able to read the proof that a computer generated and understand what it contains in order to judge whether it is correct with respect to your specifications. So in some sense elegance would be useful since the proof isn’t written by a human but has to be understood by one.
If the input here is still plain language prompts… then we’ll have to assume that we eventually develop artificial cognition. As far as I know, there isn’t a serious academic attempting this, only VC’s with shovels to sell.
I might also add that, “they,” on the first line are software developers. Most are not trained to read proofs. Few enough can write a good test let alone understand when a test is insufficient evidence of correctness.
I had to learn on my own.
Even if LLMs could start dropping proofs with their PRs would they be useful to people if they couldn’t understand them or what they mean?
Yeah, people fixated on the meaning of "makes no sense" to evade accepting that the proofs LLMs output are not useful at all.
On a similar fashion, almost all LLM created tests have negative value. They are just easier to verify than proofs, but even the bias into creating more tests (taken from the LLM fire hose) is already harmful.
I am almost confident enough to make a similarly wide claim about code. But I'm still collecting more data.
However, LLMs aren't capable of writing proofs. They can only regurgitate text that looks like a proof by training on proofs already written. There's no reasoning behind what they produce. So why waste the time of a human to review it?
If the provers' kernel and automation make the proof pass but it makes no sense, what use is it?
People write proofs to make arguments to one another.