We are pretty much plateauing in base model performance since gpt4. It's mostly ...

impossiblefork · 2025-07-22T13:08:43 1753189723

So, how do you feel about the recent IMO stuff? Don't they cause a consistency problem for your view that we've plateaued-- to me at least, I felt we were something like two years away from this kind of thing.

Probably very expensive to run of course, probably ridiculously so, but they were able to solve really difficult maths problems.

otabdeveloper4 · 2025-07-22T18:50:23 1753210223

> So, how do you feel about the recent IMO stuff?

It's not real, they are cheating on benchmarks. (Just like the previous many times this was announced.)

narrator · 2025-07-22T13:15:49 1753190149

The biological brain of the top human IMO guy runs on 20 watts. I wonder how much electricity Google used to match that performance.

blackoil · 2025-07-22T14:10:43 1753193443

What is the training cost of such human? Reliability is another concern. There is no manufacturer whom you can pay 10 billion and get few 1000 of trained processor.

emp17344 · 2025-07-22T13:24:46 1753190686

And they still couldn’t solve P6. All that power to perform worse than many human contestants.

GaggiX · 2025-07-22T12:44:24 1753188264

>We are pretty much plateauing in base model performance since gpt4.

Reasoning models didn't even exist at the time, LLMs were struggling a lot with math at the time, now it's completely different with SOTA models, there have been massive improvements since gpt4.

helicalmix · 2025-07-22T14:01:12 1753192872

The transformer paper was published in 2017, and within 8 years (less so, if i'm being honest), we have bots that passed the Turing test. To people with shorter term memories, passing the turing test was a big deal.

My point is that even if things are pleatuing, a lot of these advancements are done in step change fashion. All it takes is one or two good insights to make massive leaps, and just because things are plateauing now, it's a bad predictor for how things will be in the future.