Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And GPT-OSS's architecture improvements aren't already incorporated in SotA models?




The point is, that contradicts the claim that lately the progress is only made by throwing more compute.

That wasn't the claim made.

The claim made was that improving SotA models has historically taken exponentially more compute.

The claim implies that improving SotA models takes more compute even while integrating technological advancements to make models more efficient.

Unless you think that such advancements have been historically ignored by the curators of SotA models?


No, that was the claim made.

They justified it with the paper that states what you say, but that's exactly the problem. The statement of paper is significantly weaker than the claim that there's no progress without exponential increase in compute.

The statement of the the paper that SotA models require ever increasing compute, does not support "be careful when assuming that model capabilities will continue to grow" because it only speaks of ever growing models, but model capabilities of the models at the same compute cost continue growing too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: