Ask HN: Do you think GPT-5 will release before Gemini Ultra? Will it be better?

futureshock · on Dec 6, 2023

The Gemini benchmarks appear to be heavily gamed. They seem to be carefully presented and use different prompting techniques to show Gemini beating GPT-4. Since no one can actually use Gemini Ultra to confirm, I think it’s safe to assume this is fluff for shareholders and that once the public gets hands on it will be apparent that GPT-4 is still the better model and better aligned with fewer hallucinations.

There’s very little information on how soon we might see GPT-5. OpenAI seems to have thrown the kitchen sink at GPT-4, so without any new algorithm breakthroughs (which they are rumored to have made with Q*, but that might be just PR) I don’t know that we should expect miracles.

I’m bearish on LLMs in the next 1 year but bullish over the next 5 years.

Cannabat · on Dec 6, 2023

Even if the benchmarks are accurate - or even overly conservative - it's possible that these are the "uncensored" versions of the models, and what the public gets access to in the end is far less capable.

freedomben · on Dec 6, 2023

Ah yes! That's probably be what's going on!

That would explain a lot of things, especially why they're letting the beautiful people have access now but not us plebs. They want a few months to work on censoring it before the masses get it and make it say something offensive.

geniium · on Dec 6, 2023

> I’m bearish on LLMs in the next 1 year but bullish over the next 5 years.

I find this sentence particuarly interesting.

rkangel · on Dec 6, 2023

This is a good bet, because as a rule people overestimate progress in the short term, and underestimate it in the long term (and at the moment 5 years is at least medium term for LLMs!

https://quoteinvestigator.com/2019/01/03/estimate/

monkeydust · on Dec 7, 2023

Yea if Gartner did one good thing is the hype curve which supports this thinking also https://en.wikipedia.org/wiki/Gartner_hype_cycle

nolist_policy · on Dec 6, 2023

OTOH, if rumors are true GPT4 is multiple models in a trenchcoat.

PartiallyTyped · on Dec 6, 2023

I thought that had been established :thinking:

kartoolOz · on Dec 6, 2023

Alphacode 2 technical paper claims to solve 43% of problems (77 problems from 12 codeforce competitions) performing 85%ile on all human participants.

Caveat is deep in the technical paper,

1) generates 1m candidates from N different prior models (fine tuned on previous codeforces)

2) throws away 95%ile candidates (doesn't fit the test case + no compile)

3) groups semantically similar candidates

4) scores candidates from each group (Based on another scoring model, probably latency + descriptions etc)

5) Picks top 10

Makes 10 submissions and finally gets the score ..

sure this is how humans solve problems ... totally awed by AGI /s

futureshock · on Dec 7, 2023

10 submissions is rather arbitrary. Why not just do 1 billion submissions? AI magic!

thorum · on Dec 6, 2023

OpenAI said in Spring that GPT5 was not being trained yet:

https://www.theverge.com/2023/4/14/23683084/openai-gpt-5-rum...

Then hinted a few weeks ago that work had started, but probably not actual training:

https://decrypt.co/206044/gpt-5-openai-development-roadmap-g...

Even if training has started since then, that doesn’t seem like nearly long enough to have it completed by the end of the year, especially if they intend to go through an extended period of testing & aligning work as they did with GPT4 (which they tested for six months before releasing).

verdverm · on Dec 6, 2023

part of the issue could be that v5 is larger than v4 by an amount where they know they will not be able to meet demand for 1+ years with the hardware they have, thus they need to solve that problem first

this would be in line with the whole Altman snafu

willsmith72 · on Dec 6, 2023

Talk is cheap, I'll believe it when I see it from Google.

We also can't expect a gap like gpt3 to 4 from 4 to 5. ML progress has never been linear like that, we could be nearing another plateau. Not that I think it's just hype, the current level is extremely practical and valuable, but we could go another 20 years without huge progress.

stranded22 · on Dec 6, 2023

The biggest pull with Gemini is that there isn’t a subscription fee…

skilled · on Dec 6, 2023

I don’t think OpenAI needs to worry. I honestly did not feel a single emotion from Google’s announcement and it’s also clear that they manipulated the benchmarks, which won’t matter to their average user but anyone with a little bit of skin in the game can see right through it.

What I think will happen is that Ultra will end up being a “stable ChatGPT 3.5” and that’s it.

I am from the EU so I can’t try the Pro version in Bard, but there will be benchmarks to look at in a few hours already.

papichulo2023 · on Dec 6, 2023

I think it will be better, stakes are heigh and they can't afford another failure like Bard, their AI reputation would crumble and become a meme. They have the expertise, the data and unlimited computation capacity (hey, maybe they are repurposing all those Stadia gpu).