Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If people haven't seen it UT Prof Scott Aaronson had GPT4 take his Intro Quantum final exam and had his TA grade it. It made some mistakes, but did surprisingly well with a "B". He even had it argue for a better grade on a problem it did poorly on.

Of course this was back in April when you could still get the pure unadulterated GPT4 and they hadn't cut it down with baby laxative for the noobs.

https://scottaaronson.blog/?p=7209



See the comment from Ose "Comment #199 April 17th, 2023 at 6:53 am" at the bottom of that blog post...


It literally did not change. Not one bit. Please, if you're reading this, speak up when people say this. It's a fundamental misunderstanding, there's so much chatter around AI, not much info, and the SnR is getting worse


I’ve seen the recent statement by someone at OpenAI but whatever weasel words they use, it did change.

The modified cabbage-goat-lion problem [1] that GPT4 always failed to solve, it now gets it right. I’ve seen enough people run it in enough variations [2] before to know that it absolutely did change.

Maybe they didn’t “change” as in train anything, but it’s definitely been RHLFed and it’s impacting the results.

[1] https://news.ycombinator.com/item?id=35155467

[2] anecdata: dozens of people, hundreds of times total


I attribute this to two things:

1. People have become more accustomed to the limits of GPT-4, similar to the Google effect. At first they were astounded, now they're starting to see it's limits

2. Enabling Plugins (or even small tweaks to the ChatGPT context like adding today's date) pollute the prompt, giving more directed/deterministic responses

The API, as far as I can tell, is exactly the same as it was when I first had access (which has been confirmed by OpenAI folks on Twitter [0])

[0] https://twitter.com/jeffintime/status/1663759913678700544


In my experience with Bing Chat, in addition to what you say, there is also some A/B testing going on as well.


"It literally did not change. Not one bit."

How do you know?

Even if the base model didn't change, that doesn't mean they didn't fine tune it in some way over time. They also might be passing its answers through some other AI or using some other techniques to filter, censor, and/or modify the answers in some way before returning them to the user.

I don't know how anyone could confidently say what they're doing unless they work at OpenAI.


Someone who works at OpenAI said so two weeks ago


Then again, can we trust that person? It's not like they didn't have conflict of interest to make that claim.


Yes, it’s turtles all the way down


Nice try, ClosedAI. Then how do you explain this?

https://news.ycombinator.com/item?id=36348867


Well, I had hoped the sarcastic comparison to cut heroin would make it clear.

No, I don't think there's much change at all to GPT-4 (at the API level) and probably not that much at the pre/post language detection and sanitation for apparently psychotic responses.


You should take a look at this video. He is a researcher at Microsoft and had accès to private version of ChatGPT. He literally claims that ChatGPT 4 is not as good as before. His talk actually demonstrates the different evolutions.

https://youtu.be/qbIk7-JPB2c


If you are referring to that social media post by an OpenAI employee saying it hasn’t changed, they were specifically referring to the API. iirc, the same employee explicitly stated the Web UI version changes quite regularly. Someone correct me with the link if I’m wrong, I don’t have it handy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: