It was already quite easy to get GPT-4 to output json. You just append ‘reply in json with this format’ and it does a really good job.
GPT-3.5 was very haphazard though and needs extensive babysitting and reminding, so if this makes gpt3 better then it’s useful - it does have an annoying disclaimer though that ‘it may not reply with valid json’ so we’ll still have to do some sense checks into he output.
I have been using this to make a few ‘choose your own adventure’ type games and I can see there’s a TONNE of potential useful things.
It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.
> It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.
Danger! There be assumptions!!
gpt-? is a moving target and in rapid development. What it does Tuesday, which it did not do on Monday, it may well not do on Wednesday
If there is a documented method to guarantee it, it will work that way (modulo OpenAI bugs - and now Microsoft is involved....)
What we had before, what you are talking of, was observed behaviour. An assumption that what we observed in the past will continue in the future is not something to build a business on
Are you saying that it return only JSON before? I'm with the other commenters it was wildly variable and always at least said "Here is your response" which doesn't parse well.
If you want a parsable response, have it wrap that with ```. Include an example request/response in your history. Treat any message you can’t parse as an error message.
This works well because it has a place to put any “keep in mind” noise. You can actually include that in your example.
Coincidentally, I just published this JS library[1] over the weekend that helps prompt LLMs to return typed JSON data and validates it for you. Would love feedback on it if this is something people here are interested in. Haven’t played around with the new API yet but I think this is super exciting stuff!
Looks promising! Do you do retries when returned json is invalid? Personally, I used io-ts for parsing, and GPT seems to be able to correct itself easily when confronted with a well-formed error message.
Great idea, I was going to add basic retries but didn’t think to include the error.
Any other features you’d expect in a prompt builder like this? I’m tempted to add lots of other utility methods like classify(), summarize(), language(), etc
It's harder to form a tree with key value. I also tried the relational route. But it would always messup the cardinality (one person should have 0 or n friends, but a person has a single birth date).
even with gpt 4, it hallucinates enough that it’s not reliable, forgetting to open/close brackets and quotes. This sounds like it’d be a big improvement.
99% of the time is still super frustrating when it fails, if you're using it in a consumer facing app. You have to clean up the output to avoid getting an error. If it goes from 99% to 100% JSON that is a big deal for me, much simpler.
If you're building an app based on LLMs that expects higher than 99% correctness from it, you are bound to fail. Negative scenarios workarounds and retries are mandatory.
Honestly, I suspect asking GPT-4 to fix your JSON (in a new chat) is a good drunken JSON parser. We are only scraping the surface of what's possible with LLMs. If Token generation was free and instant we could come up with a giant schema of interacting model calls that generates 10 suggestions, iterates over them, ranks them and picks the best one, as silly as it sounds.
It shouldn't be surprising though. If a human makes an error parsing JSON, what do you do? You make them look over it again. Unless their intelligence is the bottleneck they might just be able to fix it.
I already do this today to create domain-specific knowledge focused prompts and then have them iterate back and forth and a ‘moderator’ that chooses what goes in and what doesn’t.
In my experience, telling it "no thats wrong, try again" just gets it to be wrong in a new different way, or restate the same wrong answer slightly differently. I've had to explicitly guide it to correct answers or formats at times.
It's fine, but the article makes some good points why - less cognitive load for GPT and less tokens. I think the transistor to logic gate analogy makes sense. You can build the thing perfectly with transistors, but just use the logic gate lol.
Is there any publicly available resource replicate your work? I would love to just find the right kind of "incantation" for the gpt-3.5-t or gpt-4 to output a meaningful story arc etc.
Any examples of your work would be greatly helpful as well!
I'm not the person you're asking, but I built a site that allows you to generate fiction if you have an OpenAI API key. You can see the prompts sent in console, and it's all open source:
Pass in an agent message with "Sure here is the answer in json format:" after the user message. Gpt will think it has already done the preamble and the rest of the message will start right with the json.
GPT-3.5 was very haphazard though and needs extensive babysitting and reminding, so if this makes gpt3 better then it’s useful - it does have an annoying disclaimer though that ‘it may not reply with valid json’ so we’ll still have to do some sense checks into he output.
I have been using this to make a few ‘choose your own adventure’ type games and I can see there’s a TONNE of potential useful things.