It was already quite easy to get GPT-4 to output json. You just append ‘reply in...

ignite · on June 14, 2023

> You just append ‘reply in json with this format’ and it does a really good job.

It does an ok job. Except when it doesn't. Definitely misses a lot of the time, sometimes on prompts that succeeded on previous runs.

bel423 · on June 15, 2023

It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.

worik · on June 15, 2023

> It literally does it everytime perfectly. I remember I put together an entire system that would validate the JSON against a zod schema and use reflection to fix it and it literally never gets triggered because GPT3.5-turbo always does it right the first time.

Danger! There be assumptions!!

gpt-? is a moving target and in rapid development. What it does Tuesday, which it did not do on Monday, it may well not do on Wednesday

If there is a documented method to guarantee it, it will work that way (modulo OpenAI bugs - and now Microsoft is involved....)

What we had before, what you are talking of, was observed behaviour. An assumption that what we observed in the past will continue in the future is not something to build a business on

travisjungroth · on June 15, 2023

ChatGPT moves fast. The API version doesn’t seem to change except with the model and documented API changes.

whateveracct · on June 15, 2023

No it doesn't lol. I've seen it just randomly not use a comma after one array element, for example.

LanceJones · on June 15, 2023

Yep. Incorrect trailing commas ad nauseum for me.

thomasfromcdnjs · on June 15, 2023

Are you saying that it return only JSON before? I'm with the other commenters it was wildly variable and always at least said "Here is your response" which doesn't parse well.

travisjungroth · on June 15, 2023

If you want a parsable response, have it wrap that with ```. Include an example request/response in your history. Treat any message you can’t parse as an error message.

This works well because it has a place to put any “keep in mind” noise. You can actually include that in your example.

lmeyerov · on June 15, 2023

Yeah no

sheepscreek · on June 15, 2023

The solution that worked great for me - do not use JSON for GPT to agent communication. Use comma separated key=value, or something to that effect.

Then have another pure code layer to parse that into structured JSON.

I think it’s the JSON syntax (with curly braces) that does it in. So YAML or TOML might work just as well, but I haven’t tried that.

jacobsimon · on June 15, 2023

Coincidentally, I just published this JS library[1] over the weekend that helps prompt LLMs to return typed JSON data and validates it for you. Would love feedback on it if this is something people here are interested in. Haven’t played around with the new API yet but I think this is super exciting stuff!

[1] https://github.com/jacobsimon/prompting

golergka · on June 15, 2023

Looks promising! Do you do retries when returned json is invalid? Personally, I used io-ts for parsing, and GPT seems to be able to correct itself easily when confronted with a well-formed error message.

jacobsimon · on June 15, 2023

Great idea, I was going to add basic retries but didn’t think to include the error.

Any other features you’d expect in a prompt builder like this? I’m tempted to add lots of other utility methods like classify(), summarize(), language(), etc

bombela · on June 15, 2023

It's harder to form a tree with key value. I also tried the relational route. But it would always messup the cardinality (one person should have 0 or n friends, but a person has a single birth date).

sheepscreek · on June 15, 2023

You could flatten it using namespaced keys. Eg.

    {
      parent1: { child1: value }
    }

Becomes one of the following:

    parent1/child1=value
    parent1_child1=value
    parent1.child1=value

..you get the idea.

rubyskills · on June 15, 2023

It's also harder to stream JSON? Maybe I'm overthinking this.

cwxm · on June 14, 2023

even with gpt 4, it hallucinates enough that it’s not reliable, forgetting to open/close brackets and quotes. This sounds like it’d be a big improvement.

jonplackett · on June 14, 2023

Not that it matters now but just doing something like this works 99% of the time or more with 4 and 90% with 3.5.

It is VERY IMPORTANT that you respond in valid JSON ONLY. Nothing before or after. Make sure to escape all strings. Use this format:

{“some_variable”: [describe the variable purpose]}

SamPatt · on June 14, 2023

99% of the time is still super frustrating when it fails, if you're using it in a consumer facing app. You have to clean up the output to avoid getting an error. If it goes from 99% to 100% JSON that is a big deal for me, much simpler.

jonplackett · on June 14, 2023

Except it says in the small print to expect invalid JSON occasionally, so you have to write your error handling code either way

golergka · on June 15, 2023

If you're building an app based on LLMs that expects higher than 99% correctness from it, you are bound to fail. Negative scenarios workarounds and retries are mandatory.

davepeck · on June 14, 2023

Yup. Is there a good/forgiving "drunken JSON parser" library that people like to use? Feels like it would be a useful (and separable) piece?

golol · on June 14, 2023

Honestly, I suspect asking GPT-4 to fix your JSON (in a new chat) is a good drunken JSON parser. We are only scraping the surface of what's possible with LLMs. If Token generation was free and instant we could come up with a giant schema of interacting model calls that generates 10 suggestions, iterates over them, ranks them and picks the best one, as silly as it sounds.

andai · on June 15, 2023

That's hilarious... if parsing GPT's JSON fails, keep asking GPT to fix it until it parses!

golol · on June 15, 2023

It shouldn't be surprising though. If a human makes an error parsing JSON, what do you do? You make them look over it again. Unless their intelligence is the bottleneck they might just be able to fix it.

golergka · on June 15, 2023

It works. Just be sure to build a good error message.

hhh · on June 14, 2023

I already do this today to create domain-specific knowledge focused prompts and then have them iterate back and forth and a ‘moderator’ that chooses what goes in and what doesn’t.

8organicbits · on June 14, 2023

Wouldn't you use traditional software to validate the JSON, then ask chatgpt to try again if it wasn't right?

girvo · on June 15, 2023

In my experience, telling it "no thats wrong, try again" just gets it to be wrong in a new different way, or restate the same wrong answer slightly differently. I've had to explicitly guide it to correct answers or formats at times.

cjbprime · on June 15, 2023

Try different phrasing, like "Did your answer follow all of the criteria?".

whateveracct · on June 15, 2023

It forgets commas too

ztratar · on June 14, 2023

Nah, this was solved by most teams a while ago.

bel423 · on June 15, 2023

I feel like I’m taking crazy pills with the amount of people saying this is game changing.

Did they not even try asking gpt to format the output as json?

worik · on June 15, 2023

> I feel like I’m taking crazy pills....try asking gpt to format the output as json

You are taking crazey pills. Stop

gpt-? is unreliable! That is not a bug in it, it is the nature of the beast.

It is not an expert at anything except natural language, and even then it is an idiot savant

sethd · on June 14, 2023

I like to define a JSON schema (https://json-schema.org/) and prompt GPT-4 to output JSON based on that schema.

This lets me specify general requirements (not just JSON structure) inline with the schema and in a very detailed and structured manor.

seizethecheese · on June 15, 2023

In a production system, you don’t need easy to do most of the time, you need easy without fail.

pnpnp · on June 15, 2023

Ok, just playing devil's advocate here. How many FAANG companies have you seen have an outage this year? What's their budget?

I think a better way to reply to the author would have been "how often does it fail"?

Every system will have outages, it's just a matter of how much money you can throw at the problem to reduce them.

jrockway · on June 15, 2023

If 99.995% correct looks bad to users, wait until they see 37%.

muzani · on June 15, 2023

It's fine, but the article makes some good points why - less cognitive load for GPT and less tokens. I think the transistor to logic gate analogy makes sense. You can build the thing perfectly with transistors, but just use the logic gate lol.

reallymental · on June 14, 2023

Is there any publicly available resource replicate your work? I would love to just find the right kind of "incantation" for the gpt-3.5-t or gpt-4 to output a meaningful story arc etc.

Any examples of your work would be greatly helpful as well!

SamPatt · on June 14, 2023

I'm not the person you're asking, but I built a site that allows you to generate fiction if you have an OpenAI API key. You can see the prompts sent in console, and it's all open source:

https://havewords.ai/

devbent · on June 15, 2023

I have an open source project doing exactly this at https://www.generativestorytelling.ai/ GitHub link is on the main page!

bradly · on June 14, 2023

I could not get GPT-4 to reliably not give some sort of text response, even if was just a simple "Sure" followed by the JSON.

avereveard · on June 15, 2023

Pass in an agent message with "Sure here is the answer in json format:" after the user message. Gpt will think it has already done the preamble and the rest of the message will start right with the json.

rytill · on June 14, 2023

Did you try using the API and providing a very clear system message followed by several examples that were pure JSON?

bradly · on June 14, 2023

Yep. I even gave it a JSON schema file to use. It just wouldn't stop added extra verbage.

taylorfinley · on June 15, 2023

I just use a regex to select everything between the first and last curly bracket, reliable fixes the “sure, here’s your object” problem.

NicoJuicy · on June 14, 2023

Say it's a json API and may only reply with valid json without explanation.

bradly · on June 15, 2023

Lol yes of course I tried that.

dror · on June 15, 2023

I've had good luck with both:

https://github.com/drorm/gish/blob/main/tasks/coding.txt

and

https://github.com/drorm/gish/blob/main/tasks/webapp.txt

With the second one, I reliably generated half a dozen apps with one command.

Not to say that it won't fail sometimes.

NicoJuicy · on June 15, 2023

Combine both ? :)

throwuwu · on June 15, 2023

Just end your request with

‘’’json

Or provide a few examples of user request and then agent response in json. Or both.

clbrmbr · on June 15, 2023

Does the ```json trick work with the chat models? Or only the earlier completion models?

throwuwu · on June 15, 2023

Works with chat. They’re still text completion models under all that rlhf