> I also read somewhere (not Wikipedia) that they trained on ChatGPT, Claude, and Gemini queries, basically feeding in the output of competitor’s LLMs as training data
Did OpenAI observe any ToS when scraping content from the Internet? Sorry, but you cannot complain about stealing the stolen. OpenAI cannot even have the copyright on ChatGPT output, because it's a tool.
ToS didn't stop the companies that built those models and it won't stop the companies that bootstrap off them. Until an AI company eats a multi billion dollar lawsuit for unlawful data use they will continue to operate this way.
> Until an AI company eats a multi billion dollar lawsuit for unlawful data use they will continue to operate this way
If only. That's my dream, massive copyright lawsuits against all of these AI players and maybe the courts can do something good for a change, put an end to all of this AI bullshit
All the labs permitting synthetic data do that.