Unless you didn’t want people to see and use it, maybe you shouldn’t have made it public? I don’t think training AI is fundamentally different from humans learning from things they see, and we don’t restrict that either.
The fundamental difference is that computers can do this at a pace and scale that humans could never aspire to. There is a natural limit to the extent a single individual can be informed by previous works. It sets a natural pace to innovation that is sustainable for both the artist and derivative works.
Computers have no such limitation and can consume almost the entirety of a subject’s work in a few weeks or months. I think that alone is enough to say that yes, it is fundamentally different.
Automated assembly lines have the same properties. Same with transportation. Buses, trains, airplanes, ships. These all work tirelessly at a pace that humans cannot match.
I think you missed the point. The claim is that AI training from public information is no different from humans learning from public information.
My argument is precisely that the mechanisation of information is fundamentally different from the scale at which a human can learn. One immediate consequence being there is no longer a natural brake on the scale of what can be sourced for use in a derivative work.
To be clear this is not a value judgement, just to point out that it _is_ different, just as driving is fundamentally different from what one can do with one’s own feet. Of course the mechanisation of transport is history and seems daft to argue against. But it is different. Whether that’s good or bad is a much harder question.
I can agree that it’s different, but driving is not fundamentally different from walking, in that both get you from one place to another. Nobody drives to a place because it’s fundamentally different from walking there, they do it purely because it’s faster and leaves them less tired.
I think the same thing is true for AI. Or at least, for training or acting on public information. It’s not suddenly bad because you are able to do it on all information in existence at the same time.
When it’s big copyright holders we have very specific, very granular definitions of what constitutes fair or allowed use. But when it comes to smaller creators the answer is that it’s their fault for trying to promote their work and make a living.
No. It’s not fundamentally different. Big copyright holders are better at asserting their rights, and had their information under lock and key from the start, but I don’t think for a moment all these models haven’t been trained on all the content on the disney website (or all the websites with disney derative work).
Disney is just better at preventing the LLM companies from allowing it to regurgitate that stuff.
1. I’m disinclined to believe that purely because of the inconvenience of doing so. Much easier to scrape the entire internet.
2. How so? If you sell your stuff and someone makes it public, it’s still your choice to sell it.
3. That’s only true as far as recreating the content is concerned. Reading and viewing is by definition fair use for publicly visible information.
4. Define fair compensation. I feel like the creatives are just upset that their work is “easy” to replicate with these models. And that isn’t even true, they never appear as unique or interesting as truly new work.
1. If scraping the entire internet is easier, but doesn’t give me as good of results as including private or licensed data, I will train my LLM on private data or lose the race. It is not about easy. Just look at some of the lawsuits against OpenAI: https://originality.ai/blog/openai-chatgpt-lawsuit-list
2. Think about this - what you are saying is that you can’t sell anything without also making it public. These are not the same thing. I sell something to get value from my labor. I make something public to get eyes on it. The whole issue is that people who want to privately sell things have their work being undercut by LLMs
3. LLMs are recreating the content
4. Take Miyazaki. He spends his whole life developing a unique art style and skill. Years. He makes his living and provides a living for others with it. The value of that _used to be_ the movies he was paid to make and the revenue they generated. Now LLMs can create his work for free, and he doesn’t see a dime whenever someone converts their profile picture into his style. This is the ethical problem - he is not compensated, let alone the ethics of upending artists years of work for the sake of it
These LLMs primarily distribute intellectual and creative wealth from media conglomerates to anyone on earth with $20. (Without fair compensation, agreed)
AI is also not American, A puppy, or a cardboard box full of fairies.
Your underlying presumption is that there is some property of Humans that makes the distinction important. Without identifying that property it impossible to evaluate the merit of any claim about it.
> I don’t think training AI is fundamentally different from humans learning from things they see, and we don’t restrict that either.
It places humans and AI on equal footing, which I fundamentally object to. No, we do not restrict how humans learn, nor do I believe we should. I do believe we can, and ought, to have restrictions on how technology is used within human society. Those restrictions may change over time and adapt, but I wholeheartedly disagree with the premise that AI learning and human learning is not fundamentally different - it is different because one involves a human, whose needs should be placed above those of a machine.
It's the tendency to equate humans and AI that I find both distasteful and potentially dangerous.