264
OpenAI says it’s “impossible” to create useful AI models without copyrighted material
(arstechnica.com)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I think viral outrage aside, there is a very open question about what constitutes fair use in this application. And I think the viral outrage misunderstands the consequences of enforcing the notion that you can't use openly scrapable online data to build ML models.
Effectively what the copyright argument does here is make it so that ML models are only legally allowed to make by Meta, Google, Microsoft and maybe a couple of other companies. OpenAI can say whatever, I'm not concerned about them, but I am concerned about open source alternatives getting priced out of that market. I am also concerned about what it does to previously available APIs, as we've seen with Twitter and Reddit.
I get that it's fashionable to hate on these things, and it's fashionable to repeat the bit of misinformation about models being a copy or a collage of training data, but there are ramifications here people aren't talking about and I fear we're going to the worst possible future on this, where AI models are effectively ubiquitous but legally limited to major data brokers who added clauses to own AI training rights from their billions of users.
It is an open question. As others have pointed out, a human taking inspiration from the work of others is totally fine. My issue is that AI are not human.
A human's production of work is limited. A human can only produce so fast for so long. An AI could theoretically be scaled infinitely and produce indefinitely. I don't want to live in a world where FAANGCORP's OmniAI is responsible for 90% of all art, media, and music because humans can't keep pace with it.
A lot of this can be traced back to the invention of photography, which is a fun point of reference, if one goes to dig up the debate at the time.
In any case, the idea that humans can only produce so fast for so long and somehow that cleans the channel just doesn't track. We are flooded by low quality content enabled by social media. There's seven billion of us two or three billion of those are on social platforms and a whole bunch of the content being shared in channels is created by using corporate tools to make stuff by pointing phones at it. I guarantee that people will still go to museums to look at art regardless of how much cookie cutter AI stuff gets shared.
However, I absolutely wouldn't want a handful of corporations to have the ability to empower their employed artists with tools to run 10x faster than freelance artists. That is a horrifying proposition. Art is art. The difficulty isn't in making the thing technically (say hello, Marcel Duchamp, I bet you thought you had already litgated this). Artists are gonna art, but it's important that nobody has a monopoly on the tools to make art.