this post was submitted on 09 Aug 2023

0 points (NaN% liked)

Technology

37711 readers

150 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

0

Google says AI systems should be able to mine publishers’ work unless companies opt out, turning copyright law on its head (www.theguardian.com)

submitted 1 year ago by [email protected] to c/[email protected]

10 comments fedilink hide all child comments

In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

all 11 comments

sorted by: hot top controversial new old

[–] [email protected] 0 points 1 year ago (1 children)

Copyright law already allows generative AI systems to scrape the internet. You need to change the law to forbid something, it isn't forbidden by default. Currently, if something is published publicly then it can be read and learned from by anyone (or anything) that can see it. Copyright law only prevents making copies of it, which a large language model does not do when trained on it.

[–] [email protected] 0 points 1 year ago (2 children)

An AI model is a derivative work of its training data and thus a copyright violation if the training data is copyrighted.

[–] [email protected] 0 points 1 year ago (1 children)

It is not a derivative work, the model does not contain any recognizable part of the original material that it was trained on.

[–] [email protected] 0 points 1 year ago (1 children)

Except when it produces exact copies of existing works, or when it includes a recognisable signature or watermark?

[+] [email protected] 0 points 1 year ago (1 children)

[deleted]

[–] [email protected] 0 points 1 year ago (1 children)

The point is that if the model doesn't contain any recognisable parts of the original material it was trained on, how can it reproduce recognisable parts of the original material it was trained on?

[–] [email protected] 0 points 1 year ago

That's sorta the point of it.
I can recreate the phrase "apple pie" in any number of styles and fonts using my hands and a writing tool. Would you say that I "contain" the phrase "apple pie"? Where is the letter 'p' in my brain?

Specifically, the AI contains the relationship between sets of words, and sets of relationships between lines, contrasts and colors.
From there, it knows how to take a set of words, and make an image that proportionally replicates those line pattern and color relationships.

You can probably replicate the Getty images watermark close enough for it to be recognizable, but you don't contain a copy of it in the sense that people typically mean.
Likewise, because you can recognize the artist who produced a piece, you contain an awareness of that same relationship between color, contrast and line that the AI does. I could show you a Picasso you were unfamiliar with, and you'd likely know it was him based on the style.
You've been "trained" on his works, so you have internalized many of the key markers of his style. That doesn't mean you "contain" his works.

[–] [email protected] 0 points 1 year ago (1 children)

A human is a derivative work of its training data, thus a copyright violation if the training data is copyrighted.

The difference between a human and ai is getting much smaller all the time. The training process is essentially the same at this point, show them a bunch of examples and then have them practice and provide feedback.

If that human is trained to draw on Disney art, then goes on to create similar style art for sale that isn't a copyright infringement. Nor should it be.

[–] [email protected] 0 points 1 year ago (1 children)

This is stupid and I'll tell you why.
As humans, we have a perception filter. This filter is unique to every individual because it's fed by our experiences and emotions. Artists make great use of this by producing art which leverages their view of the world, it's why Van Gogh or Picasso is interesting because they had a unique view of the world that is shown through their work.
These bots do not have perception filters. They're designed to break down whatever they're trained on into numbers and decipher how the style is constructed so it can replicate it. It has no intention or purpose behind any of its decisions beyond straight replication.
You would be correct if a human's only goal was to replicate Van Gogh's style but that's not every artist. With these art bots, that's the only goal that they will ever have.

I have to repeat this every time there's a discussion on LLM or art bots:
The imitation of intelligence does not equate to actual intelligence.

[–] [email protected] 0 points 1 year ago

Absolutely agreed! I think if the proponents of AI artwork actually had any knowledge of art history, they'd understand that humans don't just iterate the same ideas over and over again. Van Gogh, Picasso, and many others, did work that was genuinely unique and not just a derivative of what had come before, because they brought more to the process than just looking at other artworks.

[–] [email protected] 0 points 1 year ago* (last edited 1 year ago)

It’s not turning copyright law on its head, in fact asserting that copyright needs to be expanded to cover training a data set IS turning it on its head. This is not a reproduction of the original work, its learning about that work and and making a transformative use from it. An generative work using a trained dataset isn’t copying the original, its learning about the relationships that original has to the other pieces in the data set.