this post was submitted on 16 Jan 2024

88 points (100.0% liked)

Technology

38692 readers

429 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

88

Why is AI Pornifying Asian Women? (joysauce.com)

submitted 1 year ago by Gaywallet@beehaw.org to c/technology@beehaw.org

73 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] jarfil@beehaw.org 3 points 1 year ago (2 children)

"Inclusive models" would need to be larger.

Right now people seem to prefer smaller quantized models, with whatever set of even smaller LoRAs on top, that make them output what they want... and only include more generic elements in the base model.

[–] Muehe@lemmy.ml 1 points 1 year ago (1 children)

“Inclusive models” would need to be larger.

[citation needed]

To my understanding the problem is that the models reproduce biases in the training material, not model size. Alignment is currently a manual process after the initial unsupervised learning phase, often done by click-workers (Reinforcement Learning from Human Feedback, RLHF), and aimed at coaxing the model towards more "politically correct" outputs; But ultimately at that time the damage is already done since the bias is encoded in the model weights and will resurface in the outputs just randomly or if you "jailbreak" enough.

In the context of the OP, if your training material has a high volume of sexualised depictions of Asian women the model will reproduce that in its outputs. Which is also the argument the article makes. So what you need for more inclusive models is essentially a de-biased training set designed with that specific purpose in mind.

I'm glad to be corrected here, especially if you have any sources to look at.

[–] jarfil@beehaw.org 2 points 1 year ago (1 children)

You can cite me on this:

First, there is no thing as a "de-biased" training set, only sets with whatever target series of biases you define for them to reflect.

Then, there are only two ways to change the biases of a training set:

either you replace data until your desired objective, which will reduce the model's quality for any of the alternatives
or you add data until your desired objective, which will require an increased size to encode the increased amount of data, or the model's quality will go down for all cases (you'd be diluting every other case)

For reference, LoRAs are a sledgehammer approach to apply the first way.

As for the article, it's talking about the output of some app, with unknown extra prompting and LoRAs getting applied in the back, so it's worthless as a discussion of the underlying model, much less as a discussion of all models.

[–] Muehe@lemmy.ml 1 points 1 year ago (1 children)

First, there is no thing as a “de-biased” training set, only sets with whatever target series of biases you define for them to reflect.

Yes, I obviously meant "de-biased" by definition of whoever makes the set. Didn't think it worth mentioning, as it seems self evident. But again, in concrete terms regarding the OP this just means not having your dataset skewed towards sexualised depictions of certain groups.

either you replace data until your desired objective, which will reduce the model’s quality for any of the alternatives

[...]
For reference, LoRAs are a sledgehammer approach to apply the first way.

The paper introducing LoRA seems to disagree (emphasis mine):

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

There is no data replaced, the model is not changed at all. In fact if I'm not misunderstanding it adds an additional neural network on top of the pre-trained one, i.e. it's adding data instead of replacing any. Fighting bias with bias if you will.

And I think this is relevant to a discussion of all models, as reproduction of training set biases is something common to all neural networks.

[–] jarfil@beehaw.org 2 points 1 year ago* (last edited 1 year ago) (1 children)

That paper is correct (emphasis mine):

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

You can see how it works in the "Introduction" section, particularly figure 1, or in this nice writeup:

https://dataman-ai.medium.com/fine-tune-a-gpt-lora-e9b72ad4ad3

LoRA is a "space and time efficient" technique to produce a modification matrix for each layer. It doesn't introduce new layers, or add data to any layer. To the contrary, it's bludgeoning all the separate values in each layer, and modifying each whole column and whole row by the same delta (or only a few deltas, in any case with Ar«Wk and Br«Wd).

Turns out... that's enough to apply some broad strokes type of changes to a model, which still limps along thanks to the remaining value variation. But don't be mistaken: with each additional LoRA applied, a model loses some of its finer details, until at some point it descends into total nonsense.

[–] Muehe@lemmy.ml 1 points 1 year ago (1 children)

Yeah but that's my point, right?

That

you do not "replace data until your desired objective".
the original model stays intact (the W in the picture you embedded).

Meaning that when you change or remove the LoRA (A and B), the same types of biases will just resurface from the original model (W). Hence "less biased" W being the preferable solution, where possible.

Don't get me wrong, LoRAs seem quite interesting, they just don't seem like a good general approach to fighting model bias.

[–] jarfil@beehaw.org 2 points 1 year ago* (last edited 1 year ago) (1 children)

"less biased" W being the preferable solution, where possible.

Not necessarily. There are two parts to a diffusion model: a tokenizer, and a neural network with a series of layers (W in this case would be a single layer) that react in some way to some tokens. What you really want, is a W "with more information", no matter if some tokens refer to a more or less "fair" (less biased) portion of it.

It doesn't really matter if "girl = 99% chance of white girl + 1% of [other skin tone] girl", and "asian girl = sexualized asian girl"... as long as the "biased" token associations don't reduce de amount of "[skin tone] girl" variants you can extract with specific prompts, and still react correctly to negative prompts like "asian girl -sexualized".

LoRAs are a way to bludgeon a whole model into a strong bias, like "everything is a manga", or "everything is birds", or "all skin is frogs", and so on. The interesting thing of LoRAs is that, if you get a base model where "girl = sexualized white girl", and add an "all faces are asian" LoRA, and a "no sexualized parts" LoRA... then well, you've beaten the model into submission without having to use prompts (kind of a pyrrhic victory).

That is, unless you want something like a "multirracial female basketball team".

That would require the model to encode the "race" as multiple sets of features, then pick one at random for every player in whatever proportion you find acceptable... but for that, you're likely better off with adding an LLM preprocessor stage to pick a random set of races in your desired proportion, then have it instruct a bounded box diffusion model to draw each player with a specific prompt, so the bias of the model's tokens would again become irrelevant.

Forcing the model to encode more variants per token, is where you start needing a larger model, or start losing quality.

[–] Muehe@lemmy.ml 1 points 1 year ago (1 children)

a neural network with a series of layers (W in this case would be a single layer)

I understood this differently. W is a whole model, not a single layer of a model. W is a layer of the Transformer architecture, not of a model. So it is a single feed forward or attention model, which is a layer in the Transformer. As the paper says, a LoRA:

injects trainable rank decomposition matrices into each layer of the Transformer architecture

It basically learns shifting the output of each Transformer layer. But the original Transformer stays intact, which is the whole point, as it lets you quickly train a LoRA when you need this extra bias, and you can switch to another for a different task easily, without re-training your Transformer. So if the source of the bias you want to get rid off is already in these original models in the Transformer, you are just fighting fire with fire.

Which is a good approach for specific situations, but not for general ones. In the context of OP you would need one LoRA for fighting it sexualising Asian women, then you would need another one for the next bias you find, and before you know it you have hundreds and your output quality has degraded irrecoverably.

[–] jarfil@beehaw.org 1 points 1 year ago

It basically learns shifting the output of each Transformer layer

That would increase inference time, which is something they explicitly avoid.

Check point 4.1 in the paper. W is a weight matrix for a single layer, and the training focuses on finding a ∆W such that the result is fine tuned. The LoRA optimization lies in calculating a ∆W in the form of BA with lower ranks, but W still being a weight matrix for the layer, not its output:

W0 + ∆W = W0 + BA

A bit later:

When deployed in production, we can explicitly compute and store W = W0 + BA and perform inference as usual

W0 being the model's layer's original weight matrix, and W being the modified weight matrix that's being "executed".

the original Transformer stays intact

At training time, yes. At inference time, no.

before you know it you have hundreds and your output quality has degraded irrecoverably.

This is correct. Just not because you've messed with the output of each layer, but with the weights of each layer... I'd guess messing with the outputs would cause a quicker degradation.

[–] Even_Adder@lemmy.dbzer0.com 1 points 1 year ago (1 children)

I wouldn't mind. I'm here for it.

[–] jarfil@beehaw.org 2 points 1 year ago (1 children)

Are you ready to run a 100B FP64 parameter model? Or even a 10B FP32 one?

Over time, I wouldn't be surprised if 500B INT8 models became commonplace with neuromorphic RAM, but there's still some time for that to happen.

[–] Even_Adder@lemmy.dbzer0.com 1 points 1 year ago* (last edited 1 year ago) (1 children)

You don't need that many parameters, 4gb checkpoints work just fine.

[–] jarfil@beehaw.org 2 points 1 year ago (1 children)

For more inclusive models, or for current ones? In order to add something, either the size has to grow, or something would need to get pushed out (content, or quality). 4GB models are already at the limit of usefulness, both DALLE3 and SDXL run at about 12B parameters, so to make them "more inclusive" they'd have to grow.

[–] Even_Adder@lemmy.dbzer0.com 3 points 1 year ago (1 children)

I'm saying SD 1.5 and SDXL capture the concepts just fine, it's just during fine-tuning people train away some of the diversity.

[–] jarfil@beehaw.org 1 points 1 year ago (1 children)

Wait, by "fine-tuning"... do you mean LoRAs? Because those are more like brain surgery with a sledgehammer, rather the opposite of "fine". I don't think it's possible for LoRAs to avoid having undesirable side effects... and I don't think people even want that.

Actual "fine" tuning, would be adding the LoRA's training data to the original set, then training the whole model from scratch... and that would require increasing the model's size to encode the increased amount of data for the same output quality.

[–] Even_Adder@lemmy.dbzer0.com 2 points 1 year ago (1 children)

I mean like this. This paper just dropped the other day.

[–] jarfil@beehaw.org 1 points 1 year ago* (last edited 1 year ago) (1 children)

Nice read, and an interesting approach... although it kind of tries to hide the elephant in the room:

This work has the potential to shift the way that image gen-erators operate at achievable costs to ensure that several cat-egories of harm from ‘AI’ generated models are mitigated, while the generated images become much more realistic and representative of the AI-generated images that populations want around the world.

They show that the approach optimizes for less "stereotypes" and less "offensive", which in most cultures leads from worse to better "cultural representation"... but notice how there is a split in the "Indian" culture cohort, with an equal amount finding "more stereotypical, more offensive" to be just as good at "cultural representation":

They basically made the model more politically correct and "idealized", but in the process removed part of a culture representation that wasn't wrong, because the "culture" itself is split to begin with.

[–] Even_Adder@lemmy.dbzer0.com 1 points 1 year ago (1 children)

"Indian" is a huge population of very diverse people.

[–] jarfil@beehaw.org 1 points 1 year ago (1 children)

That's my point. They claim to reduce misrepresentation, while at the same time they erase a bunch of correct representations.

Going back to what I was saying: fine tuning doesn't increase diversity, it only shifts the biases. Encoding actual diversity would require increasing the model, then making sure it can output every correct representation.

[–] Even_Adder@lemmy.dbzer0.com 3 points 1 year ago (1 children)

It doesn't necessarily have to shift away from diversity biases. I think with care, you can preserve the biases that matter most. That was just their first shot at it, this seems like something you'd get better at over time.

[–] jarfil@beehaw.org 2 points 1 year ago

I guess their main shortcoming was the cultural training set. I'm still unconvinced that level of fine tuning is possible without increasing model size, but we'll see what happens if/when someone curates a much larger set with cultural labeling.

The labels might also need to be more granular, like "culture:subculture:period", or something... which is kind of a snakes nest by itself.