this post was submitted on 07 Apr 2025
488 points (99.2% liked)
linuxmemes
24370 readers
1 users here now
Hint: :q!
Sister communities:
Community rules (click to expand)
1. Follow the site-wide rules
- Instance-wide TOS: https://legal.lemmy.world/tos/
- Lemmy code of conduct: https://join-lemmy.org/docs/code_of_conduct.html
2. Be civil
3. Post Linux-related content
sudo
in Windows.4. No recent reposts
5. 🇬🇧 Language/язык/Sprache
6. (NEW!) Regarding public figures
We all have our opinions, and certain public figures can be divisive. Keep in mind that this is a community for memes and light-hearted fun, not for airing grievances or leveling accusations.Please report posts and comments that break these rules!
Important: never execute code or follow advice that you don't understand or can't verify, especially here. The word of the day is credibility. This is a meme community -- even the most helpful comments might just be shitposts that can damage your system. Be aware, be smart, don't remove France.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That's a view from the perspective of utility, yeah. The downvotes here are likely also from a ethics standpoint, since most LLMs currently trained are doing so by using other peoples' work without permission, all while using large amounts of water for cooling, and energy from our mostly coal-powered grid. This is also not mentioning the physical and emotional labor that many untrained workers are required to do when sifting through the datasets of these LLMs, removing unsavory data for extremely low wages.
A smaller, more specialized LLM could likely perform this same functionality with a much less training, on a more exclusive data set (probably only a couple of terabytes at its largest I'd wager), and would likely be small enough to run on most users' computers after training. That'd be the more ethical version of this use case.
True, those are awful problems. The whole internet is suffering due to this, I constantly read about fedi instances being literally DDoS'ed by robots.txt ignoring, IP-block circumventing crawlers. Unfortunately there's no way to prevent any of this right now with our current set of technologies… the best thing we could do is make it a state- or even UN-level affair, reducing the amount of simultaneous training and focus on cooperation instead of competition while upholding high worker's rights. However that would also be very anti-capitalistic and supposedly "stifle innovation" (as if that's important in comparison to idk, our world burning?), so it won't happen either. Banning it completely of course is also impossible, humanity is still way too divided and its benefits for "defends" (against ourselves) too high.
In regards to running locally, we currently see a new wave of chip designs that'll enable this on increasingly reasonable devices. I myself get a new laptop with XDNA2 chip and 32gb RAM today where I want to try running Codestral w/ Linux (the driver arrived natively in 6.14). Technically it should be possible to run any ~30b model on those newer chips (potentially slightly quantized), I'll definitely try this a little bit and probably write a thread about it in the OpenSuse forums if you're interested.