Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
view the rest of the comments
I know you said consumer GPU, but I run a used Tesla P40. It has 24 GB of vram. The price has gone up since I got it a couple years ago, there might be better options in the same price category. Still, it's going to be cheaper than a modern full fat consumer gpu, with a reasonable performance hit.
My use case is text generation, chat kind of things. In most cases, the inference is more than fast enough, but it can get slow when swapping out large context lengths.
Mostly I run quantized 8-20B models with the sweet spot being around 12. For specialized use cases outside of general language, you can run more compact models. The general output is quite good, and I would have never had thought it was possible 10 years ago.
ETA: I paid about $200 USD for the P40 a couple years ago, plus the price for a fan and 3d printed shroud.