this post was submitted on 09 Jan 2024
68 points (98.6% liked)

Selfhosted

39893 readers
374 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Currently, I run Unraid and have all of my services' setup there as docker containers. While this is nice and easy to setup initially, it has some major downsides:

  • It's fragile. Unraid is prone to bugs/crashes with docker that take down my containers. It's also not resilient so when things break I have to log in and fiddle.
  • It's mutable. I can't use any infrastructure-as-code tools like terraform, and configuration sort of just exist in the UI. I can't really roll back or recover easily.
  • It's single-node. Everything is tied to my one big server that runs the NAS, but I'd rather have the NAS as a separate fairly low-power appliance and then have a separate machine to handle things like VMs and containers.

So I'm looking ahead and thinking about what the next iteration of my homelab will look like. While I like unraid for the storage stuff, I'm a little tired of wrangling it into a container orchestrator and hypervisor, and I think this year I'll split that job out to a dedicated machine. I'm comfortable with, and in fact prefer, IaC over fancy UIs and so would love to be able to use terraform or Pulumi or something like that. I would prefer something multi-node, as I want to be able to tie multiple machines together. And I want something that is fault-tolerant, as I host services for friends and family that currently require a lot of manual intervention to fix when they go down.

So the question is: how do you all do this? Kubernetes, docker-compose, Hashicorp Nomad? Do you run k3s, Harvester, or what? I'd love to get an idea of what people are doing and why, so I can get some ideas as to what I might do.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 9 months ago (1 children)

That's an interesting issue. Do you think the problem would be the same for any CSI plugin? I'm thinking of using my NAS as the storage brains of the operation and hooking it up with NFS or something, but would that have issues with stateful stuff like DB's too?

[–] [email protected] 2 points 9 months ago

I have never used NFS, but I think it would fare much better than seaweedfs because it uses Fuse to implement CSI. So for NFS I am sure the protocol would consider half-assed writes

would be the same for any CSI plugin

No, it would depend on the CSI plugin and how it is implemented. Ceph for example I know it has several, and cloud providers offer CSI volumes for their block storage (AWS EBS, GCP PD), and they will all perform differently. See this comment from a seaweedfs issue:

[...] It is always better to run databases on host volumes if you can (or on volumes provided by AWS EBS or similar). But with Seaweedfs especially if you are running postgres with seaweedfs-csi volume be prepared for data corruption. Seaweefs-csi uses FUSE, if anything happens to seaweedfs-csi (Nomad client restart, docker restart, OOM) mount will be lost and data corruption will happen.

Running on CEPH (since CEPH CSI using Kernel driver not FUSE) is acceptable if you fine with low TPS.

I found it was easier to make recoverable, backed up, host volumes than to make DBs run on high availability filesystems like seaweedfs (I admit I have not tried Ceph - the deployment looked a bit complicated/overkill for a homelab).

Postgres and sqlite are just not made for that environment. To run a high-availability DB, it is better to run a distributed DB made for that (think etcd, cassandra) than to run a non-distributed DB on top of a distributed filesystem.

Good luck! :)