Selfhosted

40734 readers

427 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

[email protected]

Automated CI/CD Data Snapshots (lemmy.ml)

submitted 6 months ago by [email protected] to c/[email protected]

4 comments fedilink hide all child comments

cross-posted from: https://lemmy.ml/post/16693054

Is there a feature in a CI/CD pipeline that creates a snapshot or backup of a service's data prior to running a deployment? The steps of a ideal workflow that I am searching for are similar to:

CI tool identifies new version of service and creates a pull request

Manually merge pull request

CD tool identifies changes to Git repo

CD tool creates data snapshot and/or data backup

CD tool deploys update

Issue with deployment identified that requires rollback

Git repo reverted to prior commit and/or Git repo manually modified to prior version of service

CD tool identifies the rolled back version

(OPTIONAL) CD tool creates data snapshot and/or data backup

CD tool reverts to snapshot taken prior to upgrade

CD tool deploys service to prior version per the Git repo

(OPTIONAL) CD tool prunes data snapshot and/or data backup based on provided parameters (eg - delete snapshots after _ days, only keep 3 most recently deployed snapshots, only keep snapshots for major version releases, only keep one snapshot for each latest major, minor, and patch version, etc.)

top 4 comments

sorted by: hot top controversial new old

[–] [email protected] 6 points 6 months ago (1 children)

Not sure what you're doing, but if we're talking about a bog standard service backed by a db, I don't think having automated reverts of that data is the best idea. you might lose something! That said, triggering a snapshot of your db as a step before deployment is a pretty reasonable idea.

Reverting a service back to a previous version should be straightforward enough, and any dedicated ci/cd tool should have an API to get you information from the last successful deploy, whether that is the actual artifact you're deploying, or a reference to a registry.

As you're probably entirely unsurprised by, there are a ton of ways to skin this cat. you might consider investing in preventative measures, testing your data migration in a lower environment, splitting out db change commits from service logic commits, doing some sort of blue/green or canary deployment.

I get fairly nerd-sniped when it comes to build pipelines so happy to talk more concretely if you'd like to provide some more details!

[–] [email protected] 1 points 6 months ago (1 children)

Thanks for the reply! I am currently looking to do this for a Kubernetes cluster running various services to more reliably (and frequently) perform upgrades with automated rollbacks when necessary. At some point in the future, it may include services I am developing, but at the moment that is not the intended use case.

I am not currently familiar enough with the CI/CD pipeline (currently Renovatebot and ArgoCD) to reliably accomplish automated rollbacks, but I believe I can get everything working with the exception of rolling back a data backup (especially for upgrades that contain backwards incompatible database changes). In terms of storage, I am open to using various selfhosted services/platforms even if it means drastically changing the setup (eg - moving from TrueNAS to Longhorn, moving from Ceph to Proxmox, etc.) if it means I can accomplish this without a noticeable performance degradation to any of the services.

I understand that it can be challenging (or maybe impossible) to reliably generate backups while the services are running. I also understand that the best way to do this for databases would be to stop the service and perform a database dump. However, I'm not too concerned with losing <10 seconds of data (or however long the backup jobs take) if the backups can be performed in a way that does not result in corrupted data. Realistically, the most common use cases for the rollbacks would be invalid Kubernetes resources/application configuration as a result of the upgrade or the removal/change of a feature that I depend on.

[–] [email protected] 2 points 6 months ago (1 children)

Are you using PersistentVolumes? If your storage class supports it, looks like there's a volume snapshot concept you can use, have you looked into that?

[–] [email protected] 1 points 6 months ago

Yes, I am using PersistentVolumes. I have played around with different tools that have backup/snapshot abilities, but I haven't seen a way to integrate that functionality with a CD tool. I'm sure if I spent enough time working through things, I may be able to put together something that allows the CD tool to take a snapshot. However, I think that having it handle rollbacks would be a bit too much for me to handle without assistance.