this post was submitted on 18 Jan 2024
498 points (95.3% liked)
Technology
59030 readers
3004 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I used to write tons of automation in my previous data role. While time saved matters, the other important takeaway is reproducibility. Other people on the team were writing giant SQL scripts and highlight running each one and then manual checking to see if it worked... I'm talking about tables anywhere from 1-100 millions records. You aren't checking shit by skimming a top 1000. And what a ridiculously error prone process that is. Take the human out of that equation!
If the data came out wrong, it would be because the data came in different/corrupted, not because I missed a query. Speaking of different causing problems.. one time a company sent us data that was fixed width by character instead of fixed width by byte. Smh...
This! The point of automation is rarely saving time. The point of automation is increasing quality.
It can be a data quality, it can be mitigating a production risk, can be avoiding regression.
Heck even unit tests are automation (you may just manually test your code once and call the day).
I am not saying that automation is always good, but the evaluation should be
Then you do (3)*3 - (1) *3 - (2). Is it positive? You do, is it negative you? You don’t. The more it’s positive the higher the priority of doing.
Why the *3? The first because the expected cost of automation is always massively underestimated The second because it takes multiple times something goes wrong till the decision is reconsidered 🙂
Why 1 year? Because generally the task to automatize changes or disappear