this post was submitted on 19 Jul 2024
1191 points (99.5% liked)

Technology

59030 readers
3106 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It's all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We'll see if that changes over the weekend...

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 26 points 3 months ago (3 children)

Stop running production services on M$. There is a better backend OS.

[–] [email protected] 31 points 3 months ago (1 children)

The issue was caused by a third-party vendor, though. A similar issue could have happened on other OSes too. There's relatively intrusive endpoint security systems for MacOS and Linux too.

[–] [email protected] 21 points 3 months ago (5 children)

That's the annoying thing here. Everyone, particularly Lemmy where everyone runs Linux and FOSS, thinks this is a Microsoft/Windows issue. It's not, it's a Crowdstrike issue.

[–] [email protected] 14 points 3 months ago (2 children)

More than that: it's an IT security and infrastructure admin issue. How was this 3rd party software update allowed to go out to so many systems to break them all at once with no one testing it?

[–] [email protected] 3 points 3 months ago* (last edited 3 months ago)

From what I understand, Crowdstrike doesn't have built in functionality for that.

One admin was saying that they had to figure out which IPs were the update server vs the rest of the functionality servers, block the update server at the company firewall, and then set up special rules to let the traffic through to batches of their machines.

So... yeah. Lot of work, especially if you're somewhere where the sysadmin and firewall duties are split across teams. Or if you're somewhere that is understaffed and overworked. Spend time putting out fires, or jerry-rigging a custom way to do staggered updates on a piece of software that runs largely as a black box?

Edit: re-read your comment. My bad, I think you meant it was a failure of that on CrowdStrike's end. Yeah, absolutely.

[–] [email protected] 1 points 3 months ago (1 children)

Bingo. I work for a small software company, so I expect shit like this to go out to production every so often and cause trouble for our couple tens of thousands of clients... But I can't fathom how any company with worldwide reach can let it happen...

[–] [email protected] 3 points 3 months ago

That's because cloudstrike likely has significantly worse leadership compared to your company.

They have a massive business development budget though.

[–] [email protected] 6 points 3 months ago

Everyone, particularly Lemmy where everyone runs Linux and FOSS, knows it is a Crowdstrike issue.

[–] [email protected] 3 points 3 months ago

Many news sources said it's a "Microsoft update", so it's understandable that people are confused.

Also, there was an Azure outage yesterday.

[–] [email protected] 3 points 3 months ago

Its an snakeoil issue.

[–] [email protected] -5 points 3 months ago (2 children)

It's a MS process issue. This is a testing failure and a rollout failure

[–] [email protected] 7 points 3 months ago (1 children)

This had nothing to do with MS, other than their OS being impacted. Not their software that broke, not an update pushed out by their update system. This is an entirely third party piece of software that installs at the kernel level, deeper than MS could reasonably police, even it somehow was their responsibility.

Thid same piece of software was crashing certain Linux distros last month, but it didn't make headlines due to the limited scope.

[–] [email protected] 1 points 3 months ago (1 children)

My bad i thought this went out with a MS update

[–] [email protected] 4 points 3 months ago

Microsoft would never push an update on a Friday. They usually push their major patches on Tuesdays, unless there's something that's extremely important and can't wait.

[–] [email protected] 1 points 3 months ago* (last edited 3 months ago)

Windows is imfamous for a do-it-yourself install process, they are likely using their own deployment tools. If anything, criticize them for not helping the update process at all.

[–] [email protected] 23 points 3 months ago

Crowdstrike did the same to Linux servers previously.

[–] [email protected] 6 points 3 months ago (1 children)

There’s a better frontend OS

Doesn’t mean people want to go away from what they know

[–] [email protected] 3 points 3 months ago (1 children)

There's a shit ton more reasons than that, but in short: I highly doubt anyone suggesting a company just up and leave the MS ecosystem has spent any considerable amount of time in a sysadmin position.

[–] [email protected] 3 points 3 months ago

You’ll find xp in use because they don’t want to pay for a new system

And Linux/BSD is way more expensive because not as many people are familiar with it