this post was submitted on 21 Jul 2024
340 points (98.0% liked)

Technology

59658 readers
2705 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 31 comments
sorted by: hot top controversial new old
[–] [email protected] 201 points 4 months ago (6 children)

No validation, in the driver or the updater software.

No validation or automated testing on publish.

No staged rollouts.

Just utterly irresponsible all around.

[–] [email protected] 81 points 4 months ago* (last edited 4 months ago) (2 children)

When I worked there six years ago, the company motto was "two feet on the gas pedal" because the CEO was a race car driver. I bailed after 10 months, giving up pre IPO shares. The management for my team was non existent, and I was on the build and release team. People were doing releases of manually. They've improved the automation some from what I here, but looks like the motto finally hit them.

I should also say their metrics were absolutely staggering. The log aggregator was doing something like 2 trillion requests a week. All backed by splunk. I never heard what they were paying, but it must have been fucking nuts.

[–] [email protected] 9 points 4 months ago* (last edited 4 months ago) (1 children)

Race car drivers definitely don't put both feet on the gas pedal though... Like, what?

[–] [email protected] 1 points 4 months ago* (last edited 4 months ago)

I would've preferred Colin McRae's classic in the same spirit: "when in doubt, flat out"

[–] [email protected] 1 points 4 months ago* (last edited 4 months ago)

The unfortunate thing is that, in the long run, that strategy will probably be super effective. Unless Europe (with the only internet regulations that actually have teeth) does something harsh enough, they will probably pay a few small fines over this at most. Cost of doing business and probably baked in already.

[–] [email protected] 44 points 4 months ago (2 children)

A coworker of mine has worked with CrowdStrike in the past; I haven't. He said that the releases he was familiar with from them in the past were all staged into groups and customers were encouraged to test internally before applying them; not sure if this is a different product or what, but it seems like a big step backwards of what he's saying is right.

[–] [email protected] 51 points 4 months ago* (last edited 4 months ago) (2 children)

I first dealt with them at least 10+ years ago and at the time they had no ability to do staged roll outs or targeted roll outs. We got updates when they said we did, no choice or control. We had to resort to updating our firewall to restrict the download endpoint and only open it in groups to do a phased update.

[–] [email protected] 12 points 4 months ago

Interesting! Sounds like they may have changed things a few times, or maybe my co-worker's memory has some gaps.

[–] [email protected] 2 points 4 months ago
[–] [email protected] 9 points 4 months ago (1 children)

Channel files are different from sensor updates, which you have no control over for version control. Sensor releases you have control over.

[–] [email protected] 1 points 4 months ago

Ah interesting, thanks!

[–] [email protected] 37 points 4 months ago

The idea of "security software" is ridiculous overall. You buy a software to fix security problems in Windows and it violates the original product by inserting code into kernel code. You lose support by the original product vendor. And you think you're secure, even the whole stuff makes you forget that IT should be always fit in solving security/restorability problems even when everything else fails.

[–] [email protected] 18 points 4 months ago

And on a Friday to make things worse

[–] [email protected] 2 points 4 months ago

No staged rollouts.

I read somewhere that CS does allow for staged rollouts but some updates deliberately ignore them.

[–] [email protected] 2 points 4 months ago
[–] [email protected] 85 points 4 months ago (3 children)

As if the borked update wasn't bad enough, it was also forced on users that explicitly said not to install it.

CrowdStrike’s channel file updates were pushed to computers regardless of any settings meant to prevent such automatic updates

[–] [email protected] 20 points 4 months ago (5 children)

From my reading this is misleading at best and likely wrong. I don’t work with CrowdStrike Falcon but have installed and maintained very similar EDR tools in enterprise environments and the channel updates referenced are the modern version of definition updates for a classic AV engine. Being up to date is the entire point and so typically there are only global options to either grab those updates from the vendor or host them internally on a central server but you wouldn’t want to slow roll or stage those updates since that fundamentally reduces the protection from zero days and novel attacks that the product is specifically there to detect and stop. These are not engine updates in that they don’t change the code that is running, they give the code new information about what an attack will look like to allow it to detect malicious activity as soon as CrowdStrike knows what the IoCs look like.

In this case it appears that one of these updates pointed to a bad memory location which caused the engine to crash the OS, but it wasn’t a code update that did it (like a software patch). That should have been caught in QA checks prior to the channel update being pushed out, but it’s in CrowdStrikes interest to push these updates to all of their customers PCs as quickly as they can to allow detection of novel attacks.

[–] [email protected] 28 points 4 months ago (1 children)

That should have been caught in QA checks prior to the channel update being pushed out...

I work in QA, and part of the job is justifying why it's necessary to keep a team of people that doesn't actually "produce" anything. Either their QA team is now in the hotseat, or Crowdstrike is now realizing why they need one.

Either way, it sounds like a basic smoke test would have uncovered the issue, and the fact that nobody found this means nobody bothered to do one of the most basic tests: turn it on and see if it "catches fire.'

[–] [email protected] 14 points 4 months ago

God, even if they didn't have QA test it, they should have had continuous integration running to test all new channel updates against all versions of their program, considering the update will affect all of them. What an epic process failure.

[–] [email protected] 18 points 4 months ago

Being up to date is the entire point and so typically there are only global options to either grab those updates from the vendor or host them internally on a central server but you wouldn’t want to slow roll or stage those updates since that fundamentally reduces the protection from zero days and novel attacks that the product is specifically there to detect and stop.

That's not your, or Crowdstrikes, decision to make. If organizations have applied settings to not install updates automatically then that's what they expect to happen and you need to honour it. You don't "know best". They do.

[–] [email protected] 12 points 4 months ago

Being up to date is the entire point

No, it isn’t. The point is to keep systems safe and operational. Blindly rolling out untested updates is not a good strategy for that. I have seen entire systems shut down due to false alerts from updated antivirus software. Luckily only test environments, before these updates were rolled out to production. It does not take much to test updates like this before rolling them out to your entire organisation.

[–] [email protected] 10 points 4 months ago

Our organization is configured to install N-1 of current release specifically to avoid this type of stuff. Does it matter? No, we got hit just like everyone else.

[–] [email protected] 4 points 4 months ago* (last edited 4 months ago)

I'm getting real sick of companies acting like rapists and society just accepting it, if not justifying it for them.

No means no. Plain and simple.

[–] [email protected] 13 points 4 months ago

The distinction between that and a malicious hack consists entirely of intent .

[–] [email protected] 8 points 4 months ago (1 children)

Well that's just terrorism then

[–] [email protected] 18 points 4 months ago (1 children)

Terrorism would require a political angle.

This is malicious incompetence.

[–] [email protected] 3 points 4 months ago

One can argue that there is a very niche political angle to this - teaching Windows users the fear of God, so that they'd see the error of their ways. But it works in our favor, so let's not concentrate attention on it.

[–] [email protected] 27 points 4 months ago (1 children)
[–] [email protected] 6 points 4 months ago

For reals. Their self reporting is just trying to mitigate damages from the mistake

[–] [email protected] 12 points 4 months ago

This is the best summary I could come up with:


CrowdStrike’s faulty update caused a worldwide tech disaster that affected 8.5 million Windows devices on Friday, according to Microsoft.

Microsoft says that’s “less than one percent of all Windows machines,” but it was enough to create problems for retailers, banks, airlines, and many other industries, as well as everyone who relies on them.

Separately, the technical breakdown from CrowdStrike released Friday explains more about what happened and why so many systems were affected all at once.

CrowdStrike’s breakdown explains the configuration file that was at the heart of the issue:

CrowdStrike explained that the file is not a kernel driver but is responsible for “how Falcon evaluates named pipe1 execution on Windows systems.” Security researcher and Objective See founder Patrick Wardle says that the explanation aligns with the earlier analysis he and others provided about the cause of the crash, as the problem file “C-00000291- “triggered a logic error that resulted in an OS crash” (via CSAgent.sys).”

CrowdStrike’s channel file updates were pushed to computers regardless of any settings meant to prevent such automatic updates, Wardle noted.


The original article contains 193 words, the summary contains 175 words. Saved 9%. I'm a bot and I'm open source!

[–] [email protected] 0 points 4 months ago

How many windows updates have bricked PC's over the years?