CrowdStrike Update that Caused Global Outage Likely Skipped Checks
July 20th, 2024Wow, you don’t say…
Maybe normal procedures weren’t followed that day because management relocated the quality assurance department to a sloped roof.
Via: Reuters:
Security experts said CrowdStrike’s routine update of its widely used cybersecurity software, which caused clients’ computer systems to crash globally on Friday, apparently did not undergo adequate quality checks before it was deployed.
…
It’s unclear how that faulty code got into the update and why it wasn’t detected before being released to customers.
“Ideally, this would have been rolled out to a limited pool first,” said John Hammond, principal security researcher at Huntress Labs. “That is a safer approach to avoid a big mess like this.”
Killin’ me. Yeah, test run? Attack?
I don’t know. Very weird that it wasn’t done in a phased rollout; a relatively small number of machines at a time.
I too found the lack of phased rollout odd. But security updates break rollout rules, and security updates are CrowdStrike’s business. I conclude therefore it is likely an exceptionally severe zero-day vulnerability came to light, perhaps even one in the process of active exploitation as part of a co-ordinated attack. In that scenario, it would make perfect sense to skip phased rollout. Downtime is not the worst thing that can happen to a system. That wouldn’t entirely excuse the failure, but would go a long long way to explaining it.
Assuming a severe vulnerability, it is perfectly natural we would hear nothing of its existence until even the worst managed fleets are fully patched (which could take months). Especially as patches deployed Friday would provide heavy hints about the vulnerability to other bad actors.
It seems likely that what we saw Friday was just fallout from a (presumably successful) effort to halt an even worse scenario.