This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
Ok this might just be funny to me, but the
CloudStrikeCrowdstrike worldwide outage is the funniest thing to happen in computer security this decade.If you haven't caught up,
100+ million(billion?) computers around the world were simulatenously broken in an instant. It's black comedy for sure. Hospital & emergency systems around the world have crawled to a halt, and there will be a few hundred deaths that will be traced back to this event. Millions of $$ will be lost. But, the humor comes from the cause of it.Here is how things panned out:
CloudStrikeCrowdstrike is a 100 billion valuation tech company that provides security services to a bulk of the world business.CloudStrikeCrowdstrike deployed a software update that began this outageCloudStrikeCrowdstrike is a 'trusted' secuirty tool, it sits under the OS layer, bricking the whole device.This is the Y2k that was promised.
The world spends billions in computer security every year, and no virus has managed the kind of world-wide disruption caused by one simple bug by the premier security company in the world.
No direct culture war implications, but goes to show just how much of a house-of-cards the tech ecosystem is. 1 little, simple, stupid bug can bring the whole world to a halt. Yet, the industry continues quarterly-earnings chasing.
Jobs keep getting cut, senior members get aged out, timelines get thinner and 'how many features did you deploy' remains the only metric for evaluation.
In tech, staying at a job for more than 3 years is seen as coasting. Devs are increasingly expected to do everything, because 'everyone should be full stack' and everything that isn't feature development (testing, staging, canaries) get deprioritized. Overworked novices means carelessness, carelessness creates mistakes.
At the same time, devs get zero agency. Random HR types make list of regulations mandating certain checkboxes for compliance, while having near-zero knowledge of the risks-and-benefits of these technical decisions. Therefore, the implications of a mistake are opaque to decisions makers. So by being compliant, you've suddenly given
CloudStrikeCrowdstrike a button to shut your entire business down.This kind of error should literally be impossible in a company of the size of
CloudStrikeCrowdstrike . If such an error happens, it should be impossible for giant corporations to crumble zero backup. Incompetence on display, on all sides. Having worked in 'prestigious tech companies', especially in 2024, it isn't surprising. At times, the internal dysfunction is seriously alarming, other times it's a tuesday.I'm not going to hope for much out of this. Just like Spectre & Solar , people will cry about it for weeks, demand change and everyone will get collective amnesia about it as the next quarter rolls around.
End of the day, tech workers are treated as disposable labor. Executive bean counters are divorced from the product. And the stock price is the only incentive that matters.
As long as tech is run by MBAs and smooth talkers, this will go on.
Some choice photos:
The competency crisis rages on. Boeing's planes fall out of the sky. The Secret Service forgets to check the nearby roof. Anti-virus software bricks your computer. These sorts of incidents have always happened, but it's hard to deny that they have gotten more frequent.
Boeing used to be better. I believe the Secret Service was as well. But anti-virus software, and the companies which make it, have always sucked.
Ehh I'm going to press X to doubt on the Secret service.
John F. Kennedy got Killed (I'll admit Trump would have been killed by Lee Harvey Oswald too)
Gerald Ford had 2 assasination attempts on him both of which he got lucky and survived but both were even crazier than Trumps
and just looking through wikipedia the list is just so long and full of examples that it beggars the question if Trump was even remotely unusual.
Boeing I'll grant you though, I think a part of it is that every corporation has its ups and downs and we have 1 down for Boeing right now, but remember the ford pinto? Boeing's issues are nowhere near as bad.
More options
Context Copy link
More options
Context Copy link
I could be wrong, but the number of fatal Boeing crashes or lesser incidents is not an outlier compared to past incidents and other manufactures before all the media scrutiny. Anyone remember the 737 rudder jams during the 90s? https://en.wikipedia.org/wiki/Boeing_737_rudder_issues#:~:text=During%20the%201990s%2C%20a%20series,board%2C%20157%20people%20in%20total.
It was a different model and hardly got similar media attention despite two major accidents with lots of fatalities close together
The Boeing issue was somwhat unique in that it was arguably the result of a vulnerability that had been purposefully introduced.
A conscious choice was made to change the emergancy autopilot disconnect from a physical switch to a software one and also to exempt certain autopilot functions from said disconnect switch thus invalidating the existing pilot checklist procedure for bad air data.
More options
Context Copy link
More options
Context Copy link
I'm always skeptical but never dismissive of such common sense. It could be recency bias and the availability heuristic at work.
I am starting to think there's the opposite of that kind of bias at play. 'Instinct distrust bias'?
I don't know what to call it, but it certainly feels like a lot of people turn very 'skeptical' when an aspect of their supported or preferred worldview is poked at in some way. The most obvious example of this would be mass immigration and the rise of housing prices. Implying a causal connection simply isn't a part of the program. Yet instinct would tell us it's the most obvious and important part of the entire problem in most if not all western countries.
Ditto depressed wages, rise of "the gig economy", etc...
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Pick me! I’ll deny it!
I have zero reason to believe capability has gotten worse by any reasonable metric. Maybe—just maybe—that’s propped up by technology even as competency has tanked? But if so, I think there should be better evidence than black swans.
Compare complaints about the land boats of old. Why can’t we buy sweet Caddys anymore? I dunno, because they were death traps in an accident.
I’m still trying to find that Onion skit about accidentally invading the wrong Middle Eastern country.
Well, I think it has more to do with fuel efficiency standards. They were also death traps, or not as perfectly safe as possible, but rounding off all the edges for aerodynamic efficiency gives all calls a sameness that's striking when compared to older designs.
You can build a land boat that's as safe as you like, but it's not going to meet fuel efficiency standards unless it's classified as a truck somehow. This also relates to the rise of SUVs: they're not-sedans, and so they don't have the same standards.
I remember a video about the old standard of round headlights. Super convenient for everything except aerodynamics. There was an awkward transition where companies tried to put the aero shell around their legally-mandated headlights, but that was unnecessary after the regulation got removed. Wish I could find it again.
Was it this?
https://youtube.com/watch?v=c2J91UG6Fn8?feature=shared
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I agree that our competence probably hasn’t declined that much. But our systems are much more integrated with a lot more single points of failure. I doubt that bad updates were ever that unusual. But it wasn’t quite the same as it would have been in 1990 when there were dozens of different OS and virus software combinations and so on. One company doing one update would have only affected the few companies that had the wrong combination of systems that got a bad update. Now the combination of cloudflare and Windows is common enough that one bad update takes out thousands of computers in thousands of companies.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I’m mean, one way of looking at it is that the affected computers are now very well protected from viruses.
Considering how millions of computers are gonna have to be booted into safe mode and have OS/antivirus files tampered with, just wait until malicious actors start "helpfully" supplying USB thumb drive images that promise to deploy the fix automatically (alongside rootkits). Rootkits that might silently disable or bypass Crowd Strike entirely.
This is happening and CrowdStrike already has multiple page warning about various efforts.
From the second link:
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
No direct culture war implications, at least not directly left/right. However, this was easily predictable by readers of Michael Crichton or Ayn Rand, both names in the “up/down” culture war (to coin a phrase).
Crichton’s most famous work, Jurassic Park, was largely about chaos theory. When working with a complex system, that is to say one driven by logic and rules, an outlier can bring down a house of cards through emergent effects. John Hammond not paying for a team of programmers led to dinos eating people. Today’s a mundane version of that.
Rand had a lot to say about innovative producers versus free riders, and apropos to today, about smart people who can create or repair machines versus everyday people who can just use their interfaces until something goes wrong. When it does, the cynical cry of, “Who is John Galt?” escapes their lips.
The American IT industry was hit hard by COVID. Businessmen, C-suite execs, saw their people remoting in from home and trying not to return to the office. These execs, many of them free riders, realized they could halve their costs by hiring remote MSPs from out of country for IT and relying on Crowdstrike to be their security bottom line. A flood of IT layoffs happened this past year, deflating IT wages and making entry level jobs scarce.
Then today, only people with the admin password or a modicum of critical thought could restore the most well-protected systems. Today, companies across the globe learned who their John Galts were, their Eddie Willers, their Dagny Taggarts.
Although, as to the left/right culture war, imagine if this or worse had happened on Election Day and all the votes had to be hand-counted.
I do think that it is slightly more complicated than that. First off all the lay offs of 80% of Twitter showed everyone that you don't need that many people to run a website. It was predicted by multiple of people that if Twitter didn't stop working other big tech companies would follow. Then there is the whole deal with Section 174 also that has affected the bottom line. Tech isn't unaffected by higher interest rates, when money was cheap they could amass people to be ready for "initiatives". Well not anymore.
I can give you the point of the free riders. The worst thing about them is that they actively make our tech worse to promote some number go up on their OKRs. Google is making search worse so people stay longer trying to find what they came to google for and watch more ads. Windows search always hits Bing when you do a search locally on your computer, just that it increments a number so a free rider can get a bonus. Just to take examples of search.
Your Section 174 link was fascinating. I feel that it underplayed the back story. It was sketched very briefly, but appears to go like this:
There are fiscal responsibility rules. If the US government passes a tax cut, the law should also include a tax increase in the future to balance the budget over the longer term. Legislators game this by writing a future tax increase that is stupid. Yes, it is in the law, but there is a nudge and wink that it will be repealed before it takes effect. This time the repeal never happened, so the deliberately stupid tax increase goes into effect.
This compounding disfunction bodes ill for the future of the US.
More options
Context Copy link
More options
Context Copy link
Crowdstrike was the company who the DNC had analyze their network and blame Russia for Guccifer 2.0. I can write up a conspiracy theory about this being a result of the deep state panicking over the failed Trump assassination and forcing a patch to create a backdoor to cause a future major outage to maintain control pretty easily.
More options
Context Copy link
More options
Context Copy link
I don't think this is too apocalyptic, probably most computers will be fixed by Monday.
But you bet your ass that everyone lost a lot of money today and that it may take weeks (or months) for some businesses to get back to the black.
Does anyone disagree with me that the amount of value destroyed by this failed patch outweighs all of the economic value CrowdStrike has ever provided? Imagine working at a company that would have been better off never existing.
I disagree. Crowdstrike Falcon Sensor is meant to keep ransomware from happening, especially to (or through) the Internet of Things. Without it, at least some of the dozens of hospital systems which went down today would have already been hit by sophisticated unscrupulous organized criminals.
I feel sorriest for MGM, who got BSOD’d by Crowdstrike after getting ransomwared last year.
More options
Context Copy link
The market's reaction was surprisingly sanguine to this. CRWD stock opened 11% lower and stayed that way; almost everyone thought it would be down 30% or more. The Nasdaq was green for the first 2 hours and then went red, which could have been due to anything.
The economy is huge. Even when critical things fail, there is enough stuff that works, plus rapid response to fix the problem, that the damage is not as bad as the hype would suggest. Ironically ,a bigger problem entails a more rapid response to fix it, so it ends up being briefer or not as bad.
The cope is that this incident just shows how important CrowdStrike is.
Kinda like Boeing. They can have plane crashes, faulty parts, kill whistleblowers, etc... But we still have to buy Boeing planes – because we don't have a choice!
I'm less sanguine about Crowdstrike. Elon said he is ripping them out of all his companies. While the typical CEO drone probably won't do the same, Crowdstrike won't live this down. Maybe ever.
I predict a slow bleed out in their stock, although there's a good chance that internet morons bid it up higher over the next few weeks.
I think Elon did the right thing.
I have never heard about Crowdstrike. No computer I work with had it installed.
I totally understand that an average user is clueless and we need to protect him from his own actions. And yet, if this is such a necessity, why wouldn't Microsoft implement it directly in the OS?
Crowdstrike might be bleeding edge The need for bleeding edge is always overvalued.
It reminds me all times when everybody was trying to install antivirus software. Instead I always removed it because it only consumed resources and provided very little benefit. The best protection was to limit what user can do – do not install unauthorized software, don't even browse internet for fun, just use your work assigned software and web sites.
I think those who relied on third party antivirus software had worse outcomes because their users were more relaxed and less disciplined. At the same time those antivirus software makers got rich.
Probably the same has happened with Crowdstrike. Gradually Microsoft will implement something similar for no extra cost, everybody will realize that Crowdstrike is pointless. Until new challenges will come along and a new opportunistic company, playing on people's fears will convince to buy another scammy service.
They did. The only thing missing from Windows integrated security is that it lacks the options to spy on users (breaking multiple privacy laws) and doesn't make it as easy to disrupt productive work by locking down the computer way too much. It also doesn't slow everything to crawl. Naturally corporate IT managers can't stand that.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Is the implication that the market would properly "punish" them for destroying more value than they've ever created? It could just as easily reward them for extracting rents for "malware defense" while making all of its clients worse off.
The price of any asset is the net present value of all expected future cash flows.
It's not about the stock market punishing a company. It's about the stock market trying to correctly evaluate how much other parties might try to punish the company. If we look at Boeing, we know that increased regulatory scrutiny is very unlikely to increase cash flows, and spectacular reputational damage is unlikely to increase future business. And so cash forecasts are updated accordingly.
sure, but my claim was CrowdStrike has probably caused more economic loss from this one patch than they have ever provided, which is somewhat orthogonal to a statement about their stock price
the fact that their stock price is not zero only indicates that the world's ability to hold them liable for these losses is minimized
(or that they can be held liable and that my estimate of the damage caused is way, way off)
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I think it depends a lot on what your next alternative is. The morbid possibility is that CrowdStrike could be incompetent and also beat their customers rawdogging the internet. Even if this incident cost 1b USD, that's something like fifty major ransomware strikes. CrowdStrike could conceivably have blocked that many this year.
Of course, CrowdStrike isn't the only alternative. Businesses can use a variety of other protections and/or make themselves more robust to successful attacks. Whether they're more reliable or not is a !!fun!! question, but underneath that, there's a funner one: could businesses have made it? Contra a lot of reporting, I don't know that every regulated company has to use CrowdStrike specifically, but I do know that for even low levels of regulated industry it's a very common requirement that's accepted as a box checked, where alternatives that I could find required additional support not all IT teams would be able to provide.
More options
Context Copy link
There’s a reasonable case to be made that CrowdStrike isn’t a "real" company anyway: it’s a DeepState actor, worming its way into systems by enabling managers to check a box that satisfies regulatory compliance while giving wholesale control of their system to this opaque third-party.
I work in the industry and while I can confirm that regulatory compliance related to cybersecurity is theatrical bullshit, your assessment of CrowdStrike is completely wrong and nonsensical. It's certainly not the case for every vendor in the industry, but CrowdStrike's products and services do significantly reduce the risk of certain types of cybersecurity threats companies face.
More options
Context Copy link
More options
Context Copy link
This is effectively the argument that Lucas Critiqued.
In fact, it's almost exactly analogous to the Fort Knox example given in the article.
Why do you believe that CrowdStrike provides value?
Maybe it does but where is the proof? The half of the world didn't use CrowdStrike and how did they fare?
I would even say, let's do RCT to prove that CrowdStrike improves outcomes. It is perfect case when it could be done.
Maybe nobody wants to do such a test because they are afraid that it will show that CrowdStrike provides no value.
Remember masks during covid. The evidence is that they provided either minimal value or no value at all. And yet the government mandated their use in many countries. Sometimes people do stupid things on large scale.
More options
Context Copy link
I'm not saying CS provides no security, but it's hard to believe it provided as much global security as the damage it caused and that a competitor wouldn't have been better.
Sure. And asking how much security they provided requires addressing the counterfactual
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
So I keep trying to find any information on the technical aspects of this failure. As in, why is it bricking systems. I get that it's a driver that runs under the operating system, and it's failing to load. But why? I've only seen random reports that Crowdstrike literally pushed a corrupted file onto millions of systems, which is rather remarkable if true. If it was actually a bug, I'm deeply curious to hear what the bug was and how it slipped through.
To get really wild and speculative, lately it's been getting reported that Intel 13th and 14th gen I9 CPUs might be defective at incredibly high rates, upwards of 50%. These defects manifest in whole hosts of ways like BSOD, software crashes, and memory errors. I wonder if it's possible a defective Intel CPU borked the executable of an otherwise rigorously tested release. Like I said though, pure speculation. The nature of the Intel failures are still being investigated anyways.
This is not confirmed information, but I am hearing it on various technical grapevines and it seems plausible:
The primary bug is not new - the kernel-level driver that Crowdstrike runs (and has been running) has a dormant bug in the portion of it that parses config/data files. This update was "just" a config/data file, so deemed low-risk and put through fewer/simpler rounds of testing than a "real" update to their actual software. Whether it was a weird corner case or a malformed file, the kernel driver tripped over it and triggered the dormant bug. Since it's a kernel-level driver, crashing can affect the OS - and it did, generating an exception on a bad memory access (perfectly routine type of bug, but with privileges!) so the OS crashed.
Lol that is amazing. Sounds like the most plausible explanation, but maybe even worse because it seems like that should have been caught in a dev or staging environment
Forget about dev or staging, there's no excuse for not fuzz testing your config parser in current year plus nine.
More options
Context Copy link
More options
Context Copy link
Im not in a position to confirm but that seem quite plausible and dove-tails with some of what I've heard.
For some reason i find myself thinking of this old XKCD 😉
More options
Context Copy link
More options
Context Copy link
Ok so there's an update on what happened.
The exact crash is caused by dereferencing a null pointer the offending assembly is readable by anyone, and it is as follows mov r9d.dword ptr [r8], the key is that the value of r8 is 0000 0000 0000 009c 9c is an offset of some sort set earlier, so it's derefrencing a null pointer. The pointer is NULL because the value in the file C-00000291.sys was published to be all 0s causing r9d to get loaded as all 0s
So the offending assembly probably looks like
read r8 C-00000291.sys (some offset)
add r8 9c
mov r9d.dword ptr [r8]
causing the bug.
From this, it kind of sounds like rather than having an on-disk data representation that would be parsed and converted to an in-memory data structure, they just loaded the file and accessed the raw bytes as a data structure with internal pointers. Which is... an approach, I guess.
It's an executable; that's how executables work.
Eh, not really? Executable files have structure in them other than raw code and still have to be parsed by a loader. A file that's all zeros should fail to load. (Yes, I know DOS had .com files with were just code blobs loaded at a fixed address and immediately executed and I'm sure there are even more ancient examples of that sort of thing, but surely Windows kernel modules can't work like that.)
Anyway, the rumors I've read said that it was actually a data file and that's why they considered it acceptable to deploy it on a Friday -- the assumption being that changing configuration without rolling out a new version of the executable wouldn't break things too badly.
That might be how executables in an operating system work. Wouldn't be how extremely low level BIOS or ROM code that is meant to be executed before the OS loads would work. I can't say for certain exactly how that works these days, but when I was troubleshooting some BIOS code on an old computer of mine, I found myself decompiling a VGA BIOS. And that basically works by being in a certain memory block, it begins with a consistent signature to signal "Yup, there is code here" to the motherboard BIOS, and then it begins loading and executing instructions at a certain offset to initialize the card. Fun fact, you can actually reinitialize the VGA BIOS with a short assembly program that just CALL's to that location if memory serves.
What you are describing sounds more like a boot sector, i.e., raw machine code meant to be read from bootable media and executed directly by firmware (the mobo BIOS in your example)
I’d be surprised if in any modern operating system, executables (even those loaded and run at boot time) were handled that way. Then again, one is reminded of the old chestnut about idiot-proofing software…
The problem with turing machines is that pretty much everything becomes equivalent at high enough levels of generality. Windows EXEs (and DLLs) have a specific format that make it impossible to load an empty or (most) malformed files, but if the surrounding format is correct enough you can absolutely have it followed by a bunch of nonsensical instructions and memory locations -- there is a checksum, but (infamously), it isn't actually mandatory to load or run.
Worse, there's no rule that your executable is the only place that such instructions can come from, and few architectures try. Even in Harvard architectures like Atmels or PICs, there are specific instructions to transfer from the data bus into the program and vice versa. Modern operating systems on von Neumann architectures try to stop you from doing so by accident, by setting memory pages as either instruction or data, and in modern Windows machines further isolating data instructions with DEP, but it's ultimately just a set of flags.
There are arguments against doing this, in favor of having a having your base program load from more conventional configuration files with a strict format (eg JSON), or even having a very limited programming language that your core driver then 'runs'. They have some tradeoffs! But ultimately the problem is a lot more boring: in each case, you have to be able to recognize and respond to a corrupt file. And that's a solved problem! But you have to recognize it.
More options
Context Copy link
I'm pretty sure I could write a C program right now that would run in Windows 10 that will load and run arbitrary assembly instructions from a binary file. The C program might have all the trappings of a proper Win10 executable, but the file it loads and runs sight unseen wouldn't. I'm pretty sure that's what the Crowdstrike driver is doing with the file full of 00's.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Not really. The linker and a bunch of other transformations are going to happen before any of your instructions run. Dumping and loading bytes of a structure straight out of memory has long been considered a lazy and dangerous thing to do; no one is surprised that this sort of bug arose from it.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Apparently, the corrupted file was just filled with nulls:
https://twitter.com/jeremyphoward/status/1814364640127922499
I'm trying to image what might cause that; truncating the file and then failing to write it? My filesystem-fu isn't really up to par.
Wasn't there an old joke about an MBA cutting costs in half by getting rid of the 1s and standardizing on 0s?
More options
Context Copy link
Saving a file using a filesystem that journals metadata followed by computer crash that happens before the file contents are flushed is one way to achieve it.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
This may be true amongst the Dev Set, but it's very much not once you get outside of that small corner of 'tech' and into the infrastructure side of things. While there are plenty of greenhorns puttering around, there are also the true gray beards who have been in the same position for decades and know literally everything about the systems they administer (and design and build).
I'm a network engineer and there are enough grizzled old men on my team that our collective experience no doubt stretches into the centuries, and several of these guys have gotten a big chunk of it at this one place. We just had a guy retire last year who started in the late 70s...
That bit jumped out at me as well. Even amongst the Dev Set that strikes me as a very FAANG / SV centric point of view.
More options
Context Copy link
More options
Context Copy link
If what is being is reported is true and they released some unrunnable or improperly formatted file, I can’t even comprehend that level of incompetence. There is a lot of bullshit at my company which is also dealing with many of the issues you’ve addressed in your post, and of course we have incidents, but something so basic being released with such insane permissions would not be possible at my workplace. Of course that’s discounting any malicious actor, but the number of QA cycles and slow rollout that we go through would have caught something like this 5 weeks before it sniffed release.
Something or someone is deeply rotten at crowdstrike. They need to make a big-time firing or I predict that people will start fleeing in droves.
This seems to me like a fairly usual level of competence from a bolt-on-security-as-a-product or compliance-as-a-service company. Examples:
It's not that it's amateur hour specifically at CrowdStrike. It's the whole industry.
I have always been of the opinion that antivirus is a poor idea, and at best, a half-baked solution preventing you from adopting better solutions, such as sandboxing/virtualization and general human security hygiene. I haven't run an antivirus (besides Windows's built-in Defender) in years on any of my computers or phones, and I've never gotten malware on my systems simply because I don't open any sketchy apps or files, and if I do, it's in a virtual machine isolated from the rest of my system.
That an entire industry (the antivirus industry) exists based on the premise of a bad idea that is not only ineffective but adds massive attack surface simply because attackers can exploit what is essentially a privileged system component with deep access to all parts of the system - a cure worse than the disease - should be a lesson in how easy it is for someone to get the basics of a skill (such as security) wrong.
The problem is that simply receiving a text may count as "opening a sketchy file". You really can't expect every boomer pecking at a computer to know the ins and outs of security.
This is not to defend this particular software, but your view leaves out some things as well.
Bad example? If you're targeted with zero-days like Pegasus, an antivirus software is not going to stop it. In fact the standard defense for this sort of thing is what I've advocated - isolation of system components via sandboxing/virtualization. I'm not sure what your argument is.
AV can at least detect anomalous network traffic or unexpected processes, which is obviously not as good as preventing the infection in the first place but is still valuable.
In this case, the systems were sandboxed - FORCEDENTRY escaped the sandbox. Sandboxing isn't a magical technology without vulnerabilities.
Would antivirus have actually detected this infection? Ignoring the fact that phones don't usually run antivirus (because they employ sandboxing security measures), in the case of FORCEDENTRY, the exploit was discovered because Citizen Lab specifically examined the phone of an anonymous Saudi activist. They don't say what exactly led to the phone being examined by them, but I'm willing to bet that it exhibited signs of infection that any general-purpose antivirus like McAfee wouldn't have detected.
Yes, sandboxing technology can still be vulnerable, but antiviruses are not a better security practice than sandboxing. Moreover - since you brought up a targeted spyware attack - if you're being specifically targeted by nation-state actors aided by NSO Group, you need to up your security anyways. So your comment that
immediately after discussion of FORCEDENTRY confused me, because if your threat model includes zero-day attacks like FORCEDENTRY (for example, you're a political activist, journalist, or whistleblower), then yes, I do expect such a person to know the ins and outs of security. They should stay on top of their game, because their life literally depends on it. At that level of threat modeling, if you're genuinely worried about attacks from well-funded nation-states, then security is not something you can just ignore and expect to have taken care of for you.
It's not one or the other.
Bringing this up as an example was my mistake since it seems to have derailed the conversation.
There are plenty of vulnerabilities out there that are not zero days. There are plenty of systems out there that are vulnerable to such attacks. Not everything is patched as soon as the CVE is published and not every system is updated as soon as the patch is published. It's a simple fact of life that there is a time period between a vulnerability being disclosed and all systems being updated, even if those systems are enrolled in some kind of regular update scheme. Arguing against the need for at least detection and monitoring for threats because you have a lot of faith in sandboxing does not make sense.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
A general rule: the further a software product is away from "engineering candy", the worse it is.
Software engineers are some of the most entitled, overpaid people on the planet. (I should know!) They have lots of career options.
To get good engineers you need to either pay an outrageous salary or have an interesting product like a video game. Want to find engineers to work on your compliance software? Good luck. Hell, even Google engineers making 400k/year can't be bothered to work on essential but boring products, preferring instead to chase shiny baubles.
No one wants to do the dirty work where good job means not messing up.
I think the problem is that "good job" doesn't mean "not messing up" in the context of these compliance-as-a-service or security-blanket-as-a-service companies. Instead, "good job" is "implement as many features as possible to a level where it's not literally fraud to claim your product has thay feature, and then have a longer checklist of supported features in your product than the competition has so the MBA types choose your product".
CrowdStrike's stock price is only down by about 10% today on one of the highest-impact and highest-profile incidents of this type I've seen. I'm pretty sure their culture of "ship it even if it's janky and broken" has netted them more than a 10% increase in net revenue, so it's probably net positive to have that kind of culture.
Their net revenue is under a billion a year. The total economic damage caused by this single bug is almost certainly larger than the total net income of the entire history of the company. In fact, it is almost certainly larger than the total gross income of the entire history of the company. I do not know where the valuation is coming from, but it certainly isn't from their revenue figures.
Lol P/E of 644.
But it's a hyper-growth company bro, surely they'll be able to pivot to making money once they've captured the full market bro.
More options
Context Copy link
Yeah but if they're not liable what relevance does that have to their share price?
I don't know if they're liable or not. I doubt Crowdstrike knows if they're liable or not.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I mean, you could get good engineers with a video game project, but for that you have to be willing to also pay them the outrageous salary. Video game projects are more art than engineering, requiring more designers than engineers. And the brilliant engineers won't work for that much below market rate; if that were their goal they'd go into research or try to get into an early-stage startup, not join a project that's just the application of an existing engine to a new gameplay design. The game projects that appeal to engineers don't sell enough for AAA development, they're nerd games like Factorio or RimWorld (sorry friends).
Not that game companies don't capitalize on the appeal of their projects to talent. They just capitalize by taking lower-tier but motivated engineers/artists/designers and running them into the ground.
More options
Context Copy link
More options
Context Copy link
I incidentally just learned about the Okta breach yesterday simply by getting frustrated with it and searching on Twitter evidence on whether everyone else hates using it continuously as much as I do.
I have the opinion that the more data you give out, the more likely it will just get breached. Especially personal data meant to authenticate your identity. The best thing to do would be to not give data out at all - data that doesn't exist, can't be stolen - but most of the rest of the world doesn't think the same way, and are extremely unlikely to question why we have normalized people giving away their data without a second thought.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Don't they deploy updates like this in a development evironment first to test for exactly this kind of thing? I work in very low-level, mostly unimportant IT and I sweat breaking a single website that gets 100 visitors per month. How does something as big as this not get tested first?
More options
Context Copy link
They don't stage releases sending them out to limited groups one at a time? They do one global update and hope for the best?
There's such obvious ways to limit the impact of this sort of screwup.
I'm going to play Karnak the Magnificent here and say they do indeed do staged rollouts.
They just don't properly check if one stage has succeeded before moving on to the next.
More options
Context Copy link
Rumors suggest that it may have been rolled out Friday morning local time.
Of course, a slow rollout is pointless if you have no canary process and no means of determining if you just bricked all Australia...
More options
Context Copy link
More options
Context Copy link
I thought this is exactly why they rollout updates instead of distributing them all at once. Do we know for sure there was a rollout or could they have mistakenly pushed this everywhere at once?
In this case I think it depends on what is being pushed. You have to keep in mind that this is a security tool specifically promising and designed to implement rapid defense against zero-day security exploits. Holding off for a week or so on a threat under active exploitation is not what they are being paid for.
Yeah, the paranoid option is that there was some serious zero-day that they were trying to react against, it worked fine on the development environment, and they made a tradeoff of the risk of this sort of incident against not pushing the big red button.
But being derpy is always an option.
More options
Context Copy link
That’s a good point, but they have to have some kind of staging environment or slow rollout right? You can’t just release to all customers at once, that’s absolutely insane and asking for something like this to happen even if it’s security-critical.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
It's mid-July. Likely an intern bypassing a safety check to try to get his project completed on time.
More options
Context Copy link
More options
Context Copy link
Note that it's crowdstrike, not cloudstrike. Doesn't detract from the post that much but just thought it was worth pointing out.
ClownStrike, I think
More options
Context Copy link
Thanks, fixed
Can you remove the strikethroughs, at least after the first one? It's a bit jarring.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
It's already over. The biggest non-crisis ever even if it was the among the most widespread.
That seems overdramatic. I have not noticed any disruption at all; if not for all the headlines this morning I would have never known about this. An upgrade was rolled out and it was fixed. This requires manual intervention of servers, which is why IT exists as a profession in the first place. It's not a crisis like on the order of Covid or 2008, but more like a mass disruption. I think too many people are overreading into this as some sort of harbinger of the awaited collapse, and really it's not.
Airlines were grounded; again, this is a common occurrence. There was a similar incidence as recently as 2023 when many flights were grounded https://www.reuters.com/world/us/why-us-flights-were-grounded-by-faa-system-outage-2023-01-11/
it's bad, no doubt, but the mass-grounding of flights is something that typically happens every 2-3 years.
The fact they are paid so well and exhaustively vetted in the hiring process suggests they are not disposable. Companies invest a lot of resources in new hires . There is also a loss of perspective in that people forget the other 3650 days of the past decade in which there is no major failure, but a single failure is suddenly a major indictment on the entire tech industry, as opposed to something more mundane like a mistake.
Crowdstike stock was only down 11% today, which is far less than expected given that it has been implicated in the greatest IT failure ever. By comparison, Meta stock fell 15% in a day last after it missed the highest of earnings estimates. This is reason to believe it's not as bad as the overly dramatic language would suggest.
One would hope such companies learn from past mistakes, but as tech changes, consequently so do the mistakes. So I can expect incidents like this in the future.
I had a flight canceled today. I am fucking livid. This was over 12 hours after the rollout. Luckily I was able to get rescheduled onto a flight tomorrow, but frankly I have no confidence that that flight will happen either.
I was just going on a silly vacation. I cannot imagine how I would feel if I missed something important. There will never be justice for this. In a fair world, Crowdstrike would be sued into bankruptcy like Purdue Pharma. I'll be lucky if I get a drink voucher out of this.
Are you sure you can get nothing? Last time an airline messed up my connection I got around 800 euros out of it
in the United States, airlines aren't legally required to compensate customers for delays at all. I had a United flight recently delayed by eight hours and received a $15 lunch voucher and $100 in airline credit though.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I can't help but think about this post that was linked on the SSC reddit a few days ago: https://matt.sh/panic-at-the-job-market
(long, rambly post that I don't fully agree with but it did say a few interesting things) In particular these two quotes:
...
I don't think it's like that at every company, or even the majority. But there are certainly some companies like that. They, in theory, care greatly about their tech workers, because the salaries are high and they have a vague understanding that tech is important. But they don't have a good system for actually hiring good tech workers. And then, once hired, they use them all as generalists, moving quickly from one thing to another, with no chance to actually develop expertise or fix deep underlying issues. And they are never given any kind of decision-making authority in the company, only responsibility to "just fix whatever breaks."
I think that behavior happens the most in companies that are not "tech companies," but still use tech. Banks, airlines, large retailers, that sort of thing. They need tech to function, but it's just a cost center to them- they want to just pay a fixed price per month to "handle tech" and then not think about it ever again. And it seems like those are the ones being bitten in the ass by this thing, because it turns out that running a windows server with third-party antivirus on it with automatic updates is not actually very secure! I wonder if we'll see any restructuring, or if this sort of thing is just going to happen every so often forever, as companies get blindsided by tech issues that they don't understand and never cared to try and understand?
i think the culture of secrecy is the bigger problem. they are paid lot and expected to not blabber to the media if they expect to be employed now or in the future by other companies
More options
Context Copy link
More options
Context Copy link
Wikipedia reports that 5.9% flights were cancelled worldwide. It's definitely a lot of flights but also not that much on global perspective.
Twitter had flightradar24 animations showing flights disappearing with Community Notes saying that this animation is fake and not from CrowdStrike fault event. You wouldn't really notice 6% decrease visually or would notice only a slight reduction.
People love to lie on twitter for dramatic effect.
If we assume 6% reduction of global economic activity for one day, it certainly is loss of billions of dollars. And yet it is less than one extra holiday per year.
More options
Context Copy link
More options
Context Copy link
I was going to do a lot of stuff at work today. Was.
More options
Context Copy link
Interestingly this crash has seemingly barely affected Finnish businesses and organizations at all (apart from cases where they have projects with foreign companies, of course). Apparently there were some minor glitches at the system of the bank I use, but I didn't notice it at all.
I wonder if it's simply that Finnish companies are patriotically committed to using F-Secure/WithSecure solutions above all others...
I’ve never heard of CrowdStrike and I’ve worked as a programmer for 25 years, so I assume more or less nobody uses CrowdStrike in Finland.
Heard much the same from a programmer friend.
More options
Context Copy link
More options
Context Copy link
Finnish = Linus Torvalds = Linux servers ?
Possibly a part of the explanation, too.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Your first two images are fake, either appearing earlier or being photoshopped.
First image: https://old.reddit.com/r/pcmasterrace/comments/8xe65r/the_utimate_blue_screen_of_death/
Second image: https://www.verifythis.com/article/news/verify/national-verify/las-vegas-sphere-blue-screen-of-death-image-is-fake/536-dd009fe6-8ac7-4044-9c3a-ec114107f6e3
Yep, you're right.... i was going through twitter and got duped.
More options
Context Copy link
More options
Context Copy link
I don't even understand why something like clownstrike is necessary in 2024. It should be possible for the OS to be locked down to the point where it's not necessary to have an anti-virus running. And if you need some other security system because you are worried about zero day exploit from nation state threats then you should really consider your threat model because the clownstrike system is effectively a malware distribution platform. I guess its fine if you trust clownstrike and the US government but its a far from ideal situation. Clownstrike seems to have a very nice relationship with the US security state. For example they were brought in to do the hacking investigation by the DNC and provided attribution to Russia.
OS vendors should really expose some kind of interface that allows security vendors to perform these deep inspections 'safely'. I think linux has EBPF which I think some vendors have been using for providing file system monitoring and network monitoring.
Also, the SOC2/etc compliance mandates a lot of this stuff. We run most of our software on Fargate ECS where the compute is completely managed by AWS. I've been using this as an excuse as to why we can't run file monitoring and other garbage on our systems that use Fargate. I also suspect why these managed docker/managed kurbernetes systems are popular because potentially you can avoid some of the tickbox security work. We also run all of our containers with a read-only rootfilesystem so I don't even understand the threats that a file system monitoring system would be trying to remediate in our situation. Technically some kernel exploit could allow the root filesystem to be modified even if its read only or AWS employees could fuck with us but I suspect in these cases the file system monitoring could also be trivially bypassed.
Clownstrike and all the other security stuff is the triumph of the security engineers and MBA types over users and cowboy developer types. Security incidents happen. Security engineers blame users and cowboy developer types, come up with software to make computers crappier and less useful. MBAs (especially MBAs at companies making this malware) call this "best practices" and push to have them required by corporations and governments. Developers and users complain that their computers are slow and don't work, the MBAs and security engineers say 'that's how you know it's working'. Then something like this happens and the cowboy types indulge in schadenfreude.
More options
Context Copy link
Because, just like with DEI and other stupid corpo bullshit, business necessity has nothing to do with efficacy. You do the rituals and check the boxes because someone somewhere figures this lets the company cover its ass. Whether there was an actual threat of ass exposure to begin with doesn't even get considered.
More options
Context Copy link
Don't worry, they've figured out how to break that too.
More options
Context Copy link
More options
Context Copy link
Checking to see if my flight will be delayed due to this. It still says "on schedule", but following the chain of "where is this plane coming from" backwards in time to see where my plane is, I see one flight where the expected departure time is before the expected arrival time of the airplane.
A website that follows and shows you the chain of previous flights of your plane sounds like a pretty cool idea
More options
Context Copy link
More options
Context Copy link
I do hope the fallout from this crap will be immense. Cloud was bad idea from beginning. This type of cloud security too.
This is the opposite of the cloud.
It is Software-as-a-service, but the processing wasn’t being done on someone else’s computer.
More options
Context Copy link
This isn't "cloud" in any meaningful sense.
Indeed, if these computers were in the cloud, they'd be fixed much faster.
Well, not cloud, but internet in general.
These machines all updated something, because they are connected to the internet and set up for automatic updates.
People learn pretty quickly that automatic updates are a terrible idea. Even if the update doesn't screw up your data or your workflow, e.g. by taking away some feature you were depending or crapping up the UI, it's likely the update process will kick in at an inconvenient time (like in the middle of a presentation). So they turned them off. Security people started crying about unpatched bugs, and got enough corporate power to get automatic updates considered a "best practice" (when it's not), and here we are.
Automatic Windows updates destroyed two of my work laptops at my last job.
I've had Windows 10 updates fuck up some of the older software I have running for my job.
And people wonder why I turn Windows 10 updates off.
Now I'm going to have to fight off a Windows 11 upgrade, so as to not fuck up said software. You'd think local IT would be more paranoid about just gleefully installing whatever it is Microsoft tells them too, but...
I can't speak for your IT department, but in the past we would always test updates across a cross section of the business before rolling them out to everyone. Maybe like 10% of the computers would get the test updates, and we would only deploy if we had no issues on the test PCs. That's really all you can do though, sometimes issues come up even with testing.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Automatic updates are the worst thing . Everyone hates them yet companies do it.
More options
Context Copy link
The problem is that no automatic updates is also a terrible idea, as a majority of systems don't get patched, ever. The ideal is manual updates but responsible companies/admins testing before deployment, and sadly I don't think that's gonna happen. The second best is gradual/tiered deployments with the ability to opt out, which is more realistic but still require more effort than many companies are willing to provide.
I personally think that "no automatic updates" is better than the current hellscape of "lol we can break your device at any time", even with the problems it causes. I'd rather have hella security issues on the Internet than have my stuff randomly break (or just get worse) without my intervention.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
"Internet was a bad idea from the beginning" is certainly an interesting argument.
I can definitely agree that canary-less fast global rollouts were a bad idea from the very beginning though.
How long do you wager it'll be before a major car company [thinking of Tesla here but I'm pretty sure they all do this now] bricks a significant number of its electric cars by pushing a bad update (rendering the car unable to start)?
That seems best case. What if it bricks while driving?
Probably highly unlikely. I have worked on mission critical software. While it wasn't automotive it was in a similar field. The code I wrote took six months to reach production. At that company we wrote maybe 5% as many lines of code per work week compared to a normal company. There was also extensive testing.
There may be individual events that happen. Mass brickings are unlikely.
Considering the overall quality of automotive software is 100% garbage I'm not as certain a massive screw-up would be as unlikely.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More like, for all of its benefits the internet has always been, and will always be, a point of vulnerability.