2/ in recent years without any major outages or code breakages. Now they’re getting routine. And often they’re ones like today where they don’t create ripples of poor performance or gracefully degradation but breakages where you see the raw code. So what happened?
— Josh Marshall (@joshtpm) March 6, 2023
4/ Another explanation is that they’ve fired so many people that they simple don’t know the code base well enough to manage it or ti her with it. Another is that they don’t have enough people to do either of those. It’s probably a mix of both.
— Josh Marshall (@joshtpm) March 6, 2023
Well, the first story out of the box is, that’s all the explanation they got:6/ Problems happen even for the biggest operations. But not often. There should be lots of ways to test new code even if you know the system perfectly, staging areas, limited tests. Again, we did something and then it broke isn’t a good explanation.
— Josh Marshall (@joshtpm) March 6, 2023
How a single engineer brought down Twitter on Mondayhttps://t.co/7ntNzLh3NM
— emptywheel (@emptywheel) March 6, 2023
The change in question was part of a project to shut down free access to the Twitter API, Platformer can now confirm. On February 1, the company announced it will no longer support free access to its API, which effectively ended the existence of third-party clients and dramatically limited outside researchers’ ability to study the network. The company has been building a new, paid API for developers to work with.
But in a sign of just how deep Elon Musk’s cuts to the company have been, only one site reliability engineer has been staffed on the project, we’re told. On Monday, the engineer made a “bad configuration change” that “basically broke the Twitter API,” according to a current employee.
The change had cascading consequences inside the company, bringing down much of Twitter’s internal tools along with the public-facing APIs. On Slack, engineers responded with variations of “crap” and “Twitter is down – the entire thing” as they scrambled to fix the problem.This was not the first time:
Monday’s errant configuration change was at least the sixth high-profile service outage at Twitter this year.Let me pause and emphasize that “this year” is only 10 weeks old. And then put this in tweet form because once again I can:
It seems February was an unusually long month. But remember, the problem is, Twitter is “brittle:”Twitter issues recently:
— Pop Base (@PopBase) March 6, 2023
January 23 — error messages when tweeting
February 8 — down for users globally
February 15 — glitching for users globally
February 18 — issues sending gifs
February 28 — groupchat glitch with kicking
March 1 — timeline not loading
March 6 — content not… https://t.co/B0zKMeYBCo pic.twitter.com/EuFBAL7u2d
In many ways, Monday’s outage represented the culmination of Musk’s leadership at the company so far. In a single-minded effort to cut costs on his $44 billion purchase, he has been slashing the staff and reducing Twitter’s free offerings.
This paved the way for a single engineer to be staffed on a major project — one that is linked to several interconnected, critical systems that both users and employees depend on.
And with few knowledgeable workers on hand to restore service, it took Twitter all morning to fix the problem. “This is what happens when you fire 90 percent of the company,” another current employee says.
Inside Twitter’s HQ, however, the mood was almost light. “We’re laughing all the way down,” says a different current employee.
They’re not laughing at you, Elmo. They’re laughing near you. π
No comments:
Post a Comment