Tag Archives: Henna Elahi

Modus operandi on steroids

That is what I see and not everyone does that. There is the setting of oversimplification and I get it, we all want things to work. So when the BBC alerted us all to the outage that AWS experienced there is more to all of this. 

I am not on the side of Amazon here, or on their opposition for that matter. So when I saw the news that thousands of corporations went down I was eager to see the news. And as it was given to me “Platform outage monitor Downdetector says it has seen more than 6.5 million reports globally, affecting more than 1,000 companies” but why? And we get that with “There aren’t many alternatives to AWS – operating on that vast scale is an enormous logistical challenge” and I tend to agree with Zoe Kleinman on this. So as the BBC gives us “Amazon Web Services says it has fixed the underlying problem that has disrupted many of the world’s biggest websites and apps, but a full recovery will take some more time” and I go ‘underlying problem?’ And there Tom Gerken has an answer, he gives us “At 08:00 BST this morning, reports started flooding in of problems accessing a few apps. By 09:00, it was apparent this had turned into quite a big deal.

We know now that the culprit was something called “DNS resolution” not working properly at Amazon Web Services. In simple terms, it all comes down to the bit of tech which lets a computer understand what we mean when we see a url like bbc.co.uk. But the reason it had such a big impact is simply that a massive amount of companies rely on Amazon working properly.

Downdetector told the BBC it had received reports stating more than 1,000 companies were facing problems. The question now is – will some of these companies look to alternatives?

You see, the problem is that ‘everyone’ expects a setting to work outright all the time and the old premise is “You can fool all of the people some of the time and some of the people all of the time, but you cannot fool all of the people all of the time” this can now be ‘tweaked’ into “You can service all of the people some of the time and some of the people all of the time, but you cannot service all of the people all of the time” you might think that this is folly, but it is not. You can introduce larger pools of resolution, but the system was designed to work all of the time, there was apparently no switch over and that might have resolved things. I am also contemplating that an outside source had introduced something to make it fall over. Was that the case? Amazon and its AWS pool of technicians are top notch, as such this hiccup might have been foreseen. 

My thoughts on the third party comes from the news “The latest update comes after AWS said, at around 12:00 BST, it had fixed the underlying issue, but noted there would still be problems as they brought everything up to speed” and this happens around noon? I don’t believe in these coincidences. Like noon British Summer Time? Something seems amiss. We get the usual baby formula stories, because the baby needs feeding. Yet the idea of having something in stock was rejected? And I get it, we all need our sustenance. That’s why I keep 3 days of spare food, so when this happens I am not helpless. 

So that gives us to the ‘latest’ issue. We are given “After today’s Amazon Web Services outage impacted many of the world’s biggest businesses, some customers might be asking whether they can take legal action for any disruption they might have suffered. Henna Elahi, a senior associate at Grosvenor Law in London, explains that whether money can be recovered will depend on “several factors”, including the contracts between the various parties and the severity of the outage. For instance, banking apps are among those that saw thousands of reports of issues.” And I get that, some people will cling to legal settings and that is fine, but that gives me the following questions.

Does these contracts raise glitch issues? Was there an insurance setting to prevent this? Was that insurance paid or did everyone just assume that this is a free service that works 100% of the time?

I reckon that AWS will investigate how this could have been prevented or diminished. You see when this happens on these AI systems and you can disrupt these services, a glitch like that will allow you to short sell what AI data is handled and that implies organized crime intervention on nearly every level (or state players).

We were given:

This implies that the entire setting took less then 5 hours to fix, I say ‘Yay Amazon’ but the underlying setting that what this had such a massive impact, all whilst North Virginia was affected is the cutting question and whilst we can think that it was in North Virginia hence the CIA is to blame is just ludicrous (yet, not out of the realm of possibilities) my issue is that a setting of decentralized cloud computing might be required. Hence as one system goes down one of the other takes over and as we are given that “The AWS Cloud in North America has 31 Availability Zones within 9 Geographic Regions, with 31 Edge Network Locations and 3 Edge Cache Locations.” My question becomes (optionally utterly ridiculous) “Why did it take 4 hours” with the added “When cloud computing is nearly ‘global’” perhaps there are good reasons for this, perhaps this is the first time this went down to this degree and that is fine. Things go broken into the night and the next morning we have a stronger system. This is the track of evolution and it never goes without a glitch. 

But the idea that one centre had this much of a global impact? Consider that when the Stargate contraption goes online and power gets disrupted. See what you optionally lose at that point. Because that is the underlying setting. It isn’t what we have now, it will be what we will have tomorrow that counts as disastrous.

Have a great day and in case it happens again, don’t rely solely on your credit card, make sure you can afford to pay for that coffee (that ancient system using coins). 

Leave a comment

Filed under Finance, IT, Science