4

Another AWS outage: Should you run for other clouds?

 2 years ago
source link: https://acloudguru.com/blog/engineering/another-aws-outage-should-you-run-for-other-clouds
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Another AWS outage: Should you run for other clouds?

Scott Pletcher
Dec 20, 2021 8 Minute Read

Heard any good AWS uptime jokes? As you might have heard that Amazon Web Services (AWS) has had some well-publicized issues in the past couple weeks.

First, an AWS outage on December 7 took down some important services in US-EAST-1, resulting in significant impact for many customers. Then, on December 15, another outage struck the US-WEST regions, albeit much less dramatic. At 7:14 a.m. Pacific time on December 15, some AWS customers began to notice internet connectivity issues in US-WEST-1 and US-WEST-2. Within about 45 minutes, AWS engineers had identified the root cause and began to implement remediation steps and by around 8:15 a.m. pacific time, they reported that things had returned to normal.

So, what happened? In this post, we’ll talk about the latest AWS outage, plus the rest of the news out around AWS this week. Let’s go!


Accelerate your career

Get started with ACG and transform your career with courses and real hands-on labs in AWS, Microsoft Azure, Google Cloud, and beyond.


What happened with the second AWS outage this month?

Well, the offical status update says the issue was caused by “network conjestion between parts of the AWS backbone and a subset of Internet Service Providers.”

And what caused this network congestion?

Well, again, according to the official status update, it was “triggered by AWS traffic engineering, executed in response to congestion outside the AWS network. This moved more traffic onto the AWS network than expected” and subsequently affected connectivity between the AWS backbone network and a subset of internet destinations.

So, AWS was really trying to be proactive to do the right thing for customers, and, unfortunately in this case, it backfired. In fact, AWS engineers are always doing proactive stuff behind the scenes to keep services running efficiently, and we don’t even notice because things just work. This outage is much different in nature to the December 7 AWS outage, and I bet that not for that earlier event, this event would have gone relatively unnoticed.

What did we learn from the second AWS outage this month?

So, what are we to do? Run for the other clouds? Bring our data centers out of retirement? Look, everything fails all the time. Doesn’t matter if it’s on the cloud, across multiple clouds or in our own data centers.

According to AWS’s reports, it appears customers might have been impacted for as much as 45 minutes. 45 minutes in the context of a year’s worth of 24×7 service is still north of 99.99% uptime.

Of course, I can still hear the refrain “but that’s downtime, and we can’t afford downtime.”

Sure, then you should create an active-active multi-region failover architectures and pay twice what you’re paying now.

What’s that? You don’t trust AWS? Well, then create that same active-active multi-region architecture spanning multiple cloud providers with multiple vendor relationships, with multiple support contracts, and multiple support teams responsible for a solution that is now magnitudes more complex.

Now, how much are those 45 minute really worth?


howtothinklikeansre_heroright-e1626711156667.png

See how to think like an SRE

Watch this free, on-demand webinar to see Alex Hidalgo, Director of Site Reliability Engineering at Nobl9, break down SRE culture and tooling.


New APAC Region in Jakarta

Aside from all the outage chaos, there was one announcement that was more interesting than the existing slew of “instance type X now available in region Y.”

AWS recently opened a new data center in the Asia Pacific region based in Jakarta, Indonesia. The new data center is named ap-southeast-3 and is the 10th AWS Region in the Asia Pacific and mainland China part of the globe.

In addition to this data center, AWS has also committed to growing their business in Indonesia and creating more than 24,000 jobs over the next 15 years. This is a great example of where AWS doesn’t just drop a data center somewhere, but truly invests in the local community, changing lives and improving economic and social futures.

Keep up with all things AWS

Want to keep up with all things AWS? Follow ACG on Twitter and Facebook, subscribe to A Cloud Guru on YouTube for weekly AWS updates, and join the conversation on Discord.

Looking to learn more about cloud and AWS? Check out our rotating line-up of free courses, which are updated every month. (There’s no credit card required!)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK