Share this article
Latest news
With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low
Copilot in Outlook will generate personalized themes for you to customize the app
Microsoft will raise the price of its 365 Suite to include AI capabilities
Death Stranding Director’s Cut is now Xbox X|S at a huge discount
Outlook will let users create custom account icons so they can tell their accounts apart easier
Microsoft 365 and Azure face outages in Australia, root cause revealed
2 min. read
Published onSeptember 4, 2023
published onSeptember 4, 2023
Share this article
Read our disclosure page to find out how can you help Windows Report sustain the editorial teamRead more
A severe lightning storm hit Sydney, Australia, on the evening of August 30, causing major cloud providers, including Oracle Cloud, NetSuite, and Microsoft’s Azure, to suffer outages. It affected Microsoft 365 and Power Platform services as well. A similaroutage happened in Julyas well.
Microsoft’s post-incident analysis report attributed the blackout to a combination of a couple of factors.
Cooling Systems Stopped Working
Because of the power problem, cooling units in one data center within one Availability Zone to go offline. The data center’s systems shut down automatically to prevent the computers from getting too hot and potentially being damaged. This made it impossible for people to use Azure and Microsoft 365 services.
Not Enough Staff at Night
There weren’t enough people working at the data center during the nighttime. This made it hard to fix the cooling systems quickly. To fix this problem, Microsoft hired more people temporarily to work at the data center at night.
Both this leads Microsoft to improve the automation game. Microsoft acknowledged the need to enhance automation within the data center to make it more resilient to various voltage sag event types. This includes evaluating the sequence in which servers and chillers are restarted to ensure that the highest-load servers and their corresponding chillers are prioritized.
The outage resulted from a combination of factors, including a power sag, cooling unit failure, and an automated shutdown triggered by rising temperatures. Insufficient nighttime staffing was also a contributing factor.
viaNetworkWorld
Davesh Beri
User forum
0 messages
Sort by:LatestOldestMost Votes
Comment*
Name*
Email*
Commenting as.Not you?
Save information for future comments
Comment
Δ
Davesh Beri