Today WordPress.com was down for approximately 110 minutes, our worst downtime in four years. The outage affected 10.2 million blogs, including our VIPs, and appears to have deprived those blogs of about 5.5 million pageviews.
What Happened: We are still gathering details, but it appears an unscheduled change to a core router by one of our datacenter providers messed up our network in a way we haven’t experienced before, and broke the site. It also broke all the mechanisms for failover between our locations in San Antonio and Chicago. All of your data was safe and secure, we just couldn’t serve it.
What we’re doing: We need to dig deeper and find out exactly what happened, why, and how to recover more gracefully next time and isolate problems like this so they don’t affect our other locations.
I will update this post as we find out more, and have a more concrete plan for the future.
I know this sucked for you guys as much as it did for us — the entire team was on pins and needles trying to get your blogs back as soon as possible. I hope it will be much longer than four years before we face a problem like this again.
Update 1: We’ve gathered more details about what happened. There was a latent misconfiguration, specifically a cable plugged someplace it shouldn’t have been, from a few months ago. Something called the spanning tree protocol kicked in and started trying to route all of our private network traffic to a public network over a link that was much too small and slow to handle even 10% of our traffic which caused high packet loss. This “sort of working” state was much worse than if it had just gone down and confused our systems team and our failsafe systems. It is not clear yet why the misconfiguration bit us yesterday and not earlier. Even though the network issue was unfortunate, we responded too slowly in pinpointing the issue and taking steps to resolve it using alternate routes, extending the downtime 3-4x longer than it should have been.
Here in Europe it was late night so nobody became aware of. 😉 We had exactly the same 110 minutes. Then it was a bit slow sometimes during the day but now since some hours it runs fine and flawliss like before.
WordPress.com-Crew: Your service is fantastic, thank you very much. 110 minutes in four years, unbelievable.
Toyota can learn a thing or two from you guys!
Good luck, I know you’ll figure it out. Thanks for everything!
I was not happy to hear that WP was down…but you guys did a great job getting it back rolling. Matti
Good to be online again – thanks and greetings from sunny London!
Good work; transparent, efficient, clear. Fancy taking over some government agencies?
Thanx for update! WordPress is awesome!
Dang, I slept through it. missed the whole thing. Thanks for your hard work.
Thanks for this post – since I am VERY new to this site, and blogging – being unable to access my site was a concern. Thanks for the reassurance that this isn’t a regular thing!
My blog is still not back. The formatting is gone. What do I do?
Your blog looks great to me. If you’re still having issues, please contact support: https://wordpress.com/support/contact/
The update (and honesty) is much appreciated. Why can’t more businesses be like this?
good work!
OMG! and i thought i was banned from WP, thanks alot for letting us know.
Thanks for the swift communication. You guys are doing a great job! Keep it up!
I’m glad to see everything is back up and working fine again. Thanks for all of you guys to working so hard on fixing the problem as best and as fast as you could!
Thanks for the update. I was wondering what was going on. I probably lost a few hundred views, but what the heck – these things happen.
It’s cool. I don’t bitch about things that are free. =D
aaa…thts y i wonder..whts happening…i thought its only my computer…thank god for letting us know I was blaming my ISP and my computer since morning….hope it never happens again….but u never know…
Cheerss…J
Thanks so much for the update. Really appreciate it.
Thanks for keeping our stuff safe. You have no idea how much I appreciate what your work allows me to do! Hope your weekend is far better. 🙂
It was an incovenience but these things happenned. I woul like to add some problems we are having with our visitors.
They are being inundated with E-mails from WordPress asking to subscribe to a single comment, and they re-subscribed, and re-subscribed, double, triple quadruple or how many times one person answers a single comment. It’s ridiculous and it can be also the sheer volume of the subscribe to comment page enough, to crash a system.
That definitely sounds off, please get in touch with support and they can walk you through fixing it: https://wordpress.com/support/contact/
Is it any coincidence that this happened at exactly the same time as Wikia.com went down with a router change issue?
Yes I think it’s a coincidence, we don’t share any infrastructure with them.
Transparency and honesty are the key. Plus quick resolution, all of which you did! Let’s there’s no more surprises!
Thanks for explaining….
and here I was, thinking I’ve been so bad that all connections to WordPress had been cut for good! 🙂
Thank goodness you managed to sort it out so quickly!
WebOjO’s team appreciate your efforts in bringing back the blog universe!
…and that’s part of life Matt!We can never anticipate what’s going to happen next!AND MACHINES ARE NOTHING MORE THAN MACHINES!
So it’s no wonder that a problem happens here or there.We hope that this will not occur again.We love wordpress!
Thanks for your hardwork.
Thanks for the explanation, but over here in Australia the downtime was rather longer than ten minutes. For some blogs, including my own, the outage lasted maybe an hour or more.
I think you misread the post, we said a hundred and ten minutes, which is just under two hours.
Thank you for such a quick reaction and prompt explanation…that’s what I like about you guys!
Thanks for all.
I may have been upset if I’d noticed.
Lucky for you I had a massive hangover this morning.
The pain from that far outweighs the pain a few lost page views so…no biggie.
Glad WP is up and running again.
Wish my brain was.
Matt, quite right. This is what happens when I comment before my first coffee. Sorry.
Well this can happen to any huge traffic site, but i still wounder don’t u guys think if certain precautions can be taken here, i think may be CDN provider in place could help. But no matter wot i love you u guys still !!!!
i just want to say thank you very much for all your hard work and respect for you comunity of users.
As others have stated, thank you for holding yourselves accountable and taking prompt measures to remedy the situation.
No big deal. Stuff happens. Maybe someone took a break from their computer and went outside for a bit.
Very impressed with WordPress – efficiently back on line and great communication with users, and all this from a free service. Good on ya’, guys! Have a medal!
my blog is down right now although wordpress.com is up … what’s up with that?
downtownfarm: I see your blog just fine. If you’re still having trouble, please contact support at https://wordpress.com/support/contact/
Is it still down? I can’t see my page?
ivorb: We came back up yesterday. If you still can’t see your WordPress.com blog please contact support at https://wordpress.com/support/contact/ for help.
i was wondering what happened. good to know my posts are still there.
Your worst in four years? oh well, it’s not THAT bad.
WordPress-2 hands down-greatest blog platform with the best service around.
Thank you from a newbie blogger.
none of my videos are working
Christina: get in touch with support at https://wordpress.com/support/contact/ – they’ll help check your videos.
Thanks for informing.
We understand, don’t worry.
Appreciate all the support and effort put in to set up this great blog site!
Keep it up and you guys are the best!
🙂
Thanks for all your hard work.
Greetings from Pasig City, Philippines! I didn’t even know that my WordPress Blog was even down… We just transferred to a room, upstairs, and have been dealing with the excessively high temperatures, so my computers were down, probably at the same time WordPress.com was down; making it un-noticable by me. Luckily, I have mirror (non-Wordpress.com) sites, that are in place in the event my primary blog is ever down.
Thanks for keeping the WP community informed.
Wow that is good to know. I thought it was something i did…lol. Thanks fo rthe quick repair.
Good work in getting it back up. First time since I’ve had a blog that it’s happened so your record is good.
You guys did a great job. Bravo.
OK, that explains everything yesterday. Thanks for this update!
Since a few days I have another problem: WordPress refuses to “remember” me and demands ID and pass every time I log on, although I have checked the relative box!
heliotypon: Please contact support at https://wordpress.com/support/contact/ – they’ll be able to help you.
WordPress Rocks… Great job on the recovery!
thats the first that I ever experienced with wordpress. Hope we get stronger and more reliable failovers setup. Wish you success!
Thank you for your hard work. Thank you that my blog is ok.
speaker ok Thanks
you guys are awesome, thanx for being on top of things!
Thanks…..
Make sure this incident of sucking doesn’t take place again in near future. Otherwise you won’t get positive and appreciating comments from bloggers.
Anyway, thanks for the update. It didn’t cause me any trouble, though, as I didn’t come online while the largest blogging platform on the earth was down. But I can feel how much it sucked for both parties (authority and end-users).
i dont think it’s a long hour. 110 minutes is fast, for downtime
Could be it happens !
Come on, it was definitely not 10 minutes, it was much longer than that.
But I didn’t mind, it was still tool minuscule a glitch when weighted against what I get for free from WordPress!
Hat’s off.
Cheers.
If you re-read the post, you’ll see we said the outage lasted one hundred and ten minutes.
No problem. It is the least of life’s worries.
FWIW, if you have a couple of old routers set up with basic hard routes (i.e. not doing dynamic route sharing, just meat and potatoes static routes) then when a router meltdown happens (often from dynamic table issues) you can always just fire up the older gear for basic service while you figure out who, what, and how of the ‘trick’ stuff getting it’s configs confused.
No, not very elegant and the performance often suffers (since the old gear was usually replaced with the newer gear due to being too slow) along with losing the neat dynamic routing / failover features. On the other hand, you get basic service up fast and time to breath.
Works for all sorts of “issues” (except things like your ISP pushing a bad DNS table 😉
So anyway, had a long nice meal, came back, and all was fine again: So “Good Work Guys!”
I haven’t even noticed that something went down 😉 Have to read more blogs to be informed …oh wait, they were down… : ) nevermind…
Thanks so much!
thanks. bring it up..
It was only 110 minutes! I tried to write a blog post during this time, but I wasn’t sure why WordPress.com wasn’t responding. This blog setup at wordpress.com is actualy better than having my blog at my own website. Problems like this usually knocked my blog out of comission on my server much longer than 110 minutes. And I usually lost content forever. I’m glad I switched to WordPress!
I can’t seem to add widgets. Is this still a problem?
Widgets are working fine, if you’re still having trouble let our support team know: https://wordpress.com/support/contact/
I think you guys are doing a great job.
I’m very upset after reading this message. But still I can trust you. I know you boys are doing well soon. No one can stop the good things. My support and trust always at your side.
VIVA WORDPRESS! VIVA LIBERTY! VIVA THE BLOGGERS!
thank you for keeping us updated and for taking care of things out there. It is greatly appreciated~~
I noticed something. But I thought it was my internet connection acting up again, glad to hear it wasn’t!
Fix It
Thanks so much!
“Prepare for Impact”
Oh wait, that was the impact.