It hasn’t been a fun day here at BugHerd HQ.
BugHerd is proudly hosted with Heroku who in turn run on top of AWS (Amazon’s cloud service). Earlier today (Friday 8.20pm PDT ), AWS suffered a power outage due to lightning which resulted in Heroku going down along with big names such as Netflix, Pinterest and Instagram. This obviously meant the lights went out on BugHerd as well.
In all, BugHerd was offline for just over 6 hours; that really sucks, and we’re really sorry.
It would be easy for us to throw our hands in the air and say, “we can’t control the weather” or “we can’t help it if Amazon falls over”, but the reality is that you don’t pay Amazon or Heroku; you pay us. That means it falls on us to ensure we’re available as close to 100% of the time as possible. We let you down, and we’re sorry.
It’s also easy to say this is an isolated incident, and whilst we still have a better than 99.9% uptime, there have been two such outages in the past month (although the last one was for a much shorter period), as well as a couple of similar incidents last year. A managed host is great for overall uptime, but it does mean our hands are tied when the shit hits the fan. It’s a really unpleasant feeling, I can tell you.
In the coming weeks we’ll be reviewing our hosting strategy and, in particular, how we can implement a reliable failover system such that when this sort of incident does occur that we can at least minimize the downtime. Heroku did a great job getting us back online, but if we were running our own gear, it could’ve been much quicker. I certainly don’t blame Heroku, they’re as much a victim in this as we are and did an awesome job to get their thousands of customers back up and running. Even though we’re a startup, and cash is a precious commodity, it’s clear we need to up the ante with our hosting.
Finally, I do want to send a big thanks to the team at Heroku for getting us back online. As a resource poor startup, knowing that you have a bunch of super smart people looking after you is at least a little bit comforting in situations like this! Not only have they done everything in their power to get us back online, but they’ve kept us regularly updated so that we could keep you updated.
Once again, we’re very sorry for the outage, and I’m personally sorry to the folks that rely on us to get their work done and couldn’t.
If you want to get in touch regarding the outage, please contact us at email@example.com
BugHerd CEO and Co-founder