The other day we had a couple servers made inaccessible for over 9 hours. It was a disaster. The servers themselves were fine, but the data center where they live had a power outage. From what little information I have gathered, the power outage caused networking and router failures, so that even when power was restored either through back up or primary source repair, the server bank where we had a couple servers remained off line.
To compound matters the company that supplies these servers were not reachable by phone (constant busy signal) so I couldn’t even communicate with them to learn what had happened and get an ETA on return of service that IÂ could pass on to my customers who called and emailed me throughout the day and evening. Needless to say, I lost a lot of sleep that night. I have still to get a full accounting or reassurance from the server provider that they have things under control. I have been with them for years and have had very good service, excellent response time if a server does go down, and generally good conmmunication. But this incident has caused me to rethink the who relationship. Moving servers is a real PITA. But I have done it several times in the past due to poor service so I am considering doing it again. In any service business, communication is vital. When that breaks down, trust is lost.
In order to improve communication with our own clients, I am working on a couple things. The first is a new blog hosted at WordPress.com where I can post information about our hosting service.Â Even if all our servers go dark the blog will not. Of course there is no guarantee WordPress.com won’t go down. Anything can happen.
Update: I have heard from the owner of my supplier at the data center. It seems some of the communication failure was on my end. My Gmail account coincidental to this incident decided to send all their messages to me to the spam folder. What? And Gmail is supposed to be so reliable. Although their own mail server was down for quite a while as well. Anyway, here is some of what he said:
As you are aware there was a major power outage yesterday at the data center we are located in. Even though our data center has a back-up battery system and back-up generators, there was apparently a failed capacitor bank that prevented either the batteries or the generators from helping. We will be learning more about what happened tomorrow when the data center owners provide us with an RFO or Reason For Outage document.
I am assured they are taking steps to this won’t happen again.