Hakim
Moderator
Some perspective on hard drives …
There are many, many issues that we have to consider and contend with regard to operational issues of TBN and related sites.
One aspect is hard drive integrity. Another is backups. We have considered these issues from many angles, for several years.
Harddrive reliability statistics are usually stated by manufacturers as “Mean Time Between Failure” -- so and so many hours between failure. “MTBF 500,000 POH”. POH means “Power on hours.”
They also provide a stat for “Start/stop cycles (ambient)”, typically something like 50,000 cycles. And they will state the overall “component design life” at something like 5 years.
MTBF stands for Mean Time Between Failure. This is basically a meaningless number. As above, the typical number quoted would be about 20,833 24 hour days of constant operation, around 57 years!
MTTF -- which is a more relevant measure for hardisk life -- is “Mean Time To Failure.” There are a few manufacturers that use this number instead of MTBF. And as I stated above, it's more relevant.
Very few manufacturers work with MTTF ratings. A more interesting development is that many harddisk manufacturers have moved to a 1-year warranty for their products. That probably means that their MTBF or MTTF isn't as good as the datasheet says?
Modern PC systems runs at an increasing higher temperatures, so one could argue that the life expectancy of the components inside the box gets shorter. Just look at the newer types of processors, where one earlier had to dissipate maybe 10W of heat, the newer ones ranges up to maybe 80-100W of heat. I know that we now have temperature gauges on all of our harddrives, so we can monitor the temp at all times.
Our needs are specific to the TBN board. As I noted before, it is ranked something like 1,500 among all sites in terms of pages served monthly. This is not an insignificant thing.
And, these pages are dynamic, meaning that the load needs and drive accesses are much higher than on a typical web site. Which means our HD's may fail more frequently than other types of usage.
Well, okay, they did in fact fail once in five years. But it would have cost us around $20,000 in “costs” over that period of time, to prevent that single loss of access and a few hundred posts. Was there a critical need to "protect" to that degree. It was our judgment, when considering from all angles, that we did not need to provide 100% uptime and 100% fail-safe against any loss.
We do keep backups which are good within 24 hours usually. For your information, even Amazon.com DOES NOT make backups of their web operations. I know this is hard to believe, but it is true. I read it in their annual statement to stockholders. And, they noted that if they were to lose their entire server farm, in all likelihood they would not be able to continue in business. So. Everyone makes cost/benefit decisions.
Yes, Rackspace is expensive. No doubt about that. But, we just cannot try to find the cheapest offer. Rackspace is arguably the best host provider available. They are just unbelievably responsive and competent. Case in point is them working with us 24 x 7 on the 4th of July holiday, to get a new server up and running. Very few companies will do that AT ALL. Rackspace's slogan is "Fanatical Customer Service." For a change, here is one company that walks the talk -- in fact exceed it – by a mile.
Anyway, we are back up and running, apparently better than ever. /forums/images/graemlins/cool.gif
We sincerely apologize for any inconvenience to users.
There are many, many issues that we have to consider and contend with regard to operational issues of TBN and related sites.
One aspect is hard drive integrity. Another is backups. We have considered these issues from many angles, for several years.
Harddrive reliability statistics are usually stated by manufacturers as “Mean Time Between Failure” -- so and so many hours between failure. “MTBF 500,000 POH”. POH means “Power on hours.”
They also provide a stat for “Start/stop cycles (ambient)”, typically something like 50,000 cycles. And they will state the overall “component design life” at something like 5 years.
MTBF stands for Mean Time Between Failure. This is basically a meaningless number. As above, the typical number quoted would be about 20,833 24 hour days of constant operation, around 57 years!
MTTF -- which is a more relevant measure for hardisk life -- is “Mean Time To Failure.” There are a few manufacturers that use this number instead of MTBF. And as I stated above, it's more relevant.
Very few manufacturers work with MTTF ratings. A more interesting development is that many harddisk manufacturers have moved to a 1-year warranty for their products. That probably means that their MTBF or MTTF isn't as good as the datasheet says?
Modern PC systems runs at an increasing higher temperatures, so one could argue that the life expectancy of the components inside the box gets shorter. Just look at the newer types of processors, where one earlier had to dissipate maybe 10W of heat, the newer ones ranges up to maybe 80-100W of heat. I know that we now have temperature gauges on all of our harddrives, so we can monitor the temp at all times.
Our needs are specific to the TBN board. As I noted before, it is ranked something like 1,500 among all sites in terms of pages served monthly. This is not an insignificant thing.
And, these pages are dynamic, meaning that the load needs and drive accesses are much higher than on a typical web site. Which means our HD's may fail more frequently than other types of usage.
Well, okay, they did in fact fail once in five years. But it would have cost us around $20,000 in “costs” over that period of time, to prevent that single loss of access and a few hundred posts. Was there a critical need to "protect" to that degree. It was our judgment, when considering from all angles, that we did not need to provide 100% uptime and 100% fail-safe against any loss.
We do keep backups which are good within 24 hours usually. For your information, even Amazon.com DOES NOT make backups of their web operations. I know this is hard to believe, but it is true. I read it in their annual statement to stockholders. And, they noted that if they were to lose their entire server farm, in all likelihood they would not be able to continue in business. So. Everyone makes cost/benefit decisions.
Yes, Rackspace is expensive. No doubt about that. But, we just cannot try to find the cheapest offer. Rackspace is arguably the best host provider available. They are just unbelievably responsive and competent. Case in point is them working with us 24 x 7 on the 4th of July holiday, to get a new server up and running. Very few companies will do that AT ALL. Rackspace's slogan is "Fanatical Customer Service." For a change, here is one company that walks the talk -- in fact exceed it – by a mile.
Anyway, we are back up and running, apparently better than ever. /forums/images/graemlins/cool.gif
We sincerely apologize for any inconvenience to users.