[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cobalt-users] 99.9% up time



on 4/13/00 10:30 AM, Jeff Lasman at jblists@xxxxxxxxxxxxx wrote:

> Okay, you force me to explain myself...
> 
> Kris Dahl wrote:
> 
>> IMHO three nines is easy.  I could do that with crappy hardware and NT if I
>> was careful.  I don't think I have a single machine (even a workstation)
>> that isn't 99.9% available.
> 
> It's got nothing to do with any single machine or even "normal
> conditions".  It's got to do with what happens after your power's been
> down ten minutes and the battery's beginning to fail, and you don't have
> a generator.  Or you do, but it hasn't been run in three years and you
> have no idea if it'll start up or not until it doesn't.
> 
> Or, presuming you're multi-homed, but the backhoe that does the damage
> is right outside your building and cuts all the wires.
> 
> Or you're really securely multi-homed, with three separate paths out of
> your building, AND a dish on the roof.  And battery backup, and your own
> diesel power, well tested, running properly, and everything else, and
> one of your upstreams has a misconfigured router and your BGP just
> happens to not work properly.  And it happens the weekend you're
> vacationing in Hawaii and your BGP tech's cell-phone's battery is dead.

Well this is why you co-located with a word-class datacenter that is
multi-homed, has redundant UPS power backed by a generator, has fire
suppression, etc.  They should ideally have bandwidth purchased from
multiple carriers, and have them be lines (so the backhoe can't clip all of
them).  And if they only have one technician--it won't matter if the BGP is
up and running or not--I assure you you'll have other problems.

I'm not aware of anyone multi-homing a sat dish--but that would be pretty
cool.  Latency would be atrocious.

Its not cheap, but when you think about it how can you afford to not do
this?  And *they* guarantee you that 99.999% uptime.

> Attaining 99.9 isn't hard under normal conditions is easy.  Being able
> to guarantee it under abnormal conditions is a bit harder.
> 
> Unless of course you want to become known (as are many tier one
> providers) as having guarantees with no value.

I would have no problem with 3 nines in even abnormal conditions, at a
proper facility.  When we implement our new system, I would have no issues
with guaranteeing at least 3 nines, if not 4.

3 nines is ~ 9 hours a year.  That is easy.
4 nines is ~ 50 minutes a year.  This is a bit tougher.  Scheduled
maintenance becomes an issue.  A memory upgrade could take 20 minutes
easily.  Really to do this you should have load balancing servers.

>> But if you want to guarantee 99.9%, you should invest in adequate power
>> protection, and keep near-line backups if at all possible.  Also co-locating
>> with a datacenter that provides the power and redundant connections with
>> full BGP routing to multiple carriers would be the easiest way to do it.
> 
> I've got all of that, and charge accordingly.  I still don't guarantee
> anything except best efforts.  (I guess that's what I mean by
> "nobaloney".)

And that's the reason I don't *plan* on offering guarantees.  Because you're
at the mercy of somebody else.  But as far as your risks involved, if you
get 1-2 high end clients because you 'guarantee' that it will be up (with
penalties), it will be worth your while if you do it right.

And providing 3 nines is, if you are using a decent data-center, a given.

-k