[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cobalt-users] cobaltracks down time... how has it been?
- Subject: Re: [cobalt-users] cobaltracks down time... how has it been?
- From: Nobody <bs@xxxxxxxxxxx>
- Date: Tue Dec 19 07:49:00 2000
- List-id: Mailing list for users to share thoughts on Cobalt products. <cobalt-users.list.cobalt.com>
} << From: hirsheskwirtz@xxxxxxxxxxx (hirsh -e-skwirtz)
}
} to day is monday 11:54 pm and my cobalt raq3 has been down for over 5 hours
} today...
Are you sure Cobaltracks was down?
No, I'm not associated with Cobaltracks, nor am I trying to
hustle business. But we do run our own Network here and that
provides me with point of view that few of you have.
Assuming you are on a T1 from the likes of UUnet, Savvis,
etc., there are at least two hops between you and Cobaltracks. More
than likely there are a half dozen or more hops. Each and every one
of those hops represents a point of failure. If you are running your
tests through an ISP provided dialup the ISP may very well be
kicking you off every now and then in accordance with their standard
line management policies - or they may very well be having problems
on their end. There's also been discussion on the inet-access list
of Sprint having network problems and Sprint is a big enough player
that one or more of those hops mentioned above may be their
network.
This isn't intended to say anything good or bad about
Cobaltracks or anyone else for that matter. But the reality of
connectivity is such that being unable to reach a site does not
necessarily translate into problems on the providers end.
We just dualhomed here due to reoccuring problems with
connectivity. The plan was to install a new connection, switch over
to it, then replace our existing connection in order to try and
prevent any future problems. For the past 3 or 4 months we've been a
thorn in the site of our upstream, who repeatedly stated that their
systems were fine. By the time the second connection was being
installed we couldn't wait to get rid of them. But during that
installation our primary circuit went down. At the time USWest, our
local loop provider, was here and so were the 'engineer's" whose
task was to setup the new connection. The result was the problem
with our existing connection was found - a wire was broken inside
the cable that runs from the demarc to the pole. USWest had
repeatedly checked that circuit and insured us, in no uncertain
terms, that the problem was on our upstreams end. Fact is, USWest
had checked the physical connection just a day before the
installation of the new connection occurred and was unable to find
any problems with it.
The point is that there are many factors that apply in
situations such as this and, based upon my experience, nobody knows
for sure until such time as someone knows for certain.
We run an automated system here that checks the services on
servers located in two countries and three separate facilities every
90 seconds. We run another automated system that graphs the latency
on three major Internet backbones - checking them every 15 minutes.
We check access to our network here by running similar functionality
on a simple 19.95 per month hosting account provided by a major
player, checking our own systems from their location. And now
having had two connections for two weeks it's possible for us to
take the great circle route in and out of here - we can run test out
one connection that come in via the other connection and over the
the respective connections backbone - both of which are different.
While we only have a year's work of history to work off, I've yet to
see anything that even remotely resembles 100% uptime. Worse yet,
perhaps, some of the servers we monitor are located in one of the
highest rated colofacilites to be found and we now know that the
upstream we've been complaining about and whom we were eager to
replace provides a faster, less heavily loaded connection than our
new "better" and more expensive upstream does.
Cobaltracks may or may not be the problem. Without knowing
the status of all the hops between you and them there's no way of
knowing. If this is indeed happening I'd strongly recommend that you
do a traceroute when you can't reach them just to see if the problem
is with them or somewhere else along the tangled web of connectivity
that lies between you and them.
No, we are not going to monitor your servers for you. No,
we're not trying to hustle business. We're just pointing out the
reality of the situation. You see, the bottom line here is that your
server can be so heavily loaded to be incapable of responding to an
HTTP request.
Just a few months ago someone we assisted a while back
called because they were fed up with the problems they were
experiencing with their colocated server. That which they described
was quite similar to that which you described. They're on this list
too. To make a long story short, it turned out that a spammer was
flooding their server with vast quantities of undeliverable messages
and that activity had literally slowed the server to the point of
being all but unusable.
Quite simply, there's a whole lot more to this than meets
the eye and the possible list of causes goes on and on. While you
may very well be right, without more information your post should be
seen for exactly what it is - an unverified report of a problem that
may or may not be Cobaltracks doing. At the very least, I'd suggest
running the very same test at the very same time on a minimum of two
or three other servers which are located elsewhere - at least
one of which should be on the same backbone - before you put your
clients through the nightmare of moving your servers elsewhere.
Brent Sims