[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cobalt-users] RE: cant submit pages to inktomi



At 10:34 AM 3/7/2002, you wrote:
Hi all,

I have searched not only the cobalt user archives but it feels like the
whole internet for a solution to this problem! Now i turn to u...

We have a cobalt raq4 with a number of virtual domains under one I.P. (we
will call this siteA)

We have been trying to submit a number pages from another site (we will call
this site B) to the inktomi database with no luck, they tell us our pages
are generating errors.

As a bit of background, these are dynamic pages and the pages we are trying
to submit are in the format:
www.siteZ.com/siteZpage1/
www.siteZ.com/siteZpage2/

This is the response from the Inktomi people:

The server is actually serving the inktomi crawler an HTTP 406 error code
because
it seems it doesn't like the Accept: header which inktomi sends (Accept:
text/*).
Normally this happens when people try and serve dynamic content based on the
Accept/User-Agent combination, and don't code for crawler accesses.   If you
can check your content-negotiation code for errors and update
it so that it can handle text-only clients then that should solve the
problem.
----------------------------------------------------------------------------
-------------

I have been looking into this, telneting and requesting the pages in the
same format and getting the following HTTP response:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">


Found



The document has moved here.


This looks like (i know it is a bit cryptic) that the web browser is
redirecting them to the main host url with the page name from one of the
virtual urls.

Where the url should be = www.SiteZ/PageZ/
it is actually = www.SiteA/PageZ/ (where siteA is the primary site on the
box with its own IP)

Has anyone got any ideas???????? how or what i can do to deal with this?? I
heard mention of an apache module called 'mod_rewrite' but im not sure about
this.

Thanks in advance..

Marcus Miller


Most, if not all search engines, do not like to be redirected. If fact, if you are listed and they spider you and see the page is redirected, you not only will be dropped, but they will per se blacklist you without you knowing it. You have to ask to be respidered and if they find the redirection, they won't add you back.