[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[cobalt-users] RE: cant submit pages to inktomi



Hi all,

I have searched not only the cobalt user archives but it feels like the
whole internet for a solution to this problem! Now i turn to u...

We have a cobalt raq4 with a number of virtual domains under one I.P. (we
will call this siteA)

We have been trying to submit a number pages from another site (we will call
this site B) to the inktomi database with no luck, they tell us our pages
are generating errors.

As a bit of background, these are dynamic pages and the pages we are trying
to submit are in the format:
www.siteZ.com/siteZpage1/
www.siteZ.com/siteZpage2/

This is the response from the Inktomi people:

The server is actually serving the inktomi crawler an HTTP 406 error code
because
it seems it doesn't like the Accept: header which inktomi sends (Accept:
text/*).
Normally this happens when people try and serve dynamic content based on the
Accept/User-Agent combination, and don't code for crawler accesses.   If you
can check your content-negotiation code for errors and update
it so that it can handle text-only clients then that should solve the
problem.
----------------------------------------------------------------------------
-------------

I have been looking into this, telneting and requesting the pages in the
same format and getting the following HTTP response:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>
<TITLE>302 Found</TITLE>
</HEAD>
<BODY>
<H1>Found</H1>
The document has moved <A
HREF="http://www.siteA.co.uk/siteZpage1/";>here</A>.<P>
</BODY>
</HTML>

This looks like (i know it is a bit cryptic) that the web browser is
redirecting them to the main host url with the page name from one of the
virtual urls.

Where the url should be	= www.SiteZ/PageZ/
it is actually 		= www.SiteA/PageZ/  (where siteA is the primary site on the
box with its own IP)

Has anyone got any ideas???????? how or what i can do to deal with this?? I
heard mention of an apache module called 'mod_rewrite' but im not sure about
this.

Thanks in advance..

Marcus Miller