[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [cobalt-users] Robots.txt



>Considering there is no record for "robots" nor "robots.txt" in the
>archives, i´d like to know how to use the file robots.txt located under
>/etc/admserv/html of my Cobalt. By now, it reads as:

># Prevent all robots from visiting this site:

>User-agent: *
>Disallow: /

>How can i, for instance forbid a robot to search my pages? Considering the
>directory it is located (under root), which virtual sites would it affect?

<thinking> there *was* a thread on this sometime back, although I'm dammned
if I can recall the subject line. mmph...anyway.........

You really *can't* disallow a robot/spider search since not all spiders are
good mannered and respect the directive. Blame the "public" nature
(deliberately designed this way) of the web for this difficulty. The only
practical answer is to passwd-protect (htaccess) the site(s) or page(s) in
question. The spider may find 'em, they may appear on search engines, but
accessing them will be prevented by the existence of the password
requirement.

Regards,
-Colin