Rilevare gli spider dei motori di ricerca

Come in oggetto, ecco uno script PHP per rilevare quando una pagina è visitata da uno degli spider presenti in un elenco predefinito.

Search Engine Bot Detection

This page is made specificly for detecting when a one of the Serach Engines and in particular THE GoogleBot visits my website. The problem is that I have not seen the GoogleBot visiting me for quite a wise and now this script will insure that when Google does come to visit and INDEX my site I wil be notified abut that. (An e-mail will be sent to me wne one of the bots from the list enters this page)
If you would like to use this PHP script insert it in a .php page

Add it to any PHP page you like and it could be added anywhere you want. Good idea would be to put it on the index page as search spider is sure to hit that, but if you are like me and your index page is pure HTML than just make a seperate PHP page that is linked from the index. Search engines will spider it eventualy and so you will b notified about their visit.

Code:

bot.php (change the $to var with your email address)
Code / Sample:

$botlist = array(
"Teoma",
"alexa",
"froogle",
"inktomi",
"looksmart",
"URL_Spider_SQL",
"Firefly",
"NationalDirectory",
"Ask Jeeves",
"TECNOSEEK",
"InfoSeek",
"WebFindBot",
"girafabot",
"crawler",
"www.galaxy.com",
"Googlebot",
"Scooter",
"Slurp",
"appie",
"FAST",
"WebBug",
"Spade",
"ZyBorg",
"rabaz");foreach($botlist as $bot) {

if (ereg($bot, $_SERVER['HTTP_USER_AGENT'])) { 

if($bot == "Googlebot") {
if (substr($REMOTE_HOST, 0, 11) == "216.239.46.") $bot = "Googlebot Deep Crawl";
elseif (substr($REMOTE_HOST, 0,7) == "64.68.8") $bot = "Google Freshbot";
}
if ($QUERY_STRING != "") {
$url = "http://" . $SERVER_NAME . $PHP_SELF . "?" . $QUERY_STRING . "";
} else {
$url = "http://" . $SERVER_NAME . $PHP_SELF . "";
}

// settings
$to = "email@your-domain.com";
$subject = "Detected: $bot on $url";
$body = "$bot was deteched on $url\n\n
Date.............: " . date("F j, Y, g:i a") . "
Page.............: " . $url . "
Robot Name.......: " . $HTTP_USER_AGENT . "
Robot Address....: " . $REMOTE_ADDR . "
Robot Host.......: " . $REMOTE_HOST . "
";

mail($to, $subject, $body);
}
}
?>

fonte: http://nes-emulator.com/x_bot.php

Lascia un commento

Il tuo indirizzo email non sarà pubblicato.

You can add images to your comment by clicking here.