Detecting bots from User Agent

Often it can be very important to detect bots and one approach is using User Agent
Detecting bots from User Agent

Keywords to find bots

Many bots will define themselves as "bot", "crawler" or the function they do, such as "Analytics". By search for these keywords from User Agent, you can detect a good portions of bots. Many sites we can see are defining bots more specific, but we have decided to use certain keywords more widely. So instead of looking for googlebot or bingbot, we just look if bot exist

$botUserAgentPettern = '/bot|crawl|slurp|spider|mediapartners|sistrix|summify|analyzer|archiver|webmon|httrack|censysinspect|zgrab|survey|cURL|http|libwww|l9tcpid|bing|google|facebook|coccoc|research|biglotron|GRequests|teoma|convera|gigablast|ptst|Cloudflare|\.com|\.org|python|WhatsApp|speedy|fluffy|bibnum\.bnf|findlink|panscient|IOI|ips-agent|expanseinc|findthatfile|ec2linkfinder|yeti|Aboundex|placid|yanga|Voyager|postrank|CyberPatrol|page2rss|linkdex|ezooms|heritrix|wget|wp_is_mobile|sogou|wotbox|ichiro|drupact|coccoc|integromedb|robot|\.infoproximic|changedetection|WeSEE:Search|SEO|Scaper|binlar|\.net|\.app|AddThis|lipperhey|Qwantify|BUbiNG|ltx71|index|ADmantX|Expanse|java|Request-Promise/i'; //Bot keywords

Using it in PHP

Below is an example on how it can be implemented and we also included check if user agent or accept language is empty, which they often can be for bots

function bot_detected() {

//User Agent $user_agent = $_SERVER['HTTP_USER_AGENT'];

//Languages $accept_language = $_SERVER['HTTP_ACCEPT_LANGUAGE'];

//Bot keywords to look for $botUserAgentPettern = '/bot|crawl|slurp|spider|mediapartners|sistrix|summify|analyzer|archiver|webmon|httrack|censysinspect|zgrab|survey|cURL|http|libwww|l9tcpid|bing|google|facebook|coccoc|research|biglotron|GRequests|teoma|convera|gigablast|ptst|Cloudflare|\.com|\.org|python|WhatsApp|speedy|fluffy|bibnum\.bnf|findlink|panscient|IOI|ips-agent|expanseinc|findthatfile|ec2linkfinder|yeti|Aboundex|placid|yanga|Voyager|postrank|CyberPatrol|page2rss|linkdex|ezooms|heritrix|wget|wp_is_mobile|sogou|wotbox|ichiro|drupact|coccoc|integromedb|robot|\.infoproximic|changedetection|WeSEE:Search|SEO|Scaper|binlar|\.net|\.app|AddThis|lipperhey|Qwantify|BUbiNG|ltx71|index|ADmantX|Expanse|java|Request-Promise/i';

//Compare pattern with user agent, to check if bot if((preg_match($botUserAgentPettern, $user_agent)) || empty($user_agent) || empty($accept_language)) { //This is a bot return true; } else { //Not a bot (most likely) return false; } }

Now you can easy call this function where you need it in your code

// Check if not bot if(bot_detected() == false){//User is not bot}

// Check if user is bot if(bot_detected() == true){//User is bot}

Bad bots

Please note that many bad bots will try to mask that they're bots and it's not enough to only use User Agent to detect these kind of bots. User Agent field is easy to manipulate to appear that you are something other than what you actually are

For good bots and services, you can detect most of them through user agent

We sometimes publish affiliate links and these always needs to follow our editorial policy, for more information check out our affiliate link policy

You might also like

Run code in background using PHP-FPM
PHP

Run code in background using PHP-FPM

PHP: Redirect to different URL
PHP

PHP: Redirect to different URL

Waking up during the night to pee could mean you're peeing wrong
Health

Waking up during the night to pee could mean you're peeing wrong

Powder vs Liquid vs Capsul Detergent
Cleaning

Powder vs Liquid vs Capsul Detergent

Comments

Sign up or Login to post a comment

There are no comments, be the first to comment.