IT Blog

  • Blog
  • Technology
    • Technology
    • Architecture
    • CMS
    • CRM
    • Web
    • DotNET
    • Python
    • Database
    • BI
    • Program Language
  • Users
    • Login
    • Register
    • Forgot Password?
  • ENEN
    • 中文中文
    • ENEN
Experience IT
In a World of Technology, People Make the Difference.
  1. Home
  2. Technology
  3. Security
  4. How to block bad bots in Appache server

How to block bad bots in Appache server

2022-01-06 500 Views 5 Like 0 Comments

Nowadays more and more web crawlers visit a website, some are good for you, such as google search engine, they respect robots.txt protocol. But some of them have bad behavior that has negatively impact to your site.

How to block the bad bots

These crawlers are called bot. A bot is a software program that operates on the Internet and performs repetitive tasks. While some bot traffic is from good bots, bad bots can have a huge negative impact on a website or application. So we want to block those bad bots from visiting our site.

Normally, we can set rule in robots.txt file, but the bad bots do not respect these rules. So setting up robots.txt file is only for good bots.

To be sure the bad bots are blocked, we have to use .htaccess file. Here are the steps:

1. find out the bot keyword from the user-agent from the log.

2. add the following script to the top of .htaccess file:

a) set rewrite condition:

It uses regular expression to match multiple user agents in one line.

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} "dataforseobot|Yandex|AhrefsBot|BLEXBot|SemrushBot" [NC]
RewriteRule "^.*$" - [F,L]

Or

b) using SetEnvIf directives:

SetEnvIfNoCase User-Agent "AhrefsBot" badbots
SetEnvIfNoCase User-Agent "BLEXBot" badbots
SetEnvIfNoCase User-Agent "SemrushBot" badbots
SetEnvIfNoCase User-Agent "YandexBot" badbots
SetEnvIfNoCase User-Agent "dataforseobot" badbots

<Limit GET POST HEAD>
 Order Allow,Deny
 Deny from env=badbots
</Limit>

Both scripts are equiverant, just pick the one you like. By using the script above the page will return 403 (forbidden) to the bad bots.

Test the effects

Here's an example that uses chrome browser to test, other browsers are similar.

1. open developer tool by right click on the any area on the page, then click on Inspect.

2. click on More tools -> Network conditions

3. uncheck "Use browser default, and enter the keyword of the bot

Done.

 2,514 total views,  4 views today

error
fb-share-icon
Tweet
fb-share-icon
IT Team
Author: IT Team

Tags: Blog
Last updated:2022-01-06

IT Team

This person is lazy and left nothing

Like
< Previous

Comments

Cancel reply
Chinese (Simplified) Chinese (Simplified) Chinese (Traditional) Chinese (Traditional) English English French French German German Japanese Japanese Korean Korean Russian Russian
Newest Hotspots Random
Newest Hotspots Random
Rich editor not working Making web page scroll down automatically Getting data from Dapper result All Unicode Chars How to keep and display contact form 7 data Common Regular Expressions
WordPress Sitemap Query Reporting notes Temporary tables lifetime Disable auto update in wordpress Fixing Kratos theme multi-language issue Domain redirection
Categories
  • Architecture
  • BI
  • C#
  • CSS
  • Database
  • DotNET
  • Hosting
  • HTML
  • JavaScript
  • PHP
  • Program Language
  • Python
  • Security
  • SEO
  • Technology
  • Web
  • Wordpress

COPYRIGHT © 2021 Hostlike IT Blog. All rights reserved.

This site is supported by Hostlike.com