禁止的垃圾蜘蛛，网站优化加速屏蔽国外垃圾无用蜘蛛，防止抓取导致带宽占用！

常见的网络恶意垃圾爬虫蜘蛛

1、MJ12Bot

MJ12Bot 是英国著名SEO公司Majestic的网络爬虫，其抓取网页给需要做SEO的人用，不会给网站带来流量。

2、AhrefsBot

AhrefsBot 是知名SEO公司Ahrefs的网页爬虫。其同样抓取网页给SEO专业人士用，不会给网站带来流量。

3、SEMrushBot

SEMrushBot 也是SEO、营销公司的网络爬虫。

4、DotBot

DotBot 是 Moz.com 的网页爬虫，抓取数据用来支持 Moz tools 等工具。

5、MauiBot

MauiBot 不同于其他爬虫，这个爬虫连网站都没有，UA只显示一个邮箱：”MauiBot (crawler.feedback+wc@gm ail.com)“。神奇的是这个看起来是个人爬虫，竟然遵循robots协议，算得上垃圾爬虫的一股清流。

6、MegaIndex.ru

这是一个提供反向链接查询的网站的蜘蛛，因此它爬网站主要是分析链接，并没有什么作用。遵循robots协议。

7、BLEXBot

这个是webmeup下面的蜘蛛，作用是收集网站上面的链接，对我们来说并没有用处。遵循robots协议。

建站需要禁止的垃圾蜘蛛名单！毫无用处浪费服务器宽带资源。

第一种办法，伪静态

在宝塔的伪静态中插入

if ( $http_user_agent ~ AhrefsBot ){

return 403;

}

if ( $http_user_agent ~ YandexBot ){

return 403;

}

if ( $http_user_agent ~ MJ12bot ){

return 403;

}

if ( $http_user_agent ~ DotBot ){

return 403;

}

if ( $http_user_agent ~ RU_Bot ){

return 403;

}

if ( $http_user_agent ~ Ezooms ){

return 403;

}

if ( $http_user_agent ~ Yeti ){

return 403;

}

if ( $http_user_agent ~ BLEXBot ){

return 403;

}

if ( $http_user_agent ~ Exabot ){

return 403;

}

if ( $http_user_agent ~ YisouSpider ){

return 403;

}

if ( $http_user_agent ~ sandcrawlerbot ){

return 403;

}

if ( $http_user_agent ~ ShopWiki ){

return 403;

}

if ( $http_user_agent ~ Genieo ){

return 403;

}

if ( $http_user_agent ~ Aboundex ){

return 403;

}

if ( $http_user_agent ~ coccoc ){

return 403;

}

if ( $http_user_agent ~ MegaIndex ){

return 403;

}

if ( $http_user_agent ~ spbot ){

return 403;

}

if ( $http_user_agent ~ SemrushBot ){

return 403;

}

if ( $http_user_agent ~ TwengaBot ){

return 403;

}

if ( $http_user_agent ~ SEOkicks-Robot ){

return 403;

}

if ( $http_user_agent ~ WordPress ){

return 403;

}

if ( $http_user_agent ~ BUbiNG ){

return 403;

}

if ( $http_user_agent ~ PetalBot ){

return 403;

}

if ( $http_user_agent ~ Adsbot ){

return 403;

}

if ( $http_user_agent ~ NetcraftSurveyAgent ){

return 403;

}

if ( $http_user_agent ~ Barkrowler ){

return 403;

}

if ( $http_user_agent ~ serpstatbot ){

return 403;

}

if ( $http_user_agent ~ MegaIndex.ru ){

return 403;

}

if ( $http_user_agent ~ DataForSeoBot ){

return 403;

}

if ( $http_user_agent ~ Amazonbot ){

return 403;

}

if ( $http_user_agent ~ ClaudeBot ){

return 403;

}

if ( $http_user_agent ~ GPTBot ){

return 403;

}

=========================

在所有的伪静态前面插入！

第二个办法：创建robots.txt,插入以下代码

User-agent: AhrefsBot

Disallow: /

User-agent: YandexBot

Disallow: /

User-agent: DotBot

Disallow: /

User-agent: RU_Bot

Disallow: /

User-agent: Yeti

Disallow: /

User-agent: BLEXBot

Disallow: /

User-agent: YisouSpider

Disallow: /

User-agent: sandcrawlerbot

Disallow: /

User-agent: Genieo

Disallow: /

User-agent: Aboundex

Disallow: /

User-agent: MegaIndex

Disallow: /

User-agent: spbot

Disallow: /

User-agent: TwengaBot

Disallow: /

User-agent: SEOkicks-Robot

Disallow: /

User-agent: BUbiNG

Disallow: /

User-agent: PetalBot

Disallow: /

User-agent: NetcraftSurveyAgent

Disallow: /

User-agent: Barkrowler

Disallow: /

User-agent: MegaIndex.ru

Disallow: /

User-agent: DataForSeoBot

Disallow: /

User-agent: ClaudeBot

Disallow: /

User-agent: GPTBot

=======================

第一个方法垃圾蜘蛛访问直接403禁止访问！

第二个方法是直接告诉他不欢迎他。

扫一扫打开手机网站

微信扫一扫关注我们

禁止的垃圾蜘蛛，网站优化加速屏蔽国外垃圾无用蜘蛛，防止抓取导致带宽占用！

常见的网络恶意垃圾爬虫蜘蛛

1、MJ12Bot

2、AhrefsBot

3、SEMrushBot

4、DotBot

5、MauiBot

6、MegaIndex.ru

7、BLEXBot

站心网

发表回复取消回复

常见的网络恶意垃圾爬虫蜘蛛

1、MJ12Bot

2、AhrefsBot

3、SEMrushBot

4、DotBot

5、MauiBot

6、MegaIndex.ru

7、BLEXBot

宝塔Nginx服务器User-Agent过滤器

小米工具侠在线查询网站源码_独立后台管理安装说明

为您推荐

发表回复 取消回复

发表回复取消回复