scrapy-random-useragent

Scrapy Middleware to set a random User-Agent for every Request.

  • 所有者: cnu/scrapy-random-useragent
  • 平台:
  • 许可证: MIT License
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

Scrapy Random User-Agent

Does your scrapy spider get identified and blocked by servers because
you use the default user-agent or a generic one?

Use this random_useragent module and set a random user-agent for
every request. You are limited only by the number of different
user-agents you set in a text file.

Installing

Installing it is pretty simple.

.. code-block:: python

pip install scrapy-random-useragent

Usage

In your settings.py file, update the DOWNLOADER_MIDDLEWARES
variable like this.

.. code-block:: python

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    'random_useragent.RandomUserAgentMiddleware': 400
}

This disables the default UserAgentMiddleware and enables the
RandomUserAgentMiddleware.

Then, create a new variable USER_AGENT_LIST with the path to your
text file which has the list of all user-agents
(one user-agent per line).

.. code-block:: python

USER_AGENT_LIST = "/path/to/useragents.txt"

Now all the requests from your crawler will have a random user-agent
picked from the text file.

主要指标

概览
名称与所有者cnu/scrapy-random-useragent
主编程语言Python
编程语言Python (语言数: 1)
平台
许可证MIT License
所有者活动
创建于2014-12-25 12:29:23
推送于2019-08-16 21:29:30
最后一次提交2016-06-11 12:44:40
发布数2
最新版本名称0.2 (发布于 2016-06-11 12:44:50)
第一版名称0.1 (发布于 2014-12-25 20:07:38)
用户参与
星数202
关注者数9
派生数48
提交数13
已启用问题?
问题数7
打开的问题数6
拉请求数2
打开的拉请求数3
关闭的拉请求数0
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?