scrapy-random-useragent

Scrapy Middleware to set a random User-Agent for every Request.

  • 所有者: cnu/scrapy-random-useragent
  • 平台:
  • 許可證: MIT License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

Scrapy Random User-Agent

Does your scrapy spider get identified and blocked by servers because
you use the default user-agent or a generic one?

Use this random_useragent module and set a random user-agent for
every request. You are limited only by the number of different
user-agents you set in a text file.

Installing

Installing it is pretty simple.

.. code-block:: python

pip install scrapy-random-useragent

Usage

In your settings.py file, update the DOWNLOADER_MIDDLEWARES
variable like this.

.. code-block:: python

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    'random_useragent.RandomUserAgentMiddleware': 400
}

This disables the default UserAgentMiddleware and enables the
RandomUserAgentMiddleware.

Then, create a new variable USER_AGENT_LIST with the path to your
text file which has the list of all user-agents
(one user-agent per line).

.. code-block:: python

USER_AGENT_LIST = "/path/to/useragents.txt"

Now all the requests from your crawler will have a random user-agent
picked from the text file.

主要指標

概覽
名稱與所有者cnu/scrapy-random-useragent
主編程語言Python
編程語言Python (語言數: 1)
平台
許可證MIT License
所有者活动
創建於2014-12-25 12:29:23
推送於2019-08-16 21:29:30
最后一次提交2016-06-11 12:44:40
發布數2
最新版本名稱0.2 (發布於 2016-06-11 12:44:50)
第一版名稱0.1 (發布於 2014-12-25 20:07:38)
用户参与
星數202
關注者數9
派生數48
提交數13
已啟用問題?
問題數7
打開的問題數6
拉請求數2
打開的拉請求數3
關閉的拉請求數0
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?