crawlera-tools

Crawlera tools

  • 所有者: scrapinghub/crawlera-tools
  • 平台:
  • 許可證:
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

==============
Crawlera tools

This repository contains tools for the Crawlera service_.

crawlera-bench

crawlera-bench can be used to benchmark Crawlera with your domain. It needs
a file with a list of urls (one per line).

Quick start::

$ wget https://raw.githubusercontent.com/scrapinghub/crawlera-tools/master/crawlera-bench
$ chmod a+x crawlera-bench

Usage::

crawlera-bench urls.txt -u USER -p PASSWORD

For more usage info see::

crawlera-bench -h

The output would something like this::

Concurrency     : 100
Timeout         : 120 sec
Report interval : 1 sec
Unit            : requests per 1 sec

time                netloc                           all   2xx   3xx   4xx   5xx   503   t/o   err, minw     maxw
2014-04-23 17:29:44 www.somesite.com                   0     9     0     0     0     0     0     0, 0.929   13.958
2014-04-23 17:29:45 www.somesite.com                   0     4     0     0     0     0     0     0, 0.846   49.655
2014-04-23 17:29:46 www.somesite.com                   0    14     0     0     0     0     0     0, 0.940   50.097
2014-04-23 17:29:47 www.somesite.com                   0    12     0     0     0     0     0     0, 0.999   41.884
2014-04-23 17:29:48 www.somesite.com                   0    17     0     0     0     0     0     0, 0.932   22.537
2014-04-23 17:29:49 www.somesite.com                   0    28     0     0     0     0     0     0, 0.806   15.329
2014-04-23 17:29:50 www.somesite.com                   0    23     0     0     0     0     0     0, 0.577    9.809
2014-04-23 17:29:51 www.somesite.com                   0    33     0     0     0     0     0     0, 0.602   42.200
2014-04-23 17:29:52 www.somesite.com                   0    36     0     0     0     0     0     0, 0.489   46.377
2014-04-23 17:29:53 www.somesite.com                   0    33     0     0     0     0     0     0, 0.478   18.375
2014-04-23 17:29:54 www.somesite.com                   0    42     0     0     0     0     0     0, 0.430   16.562
2014-04-23 17:29:55 www.somesite.com                   0    49     0     0     0     0     0     0, 0.459   36.815
2014-04-23 17:29:56 www.somesite.com                   0    48     0     0     0     0     0     0, 0.464   13.926
2014-04-23 17:29:57 www.somesite.com                   0    40     0     0     0     0     0     0, 0.610   26.006
2014-04-23 17:29:58 www.somesite.com                   0    51     0     0     0     0     0     0, 0.974    6.083
2014-04-23 17:29:59 www.somesite.com                   0    38     0     0     0     0     0     0, 0.980   42.102
2014-04-23 17:30:00 www.somesite.com                   0    54     0     0     0     0     0     0, 0.663   14.737

Some columns may require an explanation such as:

  • 2xx, 3xx, ... : requests with response code in the 2xx, 3xx, ... range
  • all : all requests combined
  • t/o : requests that timed out
  • err : requests with errors (connection or HTTP errors)
  • minw : minimum request wait time found in the last interval
  • maxw : maximum request wait time found in the last interval

Other tools available

  • scrapy-crawlera_: A Scrapy downloader middleware for Crawlera

.. _Crawlera service: http://crawlera.com/
.. _scrapy-crawlera: https://github.com/scrapinghub/scrapy-crawlera

主要指標

概覽
名稱與所有者scrapinghub/crawlera-tools
主編程語言Python
編程語言Python (語言數: 1)
平台
許可證
所有者活动
創建於2014-04-23 17:31:09
推送於2016-02-09 14:55:41
最后一次提交2015-08-03 12:05:55
發布數0
用户参与
星數26
關注者數121
派生數10
提交數71
已啟用問題?
問題數3
打開的問題數2
拉請求數7
打開的拉請求數1
關閉的拉請求數2
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?