Scylla

智能代理池(需要维护人员)。(Intelligent proxy pool for Humans™ [Maintainer needed])

Github星跟蹤圖

Scylla 是一款高质量的免费代理 IP 池工具,仅支持 Python 3.6。特性如下:

  • 自动化的代理 IP 爬取与验证
  • 易用的 JSON API
  • 简单但美观的 web 用户界面,基于 TypeScript 和 React(例如,代理的地理分布)
  • 最少仅用一条命令即可启动
  • 简明直接的编程 API(将在 1.1 版本中加入)
  • 最少仅用一行代码即可与 Scrapy 和 requests 进行集成
  • 无头浏览器(headless browser crawling)爬虫


主要指標

概覽
名稱與所有者imWildCat/scylla
主編程語言Python
編程語言Python (語言數: 9)
平台Docker, Linux, Mac, Windows
許可證Apache License 2.0
所有者活动
創建於2018-04-10 09:55:11
推送於2025-02-20 16:27:00
最后一次提交
發布數14
最新版本名稱1.2.0 (發布於 )
第一版名稱0.1.3 (發布於 )
用户参与
星數4k
關注者數77
派生數475
提交數389
已啟用問題?
問題數93
打開的問題數42
拉請求數87
打開的拉請求數5
關閉的拉請求數23
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?

banner_scylla Build Status
codecov
Documentation Status
PyPI version
Docker Build Status
Donate

An intelligent proxy pool for humanities, only supports Python 3.6. Key
features:

  • Automatic proxy ip crawling and validation
  • Easy-to-use JSON API
  • Simple but beautiful web-based user interface (eg. geographical
    distribution of proxies)
  • Get started with only 1 command minimally
  • Simple HTTP Forward proxy server
  • [Scrapy] and [requests] integration with only 1 line of code
    minimally
  • Headless browser crawling

对于偏好中文的用户,请阅读 中文文档。For those who prefer to use Chinese, please read the Chinese Documentation.

Get started

Installation

docker run -d -p 8899:8899 -p 8081:8081 -v /var/www/scylla:/var/www/scylla --name scylla wildcat/scylla:latest

Install directly via pip

pip install scylla
scylla --help
scylla # Run the crawler and web server for JSON API

Install from source

git clone https://github.com/imWildCat/scylla.git
cd scylla

pip install -r requirements.txt

npm install # or yarn install
make assets-build

python -m scylla
For Windows user who fails at installing sanic due to uvloop does not support Windows at the moment:
export SANIC_NO_UVLOOP=true
export SANIC_NO_UJSON=true
pip3 install sanic

If this also fails, yoi will need to manual install sanic from source.

Usage

This is an example of running a service locally (localhost), using
port 8899.

Note: You might have to wait for 1 to 2 minutes in order to get some proxy ips populated in the database for the first time you use Scylla.

JSON API

Proxy IP List

http://localhost:8899/api/v1/proxies

Optional URL parameters: