Description 
A library to guide a web crawl using
PageRank,
HITS or other ranking
algorithms based on the link structure of the web graph, even when
making big crawls (one billion pages).
Warning: I only test with regularity under Linux, my development
platform. From time to time I test also on OS X and Windows 8 using
MinGW64.
Installation
pip install aduana
Documentation
Available at readthedocs
I have started documenting plans/ideas at the
wiki.
Example
Single spider example:
cd example
pip install -r requirements.txt
scrapy crawl example
To run the distributed crawler see the
docs