portia

Visual scraping for Scrapy

  • Owner: scrapinghub/portia
  • Platform:
  • License:: BSD 3-Clause "New" or "Revised" License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

Portia

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

Running Portia

The easiest way to run Portia is using Docker:

You can run Portia using Docker & official Portia-image by running:

docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia

You can also set up a local instance with Docker-compose by cloning this repo & running from the root of the folder:

docker-compose up

For more detailed instructions, and alternatives to using Docker, see the Installation docs.

Documentation

Documentation can be found from Read the docs. Source files can be found in the docs directory.

Main metrics

Overview
Name With Ownerscrapinghub/portia
Primary LanguagePython
Program languageShell (Language Count: 8)
Platform
License:BSD 3-Clause "New" or "Revised" License
所有者活动
Created At2014-03-21 14:24:31
Pushed At2024-06-26 19:43:46
Last Commit At2019-07-10 13:43:34
Release Count40
Last Release Nameslybot-0.13.3 (Posted on )
First Release Nameslybot (Posted on )
用户参与
Stargazers Count9.4k
Watchers Count497
Fork Count1.4k
Commits Count2.7k
Has Issues Enabled
Issues Count451
Issue Open Count111
Pull Requests Count408
Pull Requests Open Count19
Pull Requests Close Count55
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private