parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Github stars Tracking Chart

===============================
Parsel

.. image:: https://img.shields.io/travis/scrapy/parsel/master.svg
:target: https://travis-ci.org/scrapy/parsel
:alt: Build Status

.. image:: https://img.shields.io/pypi/v/parsel.svg
:target: https://pypi.python.org/pypi/parsel
:alt: PyPI Version

.. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg
:target: http://codecov.io/github/scrapy/parsel?branch=master
:alt: Coverage report

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Features

  • Extract text using CSS or XPath selectors
  • Remove elements using CSS or XPath selectors
  • Regular expression helper methods

Example (open online demo_)::

>>> from parsel import Selector
>>> sel = Selector(text=u"""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul>
        </body>
        </html>""")
>>>
>>> sel.css('h1::text').get()
'Hello, Parsel!'
>>>
>>> sel.css('h1::text').re('\w+')
['Hello', 'Parsel']
>>>
>>> for e in sel.css('ul > li'):
...     print(e.xpath('.//a/@href').get())
http://example.com
http://scrapy.org

.. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true&sandboxMode=true

Main metrics

Overview
Name With Ownerscrapy/parsel
Primary LanguagePython
Program languagePython (Language Count: 1)
Platform
License:BSD 3-Clause "New" or "Revised" License
所有者活动
Created At2015-04-24 15:53:36
Pushed At2025-05-12 06:07:25
Last Commit At2025-05-12 10:06:42
Release Count26
Last Release Namev1.10.0 (Posted on 2024-12-16 16:54:23)
First Release Namev0.9.0 (Posted on )
用户参与
Stargazers Count1.2k
Watchers Count35
Fork Count150
Commits Count814
Has Issues Enabled
Issues Count122
Issue Open Count32
Pull Requests Count154
Pull Requests Open Count12
Pull Requests Close Count31
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private