parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Github星跟蹤圖

===============================
Parsel

.. image:: https://img.shields.io/travis/scrapy/parsel/master.svg
:target: https://travis-ci.org/scrapy/parsel
:alt: Build Status

.. image:: https://img.shields.io/pypi/v/parsel.svg
:target: https://pypi.python.org/pypi/parsel
:alt: PyPI Version

.. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg
:target: http://codecov.io/github/scrapy/parsel?branch=master
:alt: Coverage report

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Features

  • Extract text using CSS or XPath selectors
  • Remove elements using CSS or XPath selectors
  • Regular expression helper methods

Example (open online demo_)::

>>> from parsel import Selector
>>> sel = Selector(text=u"""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul>
        </body>
        </html>""")
>>>
>>> sel.css('h1::text').get()
'Hello, Parsel!'
>>>
>>> sel.css('h1::text').re('\w+')
['Hello', 'Parsel']
>>>
>>> for e in sel.css('ul > li'):
...     print(e.xpath('.//a/@href').get())
http://example.com
http://scrapy.org

.. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true&sandboxMode=true

主要指標

概覽
名稱與所有者scrapy/parsel
主編程語言Python
編程語言Python (語言數: 1)
平台
許可證BSD 3-Clause "New" or "Revised" License
所有者活动
創建於2015-04-24 15:53:36
推送於2025-05-12 06:07:25
最后一次提交2025-05-12 10:06:42
發布數26
最新版本名稱v1.10.0 (發布於 2024-12-16 16:54:23)
第一版名稱v0.9.0 (發布於 )
用户参与
星數1.2k
關注者數35
派生數150
提交數814
已啟用問題?
問題數122
打開的問題數32
拉請求數154
打開的拉請求數12
關閉的拉請求數31
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?