xsoup

When jsoup meets XPath.

  • 所有者: code4craft/xsoup
  • 平台:
  • 許可證: MIT License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

Xsoup

Build Status

XPath selector based on Jsoup.

Get started:

    @Test
    public void testSelect() {

        String html = "<html><div><a href='https://github.com'>github.com</a></div>" +
                "<table><tr><td>a</td><td>b</td></tr></table></html>";

        Document document = Jsoup.parse(html);

        String result = Xsoup.compile("//a/@href").evaluate(document).get();
        Assert.assertEquals("https://github.com", result);

        List<String> list = Xsoup.compile("//tr/td/text()").evaluate(document).list();
        Assert.assertEquals("a", list.get(0));
        Assert.assertEquals("b", list.get(1));
    }

Performance:

Xsoup use Jsoup as HTML parser.

Compare with another most used XPath selector for HTML - HtmlCleaner, Xsoup is much faster:

Normal HTML, size 44KB
XPath: "//a"	
Run for 2000 times

Environment:Mac Air MD231CH/A 
CPU: 1.8Ghz Intel Core i5

Syntax supported:

XPath1.0:

Function supported:

In Xsoup, we use some function (maybe not in Standard XPath 1.0):

Extended syntax supported:

These XPath syntax are extended only in Xsoup (for convenience in extracting HTML, refer to Jsoup CSS Selector):

License

MIT License, see file LICENSE

Bitdeli Badge

主要指標

概覽
名稱與所有者code4craft/xsoup
主編程語言Java
編程語言Java (語言數: 1)
平台
許可證MIT License
所有者活动
創建於2013-08-31 11:37:03
推送於2023-07-10 05:22:27
最后一次提交2023-06-13 00:33:47
發布數13
最新版本名稱xsoup-0.3.7 (發布於 2023-06-13 00:33:47)
第一版名稱xsoup-0.1.0 (發布於 2013-09-04 07:20:59)
用户参与
星數469
關注者數43
派生數152
提交數161
已啟用問題?
問題數46
打開的問題數28
拉請求數11
打開的拉請求數0
關閉的拉請求數5
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?