xsoup

When jsoup meets XPath.

  • 所有者: code4craft/xsoup
  • 平台:
  • 许可证: MIT License
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

Xsoup

Build Status

XPath selector based on Jsoup.

Get started:

    @Test
    public void testSelect() {

        String html = "<html><div><a href='https://github.com'>github.com</a></div>" +
                "<table><tr><td>a</td><td>b</td></tr></table></html>";

        Document document = Jsoup.parse(html);

        String result = Xsoup.compile("//a/@href").evaluate(document).get();
        Assert.assertEquals("https://github.com", result);

        List<String> list = Xsoup.compile("//tr/td/text()").evaluate(document).list();
        Assert.assertEquals("a", list.get(0));
        Assert.assertEquals("b", list.get(1));
    }

Performance:

Xsoup use Jsoup as HTML parser.

Compare with another most used XPath selector for HTML - HtmlCleaner, Xsoup is much faster:

Normal HTML, size 44KB
XPath: "//a"	
Run for 2000 times

Environment:Mac Air MD231CH/A 
CPU: 1.8Ghz Intel Core i5

Syntax supported:

XPath1.0:

Function supported:

In Xsoup, we use some function (maybe not in Standard XPath 1.0):

Extended syntax supported:

These XPath syntax are extended only in Xsoup (for convenience in extracting HTML, refer to Jsoup CSS Selector):

License

MIT License, see file LICENSE

Bitdeli Badge

主要指标

概览
名称与所有者code4craft/xsoup
主编程语言Java
编程语言Java (语言数: 1)
平台
许可证MIT License
所有者活动
创建于2013-08-31 11:37:03
推送于2023-07-10 05:22:27
最后一次提交2023-06-13 00:33:47
发布数13
最新版本名称xsoup-0.3.7 (发布于 2023-06-13 00:33:47)
第一版名称xsoup-0.1.0 (发布于 2013-09-04 07:20:59)
用户参与
星数469
关注者数43
派生数152
提交数161
已启用问题?
问题数46
打开的问题数28
拉请求数11
打开的拉请求数0
关闭的拉请求数5
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?