1.1K Star 3.6K Fork 1.4K

GVP黄亿华 / webmagic

getHtml().links()无法准确获取a标签中的相对链接

Backlog
sxrstrive  Opened this issue

问题描述:getHtml().links()无法准确获取a标签中的相对链接
期望:准确获取html内容中所有链接包括相对链接
实际:无法准确获取相对链接
html片段:<div class="ui menu"> <a href="/explore" class="item">开源软件</a> <a href="/enterprises" class="item">企业版</a> <a href="/education" class="item">高校版</a> <a href="https://blog.gitee.com/" class="item">博客</a> </div>
代码片段:
String html = "<div>......</div>"; Page page = new Page(); page.setRequest(new Request("https://gitee.com")); page.setUrl(new PlainText("https://gitee.com")); page.setHtml(new Html(html)); System.out.println(page.getHtml().links());

387967 daniutec admin 1578922300 total 1 participants

Comments (0)

Sign in to comment

Assignees
Labels
Not set
Projects
Milestones
Branches
Planed to start
Not set
Planed to end
Not set
Top level
Priority
Java
1
https://gitee.com/flashsword20/webmagic.git
git@gitee.com:flashsword20/webmagic.git
flashsword20
webmagic
webmagic

Search