CSS 选择器 "combine" 元素
CSS Selector "combine" Elements
我是 CSS 的新手,尝试通过 Jsoup 解析器为 Java 解析 HTML。
示例HTML:
<p>However much beautiful the s6 Edge looks, I doubt [...] the <a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>, the dual gently curved screen [...] or accidental palm taps.</p>
我已经得到 <p>
元素内的文本如下:
Elements text = doc.select("p");
for (Element element : text) {
System.out.println(element.ownText() + "\n");
}
输出:
However much beautiful the s6 Edge looks, I doubt [...] the , the dual
gently curved screen [...] or accidental palm taps.
可以看到,文本 Note Edge
insde <a>
元素没有显示。
所以我想问一下有没有可能显示整个文本,包括<a>
元素里面的文本如下:
However much beautiful the s6 Edge looks, I doubt [...] the Note Edge, the
dual gently curved screen [...] or accidental palm taps.
我很感激每一个建议!
根据 docs、ownText()
:
Gets the text owned by this element only; does not get the combined text of all children.
你想调用 element.text()
,而不是,如果你想包含子节点的内容。
这样做:
for (Element element : text) {
System.out.println(element.text() + "\n");
}
您应该使用 text()
而不是 ownText()
,因为 ownText()
而不是 获取任何子元素的文本。
您可以做的是,不是使用纯文本,然后是 <a></a>
标记,然后是更多纯文本,您可以将文本包装在其中,然后获取 [=13= 的每个子项的文本] 元素.
<p id="myParagraph">
<span>However much beautiful the s6 Edge looks, I doubt [...] the </span>
<a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>
<span>, the dual
gently curved screen [...] or accidental palm taps.</span>
</p>
因此您的函数将遍历元素的子节点 <p>
//I don't known jsoup so i use javascript directly
var childrens= document.getElementByID("myParagraph").children;
childrens.forEach(function(child) {
console.log(child.textContent() + "\n");
});
我是 CSS 的新手,尝试通过 Jsoup 解析器为 Java 解析 HTML。
示例HTML:
<p>However much beautiful the s6 Edge looks, I doubt [...] the <a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>, the dual gently curved screen [...] or accidental palm taps.</p>
我已经得到 <p>
元素内的文本如下:
Elements text = doc.select("p");
for (Element element : text) {
System.out.println(element.ownText() + "\n");
}
输出:
However much beautiful the s6 Edge looks, I doubt [...] the , the dual gently curved screen [...] or accidental palm taps.
可以看到,文本 Note Edge
insde <a>
元素没有显示。
所以我想问一下有没有可能显示整个文本,包括<a>
元素里面的文本如下:
However much beautiful the s6 Edge looks, I doubt [...] the Note Edge, the dual gently curved screen [...] or accidental palm taps.
我很感激每一个建议!
根据 docs、ownText()
:
Gets the text owned by this element only; does not get the combined text of all children.
你想调用 element.text()
,而不是,如果你想包含子节点的内容。
这样做:
for (Element element : text) {
System.out.println(element.text() + "\n");
}
您应该使用 text()
而不是 ownText()
,因为 ownText()
而不是 获取任何子元素的文本。
您可以做的是,不是使用纯文本,然后是 <a></a>
标记,然后是更多纯文本,您可以将文本包装在其中,然后获取 [=13= 的每个子项的文本] 元素.
<p id="myParagraph">
<span>However much beautiful the s6 Edge looks, I doubt [...] the </span>
<a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>
<span>, the dual
gently curved screen [...] or accidental palm taps.</span>
</p>
因此您的函数将遍历元素的子节点 <p>
//I don't known jsoup so i use javascript directly
var childrens= document.getElementByID("myParagraph").children;
childrens.forEach(function(child) {
console.log(child.textContent() + "\n");
});