Jsoup 库 "Did not find balanced marker" 错误
Jsoup library "Did not find balanced marker" error
使用 jsoup
库,我试图获取 <a>
元素的 href
,每次都包含指定的文本。
示例:
import org.jsoup.Jsoup
import org.jsoup.nodes.Document
import org.jsoup.select.Elements
public class GlobVars {
public static Document currentPageSource
public static String currentTitle
}
def get_url() {
String url = "https://www.website.com/"
GlobVars.currentPageSource = Jsoup.connect(url).get()
Elements wElements = GlobVars.currentPageSource.select('a[class="class-name"]:contains('+GlobVars.currentTitle+')')
if(wElements) {
/*
* Do stuff...
*
* */
}
}
问题是 GlobVars.currentTitle
包含单引号字符!!!例如,如果 GlobVars.currentTitle
是 I am here
它“工作”正常。但是如果 GlobVars.currentTitle
是 I'm here
我得到这个错误: Did not find balanced marker at 'I'
.
我尝试将 GlobVars.currentTitle
变量与 double-quoted
、triple-single-quoted
或 triple-double-quoted
一起使用,但我得到了同样的错误。
我还阅读了 https://github.com/jhy/jsoup/issues/1105 but the "trick" 转义引号不能用于我的情况。
知道我将如何解决这个问题吗?
// @Grab(group='org.jsoup', module='jsoup', version='1.14.3')
import org.jsoup.Jsoup
import org.jsoup.nodes.Document
import org.jsoup.select.Elements
def html = """
<html>
<body>
<a class="c1" href="#1">i'm the one</a>
<a class="c1" href="#2">i am the one</a>
</body>
</html>
"""
def desiredText = "i'm the one"
// escape special chars. maybe you need more special chars to escape...
desiredText = desiredText.replaceAll(/(['"\\/\|(\)\[\]])/, '\\')
Document currentPageSource = Jsoup.parse(html)
Elements wElements = currentPageSource.select('a[class="c1"]:contains('+ desiredText +')')
或
def desiredText = "i'm the one"
Elements wElements = currentPageSource.select('a[class="c1"]').findAll{it.html().contains(desiredText)}
使用 jsoup
库,我试图获取 <a>
元素的 href
,每次都包含指定的文本。
示例:
import org.jsoup.Jsoup
import org.jsoup.nodes.Document
import org.jsoup.select.Elements
public class GlobVars {
public static Document currentPageSource
public static String currentTitle
}
def get_url() {
String url = "https://www.website.com/"
GlobVars.currentPageSource = Jsoup.connect(url).get()
Elements wElements = GlobVars.currentPageSource.select('a[class="class-name"]:contains('+GlobVars.currentTitle+')')
if(wElements) {
/*
* Do stuff...
*
* */
}
}
问题是 GlobVars.currentTitle
包含单引号字符!!!例如,如果 GlobVars.currentTitle
是 I am here
它“工作”正常。但是如果 GlobVars.currentTitle
是 I'm here
我得到这个错误: Did not find balanced marker at 'I'
.
我尝试将 GlobVars.currentTitle
变量与 double-quoted
、triple-single-quoted
或 triple-double-quoted
一起使用,但我得到了同样的错误。
我还阅读了 https://github.com/jhy/jsoup/issues/1105 but the "trick" 转义引号不能用于我的情况。
知道我将如何解决这个问题吗?
// @Grab(group='org.jsoup', module='jsoup', version='1.14.3')
import org.jsoup.Jsoup
import org.jsoup.nodes.Document
import org.jsoup.select.Elements
def html = """
<html>
<body>
<a class="c1" href="#1">i'm the one</a>
<a class="c1" href="#2">i am the one</a>
</body>
</html>
"""
def desiredText = "i'm the one"
// escape special chars. maybe you need more special chars to escape...
desiredText = desiredText.replaceAll(/(['"\\/\|(\)\[\]])/, '\\')
Document currentPageSource = Jsoup.parse(html)
Elements wElements = currentPageSource.select('a[class="c1"]:contains('+ desiredText +')')
或
def desiredText = "i'm the one"
Elements wElements = currentPageSource.select('a[class="c1"]').findAll{it.html().contains(desiredText)}