Symfony2 - DomCrawler - 通过正则表达式中的相邻内容获取元素的内容
Symfony2 - DomCrawler - fetch element's content by it's neighbour content in regex
我有这个 xml:
<Item id="3" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">BE01</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">RÉGION DE BRUXELLES-CAPITALE / BRUSSELS HOOFDSTEDELIJK GEWEST</LabelText>
</Label>
</Item>
<Item id="4" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">BE001</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">VLAAMS GEWEST</LabelText>
</Label>
</Item>
<Item id="123" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">RO001</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">MACROREGIUNEA DOI</LabelText>
</Label>
</Item>
我想获取 <LabelText language="EN">
的值,其中邻居 <LabelText language="ALL">
以 "BE" 开头,后面有 3 个数字。
在这种情况下,我会得到第二个 xml 元素的值,例如:VLAAMS GEWEST
我知道如何以丑陋的方式处理它,但我相信应该有更灵活和优雅的方式来做到这一点:
$crawler = new Crawler();
$crawler->addXmlContent($xml);
$crawler = $crawler->filterXPath('//Item[@idLevel="3"]');
foreach ($crawler as $domElement) {
// here I check if inside element's neighbour has value of "BE" and three numbers after with regex
}
有没有办法用 DomCrawler
来处理它而不是迭代所有元素并检查每个元素?
您可以使用单个 XPath 表达式来获取所需的文本:
//Item[@idLevel="3"]/Label[string-length(preceding-sibling::Label/LabelText/text()) = 5 and starts-with(preceding-sibling::Label/LabelText/text(), "BE") and number(substring(preceding-sibling::Label/LabelText/text(), 3)) = number(substring(preceding-sibling::Label/LabelText/text(), 3))]/LabelText[@language="EN"]/text()
分解:
//Item[@idLevel="3"]
- 获取具有 idLevel
属性且值为 3
的 Item
节点
/Label
- 它的 Label
children 有...
[string-length(preceding-sibling::Label/LabelText/text()) = 5
- 文本长度等于 5 的同级 Label/LabelText
节点...
and starts-with(preceding-sibling::Label/LabelText/text(), "BE")
- 文本以 BE
开头
and number(substring(preceding-sibling::Label/LabelText/text(), 3)) = number(substring(preceding-sibling::Label/LabelText/text(), 3))]
- 最后 3 个字符是数字
/LabelText[@language="EN"]/text()
- 获取具有 language
属性的 LabelText
节点的文本 EN
我有这个 xml:
<Item id="3" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">BE01</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">RÉGION DE BRUXELLES-CAPITALE / BRUSSELS HOOFDSTEDELIJK GEWEST</LabelText>
</Label>
</Item>
<Item id="4" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">BE001</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">VLAAMS GEWEST</LabelText>
</Label>
</Item>
<Item id="123" idLevel="3">
<Label qualifier="Usual">
<LabelText language="ALL">RO001</LabelText>
</Label>
<Label qualifier="Usual">
<LabelText language="EN">MACROREGIUNEA DOI</LabelText>
</Label>
</Item>
我想获取 <LabelText language="EN">
的值,其中邻居 <LabelText language="ALL">
以 "BE" 开头,后面有 3 个数字。
在这种情况下,我会得到第二个 xml 元素的值,例如:VLAAMS GEWEST
我知道如何以丑陋的方式处理它,但我相信应该有更灵活和优雅的方式来做到这一点:
$crawler = new Crawler();
$crawler->addXmlContent($xml);
$crawler = $crawler->filterXPath('//Item[@idLevel="3"]');
foreach ($crawler as $domElement) {
// here I check if inside element's neighbour has value of "BE" and three numbers after with regex
}
有没有办法用 DomCrawler
来处理它而不是迭代所有元素并检查每个元素?
您可以使用单个 XPath 表达式来获取所需的文本:
//Item[@idLevel="3"]/Label[string-length(preceding-sibling::Label/LabelText/text()) = 5 and starts-with(preceding-sibling::Label/LabelText/text(), "BE") and number(substring(preceding-sibling::Label/LabelText/text(), 3)) = number(substring(preceding-sibling::Label/LabelText/text(), 3))]/LabelText[@language="EN"]/text()
分解:
//Item[@idLevel="3"]
- 获取具有idLevel
属性且值为3
的 /Label
- 它的Label
children 有...[string-length(preceding-sibling::Label/LabelText/text()) = 5
- 文本长度等于 5 的同级Label/LabelText
节点...and starts-with(preceding-sibling::Label/LabelText/text(), "BE")
- 文本以BE
开头
and number(substring(preceding-sibling::Label/LabelText/text(), 3)) = number(substring(preceding-sibling::Label/LabelText/text(), 3))]
- 最后 3 个字符是数字/LabelText[@language="EN"]/text()
- 获取具有language
属性的LabelText
节点的文本EN
Item
节点