如何根据具体定义的同级内容获取同级的child
How to get siblings' child according to specific defined sibling content
我需要找到从以下 XML 数据中收集作家和艺术家信息的最佳方法。 comic
节点出现多次并包含一本漫画书的数据。
我无法根据他们的工作职能,作家,艺术家等来抓住合适的人。有时每本漫画书都有多个作家和艺术家。我的计划是 add/append 每个到一个列表。
因此,对于这本漫画书,我需要获取所有作家和艺术家的显示名称,但工作职能(例如作家)是人物姓名的同级。
这是我有的,但不起作用:
writer = []
penciler = []
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit/role[@id='dfWriter']").each do |n|
writer << n.xpath('person/displayname').text
end
main_element.xpath("mainsection/credits/credit/role[@id='dfPenciler']").each do |n|
penciler << n.xpath('person/displayname').text
end
end
p "Writer(s): ",writer
p "Penciler(s): ",penciler
这是XML file/data:
<comic>
<id>3398</id>
<index>195</index>
<mainsection>
<title>Mind Games</title>
<myrating>0</myrating>
<myrating>
<displayname>0</displayname>
<sortname>0</sortname>
</myrating>
<pagecount>32</pagecount>
<credits>
<credit>
<role id="dfWriter">Writer</role>
<roleid>dfWriter</roleid>
<person>
<displayname>Will Pfeifer</displayname>
<sortname>Pfeifer, Will</sortname>
<lastname>Pfeifer</lastname>
<firstname>Will</firstname>
</person>
</credit>
<credit>
<role id="dfWriter">Writer</role>
<roleid>dfWriter</roleid>
<person>
<displayname>John Byrne</displayname>
<sortname>Byrne, John</sortname>
<lastname>Byrne</lastname>
<firstname>John</firstname>
</person>
</credit>
<credit>
<role id="dfPenciler">Penciller</role>
<roleid>dfPenciler</roleid>
<person>
<displayname>John Byrne</displayname>
<sortname>Byrne, John</sortname>
<lastname>Byrne</lastname>
<firstname>John</firstname>
</person>
</credit>
</credits>
</mainsection>
</comic>
我的代码没有给我想要的结果。我找到了“Getting the siblings of a node with Nokogiri”,但我需要迭代并抓住每个兄弟姐妹。
我可以通过 <roleid>dfWriter</roleid>
或 <role id="dfWriter">Writer</role>
进行搜索,因为它们是相同的。
我的预期输出是:
Writer(s): Will Pfeifer, John Byrne
Penciler(s): John Byrne
假设目标元素始终位于之后 role
:
,您可以为此目的使用 XPath following-sibling
轴
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit/role[@id='dfWriter']").each do |n|
writer << n.xpath('following-sibling::person/displayname').text
end
main_element.xpath("mainsection/credits/credit/role[@id='dfPenciler']").each do |n|
penciler << n.xpath('following-sibling::person/displayname').text
end
end
或者您可以首先遍历 credit
而不是 role
:
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit[role/@id='dfWriter']").each do |n|
writer << n.xpath('person/displayname').text
end
main_element.xpath("mainsection/credits/credit[role/@id='dfPenciler']").each do |n|
penciler << n.xpath('person/displayname').text
end
end
以下是我将如何执行此操作:
require 'nokogiri'
XML = <<EOT
<comic>
<mainsection>
<credits>
<credit>
<role id="dfWriter">Writer</role>
<person>
<displayname>Will Pfeifer</displayname>
</person>
</credit>
<credit>
<role id="dfWriter">Writer</role>
<person>
<displayname>John Byrne</displayname>
</person>
</credit>
<credit>
<role id="dfPenciler">Penciller</role>
<person>
<displayname>John Byrne</displayname>
</person>
</credit>
</credits>
</mainsection>
</comic>
EOT
doc = Nokogiri::XML(XML)
writers = doc.search("credits role[id='dfWriter']").map { |w| w.parent.at('displayname').text }
pencilers = doc.search("credits role[id='dfPenciler']").map { |n| n.parent.at('displayname').text }
puts "Writer(s): %s" % writers.join(', ')
puts "Penciler(s): %s" % pencilers.join(', ')
# >> Writer(s): Will Pfeifer, John Byrne
# >> Penciler(s): John Byrne
其中,当 运行 时,输出:
# >> Writer(s): Will Pfeifer, John Byrne
# >> Penciler(s): John Byrne
这个:
writers = doc.search("credits role[id='dfWriter']").map { |w| w.parent.at('displayname').text }
pencilers = doc.search("credits role[id='dfPenciler']").map { |n| n.parent.at('displayname').text }
可以干到:
writers, pencilers = %w(dfWriter dfPenciler).map { |s|
doc.search("credits role[id='#{s}']").map { |w| w.parent.at('displayname').text }
}
我使用 CSS 提高可读性,at
,returns 一个节点,当我想要文本而不是 xpath
,returns一个节点集。
在 NodeSet 和 Node 上使用 text
之间的区别非常重要。考虑一下:
require 'nokogiri'
xml = <<EOT
<root>
<displayname>Will Pfeifer</displayname>
<displayname>John Byrne</displayname>
<displayname>John Byrne</displayname>
</root>
EOT
doc = Nokogiri::XML(xml)
doc.search('displayname').class # => Nokogiri::XML::NodeSet
doc.search('displayname').text # => "Will PfeiferJohn ByrneJohn Byrne"
doc.at('displayname').class # => Nokogiri::XML::Element
doc.at('displayname').text # => "Will Pfeifer"
如果您希望 NodeSet 的所有文本都以易于使用的形式出现,则从每个节点中提取它:
doc.search('displayname').map(&:text) # => ["Will Pfeifer", "John Byrne", "John Byrne"]
我需要找到从以下 XML 数据中收集作家和艺术家信息的最佳方法。 comic
节点出现多次并包含一本漫画书的数据。
我无法根据他们的工作职能,作家,艺术家等来抓住合适的人。有时每本漫画书都有多个作家和艺术家。我的计划是 add/append 每个到一个列表。
因此,对于这本漫画书,我需要获取所有作家和艺术家的显示名称,但工作职能(例如作家)是人物姓名的同级。
这是我有的,但不起作用:
writer = []
penciler = []
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit/role[@id='dfWriter']").each do |n|
writer << n.xpath('person/displayname').text
end
main_element.xpath("mainsection/credits/credit/role[@id='dfPenciler']").each do |n|
penciler << n.xpath('person/displayname').text
end
end
p "Writer(s): ",writer
p "Penciler(s): ",penciler
这是XML file/data:
<comic>
<id>3398</id>
<index>195</index>
<mainsection>
<title>Mind Games</title>
<myrating>0</myrating>
<myrating>
<displayname>0</displayname>
<sortname>0</sortname>
</myrating>
<pagecount>32</pagecount>
<credits>
<credit>
<role id="dfWriter">Writer</role>
<roleid>dfWriter</roleid>
<person>
<displayname>Will Pfeifer</displayname>
<sortname>Pfeifer, Will</sortname>
<lastname>Pfeifer</lastname>
<firstname>Will</firstname>
</person>
</credit>
<credit>
<role id="dfWriter">Writer</role>
<roleid>dfWriter</roleid>
<person>
<displayname>John Byrne</displayname>
<sortname>Byrne, John</sortname>
<lastname>Byrne</lastname>
<firstname>John</firstname>
</person>
</credit>
<credit>
<role id="dfPenciler">Penciller</role>
<roleid>dfPenciler</roleid>
<person>
<displayname>John Byrne</displayname>
<sortname>Byrne, John</sortname>
<lastname>Byrne</lastname>
<firstname>John</firstname>
</person>
</credit>
</credits>
</mainsection>
</comic>
我的代码没有给我想要的结果。我找到了“Getting the siblings of a node with Nokogiri”,但我需要迭代并抓住每个兄弟姐妹。
我可以通过 <roleid>dfWriter</roleid>
或 <role id="dfWriter">Writer</role>
进行搜索,因为它们是相同的。
我的预期输出是:
Writer(s): Will Pfeifer, John Byrne
Penciler(s): John Byrne
假设目标元素始终位于之后 role
:
following-sibling
轴
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit/role[@id='dfWriter']").each do |n|
writer << n.xpath('following-sibling::person/displayname').text
end
main_element.xpath("mainsection/credits/credit/role[@id='dfPenciler']").each do |n|
penciler << n.xpath('following-sibling::person/displayname').text
end
end
或者您可以首先遍历 credit
而不是 role
:
doc.xpath('//comic').each do |main_element|
main_element.xpath("mainsection/credits/credit[role/@id='dfWriter']").each do |n|
writer << n.xpath('person/displayname').text
end
main_element.xpath("mainsection/credits/credit[role/@id='dfPenciler']").each do |n|
penciler << n.xpath('person/displayname').text
end
end
以下是我将如何执行此操作:
require 'nokogiri'
XML = <<EOT
<comic>
<mainsection>
<credits>
<credit>
<role id="dfWriter">Writer</role>
<person>
<displayname>Will Pfeifer</displayname>
</person>
</credit>
<credit>
<role id="dfWriter">Writer</role>
<person>
<displayname>John Byrne</displayname>
</person>
</credit>
<credit>
<role id="dfPenciler">Penciller</role>
<person>
<displayname>John Byrne</displayname>
</person>
</credit>
</credits>
</mainsection>
</comic>
EOT
doc = Nokogiri::XML(XML)
writers = doc.search("credits role[id='dfWriter']").map { |w| w.parent.at('displayname').text }
pencilers = doc.search("credits role[id='dfPenciler']").map { |n| n.parent.at('displayname').text }
puts "Writer(s): %s" % writers.join(', ')
puts "Penciler(s): %s" % pencilers.join(', ')
# >> Writer(s): Will Pfeifer, John Byrne
# >> Penciler(s): John Byrne
其中,当 运行 时,输出:
# >> Writer(s): Will Pfeifer, John Byrne
# >> Penciler(s): John Byrne
这个:
writers = doc.search("credits role[id='dfWriter']").map { |w| w.parent.at('displayname').text }
pencilers = doc.search("credits role[id='dfPenciler']").map { |n| n.parent.at('displayname').text }
可以干到:
writers, pencilers = %w(dfWriter dfPenciler).map { |s|
doc.search("credits role[id='#{s}']").map { |w| w.parent.at('displayname').text }
}
我使用 CSS 提高可读性,at
,returns 一个节点,当我想要文本而不是 xpath
,returns一个节点集。
在 NodeSet 和 Node 上使用 text
之间的区别非常重要。考虑一下:
require 'nokogiri'
xml = <<EOT
<root>
<displayname>Will Pfeifer</displayname>
<displayname>John Byrne</displayname>
<displayname>John Byrne</displayname>
</root>
EOT
doc = Nokogiri::XML(xml)
doc.search('displayname').class # => Nokogiri::XML::NodeSet
doc.search('displayname').text # => "Will PfeiferJohn ByrneJohn Byrne"
doc.at('displayname').class # => Nokogiri::XML::Element
doc.at('displayname').text # => "Will Pfeifer"
如果您希望 NodeSet 的所有文本都以易于使用的形式出现,则从每个节点中提取它:
doc.search('displayname').map(&:text) # => ["Will Pfeifer", "John Byrne", "John Byrne"]