在 ruby 中筛选 Simple-XML 输出

Filter Simple-XML Output in ruby

我对 ruby 完全陌生,我正在尝试解析 XML 结构,并且 过滤它的一些属性。 XML 看起来像这样:

<systeminfo>
<machines>
<machine name="localhost">
<repository worker="localhost:8060" status="OK"/>
<dataengine worker="localhost:27042" status="OK"/>
<serverwebapplication worker="localhost:8000" status="OK"/>
<serverwebapplication worker="localhost:8001" status="OK"/>
<vizqlserver worker="localhost:9100" status="OK"/>
<vizqlserver worker="localhost:9101" status="OK"/>
<dataserver worker="localhost:9700" status="OK"/>
<dataserver worker="localhost:9701" status="OK"/>
<backgrounder worker="localhost:8250" status="OK"/>
<webserver worker="localhost:80" status="OK"/>
</machine>
</machines>
<service status="OK"/>
</systeminfo>

我想检查状态属性是否正常。到目前为止我已经写了 此代码:

#!/usr/bin/ruby -w

require 'rubygems'
require 'net/http'
require 'xmlsimple'

url = URI.parse("URL to XML")
req = Net::HTTP::Get.new(url.path)
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}

sysinfodoc = XmlSimple.xml_in(res.body)


sysinfodoc["machines"][0]["machine"][0].each do |status|
p status[1][0]
p status[1][1]
end

输出:

{"worker"=>"localhost:8060", "status"=>"OK"}
nil
{"worker"=>"localhost:27042", "status"=>"OK"}
nil
{"worker"=>"localhost:9100", "status"=>"OK"}
{"worker"=>"localhost:9101", "status"=>"OK"}
{"worker"=>"localhost:8000", "status"=>"OK"}
{"worker"=>"localhost:8001", "status"=>"OK"}
{"worker"=>"localhost:8250", "status"=>"OK"}
nil
{"worker"=>"localhost:9700", "status"=>"OK"}
{"worker"=>"localhost:9701", "status"=>"OK"}
{"worker"=>"localhost:80", "status"=>"OK"}
nil
108
111

更新 输出应该是这样的:

OK
OK
OK
OK
OK
OK
OK
OK
OK
OK

这个脚本应该与nagios一起使用。因此,我不想输出结果,而是想检查其中一个状态属性是否包含一些后来不是 "OK" 的东西。 更新

如何摆脱 nils 和 fixnums?为什么还有fixnums?

如何过滤它以便我只获得每台机器的状态 child? 还是这完全是错误的方法?

为此使用 Nokogiri and XPath 怎么样?

require 'nokogiri'
@doc = Nokogiri::XML(File.open("example.xml"))
@doc.xpath("//machine/*/@status").each { |x| puts x }

结果会是

OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
=> 0

免责声明:按照 Mathias 的建议使用 nokogiri 和 XPath 更加优雅和简单。


一旦遇到意外输出,尝试打印出状态局部变量本身:

sysinfodoc["machines"][0]["machine"][0].each do |status|
  # p status[1][0]
  p status
end

你会看到输出是:

#⇒ ["name", "localhost"]
#⇒ ["repository", [{"worker"=>"localhost:8060", "status"=>"OK"}]]
#⇒ ["dataengine", [{"worker"=>"localhost:27042", "status"=>"OK"}]]
#⇒ ...

也就是说,要实现您的目标,您应该:

▶ sysinfodoc["machines"][0]["machine"][0].values.each do |status|
▷   next unless Array === status
▷   p status.last['status']
▷ end  
# "OK"
# "OK"
# "OK"
# ...

检查 status 是否为数组是必要的,因为存在

# ["name", "localhost"]

希望对您有所帮助。