XML 使用 Ruby 解析提供的 URL
XML parsing using Ruby for provided URL
我正在尝试使用 Nokogiri 来解析我的 XML,这是我从 URL 得到的,但我无法创建它的数组以便所有人都可以访问它在项目上。
我的XML:
<component name="Hero">
<topic name="i1">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here First. </label>
<topic name="i2">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here Fourth. </label>
<label>I am here Sixth. </label>
<topic name="i3">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here Fourth. </label>
我打算为每个主题创建一个数组,其中包含标签。例如:
hro_array = ["I am here First.","I am here Second.","I am here Third".]
假设您的 XML 格式正确且有效(正确关闭嵌套标签等),那么您只需获取 URL 的内容(例如使用 builtin open-uri
),然后使用 XML 解析技术(例如 XPath)检索所需数据。
例如,假设您想要将主题名称散列到嵌套标签列表:
require 'open-uri'
require 'nokogiri'
def topic_label_hash(doc)
doc.xpath('//topic').each_with_object({}) do |topic, hash|
labels = topic.xpath('.//label/text()').map(&:to_s)
name = topic.attr('name')
hash[name] = labels
end
end
xml = open(my_url)
doc = Nokogiri::XML(xml)
topic_label_hash(doc) # =>
# {
# "TV" => [
# "I am here First. ",
# "I am here Second. ",
# "I am here Third. ",
# ...
# ],
# "Internet" => [
# "I am here Fourth. ",
# "I am here Fifth. ",
# "I am here Sixth. "
# ],
# ...
# }
我正在尝试使用 Nokogiri 来解析我的 XML,这是我从 URL 得到的,但我无法创建它的数组以便所有人都可以访问它在项目上。
我的XML:
<component name="Hero">
<topic name="i1">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here First. </label>
<topic name="i2">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here Fourth. </label>
<label>I am here Sixth. </label>
<topic name="i3">
<subtopic name="">
<links>
<link Dur="" Id="" type="article">
<label>I am here Fourth. </label>
我打算为每个主题创建一个数组,其中包含标签。例如:
hro_array = ["I am here First.","I am here Second.","I am here Third".]
假设您的 XML 格式正确且有效(正确关闭嵌套标签等),那么您只需获取 URL 的内容(例如使用 builtin open-uri
),然后使用 XML 解析技术(例如 XPath)检索所需数据。
例如,假设您想要将主题名称散列到嵌套标签列表:
require 'open-uri'
require 'nokogiri'
def topic_label_hash(doc)
doc.xpath('//topic').each_with_object({}) do |topic, hash|
labels = topic.xpath('.//label/text()').map(&:to_s)
name = topic.attr('name')
hash[name] = labels
end
end
xml = open(my_url)
doc = Nokogiri::XML(xml)
topic_label_hash(doc) # =>
# {
# "TV" => [
# "I am here First. ",
# "I am here Second. ",
# "I am here Third. ",
# ...
# ],
# "Internet" => [
# "I am here Fourth. ",
# "I am here Fifth. ",
# "I am here Sixth. "
# ],
# ...
# }