如何使用 Nokogiri 对 XML 文件进行多次更改
How to use Nokogiri to make many changes to an XML file
我正在使用 Nokogiri 将超过 80K 行的相当大的 XML 文件转换为 CSV 格式。
我需要将 <ImageFile />
节点批量编辑为类似
的内容
www.mybaseurl.com + text of <ImageFile />
这样它就可以拥有完整的图像路径。我查看了他们所有的文档和 Stack Overflow,虽然很简单,但我仍然找不到解决我问题的方法。
我想使用 Ruby 检查 <AltImageFile1>
是否为空,如果不是,我需要在正下方创建一个具有相同句柄值但值为
<AltImageFile1> for <ImageFile />
像这样:
enter image description here
这是我正在使用的文件 XML 的示例:
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103387-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103429-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.</Description>
<ImageFile>100073.jpg</ImageFile>
<AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
这是我的代码。我该如何改进呢?
require 'csv'
require 'nokogiri'
xml = File.read('Desktop/roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
collection = file.xpath("./FLDeptName").first.text
image1 = file.xpath("./ImageFile").first.text
all_the_things << [ handle, title, description, collection, image1]
end
CSV.open('product_file_1.csv', 'wb' ) do |row|
row << [ 'handle', 'title', 'description', 'collection', 'image1']
all_the_things.each do |data|
row << data
end
end
这是您可以尝试的代码。我在 XML 中没有看到 FLDeptName
节点,所以我评论了与该节点相关的行。
require 'csv'
require 'nokogiri'
xml = File.read('roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
# collection = file.xpath("./FLDeptName").first.text #<== commented because as ./FLDeptName node not present
image1 = "www.mybaseurl.com/" + file.xpath("./ImageFile").first.text
# all_the_things << [ handle, title, description, collection, image1]#<== commented because as ./FLDeptName node not present
all_the_things << [handle, title, description, image1]
end
CSV.open('product_file_1.csv', 'wb') do |row|
# row << [ 'handle', 'title', 'description','collection' 'image1'] #<== commented because as ./FLDeptName node not present
row << ['handle', 'title', 'description', 'image1']
all_the_things.each do |data|
row << data
end
end
这是输出。
样本 XML 有两张图片:
<?xml version="1.0"?>
<ProductCatalogImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement
handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and
American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide
exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with
the required wedges, rivets, or epoxy needed for proper application of the tool head.
</Description>
<ImageFile>100024.jpg</ImageFile>
<ImageFile2>100024-2.jpg</ImageFile2>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to
1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is
white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29".
You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be
quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.
</Description>
<ImageFile>100073.jpg</ImageFile>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
<ProductCatalogImport/>
这是将内容写入不同行的代码:
require 'csv'
require 'nokogiri'
xml = File.read('roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
# collection = file.xpath("./FLDeptName").first.text #<== commented because as ./FLDeptName node not present
image1 = "www.mybaseurl.com/" + file.xpath("./ImageFile").first.text
if file.xpath("./ImageFile2").size() > 0
image2 = "www.mybaseurl.com/" + file.xpath("./ImageFile2").first.text
else
image2 = ''
end
# all_the_things << [ handle, title, description, collection, image1]#<== commented because as ./FLDeptName node not present
all_the_things << [handle, title, description, image1, image2]
end
CSV.open('product_file_1.csv', 'wb') do |row|
# row << [ 'handle', 'title', 'description','collection' 'image1'] #<== commented because as ./FLDeptName node not present
row << ['handle', 'title', 'description', 'image1', 'image2']
all_the_things.each do |data|
if data[-1] != ''
row << data[0...-1]
row << [data[0], '', '', '', data[-1]]
else
row << data
end
end
end
这是输出:
我会从这样的事情开始:
require 'csv'
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103387-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103429-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.</Description>
<ImageFile>100073.jpg</ImageFile>
<AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
EOT
逻辑是这样的:
NODES_TO_COLUMNS = {
'ItemNumber' => 'handle',
'Name' => 'title',
'Description' => 'description',
# 'FLDeptName' => 'collection',
'ImageFile' => 'image1'
}
all_things = doc.search('Product').map do |product|
NODES_TO_COLUMNS.keys.map { |node|
product.at(node).text
}
end
CSV.open('/dev/stdout', 'wb') do |csv|
csv << NODES_TO_COLUMNS.values
all_things.each do |r|
csv << r
end
end
当 运行 时,结果为:
handle,title,description,image1
100024,36-In. Homeowner Bent Single-Bit Axe Handle,"This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.",100024.jpg
100024,1-1/4-Inch Lavatory Pop Up Assembly,"Classic chrome finish with ABS plastic top & body includes push rod, no overflow.",100024.jpg
100024,30-Inch Belt-Drive Whole-House Attic Fan With Shutter,"The 30"" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4"" x 29"". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.",100073.jpg
因为 XML 中缺少 FLDeptName
,它 不应该 是关于 SO 的正确问题,我将其注释掉了。怎么用就交给你了
您需要更改:
CSV.open('/dev/stdout', 'wb') do |csv|
你想用什么作为文件名。 '/dev/stdout'
只是我保留编码并将输出路由到 STDOUT 以显示它的一种方式。
在您的代码中,您使用的是:
xpath("./ItemNumber").first.text
不要那样做。 Nokogiri 提供了 at
快捷方式,它等同于 xpath....first
但更简洁。此外,没有必要使用 xpath
,因为 Nokogiri 的 search
和 at
方法足够聪明,几乎每次都能弄清楚什么是 XPath 或 CSS 选择器。
我还建议除非被迫,否则不要使用 XPath。 CSS 选择器更具可读性,并包含许多 jQuery CSS 扩展(如果现在还不是全部的话),因此您可以避免使用它们时出现一些 XPath 视觉噪音。
如果 AltImageFile1
不为空,您需要创建一个次要的、大部分为空的行,这不是我会做或推荐的。 CSV 行被认为是一个单独的、单独的记录,并且会被我见过的每个支持 CSV 的应用程序解释为该行,因此您要求创建一个没有非标准格式字段的辅助记录。相反,该字段应该作为附加字段附加到同一行。这个逻辑并不难,留给你自己去想。
Each record is located on a separate line, delimited by a line
break (CRLF). For example:
aaa,bbb,ccc CRLF
zzz,yyy,xxx CRLF
因此,不这样做会破坏通过许多其他应用程序的数据移动,这是您应该避免的事情,因为 CSV 应该用于数据传输.
如果您要将数据移动到 DBM,请创建一个临时 table 以便直接从 XML 导入,执行数据库语句以适当地操作记录,然后将它们附加到主 table。如果您要将数据导入 Excel,请使用单独的 table,修改字段,然后将数据复制或合并到常规 table。创建数据的非标准表示对我来说似乎是死胡同。
另一种方法是使用更灵活、更可靠的 YAML 文件。
我正在使用 Nokogiri 将超过 80K 行的相当大的 XML 文件转换为 CSV 格式。
我需要将 <ImageFile />
节点批量编辑为类似
www.mybaseurl.com + text of <ImageFile />
这样它就可以拥有完整的图像路径。我查看了他们所有的文档和 Stack Overflow,虽然很简单,但我仍然找不到解决我问题的方法。
我想使用 Ruby 检查 <AltImageFile1>
是否为空,如果不是,我需要在正下方创建一个具有相同句柄值但值为
<AltImageFile1> for <ImageFile />
像这样:
enter image description here
这是我正在使用的文件 XML 的示例:
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103387-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103429-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.</Description>
<ImageFile>100073.jpg</ImageFile>
<AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
这是我的代码。我该如何改进呢?
require 'csv'
require 'nokogiri'
xml = File.read('Desktop/roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
collection = file.xpath("./FLDeptName").first.text
image1 = file.xpath("./ImageFile").first.text
all_the_things << [ handle, title, description, collection, image1]
end
CSV.open('product_file_1.csv', 'wb' ) do |row|
row << [ 'handle', 'title', 'description', 'collection', 'image1']
all_the_things.each do |data|
row << data
end
end
这是您可以尝试的代码。我在 XML 中没有看到 FLDeptName
节点,所以我评论了与该节点相关的行。
require 'csv'
require 'nokogiri'
xml = File.read('roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
# collection = file.xpath("./FLDeptName").first.text #<== commented because as ./FLDeptName node not present
image1 = "www.mybaseurl.com/" + file.xpath("./ImageFile").first.text
# all_the_things << [ handle, title, description, collection, image1]#<== commented because as ./FLDeptName node not present
all_the_things << [handle, title, description, image1]
end
CSV.open('product_file_1.csv', 'wb') do |row|
# row << [ 'handle', 'title', 'description','collection' 'image1'] #<== commented because as ./FLDeptName node not present
row << ['handle', 'title', 'description', 'image1']
all_the_things.each do |data|
row << data
end
end
这是输出。
样本 XML 有两张图片:
<?xml version="1.0"?>
<ProductCatalogImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement
handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and
American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide
exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with
the required wedges, rivets, or epoxy needed for proper application of the tool head.
</Description>
<ImageFile>100024.jpg</ImageFile>
<ImageFile2>100024-2.jpg</ImageFile2>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to
1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is
white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29".
You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be
quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.
</Description>
<ImageFile>100073.jpg</ImageFile>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
<ProductCatalogImport/>
这是将内容写入不同行的代码:
require 'csv'
require 'nokogiri'
xml = File.read('roduct_catalog.xml')
doc = Nokogiri::XML(xml)
all_the_things = []
doc.xpath('//Products/Product').each do |file|
handle = file.xpath("./ItemNumber").first.text
title = file.xpath("./Name").first.text
description = file.xpath("./Description").first.text
# collection = file.xpath("./FLDeptName").first.text #<== commented because as ./FLDeptName node not present
image1 = "www.mybaseurl.com/" + file.xpath("./ImageFile").first.text
if file.xpath("./ImageFile2").size() > 0
image2 = "www.mybaseurl.com/" + file.xpath("./ImageFile2").first.text
else
image2 = ''
end
# all_the_things << [ handle, title, description, collection, image1]#<== commented because as ./FLDeptName node not present
all_the_things << [handle, title, description, image1, image2]
end
CSV.open('product_file_1.csv', 'wb') do |row|
# row << [ 'handle', 'title', 'description','collection' 'image1'] #<== commented because as ./FLDeptName node not present
row << ['handle', 'title', 'description', 'image1', 'image2']
all_the_things.each do |data|
if data[-1] != ''
row << data[0...-1]
row << [data[0], '', '', '', data[-1]]
else
row << data
end
end
end
这是输出:
我会从这样的事情开始:
require 'csv'
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<Products>
<Product>
<Name>36-In. Homeowner Bent Single-Bit Axe Handle</Name>
<Description>This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103387-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>1-1/4-Inch Lavatory Pop Up Assembly</Name>
<Description>Classic chrome finish with ABS plastic top & body includes push rod, no overflow.</Description>
<ImageFile>100024.jpg</ImageFile>
<AltImageFile1>103429-1.jpg</AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
<Product>
<Name>30-Inch Belt-Drive Whole-House Attic Fan With Shutter</Name>
<Description>The 30" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4" x 29". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.</Description>
<ImageFile>100073.jpg</ImageFile>
<AltImageFile1>
<ItemNumber>100024</ItemNumber>
<ModelNumber>64707</ModelNumber>
</Product>
</Products>
EOT
逻辑是这样的:
NODES_TO_COLUMNS = {
'ItemNumber' => 'handle',
'Name' => 'title',
'Description' => 'description',
# 'FLDeptName' => 'collection',
'ImageFile' => 'image1'
}
all_things = doc.search('Product').map do |product|
NODES_TO_COLUMNS.keys.map { |node|
product.at(node).text
}
end
CSV.open('/dev/stdout', 'wb') do |csv|
csv << NODES_TO_COLUMNS.values
all_things.each do |r|
csv << r
end
end
当 运行 时,结果为:
handle,title,description,image1
100024,36-In. Homeowner Bent Single-Bit Axe Handle,"This single bit curved grip axe handle is made for 3 to 5 pound axes. A good quality replacement handle made of American hickory with a natural wax finish. Hardwood handles do not conduct electricity and American Hickory is known for its strength, elasticity and ability to absorb shock. These handles provide exceptional value and economy for homeowners and other occasional use applications. Each Link handle comes with the required wedges, rivets, or epoxy needed for proper application of the tool head.",100024.jpg
100024,1-1/4-Inch Lavatory Pop Up Assembly,"Classic chrome finish with ABS plastic top & body includes push rod, no overflow.",100024.jpg
100024,30-Inch Belt-Drive Whole-House Attic Fan With Shutter,"The 30"" belt drive whole house fan (5700 CFM) with automatic shutter helps cool living spaces up to 1900 square feet. It runs on high & low and a 2 speed wall switch is included. The automatic shutter is white. It needs 1095 square inches of open exhaust vents in attic space, with a rough opening of 34-1/4"" x 29"". You do have to cut joist when installing fan, with the motor mounted on struts above housing. The fan will be quieter than direct drive models. There is a 10 year limited parts warranty, 5 year limited labor warranty.",100073.jpg
因为 XML 中缺少 FLDeptName
,它 不应该 是关于 SO 的正确问题,我将其注释掉了。怎么用就交给你了
您需要更改:
CSV.open('/dev/stdout', 'wb') do |csv|
你想用什么作为文件名。 '/dev/stdout'
只是我保留编码并将输出路由到 STDOUT 以显示它的一种方式。
在您的代码中,您使用的是:
xpath("./ItemNumber").first.text
不要那样做。 Nokogiri 提供了 at
快捷方式,它等同于 xpath....first
但更简洁。此外,没有必要使用 xpath
,因为 Nokogiri 的 search
和 at
方法足够聪明,几乎每次都能弄清楚什么是 XPath 或 CSS 选择器。
我还建议除非被迫,否则不要使用 XPath。 CSS 选择器更具可读性,并包含许多 jQuery CSS 扩展(如果现在还不是全部的话),因此您可以避免使用它们时出现一些 XPath 视觉噪音。
如果 AltImageFile1
不为空,您需要创建一个次要的、大部分为空的行,这不是我会做或推荐的。 CSV 行被认为是一个单独的、单独的记录,并且会被我见过的每个支持 CSV 的应用程序解释为该行,因此您要求创建一个没有非标准格式字段的辅助记录。相反,该字段应该作为附加字段附加到同一行。这个逻辑并不难,留给你自己去想。
Each record is located on a separate line, delimited by a line break (CRLF). For example:
aaa,bbb,ccc CRLF
zzz,yyy,xxx CRLF
因此,不这样做会破坏通过许多其他应用程序的数据移动,这是您应该避免的事情,因为 CSV 应该用于数据传输.
如果您要将数据移动到 DBM,请创建一个临时 table 以便直接从 XML 导入,执行数据库语句以适当地操作记录,然后将它们附加到主 table。如果您要将数据导入 Excel,请使用单独的 table,修改字段,然后将数据复制或合并到常规 table。创建数据的非标准表示对我来说似乎是死胡同。
另一种方法是使用更灵活、更可靠的 YAML 文件。