将 YAML 文件转换为 CSV (Excel) /dev/sda 问题
Convert YAML files to CSV (Excel) /dev/sda problem
我有 100 个类似的 yml 文件,需要将所有这些文件解析为一个 csv 文件。我在 python 方面没有太多练习 :(
此 yaml 文件如下所示:
/dev/sda:
devname: /dev/sda
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS746602EV240AGN
/dev/sdaa:
devname: /dev/sdaa
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000UX1P9DGN
/dev/sdb:
devname: /dev/sdb
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS745207RL240AGN
/dev/sdc:
devname: /dev/sdc
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG9271045G1P9DGN
/dev/sdd:
devname: /dev/sdd
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000EP1P9DGN
这是一个文件的一小部分。
我尝试使用和修改此代码,但我不知道如何解析此 ("/dev/sda"),它只是不想工作:
import yaml
import csv
import glob
yaml_file_names = glob.glob('/home/ranburu/Downloads/disks_info/*.yml')
rows_to_write = []
for i, each_yaml_file in enumerate(yaml_file_names):
print("Processing file {} of {} file name: {}".format(
i + 1, len(yaml_file_names), each_yaml_file))
with open(each_yaml_file) as file:
data = yaml.safe_load(file)
for instance in data['/dev/sd*']:
# values = dict()
# for tag in instance["tags"]:
# tag_for_check = tag.split(":")
#
# if tag_for_check[0] == "ip":
# values["ip"] = tag_for_check[1]
# continue
#
# elif tag_for_check[0] == "name":
# values["name"] = tag_for_check[1]
rows_to_write.append([each_yaml_file, instance["id_model"], instance["id_serial_short"]])
with open('output_csv_file.csv', 'w', newline='') as out:
csv_writer = csv.writer(out)
csv_writer.writerow(["host", "model", "serial"])
csv_writer.writerows(rows_to_write)
print("Output file output_csv_file.csv created")
最后我有这个错误:
line 15, in <module>
for instance in data['/dev/sd*']:
KeyError: '/dev/sd*'
我弄清楚这段代码是如何工作的,但是每个块开头的 /dev/sda,sdb,sdc 让我感到困惑。
如果您能帮助我解决这个问题并编写出有效的代码,我将不胜感激。
yaml.safe_load
会将 yaml 处理成字典。这意味着您可以使用标准字典方法对其进行迭代(即无需尝试将其解析为带有 split
等的字符串)。
假设您希望每一列都是每个嵌套字典中的 3 个条目的值,我们可以利用按顺序添加到字典中的项目,只需使用 values()
来提取它们。下面用一点 yaml 数据演示了这个过程——但是将它嵌套在迭代所有文件的代码中应该很容易:
import csv
import yaml
yml = """
/dev/sda:
devname: /dev/sda
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS746602EV240AGN
/dev/sdaa:
devname: /dev/sdaa
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000UX1P9DGN
"""
data = yaml.safe_load(yml)
with open('output_csv_file.csv', 'w', newline='') as out:
csv_writer = csv.writer(out)
csv_writer.writerow(["host", "model", "serial"])
for disk in data.values():
csv_writer.writerows([disk.values()])
# output_csv_file.csv
host,model,serial
/dev/sda,INTEL_SSDSC2KB240G7,PHYS746602EV240AGN
/dev/sdaa,INTEL_SSDSC2KG019T8,PHYG013000UX1P9DGN
我有 100 个类似的 yml 文件,需要将所有这些文件解析为一个 csv 文件。我在 python 方面没有太多练习 :( 此 yaml 文件如下所示:
/dev/sda:
devname: /dev/sda
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS746602EV240AGN
/dev/sdaa:
devname: /dev/sdaa
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000UX1P9DGN
/dev/sdb:
devname: /dev/sdb
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS745207RL240AGN
/dev/sdc:
devname: /dev/sdc
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG9271045G1P9DGN
/dev/sdd:
devname: /dev/sdd
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000EP1P9DGN
这是一个文件的一小部分。 我尝试使用和修改此代码,但我不知道如何解析此 ("/dev/sda"),它只是不想工作:
import yaml
import csv
import glob
yaml_file_names = glob.glob('/home/ranburu/Downloads/disks_info/*.yml')
rows_to_write = []
for i, each_yaml_file in enumerate(yaml_file_names):
print("Processing file {} of {} file name: {}".format(
i + 1, len(yaml_file_names), each_yaml_file))
with open(each_yaml_file) as file:
data = yaml.safe_load(file)
for instance in data['/dev/sd*']:
# values = dict()
# for tag in instance["tags"]:
# tag_for_check = tag.split(":")
#
# if tag_for_check[0] == "ip":
# values["ip"] = tag_for_check[1]
# continue
#
# elif tag_for_check[0] == "name":
# values["name"] = tag_for_check[1]
rows_to_write.append([each_yaml_file, instance["id_model"], instance["id_serial_short"]])
with open('output_csv_file.csv', 'w', newline='') as out:
csv_writer = csv.writer(out)
csv_writer.writerow(["host", "model", "serial"])
csv_writer.writerows(rows_to_write)
print("Output file output_csv_file.csv created")
最后我有这个错误:
line 15, in <module>
for instance in data['/dev/sd*']:
KeyError: '/dev/sd*'
我弄清楚这段代码是如何工作的,但是每个块开头的 /dev/sda,sdb,sdc 让我感到困惑。 如果您能帮助我解决这个问题并编写出有效的代码,我将不胜感激。
yaml.safe_load
会将 yaml 处理成字典。这意味着您可以使用标准字典方法对其进行迭代(即无需尝试将其解析为带有 split
等的字符串)。
假设您希望每一列都是每个嵌套字典中的 3 个条目的值,我们可以利用按顺序添加到字典中的项目,只需使用 values()
来提取它们。下面用一点 yaml 数据演示了这个过程——但是将它嵌套在迭代所有文件的代码中应该很容易:
import csv
import yaml
yml = """
/dev/sda:
devname: /dev/sda
id_model: INTEL_SSDSC2KB240G7
id_serial_short: PHYS746602EV240AGN
/dev/sdaa:
devname: /dev/sdaa
id_model: INTEL_SSDSC2KG019T8
id_serial_short: PHYG013000UX1P9DGN
"""
data = yaml.safe_load(yml)
with open('output_csv_file.csv', 'w', newline='') as out:
csv_writer = csv.writer(out)
csv_writer.writerow(["host", "model", "serial"])
for disk in data.values():
csv_writer.writerows([disk.values()])
# output_csv_file.csv
host,model,serial
/dev/sda,INTEL_SSDSC2KB240G7,PHYS746602EV240AGN
/dev/sdaa,INTEL_SSDSC2KG019T8,PHYG013000UX1P9DGN