无法剥离和存储 CSV 格式的某些文件的内容
Unable to strip and store content of some files in CSV format
我有一个文件看起来像:
它们被放置在
~/ansible-environments/aws/random_name_1/inventory/group_vars/all
~/ansible-environments/aws/random_name_2/inventory/group_vars/all
~/ansible-environments/aws/random_name_3/inventory/group_vars/all
我写了:
import os
import sys
rootdir='/home/USER/ansible-environments/aws'
#print "aa"
for root, subdirs, files in os.walk(rootdir):
for subdir in subdirs:
all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
if not os.path.isfile(all_path):
continue
try:
with open(all_path, "r") as f:
all_content = f.readlines()
except (OSError, IOError):
continue # ignore errors
csv_line = [""] * 3
for line in all_content:
if line[:9] == "isv_alias:":
csv_line[0] = line[7:].strip()
elif line[:21] == "LMID:":
csv_line[1] = line[6:].strip()
elif line[:17] == "products:":
csv_line[2] = line[10:].strip()
if all(value != "" for value in csv_line):
with open(os.path.join("/home/nsingh/nishlist.csv"), "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
我只需要 LMIT,isv_alias,产品格式如下:
alias,LMIT,product
bloodyhell,80,rms_scl
something_else,434,some_other_prod
您没有说明您是如何获取文件的(第一行的f
),但假设您已经整理好文件遍历并且文件与您提供的完全一样他们(所以没有多余的空格或类似的东西),你可以修改你的代码:
csv_line = [""] * 3
for line in f:
if line[:6] == "alias:":
csv_line[0] = line[7:].strip()
elif line[:5] == "LMIT:":
csv_line[1] = line[6:].strip()
elif line[:9] == "products:":
csv_line[2] = line[10:].strip()
with open(rootdir + '/' + 'list.csv', "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
这将在您的 CSV 中为作为 f
加载的每个文件添加一个新行,其中包含正确的变量,但请记住,它不会检查数据有效性,因此它会很高兴如果打开的文件不包含正确的数据,则写入空的新行。
您可以通过在打开 csv 文件进行写入之前检查 all(value != "" for value in csv_line)
来防止这种情况发生。如果您想编写至少填充了一个变量的条目,则可以使用 any
而不是 all
。
更新:您刚刚粘贴的代码存在严重的缩进和结构问题。它至少对你想做的事情更有意义 - 假设其他一切都很好,应该这样做:
for root, subdirs, files in os.walk(rootdir):
for subdir in subdirs:
all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
if not os.path.isfile(all_path):
continue
try:
with open(all_path, "r") as f:
all_content = f.readlines()
except (OSError, IOError):
continue # ignore errors
csv_line = [""] * 3
for line in all_content:
if line[:6] == "alias:":
csv_line[0] = line[7:].strip()
elif line[:5] == "LMIT:":
csv_line[1] = line[6:].strip()
elif line[:9] == "products:":
csv_line[2] = line[10:].strip()
if all(value != "" for value in csv_line):
with open(os.path.join(rootdir, "list.csv"), "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
这里存在三个问题:
- 正在查找所有 key-value 个文件
- 从每个文件中提取键和值
- 将每个文件中的键和值转换为 CSV 中的行
先用os.listdir()
查找内容
~/ansible-environments/aws
,然后构建预期的路径
inventory/group_vars
目录里面的每一个using
os.path.join()
,看看有哪些是真实存在的。然后列出
那些确实存在的目录的内容,并假设所有
里面的文件(比如all
)是key-value个文件。这个例子
此答案末尾的代码假定所有文件都可以
找到这种方式;如果他们不能,你可能需要调整这个例子
使用 os.walk()
或其他方法查找文件的代码。
每个key-value文件都是一系列行,其中每一行都是一个键
和用冒号分隔的值 (":"
)。您使用搜索的方法
对于一个子字符串(运算符 in
)将失败,例如,密钥
包含字符串 "LMIT"。相反,在冒号处拆分行。
表达式 line.split(":", 1)
在第一行拆分行
冒号,但不是后续冒号,以防值本身具有
冒号。然后从键和值中去除多余的空格,
并构建键和值的字典。
现在选择要保留的密钥。一旦你解析了每个
文件,从中查找字典中的关联值
文件,并从中构建一个列表。然后添加值列表
从这个文件到所有文件的值列表的列表,以及
使用 csv.writer
将列表列表写成 CSV 文件。
它可能看起来像这样:
#!/usr/bin/env python2
from __future__ import with_statement, print_function, division
import os
import csv
def read_kv_file(filename):
items = {}
with open(filename, "rU") as infp:
for line in infp:
# Split at a colon and strip leading and trailing space
line = [x.strip() for x in line.split(":", 1)]
# Add the key and value to the dictionary
if len(line) > 1:
items[line[0]] = line[1]
return items
# First find all random names
outer_dir = os.path.expanduser("~/ansible-environments/aws")
random_names = os.listdir(outer_dir)
inner_dirs = [
os.path.join(outer_dir, name, "inventory/group_vars")
for name in random_names
]
# Now filter it to those directories that actually exist
inner_dirs = [name for name in inner_dirs if os.path.isdir(name)]
wanted_keys = ["alias", "LMIT", "products"]
out_columns = ["alias", "LMIT", "product"]
# Collect key-value pairs from all files in these folders
rows = []
for dirname in inner_dirs:
for filename in os.listdir(dirname):
path = os.path.join(dirname, filename)
# Skip non-files in this directory
if not os.path.isfile(path):
continue
# If the file has a non-blank value for any of the keys of
# interest, add a row
items = read_kv_file(path)
this_file_values = [items.get(key) for key in wanted_keys]
if any(this_file_values):
rows.append(this_file_values)
# And write them out
with open("out.csv", "wb") as outfp:
writer = csv.writer(outfp, "excel")
writer.writerow(out_columns)
writer.writerows(rows)
我有一个文件看起来像:
它们被放置在
~/ansible-environments/aws/random_name_1/inventory/group_vars/all
~/ansible-environments/aws/random_name_2/inventory/group_vars/all
~/ansible-environments/aws/random_name_3/inventory/group_vars/all
我写了:
import os
import sys
rootdir='/home/USER/ansible-environments/aws'
#print "aa"
for root, subdirs, files in os.walk(rootdir):
for subdir in subdirs:
all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
if not os.path.isfile(all_path):
continue
try:
with open(all_path, "r") as f:
all_content = f.readlines()
except (OSError, IOError):
continue # ignore errors
csv_line = [""] * 3
for line in all_content:
if line[:9] == "isv_alias:":
csv_line[0] = line[7:].strip()
elif line[:21] == "LMID:":
csv_line[1] = line[6:].strip()
elif line[:17] == "products:":
csv_line[2] = line[10:].strip()
if all(value != "" for value in csv_line):
with open(os.path.join("/home/nsingh/nishlist.csv"), "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
我只需要 LMIT,isv_alias,产品格式如下:
alias,LMIT,product
bloodyhell,80,rms_scl
something_else,434,some_other_prod
您没有说明您是如何获取文件的(第一行的f
),但假设您已经整理好文件遍历并且文件与您提供的完全一样他们(所以没有多余的空格或类似的东西),你可以修改你的代码:
csv_line = [""] * 3
for line in f:
if line[:6] == "alias:":
csv_line[0] = line[7:].strip()
elif line[:5] == "LMIT:":
csv_line[1] = line[6:].strip()
elif line[:9] == "products:":
csv_line[2] = line[10:].strip()
with open(rootdir + '/' + 'list.csv', "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
这将在您的 CSV 中为作为 f
加载的每个文件添加一个新行,其中包含正确的变量,但请记住,它不会检查数据有效性,因此它会很高兴如果打开的文件不包含正确的数据,则写入空的新行。
您可以通过在打开 csv 文件进行写入之前检查 all(value != "" for value in csv_line)
来防止这种情况发生。如果您想编写至少填充了一个变量的条目,则可以使用 any
而不是 all
。
更新:您刚刚粘贴的代码存在严重的缩进和结构问题。它至少对你想做的事情更有意义 - 假设其他一切都很好,应该这样做:
for root, subdirs, files in os.walk(rootdir):
for subdir in subdirs:
all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
if not os.path.isfile(all_path):
continue
try:
with open(all_path, "r") as f:
all_content = f.readlines()
except (OSError, IOError):
continue # ignore errors
csv_line = [""] * 3
for line in all_content:
if line[:6] == "alias:":
csv_line[0] = line[7:].strip()
elif line[:5] == "LMIT:":
csv_line[1] = line[6:].strip()
elif line[:9] == "products:":
csv_line[2] = line[10:].strip()
if all(value != "" for value in csv_line):
with open(os.path.join(rootdir, "list.csv"), "a") as csv:
csv.write(",".join(csv_line))
csv.write("\n")
这里存在三个问题:
- 正在查找所有 key-value 个文件
- 从每个文件中提取键和值
- 将每个文件中的键和值转换为 CSV 中的行
先用os.listdir()
查找内容
~/ansible-environments/aws
,然后构建预期的路径
inventory/group_vars
目录里面的每一个using
os.path.join()
,看看有哪些是真实存在的。然后列出
那些确实存在的目录的内容,并假设所有
里面的文件(比如all
)是key-value个文件。这个例子
此答案末尾的代码假定所有文件都可以
找到这种方式;如果他们不能,你可能需要调整这个例子
使用 os.walk()
或其他方法查找文件的代码。
每个key-value文件都是一系列行,其中每一行都是一个键
和用冒号分隔的值 (":"
)。您使用搜索的方法
对于一个子字符串(运算符 in
)将失败,例如,密钥
包含字符串 "LMIT"。相反,在冒号处拆分行。
表达式 line.split(":", 1)
在第一行拆分行
冒号,但不是后续冒号,以防值本身具有
冒号。然后从键和值中去除多余的空格,
并构建键和值的字典。
现在选择要保留的密钥。一旦你解析了每个
文件,从中查找字典中的关联值
文件,并从中构建一个列表。然后添加值列表
从这个文件到所有文件的值列表的列表,以及
使用 csv.writer
将列表列表写成 CSV 文件。
它可能看起来像这样:
#!/usr/bin/env python2
from __future__ import with_statement, print_function, division
import os
import csv
def read_kv_file(filename):
items = {}
with open(filename, "rU") as infp:
for line in infp:
# Split at a colon and strip leading and trailing space
line = [x.strip() for x in line.split(":", 1)]
# Add the key and value to the dictionary
if len(line) > 1:
items[line[0]] = line[1]
return items
# First find all random names
outer_dir = os.path.expanduser("~/ansible-environments/aws")
random_names = os.listdir(outer_dir)
inner_dirs = [
os.path.join(outer_dir, name, "inventory/group_vars")
for name in random_names
]
# Now filter it to those directories that actually exist
inner_dirs = [name for name in inner_dirs if os.path.isdir(name)]
wanted_keys = ["alias", "LMIT", "products"]
out_columns = ["alias", "LMIT", "product"]
# Collect key-value pairs from all files in these folders
rows = []
for dirname in inner_dirs:
for filename in os.listdir(dirname):
path = os.path.join(dirname, filename)
# Skip non-files in this directory
if not os.path.isfile(path):
continue
# If the file has a non-blank value for any of the keys of
# interest, add a row
items = read_kv_file(path)
this_file_values = [items.get(key) for key in wanted_keys]
if any(this_file_values):
rows.append(this_file_values)
# And write them out
with open("out.csv", "wb") as outfp:
writer = csv.writer(outfp, "excel")
writer.writerow(out_columns)
writer.writerows(rows)