无法剥离和存储 CSV 格式的某些文件的内容

Question

我有一个文件看起来像：

它们被放置在

~/ansible-environments/aws/random_name_1/inventory/group_vars/all 
~/ansible-environments/aws/random_name_2/inventory/group_vars/all
~/ansible-environments/aws/random_name_3/inventory/group_vars/all

我写了：

    import os
import sys
rootdir='/home/USER/ansible-environments/aws'
#print "aa"
for root, subdirs, files in os.walk(rootdir):
    for subdir in subdirs:
        all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
        if not os.path.isfile(all_path):
            continue
        try:
            with open(all_path, "r") as f:
                all_content = f.readlines()
        except (OSError, IOError):
            continue  # ignore errors
        csv_line = [""] * 3
        for line in all_content:
            if line[:9] == "isv_alias:":
                csv_line[0] = line[7:].strip()
            elif line[:21] == "LMID:":
                csv_line[1] = line[6:].strip()
            elif line[:17] == "products:":
                csv_line[2] = line[10:].strip()
        if all(value != "" for value in csv_line):
            with open(os.path.join("/home/nsingh/nishlist.csv"), "a") as csv:
                csv.write(",".join(csv_line))
                csv.write("\n")

我只需要 LMIT，isv_alias，产品格式如下：

alias,LMIT,product
bloodyhell,80,rms_scl
something_else,434,some_other_prod

Answer 1

您没有说明您是如何获取文件的（第一行的f），但假设您已经整理好文件遍历并且文件与您提供的完全一样他们（所以没有多余的空格或类似的东西），你可以修改你的代码：

csv_line = [""] * 3
for line in f:
    if line[:6] == "alias:":
        csv_line[0] = line[7:].strip()
    elif line[:5] == "LMIT:":
        csv_line[1] = line[6:].strip()
    elif line[:9] == "products:":
        csv_line[2] = line[10:].strip()
with open(rootdir + '/' + 'list.csv', "a") as csv:
    csv.write(",".join(csv_line))
    csv.write("\n")

这将在您的 CSV 中为作为 f 加载的每个文件添加一个新行，其中包含正确的变量，但请记住，它不会检查数据有效性，因此它会很高兴如果打开的文件不包含正确的数据，则写入空的新行。

您可以通过在打开 csv 文件进行写入之前检查 all(value != "" for value in csv_line) 来防止这种情况发生。如果您想编写至少填充了一个变量的条目，则可以使用 any 而不是 all。

更新：您刚刚粘贴的代码存在严重的缩进和结构问题。它至少对你想做的事情更有意义 - 假设其他一切都很好，应该这样做：

for root, subdirs, files in os.walk(rootdir):
    for subdir in subdirs:
        all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
        if not os.path.isfile(all_path):
            continue
        try:
            with open(all_path, "r") as f:
                all_content = f.readlines()
        except (OSError, IOError):
            continue  # ignore errors
        csv_line = [""] * 3
        for line in all_content:
            if line[:6] == "alias:":
                csv_line[0] = line[7:].strip()
            elif line[:5] == "LMIT:":
                csv_line[1] = line[6:].strip()
            elif line[:9] == "products:":
                csv_line[2] = line[10:].strip()
        if all(value != "" for value in csv_line):
            with open(os.path.join(rootdir, "list.csv"), "a") as csv:
                csv.write(",".join(csv_line))
                csv.write("\n")

Answer 2

这里存在三个问题：

正在查找所有 key-value 个文件
从每个文件中提取键和值
将每个文件中的键和值转换为 CSV 中的行

先用os.listdir()查找内容 ~/ansible-environments/aws，然后构建预期的路径 inventory/group_vars 目录里面的每一个using os.path.join()，看看有哪些是真实存在的。然后列出那些确实存在的目录的内容，并假设所有里面的文件（比如all）是key-value个文件。这个例子此答案末尾的代码假定所有文件都可以找到这种方式；如果他们不能，你可能需要调整这个例子使用 os.walk() 或其他方法查找文件的代码。

每个key-value文件都是一系列行，其中每一行都是一个键和用冒号分隔的值 (":")。您使用搜索的方法对于一个子字符串（运算符 in）将失败，例如，密钥包含字符串 "LMIT"。相反，在冒号处拆分行。表达式 line.split(":", 1) 在第一行拆分行冒号，但不是后续冒号，以防值本身具有冒号。然后从键和值中去除多余的空格，并构建键和值的字典。

现在选择要保留的密钥。一旦你解析了每个文件，从中查找字典中的关联值文件，并从中构建一个列表。然后添加值列表从这个文件到所有文件的值列表的列表，以及使用 csv.writer 将列表列表写成 CSV 文件。

它可能看起来像这样：

#!/usr/bin/env python2
from __future__ import with_statement, print_function, division
import os
import csv

def read_kv_file(filename):
    items = {}
    with open(filename, "rU") as infp:
        for line in infp:
            # Split at a colon and strip leading and trailing space
            line = [x.strip() for x in line.split(":", 1)]

            # Add the key and value to the dictionary
            if len(line) > 1:
                items[line[0]] = line[1]
    return items

# First find all random names
outer_dir = os.path.expanduser("~/ansible-environments/aws")
random_names = os.listdir(outer_dir)
inner_dirs = [
    os.path.join(outer_dir, name, "inventory/group_vars")
    for name in random_names
]

# Now filter it to those directories that actually exist
inner_dirs = [name for name in inner_dirs if os.path.isdir(name)]

wanted_keys = ["alias", "LMIT", "products"]
out_columns = ["alias", "LMIT", "product"]

# Collect key-value pairs from all files in these folders
rows = []
for dirname in inner_dirs:
    for filename in os.listdir(dirname):
        path = os.path.join(dirname, filename)

        # Skip non-files in this directory
        if not os.path.isfile(path):
            continue

        # If the file has a non-blank value for any of the keys of
        # interest, add a row
        items = read_kv_file(path)
        this_file_values = [items.get(key) for key in wanted_keys]
        if any(this_file_values):
            rows.append(this_file_values)

# And write them out
with open("out.csv", "wb") as outfp:
    writer = csv.writer(outfp, "excel")
    writer.writerow(out_columns)
    writer.writerows(rows)

无法剥离和存储 CSV 格式的某些文件的内容

Unable to strip and store content of some files in CSV format

python

os.walk

python-2.7