将配置数据文本与默认数据文本进行比较

Question

我正在了解如何比较两个文本文件中的数据并将不匹配的数据打印到新文档或输出中。

计划目标：

允许用户将包含多行数据的文件中的数据与具有正确数据值的默认文件进行比较。
将具有相同参数的多行不同数据与具有相同参数的默认数据列表进行比较

示例：

假设我有以下包含这些参数和数据的文本文档：让我们称之为 Config.txt:

<231931844151>
Bird = 3
Cat = 4
Dog = 5
Bat = 10
Tiger = 11
Fish = 16

<92103884812>
Bird = 4
Cat = 40
Dog = 10
Bat = Null
Tiger = 19
Fish = 24

etc. etc.

让我们称之为我的配置数据，现在我需要确保我的配置数据文件中这些参数的值是正确的。

所以我有一个默认数据文件，其中包含这些 parameters/variables 的正确值。让我们称之为 Default.txt

<Correct Parameters>
Bird = 3
Cat = 40
Dog = 10
Bat = 10
Tiger = 19
Fish = 234

此文本文件是数据的默认配置或正确配置。

现在我要比较这两个文件，打印出不正确的数据。

因此，理论上，如果我要比较这两个文本文档，我应该得到以下输出：让我们称之为 Output.txt

<231931844151>
Cat = 4
Dog = 5
Tiger = 11
Fish = 16

<92103884812>
Bird = 4
Bat = Null
Fish = 24

etc. etc.

因为这些是不正确或不匹配的参数。所以在这种情况下，我们看到对于 <231931844151>，参数 Cat、Dog、Tiger 和 Fish 与默认文本文件不匹配，因此将打印这些参数。在 <92103884812> Bird、Bat 和 Fish 的情况下，它们与默认参数不匹配，因此将打印这些参数。

这就是现在的要点。

代码：

目前这是我正在尝试做的方法，但是我不确定如何将具有相同参数的不同行集的数据文件与默认数据文件进行比较。

configFile = open("Config.txt", "rb")
defaultFile = open("Default.txt", "rb")

with open(configFile) as f:
    dataConfig = f.read().splitlines()

with open(defaultFile) as d:
    dataDefault = d.read().splitlines()

def make_dict(data):
    return dict((line.split(None, 1)[0], line) for line in data)


defdict = make_dict(dataDefault)
outdict = make_dict(dataConfig)

#Create a sorted list containing all the keys
allkeys = sorted(set(defdict) | set(outdict))
#print allkeys

difflines = []
for key in allkeys:
    indef = key in defdict
    inout = key in outdict
    if indef and not inout:
        difflines.append(defdict[key])
    elif inout and not indef:
        difflines.append(outdict[key])
    else:
        #key must be in both dicts
        defval = defdict[key]
        outval = outdict[key]
        if outval != defval:
            difflines.append(outval)

for line in difflines:
    print line

总结：

我想比较两个包含 data/parameters 的文本文档，一个文本文档将包含一系列具有相同参数的数据，而另一个文本文档将只有一个具有相同参数的数据系列。我需要比较这些参数并打印出与默认值不匹配的参数。我怎样才能在 Python 中做到这一点？

编辑：

好的，感谢@Maria 的代码，我想我快到了。现在我只需要弄清楚如何将字典与列表进行比较并打印出差异。这是我正在尝试做的一个例子：

for i in range (len(setNames)):
    print setNames[i]
    for k in setData[i]:
        if k in dataDefault:
            print dataDefault

很明显，打印行就在那里，看看它是否有效，但我不确定这是否是完成此操作的正确方法。

Answer 1

为什么不直接使用这些字典并循环比较它们呢？

for keys in outdict:
    if defdict.get(keys):
        print outdict.get(keys)

Answer 2

将文件解析为单独字典的示例代码。这通过找到组分隔符（空行）来工作。 setNames[i] 是 setData[i] 处字典中参数集的名称。或者你可以创建一个对象，它有一个字符串 name 成员和一个字典 data 成员，并保留它们的列表。进行比较并以您想要的方式输出它取决于您，这只是将输入文件以稍微不同的格式反刍到命令行。

 # The function you wrote
 def make_dict(data):
    return dict((line.split(None, 1)[0], line) for line in data)

# open the file and read the lines into a list of strings
with open("Config.txt" , "rb") as f:
    dataConfig = f.read().splitlines()

# get rid of trailing '', as they cause problems and are unecessary
while (len(dataConfig) > 0) and (dataConfig[len(dataConfig) - 1] == ''):
    dataConfig.pop()

# find the indexes of all the ''. They amount to one index past the end of each set of parameters
setEnds = []
index = 0
while '' in dataConfig[index:]:
    setEnds.append(dataConfig[index:].index('') + index)
    index = setEnds[len(setEnds) - 1] + 1

# separate out your input into separate dictionaries, and keep track of the name of each dictionary
setNames = []
setData = []

i = 0;
j = 0;
while j < len(setEnds):
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:setEnds[j]]))
    i = setEnds[j] + 1
    j += 1

# handle the last index to the end of the list. Alternativel you could add len(dataConfig) to the end of setEnds and you wouldn't need this
if len(setEnds) > 0:
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:]))

# regurgitate the input to prove it worked the way you wanted.
for i in range(len(setNames)):
    print setNames[i]
    for k in setData[i]:
        print "\t" + k + ": " + setData[i][k];
    print ""

将配置数据文本与默认数据文本进行比较

Compare configuration data text with a default data text

python

comparison

text

compare

text-files

计划目标：

示例：

代码：

总结：

编辑：