检查文件中的重复数据 (Python)

Question

我正在尝试为另一个项目制作一个主题列表，并将这些主题存储在 Topics.txt 中。但是，当主题存储在文件中时，我不想重复主题。因此，当我将我的主题保存到我的 Topics.txt 文件时，我也会将它们保存到 Duplicates.txt 文件。我想要做的是创建一个条件语句，如果主题在 Duplicates.txt 中，则不会将主题添加到 Topics.txt。我的问题是，我不知道如何创建一个条件语句来检查该主题是否列在 Duplicates.txt 中。如果您扫描 "music" 等关键字，发现 "electro-music" 包含单词 "music"，则可能会出现问题。

Entry = input("Enter topic: ")
Topic = Entry + "\n"
Readfilename = "Duplicates.txt"
Readfile = open(Readfilename, "r")
Readdata = Readfile.read()
Readfile.close()
if Topic not in Duplicates:
    Filename = "Topics.txt"
    File = open(Filename, "a")
    File.append(Topic)
    File.close()
    Duplicate = Topic + "\n"
    Readfile = open(Readfilename, "a")
    Readfile.append(Topic)
    Readfile.close()

Answer 1

您可以将主题存储在集合中。一组是独特物品的集合。

topics = {'Banjo', 'Guitar', 'Piano'}

您可以使用以下方式检查会员资格：

>>> 'Banjo' in topics
True

您通过 .add()

将新事物添加到集合中

topics.add('Iceskating')
>>> topics
set(['Banjo','Guitar', 'Piano', 'Iceskating'])

Python 3 个文档集合 here. The tutorial page on sets is here。

Answer 2

您可以逐行读取文件，这会产生类似这样的解决方案

Entry = input("Enter topic: ")
Topic = Entry + "\n"
Readfilename = "Duplicates.txt"
found=False
with open(Readfilename, "r") as Readfile:
    for line in Readfile:
        if Topic==line:
            found=True
            break # no need to read more of the file

if not found:
    Filename = "Topics.txt"
    with open(Filename, "a") as File:
        File.write(Topic)

    with open(Readfilename, "a") as Readfile:
        Readfile.write(Topic)

检查文件中的重复数据 (Python)

Checking data in a file for duplicates (Python)

python

duplicates

keyword-search

python-3.x