通过读取文本文件从文本文件中删除某些链接

Question

所以我有 whitelist.txt 其中包含一些链接， scrapedlist.txt 包含其他链接，并且还有 whitelist.txt.

中的链接

我正在尝试打开并阅读 whitelist.txt 然后打开并阅读 scrapedlist.txt - 以写入一个新文件 updatedlist2.txt 它将包含 scrapedlist.txt 减去 [= 的所有内容34=]whitelist.txt.

我是 Python 的新手，所以还在学习中。我搜索了答案，这就是我想出的答案：

def whitelist_file_func():
    with open("whitelist.txt", "r") as whitelist_read:
        whitelist_read.readlines()
    whitelist_read.close()

    unique2 = set()

    with open("scrapedlist.txt", "r") as scrapedlist_read:
        scrapedlist_lines = scrapedlist_read.readlines()
    scrapedlist_read.close()

    unique3 = set()

    with open("updatedlist2.txt", "w") as whitelist_write2:
   
        for line in scrapedlist_lines:
            if unique2 not in line and line not in unique3:
                whitelist_write2.write(line)
                unique3.add(line)

我遇到了这个错误，我也不确定我的做法是否正确：

if unique2 not in line and line not in unique3:
TypeError: 'in <string>' requires string as left operand, not set

我应该怎么做才能实现上面提到的，我的代码对吗？

编辑：

whitelist.txt:

KUWAIT
ISRAEL
FRANCE

scrapedlist.txt:

USA
CANADA
GERMANY
KUWAIT
ISRAEL
FRANCE

updatedlist2.txt（应该是这样）：

USA
CANADA
GERMANY

Answer 1

根据您的描述，我对您的代码进行了一些更改。

readlines() 方法替换为 read().splitlines()。他们都读取整个文件并将每一行转换为列表项。不同之处在于 readlines() 在项目末尾包含 \n。
unique2 和 unique3 被删除。我找不到它们的用法。
前两部分 whitelist_lines 和 scrapedlist_lines 是两个包含链接的列表。根据您的描述，我们需要不在 whitelist_lines 列表中的 scrapedlist_lines 行，因此条件 if unique2 not in line and line not in unique3: 更改为 if line not in whitelist_lines:.
如果您使用的是 Python 2.5 及更高版本，可以使用 with 语句自动为您调用 close()。

最终代码为：

with open("whitelist.txt", "r") as whitelist_read:
    whitelist_lines = whitelist_read.read().split("\n")
    
with open("scrapedlist.txt", "r") as scrapedlist_read:
    scrapedlist_lines = scrapedlist_read.read().split("\n")

with open("updatedlist2.txt", "w") as whitelist_write2:
    for line in scrapedlist_lines:
        if line not in whitelist_lines:
            whitelist_write2.write(line + "\n")

通过读取文本文件从文本文件中删除某些链接

Remove certain links from a textfile by reading textfile

python

with-statement

text-files