如何在 python 中找到单个 csv 文件的两列之间的交集（公共元素）？

Question

我在没有 headers（制表符分隔）的 csv 文件中几乎有两列（1000 行）。列值的示例内容如下。它可以是一个短语或一个单词。

CSV 文件格式：

ac           home          

home         big         

new city     city

city         paris

heat         waves

blood        blood pressure

relation     blood

输入格式（编辑）：

我想计算 csv 文件两列之间的公共元素？有没有什么办法。我完全不知道如何实现这一目标。

我对文件 (.csv) 及其变体完全陌生。非常感谢任何帮助。

输出

home, city, blood

我知道如何计算两个字典、列表等的交集。但这无法帮助我实现所需的解决方案。

Answer 1

使用set --> set.intersection

例如：

import csv

with open(filename) as infile:
    reader = csv.reader(infile, delimiter="\t")
    c1, c2 = set(), set()
    for row in reader:
        if row:
            c1.add(row[0])
            c2.add(row[1])

print(c1.intersection(c2))

输出：

{'home', 'city', 'blood'}

如何在 python 中找到单个 csv 文件的两列之间的交集（公共元素）？

How to find the intersection (common elements) between two columns of a single csv file in python?

python

csv

intersection

python-3.x

reader