数据框列中的公共元素
Common elements in a dataframe column
我有一个 CSV 列表,我目前正在 Pandas 的数据框中读取它。我需要在数据框的一列中找到公共元素。
df1 = pd.read_csv("example.csv")
df2 = pd.read_csv("example1.csv")
val = np.intersect1d(example[' column'], example1[' column'])
如何为多个文件执行此操作。
您可以这样做:
df1 = pd.DataFrame([
(0, "A"),
(1, "B"),
(2, "C"),
(3, "D")
], columns=["id", "val"])
df2 = pd.DataFrame([
(0, "A"),
(1, "A"),
(2, "A"),
(3, "D")
], columns=["id", "val"])
df3 = pd.DataFrame([
(0, "A"),
(1, "A"),
(2, "A"),
(3, "A")
], columns=["id", "val"])
from functools import reduce
dfs = [df1, df2, df3]
val = reduce(
lambda acc, x: np.intersect1d(acc, x['val']),
dfs,
dfs[0]['val']
)
val
# array(['A'], dtype=object)
您可以通过解压可迭代对象在多个集合上使用 set.intersection
。来自@raulferreira 的数据。
res = set.intersection(*(set(df['val']) for df in [df1, df2, df3]))
print(res)
# {'A'}
我有一个 CSV 列表,我目前正在 Pandas 的数据框中读取它。我需要在数据框的一列中找到公共元素。
df1 = pd.read_csv("example.csv")
df2 = pd.read_csv("example1.csv")
val = np.intersect1d(example[' column'], example1[' column'])
如何为多个文件执行此操作。
您可以这样做:
df1 = pd.DataFrame([
(0, "A"),
(1, "B"),
(2, "C"),
(3, "D")
], columns=["id", "val"])
df2 = pd.DataFrame([
(0, "A"),
(1, "A"),
(2, "A"),
(3, "D")
], columns=["id", "val"])
df3 = pd.DataFrame([
(0, "A"),
(1, "A"),
(2, "A"),
(3, "A")
], columns=["id", "val"])
from functools import reduce
dfs = [df1, df2, df3]
val = reduce(
lambda acc, x: np.intersect1d(acc, x['val']),
dfs,
dfs[0]['val']
)
val
# array(['A'], dtype=object)
您可以通过解压可迭代对象在多个集合上使用 set.intersection
。来自@raulferreira 的数据。
res = set.intersection(*(set(df['val']) for df in [df1, df2, df3]))
print(res)
# {'A'}