使用 pandas 根据列查找两个数据帧之间的公共行
Finding common rows between two dataframes based on a column using pandas
我有两个数据框。我需要根据列 'a' 中的公共值提取行。但是,我不想在最后创建一个数据框,而是想保留两个数据框。
例如:
###Consider the following input
df1 = pd.DataFrame({'a':[0,1,1,2,3,4], 'b':['q','r','s','t','u','v'],'c':['a','b','c','d','e','f']})
df2 = pd.DataFrame({'a':[1,4,5,6], 'b':['qq','rr','ss','tt'],'c':[1,2,3,4]})
预期输出为:
###df1:
a. b. c
0. 1. r. a
1. 1. s. c
2. 4. v. f
###df2:
a. b. c
0. 1. qq 1
1. 4. rr 2
我怎样才能达到下面的结果?将不胜感激。
df1 = df1[df1['a'].isin(df2['a'])].reset_index(drop=True)
df2 = df2[df2['a'].isin(df1['a'])].reset_index(drop=True)
你可以用 numpy 的 intersect1d
概括它
import numpy as np
intersection_arr = np.intersect1d(df1['a'], df2['a'])
df1 = df1.loc[df1['a'].isin(intersection_arr),:]
df2 = df2.loc[df2['a'].isin(intersection_arr),:]
超过两个数据帧:
import numpy as np
from functools import reduce
intersection_arr = reduce(np.intersect1d, (df1['a'], df2['a'], df3['a']))
df1 = df1.loc[df1['a'].isin(intersection_arr),:]
df2 = df2.loc[df2['a'].isin(intersection_arr),:]
df3 = df3.loc[df3['a'].isin(intersection_arr),:]
我有两个数据框。我需要根据列 'a' 中的公共值提取行。但是,我不想在最后创建一个数据框,而是想保留两个数据框。
例如:
###Consider the following input
df1 = pd.DataFrame({'a':[0,1,1,2,3,4], 'b':['q','r','s','t','u','v'],'c':['a','b','c','d','e','f']})
df2 = pd.DataFrame({'a':[1,4,5,6], 'b':['qq','rr','ss','tt'],'c':[1,2,3,4]})
预期输出为:
###df1:
a. b. c
0. 1. r. a
1. 1. s. c
2. 4. v. f
###df2:
a. b. c
0. 1. qq 1
1. 4. rr 2
我怎样才能达到下面的结果?将不胜感激。
df1 = df1[df1['a'].isin(df2['a'])].reset_index(drop=True)
df2 = df2[df2['a'].isin(df1['a'])].reset_index(drop=True)
你可以用 numpy 的 intersect1d
概括它import numpy as np
intersection_arr = np.intersect1d(df1['a'], df2['a'])
df1 = df1.loc[df1['a'].isin(intersection_arr),:]
df2 = df2.loc[df2['a'].isin(intersection_arr),:]
超过两个数据帧:
import numpy as np
from functools import reduce
intersection_arr = reduce(np.intersect1d, (df1['a'], df2['a'], df3['a']))
df1 = df1.loc[df1['a'].isin(intersection_arr),:]
df2 = df2.loc[df2['a'].isin(intersection_arr),:]
df3 = df3.loc[df3['a'].isin(intersection_arr),:]