比较 python 中的单独数据帧删除值
comparing separate dataframes in python removing values
我有两个多索引数据帧:data1
和 data2
,我正在 运行 进行比较。
数据集内容data1 =
Personalities Rating
Type
warm 5
Cat caring 7
frightful 9
happy 3
数据集内容data2 =
Personalities Rating
Type
mean 3
Dog ferocious 8
loyal 4
happy 1
warm 6
我想使用屏蔽操作来识别所有在两个数据框中的个性列中不具有相同值的行(所有唯一的个性值)。
然后,我需要从数据框中删除所有这些行,这样两个数据框的个性列中的值就会相同。
我的尝试包括:new_data1['Personalities'].isin(new_data2['Personalities']).any(axis=1)
new_data1 =
的结果数据集
Personalities Rating
Type
warm 5
Cat
happy 3
new_data2
的结果数据集
Personalities Rating
Type
Dog
happy 1
warm 6
我想创建一个具有唯一值的新数据框,它看起来像这样:
在 unique_data =
Personalities Rating
Type
Cat caring 7
frightful 9
Dog mean 3
ferocious 8
loyal 4
输入数据:
>>> df1
Personalities Rating
Type
Cat warm 5
Cat caring 7
Cat frightful 9
Cat happy 3
>>> df2
Personalities Rating
Type
Dog mean 3
Dog ferocious 8
Dog loyal 4
Dog happy 1
Dog warm 6
准备几套:
s1 = set(df1["Personalities"])
s2 = set(df2["Personalities"])
现在,您可以提取您需要的数据:
new_data1 = df1.loc[df1["Personalities"].isin(s1.intersection(s2))]
new_data2 = df2.loc[df2["Personalities"].isin(s2.intersection(s1))]
unique_data = pd.concat([df1.loc[df1["Personalities"].isin(s1.difference(s2))],
df2.loc[df2["Personalities"].isin(s2.difference(s1))]])
>>> new_data1
Personalities Rating
Type
Cat warm 5
Cat happy 3
>>> new_data2
Personalities Rating
Type
Dog happy 1
Dog warm 6
>>> unique_data
Personalities Rating
Type
Cat caring 7
Cat frightful 9
Dog mean 3
Dog ferocious 8
Dog loyal 4
我有两个多索引数据帧:data1
和 data2
,我正在 运行 进行比较。
数据集内容data1 =
Personalities Rating
Type
warm 5
Cat caring 7
frightful 9
happy 3
数据集内容data2 =
Personalities Rating
Type
mean 3
Dog ferocious 8
loyal 4
happy 1
warm 6
我想使用屏蔽操作来识别所有在两个数据框中的个性列中不具有相同值的行(所有唯一的个性值)。
然后,我需要从数据框中删除所有这些行,这样两个数据框的个性列中的值就会相同。
我的尝试包括:new_data1['Personalities'].isin(new_data2['Personalities']).any(axis=1)
new_data1 =
Personalities Rating
Type
warm 5
Cat
happy 3
new_data2
Personalities Rating
Type
Dog
happy 1
warm 6
我想创建一个具有唯一值的新数据框,它看起来像这样:
在 unique_data =
Personalities Rating
Type
Cat caring 7
frightful 9
Dog mean 3
ferocious 8
loyal 4
输入数据:
>>> df1
Personalities Rating
Type
Cat warm 5
Cat caring 7
Cat frightful 9
Cat happy 3
>>> df2
Personalities Rating
Type
Dog mean 3
Dog ferocious 8
Dog loyal 4
Dog happy 1
Dog warm 6
准备几套:
s1 = set(df1["Personalities"])
s2 = set(df2["Personalities"])
现在,您可以提取您需要的数据:
new_data1 = df1.loc[df1["Personalities"].isin(s1.intersection(s2))]
new_data2 = df2.loc[df2["Personalities"].isin(s2.intersection(s1))]
unique_data = pd.concat([df1.loc[df1["Personalities"].isin(s1.difference(s2))],
df2.loc[df2["Personalities"].isin(s2.difference(s1))]])
>>> new_data1
Personalities Rating
Type
Cat warm 5
Cat happy 3
>>> new_data2
Personalities Rating
Type
Dog happy 1
Dog warm 6
>>> unique_data
Personalities Rating
Type
Cat caring 7
Cat frightful 9
Dog mean 3
Dog ferocious 8
Dog loyal 4