Pandas - 比较数据框中的 2 列和 return 计数
Pandas - Compare 2 columns in a dataframe and return count
数据帧:
我想比较这两列并提取匹配和不匹配行的计数。
结果会像
Matched = 3
Un matched = 2
先用 Series.eq
and count by Series.value_counts
比较值,然后替换 True, False
索引:
s = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'}))
如果需要DataFrame:
sdf1 = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'})
.rename_axis('state')
.reset_index(name='count'))
另一种方式,布尔索引,有条件地分配匹配状态。此后 get_summies 并对列求和。下面的代码
pd.get_dummies(np.where(df['code_207']==df['code_207a'],'matched','unmatched')).sum(0)
试试这个:
import pandas as pd
df = pd.DataFrame(data={'input_number':[123,253,458,479,1564],'org_number':[1234,253,458,478,1564]})
matched, un_matched = df[df['input_number']==df['org_number']].shape[0],df[df['input_number']!=df['org_number']].shape[0]
print("Matched = {}\nUn matched = {}".format(matched,un_matched))
数据帧:
我想比较这两列并提取匹配和不匹配行的计数。
结果会像
Matched = 3
Un matched = 2
先用 Series.eq
and count by Series.value_counts
比较值,然后替换 True, False
索引:
s = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'}))
如果需要DataFrame:
sdf1 = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'})
.rename_axis('state')
.reset_index(name='count'))
另一种方式,布尔索引,有条件地分配匹配状态。此后 get_summies 并对列求和。下面的代码
pd.get_dummies(np.where(df['code_207']==df['code_207a'],'matched','unmatched')).sum(0)
试试这个:
import pandas as pd
df = pd.DataFrame(data={'input_number':[123,253,458,479,1564],'org_number':[1234,253,458,478,1564]})
matched, un_matched = df[df['input_number']==df['org_number']].shape[0],df[df['input_number']!=df['org_number']].shape[0]
print("Matched = {}\nUn matched = {}".format(matched,un_matched))