如何在条件后打印应用函数内的行号

Question

我有一个如下所示的数据框

Company,year                                   
T123 Inc Ltd,1990
T124 PVT ltd,1991
T345 Ltd,1990
T789 Pvt.LTd,2001
ABC Limited,1992
ABCDE Ltd,1994
ABC Ltd,1997
ABFE,1987
Tesla ltd,1995
AMAZON Inc,2001
Apple ltd,2003

compare = pd.MultiIndex.from_product([tf['Company'].astype(str),tf['Company'].astype(str)]).to_series()
compare = compare[compare.index.get_level_values(0) != compare.index.get_level_values(1)]

我的数据如下所示

我想执行以下操作

a) 在每个输入键比较结束时打印索引号(i)。for ex: T123 Inc Ltd is compared with ten other strings. So, once it is done and moves to a new/next string/input_key which is T124 PVT ltd, I want the value of i to be incremented by 1 and shown using print function.

我尝试了以下

def metrics(tup):
print(compare.loc[tup]) # doesn't work. I want the index number 
return list(tup)

我不知道如何在 apply 函数中 increment/iterate

我希望打印语句的输出如下所示。打印语句应该在每个输入键的比较结束后才执行。

comparison of 1st input key with 10 strings is done
comparison of 2nd input key with 10 strings is done

更新-代码

ls = []
ks=[]
for i, k in enumerate(compare, 1):
    ls.append(metrics(k))
    ks.append(k)
    print(f'comparison of {i} input key: {k} with {len(ls)} strings is done')
pd.concat(pd.DataFrame(ks),pd.DataFrame(ls)) # error
pd.concat(ks,ls)  # error

指标

def metrics(tup):
    return pd.Series([fuzz.ratio(*tup),
                      fuzz.token_sort_ratio(*tup),
                      fuzz.token_set_ratio(*tup),
                      fuzz.QRatio(*tup),
                      fuzz.UQRatio(*tup),
                      fuzz.UWRatio(*tup)],
                     ['ratio', 'token','set','qr','uqr','uwr'])

Answer 1

让我们尝试枚举 level 0 索引中的唯一值：

groups = []
grouper = compare.groupby(level=0, sort=False)

for i, (k, g) in enumerate(grouper, 1):
    # Execute statements here
    groups.append(g.apply(metrics))
    print(f'comparison of {i} input key: {k} with {len(g)} strings is done')

df_out = pd.concat(groups)

comparison of 1 input key: T123 Inc Ltd with 10 strings is done
comparison of 2 input key: T124 PVT ltd with 10 strings is done
...
comparison of 11 input key: Apple ltd with 10 strings is done

如何在条件后打印应用函数内的行号

How to print row number inside apply function after a criteria

python

series

multi-index

dataframe

pandas