使用 SequenceMatcher 过滤 pandas 数据帧
Filter pandas dataframe with SequenceMatcher
当我使用下面的代码过滤数据框时,它工作正常
my_df.loc[lambda x:x["name"]=="space"]
当我使用以下代码进行过滤时,出现错误
my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8]
我想使用 SequenceMatcher
进行过滤,并且可能使用比上述条件更复杂的条件
完整代码如下:
import pandas as pd
import difflib
my_df=pd.DataFrame({"name":["space","mapp","eemail","daata"],"id":[9,12,13,14]})
my_df.loc[lambda x:x["name"]=="space"] #this line works
my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8] #this doesn't
尝试以下操作:
my_df.loc[my_df['name'].apply(lambda x: difflib.SequenceMatcher(None,"email",x).ratio()) > 0.8]
输出:
id name
2 13 eemail
当我使用下面的代码过滤数据框时,它工作正常
my_df.loc[lambda x:x["name"]=="space"]
当我使用以下代码进行过滤时,出现错误
my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8]
我想使用 SequenceMatcher
进行过滤,并且可能使用比上述条件更复杂的条件
完整代码如下:
import pandas as pd
import difflib
my_df=pd.DataFrame({"name":["space","mapp","eemail","daata"],"id":[9,12,13,14]})
my_df.loc[lambda x:x["name"]=="space"] #this line works
my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8] #this doesn't
尝试以下操作:
my_df.loc[my_df['name'].apply(lambda x: difflib.SequenceMatcher(None,"email",x).ratio()) > 0.8]
输出:
id name
2 13 eemail