将 Jaro-Winkler 距离应用于数据框
Applying Jaro-Winkler distance to dataframe
我有两列的数据框。第一个是正确的字符串,第二个是损坏的。我想应用 Jaro-Winkler 距离并将其存储在新的第三列中。
import pandas as pd
from pyjarowinkler.distance import get_jaro_distance
df = pd.DataFrame(
{"Correct" : ['Hello' , 'bread' , 'situation'],
"Corrupt" : ['Hlloe' , 'braed' , 'sitatuion']},
index = [1, 2, 3])
df['res'] = [get_jaro_distance(x, y) for x, y in zip(df['Correct'], df['Corrupt'])]
Correct Corrupt res
1 Hello Hlloe 0.88
2 bread braed 0.95
3 situation sitatuion 0.97
我有两列的数据框。第一个是正确的字符串,第二个是损坏的。我想应用 Jaro-Winkler 距离并将其存储在新的第三列中。
import pandas as pd
from pyjarowinkler.distance import get_jaro_distance
df = pd.DataFrame(
{"Correct" : ['Hello' , 'bread' , 'situation'],
"Corrupt" : ['Hlloe' , 'braed' , 'sitatuion']},
index = [1, 2, 3])
df['res'] = [get_jaro_distance(x, y) for x, y in zip(df['Correct'], df['Corrupt'])]
Correct Corrupt res
1 Hello Hlloe 0.88
2 bread braed 0.95
3 situation sitatuion 0.97