从索引到 pandas 数据框中的字段名称
From indices to field name in pandas dataframe
我需要从索引中取回值名称。
我的数据集如下
try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
使用此功能:
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [i for i, x in enumerate(matches) if x]
try_test.apply(lambda row: func(row['name']), axis=1)
我得到以下值:
0 [0, 3, 4]
1 [1, 2]
2 [1, 2]
3 [0, 3]
4 [0, 4]
5 [5]
我想要单词字段而不是索引。
预期输出:
0 [apple, energy, fire]
1 [orange, diet]
2 [orange, diet]
3 [apple, energy]
4 [apple, fire]
5 [cake]
如有任何建议,我们将不胜感激。
获得带索引的 df 后,只需再次索引 df 就可以解决您的问题。这你可以在你的 func 之外或在你的 func 内做,IMO;
In [2]: import pandas as pd
In [3]: try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
...: 'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
In [4]: try_test
Out[4]:
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
In [5]: rows = [0,3,4]
In [6]: try_test.loc[rows, 'word']
Out[6]:
0 apple
3 energy
4 fire
Name: word, dtype: object
In [7]: try_test.loc[rows, 'word'].values.tolist()
['apple', 'energy', 'fire']
将函数从 i
更改为 try_test.word[i]
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [try_test.word[i] for i, x in enumerate(matches) if x]
我需要从索引中取回值名称。 我的数据集如下
try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
使用此功能:
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [i for i, x in enumerate(matches) if x]
try_test.apply(lambda row: func(row['name']), axis=1)
我得到以下值:
0 [0, 3, 4]
1 [1, 2]
2 [1, 2]
3 [0, 3]
4 [0, 4]
5 [5]
我想要单词字段而不是索引。
预期输出:
0 [apple, energy, fire]
1 [orange, diet]
2 [orange, diet]
3 [apple, energy]
4 [apple, fire]
5 [cake]
如有任何建议,我们将不胜感激。
获得带索引的 df 后,只需再次索引 df 就可以解决您的问题。这你可以在你的 func 之外或在你的 func 内做,IMO;
In [2]: import pandas as pd
In [3]: try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
...: 'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
In [4]: try_test
Out[4]:
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
In [5]: rows = [0,3,4]
In [6]: try_test.loc[rows, 'word']
Out[6]:
0 apple
3 energy
4 fire
Name: word, dtype: object
In [7]: try_test.loc[rows, 'word'].values.tolist()
['apple', 'energy', 'fire']
将函数从 i
更改为 try_test.word[i]
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [try_test.word[i] for i, x in enumerate(matches) if x]