如何更改同义词集列表以列出元素?

How to change a list of synsets to list elements?

我已经为我的项目尝试了以下代码片段:

import pandas as pd
import nltk
from nltk.corpus import wordnet as wn

nltk.download('wordnet')
df=[]
hypo = wn.synset('science.n.01').hyponyms()
hyper = wn.synset('science.n.01').hypernyms()
mero = wn.synset('science.n.01').part_meronyms()
holo = wn.synset('science.n.01').part_holonyms()
ent = wn.synset('science.n.01').entailments()
df = df+hypo+hyper+mero+holo+ent
df_agri_clean = pd.DataFrame(df)
df_agri_clean.columns=["Items"]
print(df_agri_clean)

pd.set_option('display.expand_frame_repr', False)

它给了我这个数据帧的输出:

                             Items
0            Synset('agrobiology.n.01')
1               Synset('agrology.n.01')
2               Synset('agronomy.n.01')
3         Synset('architectonics.n.01')
4      Synset('cognitive_science.n.01')
5          Synset('cryptanalysis.n.01')
6    Synset('information_science.n.01')
7            Synset('linguistics.n.01')
8            Synset('mathematics.n.01')
9             Synset('metallurgy.n.01')
10             Synset('metrology.n.01')
11       Synset('natural_history.n.01')
12       Synset('natural_science.n.01')
13             Synset('nutrition.n.03')
14            Synset('psychology.n.01')
15        Synset('social_science.n.01')
16            Synset('strategics.n.01')
17           Synset('systematics.n.01')
18           Synset('thanatology.n.01')
19            Synset('discipline.n.01')
20     Synset('scientific_theory.n.01')
21  Synset('scientific_knowledge.n.01')

只需打印 df 即可将其转换为列表。

[Synset('agrobiology.n.01'), Synset('agrology.n.01'), Synset('agronomy.n.01'), Synset('architectonics.n.01'), Synset('cognitive_science.n.01'), Synset('cryptanalysis.n.01'), Synset('information_science.n.01'), Synset('linguistics.n.01'), Synset('mathematics.n.01'), Synset('metallurgy.n.01'), Synset('metrology.n.01'), Synset('natural_history.n.01'), Synset('natural_science.n.01'), Synset('nutrition.n.03'), Synset('psychology.n.01'), Synset('social_science.n.01'), Synset('strategics.n.01'), Synset('systematics.n.01'), Synset('thanatology.n.01'), Synset('discipline.n.01'), Synset('scientific_theory.n.01'), Synset('scientific_knowledge.n.01')]

我希望像这样更改“项目”下的每个词: Synset('agrobiology.n.01') => agrobiology.n.01 要么 Synset('agrobiology.n.01') => 'agrobiology' 任何相关的答案将不胜感激!谢谢!

要访问这些项目的名称,只需执行 function.name()。您可以使用行理解更新这些项目,如下所示:

df_agri_clean['Items'] = [df_agri_clean['Items'][i].name() for i in range(len(df_agri_clean))] 
df_agri_clean

输出如你所愿

    Items
0   agrobiology.n.01
1   agrology.n.01
2   agronomy.n.01
3   architectonics.n.01
4   cognitive_science.n.01
5   cryptanalysis.n.01
6   information_science.n.01
7   linguistics.n.01
8   mathematics.n.01
9   metallurgy.n.01
10  metrology.n.01
11  natural_history.n.01
12  natural_science.n.01
13  nutrition.n.03
14  psychology.n.01
15  social_science.n.01
16  strategics.n.01
17  systematics.n.01
18  thanatology.n.01
19  discipline.n.01
20  scientific_theory.n.01
21  scientific_knowledge.n.01

要进一步替换字符串中的“.n.01”,您可以执行以下操作:

df_agri_clean['Items'] = [df_agri_clean['Items'][i].name().replace('.n.01', '') for i in range(len(df_agri_clean))] 
df_agri_clean

输出(就像你的第二个预期输出)


Items
0   agrobiology
1   agrology
2   agronomy
3   architectonics
4   cognitive_science
5   cryptanalysis
6   information_science
7   linguistics
8   mathematics
9   metallurgy
10  metrology
11  natural_history
12  natural_science
13  nutrition.n.03
14  psychology
15  social_science
16  strategics
17  systematics
18  thanatology
19  discipline
20  scientific_theory
21  scientific_knowledge