
pandas: replace values in a column based on a condition in another dataframe if that value is in the second dataframe


import pandas as pd
df = pd.DataFrame({'text':['I go to school','open the green door', 'go out and play'],

df2 = pd.DataFrame({'verbs':['go','open','close','share','divide'],

如果在 df2.verbs 中找到动词,我想将 df.text 中的动词替换为它们在 df2.new_verbs 中的过去形式。到目前为止,我已经完成了以下工作,

df['text'] = df['text'].str.split()
new_df = df.apply(pd.Series.explode)
new_df = new_df.assign(new=lambda d: d['pos'].mask(d['pos'] == 'VERB', d['text']))
new_df.text[] = df2.new_verbs


       text    pos    new
0       I   PRON   PRON
0    went   VERB     go
0      to    ADP    ADP
0  school   NOUN   NOUN
1  opened   VERB   open
1     the    DET    DET
1   green    ADJ    ADJ
1    door   NOUN   NOUN
2    went   VERB     go
2     out    ADP    ADP
2     and  CCONJ  CCONJ
2    play   VERB   play


import re
regex = '|'.join(map(re.escape, df2['verbs']))
s = df2.set_index('verbs')['new_verbs']

df['text'] = df['text'].str.replace(regex, lambda m: s.get(, m),


                  text                       pos                  text2
0       I go to school   [PRON, VERB, ADP, NOUN]       I went to school
1  open the green door    [VERB, DET, ADJ, NOUN]  opened the green door
2      go out and play  [VERB, ADP, CCONJ, VERB]      went out and play

对于较小的列表,您可以使用 pandas replace 和这样的字典:

verbs_map = dict(zip(df2.verbs, df2.new_verbs))

基本上,dict(zip(df2.verbs, df2.new_verbs) 创建了一个新词典,将旧动词映射到它们的新(过去时)动词,例如{'go' : 'went' , 'close' : 'closed', ...}.