pandas:只替换单词,不替换整个句子
pandas: replace only the word and not the entire sentence
我有一个数据框如下:(e,g)
import pandas as pd
df = pd.DataFrame({'text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .']})
我想用假库中的随机日期替换以下列表中收集的星期几我做了以下事情:
from faker import Faker
import numpy as np
fake = Faker()
days_list = ['Saturday','Monday','Tuesday']
我试过以下方法,但都是 return 替换的日期而不是整个句子:
df.text = np.where(df.text.str.contains('|'.join(days_list)),
fake.day_of_week(), df.text)
或
df.text.str.replace('|'.join(days_list), fake.day_of_week())
我想要的输出:
print(df): (e,g)
'Lary Page is visiting on Tuesday'
'On Thursday his boss, Maria Jackson is here .'
使用 lambda 函数替换回调:
regex = '|'.join(days_list)
df['text'] = df.text.str.replace(regex, lambda x: fake.day_of_week(), regex=True)
print (df)
text
0 Lary Page is visiting on Tuesday
1 On Thursday his boss, Maria Jackson is here .
from faker import Faker
import pandas as pd
df = pd.DataFrame({'text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .']})
fake = Faker()
days_list = ['Saturday','Monday','Tuesday']
df['text'] = df['text'].apply(lambda x: ' '.join(fake.day_of_week() if i in days_list else i for i in x.split()))
print(df)
输出:
text
0 Lary Page is visiting on Tuesday
1 On Monday his boss, Maria Jackson is here .
我有一个数据框如下:(e,g)
import pandas as pd
df = pd.DataFrame({'text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .']})
我想用假库中的随机日期替换以下列表中收集的星期几我做了以下事情:
from faker import Faker
import numpy as np
fake = Faker()
days_list = ['Saturday','Monday','Tuesday']
我试过以下方法,但都是 return 替换的日期而不是整个句子:
df.text = np.where(df.text.str.contains('|'.join(days_list)),
fake.day_of_week(), df.text)
或
df.text.str.replace('|'.join(days_list), fake.day_of_week())
我想要的输出:
print(df): (e,g)
'Lary Page is visiting on Tuesday'
'On Thursday his boss, Maria Jackson is here .'
使用 lambda 函数替换回调:
regex = '|'.join(days_list)
df['text'] = df.text.str.replace(regex, lambda x: fake.day_of_week(), regex=True)
print (df)
text
0 Lary Page is visiting on Tuesday
1 On Thursday his boss, Maria Jackson is here .
from faker import Faker
import pandas as pd
df = pd.DataFrame({'text':['Lary Page is visiting on Saturday',' On Monday his boss, Maria Jackson is here .']})
fake = Faker()
days_list = ['Saturday','Monday','Tuesday']
df['text'] = df['text'].apply(lambda x: ' '.join(fake.day_of_week() if i in days_list else i for i in x.split()))
print(df)
输出:
text
0 Lary Page is visiting on Tuesday
1 On Monday his boss, Maria Jackson is here .