如何使用正则表达式从 DataFrame 中提取数据?

How do I extract data from a DataFrame using regular expressions?

我正在尝试更正 DataFrame 中的数据,但遇到值替换问题。原始值采用“31 ^”或“54_”格式,我需要它采用整数格式,例如 31.54

frame = pd.DataFrame({'first': [123, '32^'], 'second': [23,'13_']})
frame['first'] = frame['first'].str.extract(r'([0-9]+)', expand=False)


first   second
0   NaN 23
1   32  13_

使用Series.str.extract with fillna:

In [679]: frame['first'] = frame['first'].str.extract('(\d+)').fillna(frame['first'])

In [680]: frame['second'] = frame['second'].str.extract('(\d+)').fillna(frame['second'])

In [681]: frame
Out[681]: 
  first second
0   123     23
1    32     13