在 python 中提取字符串的特定部分
Extracting a specific part of a string in python
我正在尝试提取 pandas 系列中字符串的特定部分。
例如:
energy['Country']
给我:
27 Aruba
28 Australia1
29 Austria
30 Azerbaijan
31 Bahamas
32 Bahrain
33 Bangladesh
34 Barbados
35 Belarus
36 Belgium
37 Belize
38 Benin
39 Bermuda
40 Bhutan
41 Bolivia (Plurinational State of)
42 Bonaire, Sint Eustatius and Saba
我想把'Bolivia (Plurinational State of)'改成'Bolivia'.
我的失败尝试是:
pattern = “(.*?)”
list = [re.sub(pattern, '', i) for i in energy['Country']]
energy['Country'] = list
任何人都可以给我任何建议,告诉我如何修改我的代码以使其工作!?
这样做:
df['Country'] = df['Country'].str.replace(r"\(.*\)","")
示例数据框示例:
In [91]: df
Out[91]:
Country
0 Aruba
1 Australia1
2 Bolivia (Plurinational State of)
In [93]: df['Country'] = df['Country'].str.replace(r"\(.*\)","")
In [94]: df
Out[94]:
Country
0 Aruba
1 Australia1
2 Bolivia
我正在尝试提取 pandas 系列中字符串的特定部分。
例如:
energy['Country']
给我:
27 Aruba
28 Australia1
29 Austria
30 Azerbaijan
31 Bahamas
32 Bahrain
33 Bangladesh
34 Barbados
35 Belarus
36 Belgium
37 Belize
38 Benin
39 Bermuda
40 Bhutan
41 Bolivia (Plurinational State of)
42 Bonaire, Sint Eustatius and Saba
我想把'Bolivia (Plurinational State of)'改成'Bolivia'.
我的失败尝试是:
pattern = “(.*?)”
list = [re.sub(pattern, '', i) for i in energy['Country']]
energy['Country'] = list
任何人都可以给我任何建议,告诉我如何修改我的代码以使其工作!?
这样做:
df['Country'] = df['Country'].str.replace(r"\(.*\)","")
示例数据框示例:
In [91]: df
Out[91]:
Country
0 Aruba
1 Australia1
2 Bolivia (Plurinational State of)
In [93]: df['Country'] = df['Country'].str.replace(r"\(.*\)","")
In [94]: df
Out[94]:
Country
0 Aruba
1 Australia1
2 Bolivia