Dataframe - 使用 isin() 过滤特定列的值
Dataframe - filter the values of a particular column with isin()
我有一个 pandas 数据框,其中有“Bio Location”列,我想对其进行过滤,以便我的列表中只有包含城市名称的位置。我制作了以下代码,但我遇到了问题。
例如,如果位置是“法国巴黎”并且我的列表中有巴黎,那么它将 return 结果。但是,如果我有“France Paris”,它就不会 return “Paris”。你有解决方案吗?也许使用正则表达式?非常感谢!!!
df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
cities = [Paris, Bruxelles, Madrid]
values = df[df['Bio Location'].isin(citiesfr)]
values.to_csv(r'results.csv', index = False)
这里你要的是.str.contains()
:
1.我测试的DF:
df = {
'col1':['Paris France','France Paris Test','France Paris','Madrid Spain','Spain Madrid Test','Spain Madrid'] #so tested with 1x at start, 1x in the middle and 1x at the end of a str
}
df = pd.DataFrame(df)
df
结果:
index
col1
0
Paris France
1
France Paris Test
2
France Paris
3
Madrid Spain
4
Spain Madrid Test
5
Spain Madrid
2. 然后应用下面的代码:
已更新以下评论
#so tested with 1x at start, 1x in the middle and 1x at the end of a str
reg = ('Paris|Madrid')
df = df[df.col1.str.contains(reg)]
df
结果:
index
col1
0
Paris France
1
France Paris Test
2
France Paris
3
Madrid Spain
4
Spain Madrid Test
5
Spain Madrid
我有一个 pandas 数据框,其中有“Bio Location”列,我想对其进行过滤,以便我的列表中只有包含城市名称的位置。我制作了以下代码,但我遇到了问题。
例如,如果位置是“法国巴黎”并且我的列表中有巴黎,那么它将 return 结果。但是,如果我有“France Paris”,它就不会 return “Paris”。你有解决方案吗?也许使用正则表达式?非常感谢!!!
df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
cities = [Paris, Bruxelles, Madrid]
values = df[df['Bio Location'].isin(citiesfr)]
values.to_csv(r'results.csv', index = False)
这里你要的是.str.contains()
:
1.我测试的DF:
df = {
'col1':['Paris France','France Paris Test','France Paris','Madrid Spain','Spain Madrid Test','Spain Madrid'] #so tested with 1x at start, 1x in the middle and 1x at the end of a str
}
df = pd.DataFrame(df)
df
结果:
index | col1 |
---|---|
0 | Paris France |
1 | France Paris Test |
2 | France Paris |
3 | Madrid Spain |
4 | Spain Madrid Test |
5 | Spain Madrid |
2. 然后应用下面的代码:
已更新以下评论 #so tested with 1x at start, 1x in the middle and 1x at the end of a str
reg = ('Paris|Madrid')
df = df[df.col1.str.contains(reg)]
df
结果:
index | col1 |
---|---|
0 | Paris France |
1 | France Paris Test |
2 | France Paris |
3 | Madrid Spain |
4 | Spain Madrid Test |
5 | Spain Madrid |