.isin() 函数在过滤 DataFrame 中的对象列时返回一个空集
.isin() function is returning an empty set when filtering an object column in DataFrame
正在读取和附加 excel 个文件以创建 DataFrame:
import pandas as pd
import os
folder = r'C:\mypathtodocuments'
files = os.listdir(folder)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(os.path.join(folder,file)))
#Drop extra columns from wrong data
df1 = df[['FIRST_NM', 'LAST_NM', 'CITY_AD']]
CITY_AD
列的预览:
>>> df1["CITY_AD"]
0 EL PASO
1 HOUSTON
2 HOUSTON
3 CONROE
4 MCKINNEY
5 MCKINNEY
6 KATY
7 TOMBALL
8 TOMBALL
9 SPRING
10 SPRING
使用 .isin()
函数过滤 DataFrame 以仅包含城市 HOUSTON
和 CONROE
:
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]
这个 returns 是一个空集...我怎样才能让它正确过滤?
试试这个:
df1["CITY_AD"] = df1["CITY_AD"].str.strip()
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]
正在读取和附加 excel 个文件以创建 DataFrame:
import pandas as pd
import os
folder = r'C:\mypathtodocuments'
files = os.listdir(folder)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(os.path.join(folder,file)))
#Drop extra columns from wrong data
df1 = df[['FIRST_NM', 'LAST_NM', 'CITY_AD']]
CITY_AD
列的预览:
>>> df1["CITY_AD"]
0 EL PASO
1 HOUSTON
2 HOUSTON
3 CONROE
4 MCKINNEY
5 MCKINNEY
6 KATY
7 TOMBALL
8 TOMBALL
9 SPRING
10 SPRING
使用 .isin()
函数过滤 DataFrame 以仅包含城市 HOUSTON
和 CONROE
:
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]
这个 returns 是一个空集...我怎样才能让它正确过滤?
试试这个:
df1["CITY_AD"] = df1["CITY_AD"].str.strip()
df1[df1["CITY_AD"].isin(["HOUSTON","CONROE"])]