pandas 基于年份索引的层次索引切片
pandas hierarchical index slicing based on year index
我有一个数据集,data1
。我正在尝试使用基于 input
的索引切片
其中 data1 =
stats
gender year
women 2003 cellphone use
2007 height
2007 cigarette use
2008 weight
2009 cellphone use
2015 cigarette use
2018 weight
2020 height
这是我对索引切片的尝试:
isvalid_yr = False
while not isvalid_yr:
year_input = int(input("Input the year you want to compare data from: "))
if year_input in data1.index.get_level_values('year')
idx = pd.IndexSlice
isvalid_yr = True
new_data1 = data1.loc(axis = 0)[idx[year_input:year_input], idx[:]]
else:
isvalid_yr = False
try:
if isvalid_yr ==True:
pass
else:
raise ValueError("Year not in data!")
except ValueError as err:
print("Year not in data!")
它给了我这个我不想要的输出。
Empty DataFrame
Columns: [stats]
Index: []
我想要实现的最终期望输出如下所示
Input the year you want to compare data from: 2007
new_data1 =
的结果
stats
gender year
women
2007 height
2007 cigarette use
使用xs
获取DataFrame的横截面:
res = df.xs(2007, axis=0, level='year', drop_level=False)
res
:
stats
gender year
women 2007 height
2007 cigarette use
有用户输入:
while True:
try:
year_input = int(
input("Input the year you want to compare data from: ")
)
res = df.xs(year_input, axis=0, level='year', drop_level=False)
break
except KeyError:
print("Year not in data!")
except ValueError:
print("Please enter a valid year")
df
使用:
df = pd.DataFrame({
'gender': ['women', 'women', 'women', 'women', 'women', 'women', 'women',
'women'],
'year': [2003, 2007, 2007, 2008, 2009, 2015, 2018, 2020],
'stats': ['cellphone use', 'height', 'cigarette use', 'weight',
'cellphone use', 'cigarette use', 'weight', 'height']
}).set_index(['gender', 'year'])
df
:
stats
gender year
women 2003 cellphone use
2007 height
2007 cigarette use
2008 weight
2009 cellphone use
2015 cigarette use
2018 weight
2020 height
我有一个数据集,data1
。我正在尝试使用基于 input
其中 data1 =
stats
gender year
women 2003 cellphone use
2007 height
2007 cigarette use
2008 weight
2009 cellphone use
2015 cigarette use
2018 weight
2020 height
这是我对索引切片的尝试:
isvalid_yr = False
while not isvalid_yr:
year_input = int(input("Input the year you want to compare data from: "))
if year_input in data1.index.get_level_values('year')
idx = pd.IndexSlice
isvalid_yr = True
new_data1 = data1.loc(axis = 0)[idx[year_input:year_input], idx[:]]
else:
isvalid_yr = False
try:
if isvalid_yr ==True:
pass
else:
raise ValueError("Year not in data!")
except ValueError as err:
print("Year not in data!")
它给了我这个我不想要的输出。
Empty DataFrame
Columns: [stats]
Index: []
我想要实现的最终期望输出如下所示
Input the year you want to compare data from: 2007
new_data1 =
stats
gender year
women
2007 height
2007 cigarette use
使用xs
获取DataFrame的横截面:
res = df.xs(2007, axis=0, level='year', drop_level=False)
res
:
stats
gender year
women 2007 height
2007 cigarette use
有用户输入:
while True:
try:
year_input = int(
input("Input the year you want to compare data from: ")
)
res = df.xs(year_input, axis=0, level='year', drop_level=False)
break
except KeyError:
print("Year not in data!")
except ValueError:
print("Please enter a valid year")
df
使用:
df = pd.DataFrame({
'gender': ['women', 'women', 'women', 'women', 'women', 'women', 'women',
'women'],
'year': [2003, 2007, 2007, 2008, 2009, 2015, 2018, 2020],
'stats': ['cellphone use', 'height', 'cigarette use', 'weight',
'cellphone use', 'cigarette use', 'weight', 'height']
}).set_index(['gender', 'year'])
df
:
stats
gender year
women 2003 cellphone use
2007 height
2007 cigarette use
2008 weight
2009 cellphone use
2015 cigarette use
2018 weight
2020 height