如何在不使用元组的情况下对数据框的行和列进行多索引?
How to have multi index on both rows and columns of a dataframe without using tuples?
有没有办法在不使用元组的情况下创建行和列都具有 multi-indexing 的数据框?我的标签太长,无法以元组形式手动输入(96 个国家和每个国家 26 个部门)。
Example of what I want
我试过:
df_data.columns=label_df
df_data_w = pd.concat([label_df, data],axis=1,ignore_index=False)
这将标签 df 添加到前两列,但未对其编制索引。我改为在 dataframe
之后得到这个
这是一些要使用的代码:
import numpy as np
import pandas as pd
a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)
df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])
Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']
Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']
label_df = pd.DataFrame(data=[Continent, Sectors])
df.columns=label_df
df_w_labels = pd.concat([label_df, data],axis=1,ignore_index=False)`
这在我的 df 中为我提供了 headers 标签,但我也需要它们作为列,所以我尝试了 concat,它将标签 df 添加到前两列,但没有索引它。
您可以将 zip
和 list
与 pd.MultiIndex
一起使用:
a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)
df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])
Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']
Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']
indx = pd.MultiIndex.from_tuples(list(zip(Continent,Sectors)))
df.index = indx
df.columns = indx
print(df)
有没有办法在不使用元组的情况下创建行和列都具有 multi-indexing 的数据框?我的标签太长,无法以元组形式手动输入(96 个国家和每个国家 26 个部门)。 Example of what I want
我试过:
df_data.columns=label_df
df_data_w = pd.concat([label_df, data],axis=1,ignore_index=False)
这将标签 df 添加到前两列,但未对其编制索引。我改为在 dataframe
之后得到这个这是一些要使用的代码:
import numpy as np
import pandas as pd
a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)
df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])
Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']
Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']
label_df = pd.DataFrame(data=[Continent, Sectors])
df.columns=label_df
df_w_labels = pd.concat([label_df, data],axis=1,ignore_index=False)`
这在我的 df 中为我提供了 headers 标签,但我也需要它们作为列,所以我尝试了 concat,它将标签 df 添加到前两列,但没有索引它。
您可以将 zip
和 list
与 pd.MultiIndex
一起使用:
a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)
df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])
Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']
Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']
indx = pd.MultiIndex.from_tuples(list(zip(Continent,Sectors)))
df.index = indx
df.columns = indx
print(df)