如何根据包含列顺序的列表重新排序 pandas 数据框

How to reorder pandas dataframe based off list containing column order

假设我有一个包含文件列表及其内容的数据框 'df':

File          Field          Folder
Users.csv       Age      UserFolder
Users.csv      Name      UserFolder
Cars.csv      Color       CarFolder
Cars.csv      Model       CarFolder

如果我已经订购了 'Field' 列应该如何排序的列表,我该如何重新排序这个 df?

users_col_order = ['Name', 'Age']
cars_col_order = ['Model', 'Color']

这样生成的 df 会像这样重新排序(我并不是想按字母倒序对 'Field' 进行排序,这个例子纯属巧合):

File          Field          Folder
Users.csv      Name      UserFolder
Users.csv       Age      UserFolder
Cars.csv      Model       CarFolder
Cars.csv      Color       CarFolder

首先,将您的新订单放入字典中:

mapping = {
    'Users': ['Name', 'Age'],
    'Cars': ['Model', 'Color'],
}

然后,创建一个新列,根据 File 值正确定位这些值,并使 Field 成为索引并使用新列对其进行索引:

original_cols = df.columns

for k, v in mapping.items():
    df.loc[df['File'] == k + '.csv', 'tmp'] = v

df = df.set_index('Field').loc[df['tmp']].reset_index().drop('tmp', axis=1)[original_cols]

输出:

>>> df
        File  Field      Folder
0  Users.csv   Name  UserFolder
1  Users.csv    Age  UserFolder
2   Cars.csv  Model   CarFolder
3   Cars.csv  Color   CarFolder

pd.Categoricalordered=True 一起使用!

categories = users_col_order + cars_col_order

df['Field'] = pd.Categorical(values = df['Field'],
                             categories = categories, 
                             ordered = True)
df.sort_values(by='Field')

File          Field          Folder
Users.csv      Name      UserFolder
Users.csv       Age      UserFolder
Cars.csv      Model       CarFolder
Cars.csv      Color       CarFolder

如果需要,您可以随时创建一个新列 Field_categorical 以保留 Field 中的原始值。