DataFrames 列根据列表重新排列 - DataFrames 有不同的列

Question

SUMMARY of my problem:

我有很多 DataFrame，所有列都具有相同的列池（7 列，例如 COLUMN1:COLUMN7），但有时缺少一列或多列（即 DataFrame 可能有 COLUMN1:COLUMN3 + COLUMN6:COLUMN7，因此缺少第 4 和第 5 列）。
每个 DataFrame 的列每次都以不同的顺序排列（即 df1 有它的顺序，df2 有另一个顺序，df3 有另一个顺序等等...）。
我想根据列表排列每个 DataFrame 中的列作为基准的列（在本例中为列列表从 1 到 7).
期望的结果是所有数据帧都具有相同的基于此列表的列顺序，如果列缺少顺序应保留（即，如果第 4 列和第 5 列缺失，则列应为：COL1、COL2、COL3、COL6、COL7）。

More detailed description:

我的代码中有几个数据帧是通过清理一些数据集生成的。这些 DataFrame 中的每一个都有不同数量的列，并且顺序不同，但列仅限于此列表：'id', 'title', 'type', 'category', 'secondary category', 'date', 'description'。因此，该列表中的列最多可以是 7。示例：

DataFrame1 'id', 'title', 'date', 'category', 'type', 'description', 'secondary category'

DataFrame2 'id', 'description', 'title', 'type', 'category', 'date'

DataFrame3 'id', 'category', 'description', 'title'

DESIRED OUTPUT:

我想根据初始列表 'id', 'title', 'type', 'category', 'secondary category', 'date', 'description' 对列进行排序，即使列数不同也是如此。从上面的例子中，DataFrame 应该变成：

DataFrame1 'id', 'title', 'type', 'category', 'secondary category', 'date', 'description'

DataFrame2 'id', 'title', 'type', 'category', 'date', 'description'

DataFrame3 'id', 'title', 'category', 'description'

有没有办法（例如循环）以这种方式排列列？

Answer 1

您可以使用列表理解对列的顺序进行排序并使用 reindex 设置正确的顺序：

desired_order = ['id', 'title', 'type', 'category', 'secondary category', 'date', 'description']

df = df.reindex([i for i in desired_order if i in df.columns], axis=1)

DataFrames 列根据列表重新排列 - DataFrames 有不同的列

DataFrames columns re-arrange based on list - DataFrames have different columns

python

dataframe

pandas

columnsorting