使用 python pandas 行来创建排列以创建所有可能的场景

Question

我在 excel 中有 5 个 sheet 具有不同的参数。

history

idx  history
1    1daybehind

recorded

idx   recorded
1     daily

optmethod

idx   opt         optmethod
1     backprop    x1
2     convex      x2
3     monte       x3
4     monte       x4

optpara

idx   optpara   
1     x1x2    
2     x3x4    
3     x1x4  
4     x2x3

filter

idx   filter   
1     x1>0    
2     x2>0    
3     x3>0  
4     x4>0

我想创建行条目的排列，所以我想以以下 sheet 结束所有可能的场景。这只是前 6 行。

scenario history recorded optmethod optpara filter
1        1       1        1         1        1
2        1       1        1         1        2
3        1       1        1         1        3
4        1       1        1         1        4
5        1       1        1         2        1
6        1       1        1         2        2
...

所以第一行，场景 1 将是 1 1daybehind，1 daily，1 backprop，1 x1x2，1 x1>1

我尝试了下面的代码，

for name,sheet in sheet_dict.items():
    print(name)
    if name == 'history':
        sndf = sheet
        sndf = sndf[['idx']]
        sndf = sndf.rename(columns={'idx':name})
    else: 
        sndf['key'] = 1
        sheet = sheet[['idx']]
        sheet = sheet.rename(columns={'idx':name})
        sheet['key'] = 1
        sndf = pd.merge(sndf, sheet, on ='key').drop("key", 1)
sndf.index.names = ['scenario']
sndf.to_csv('scenarionum.csv',index=True)

但我最终得到了这个。我有正确的行数，但每个单元格都只填充了 1s.

scenario history recorded optmethod optpara filter
0        1       1        1         1        1
1        1       1        1         1        1
2        1       1        1         1        1
3        1       1        1         1        1
4        1       1        1         2        1
5        1       1        1         2        1

我相信这个问题的答案是交叉连接，但我不确定我该怎么做。

我做错了什么，我该如何解决？？？

Answer 1

如果 idx 是您的数据帧的索引：

indexes = [ data.index.tolist() for data in sheet_dict.values()]

否则，如果 idx 是数据帧的简单列：

indexes = [ data["idx"].tolist() for data in sheet_dict.values()]

生成所有组合

sndf = pd.MultiIndex.from_product(indexes, names=sheet_dict.keys()) \
                    .to_frame(index=False)
                    .rename_axis("scenario")
sndf.index += 1

>>> sndf
          history  recorded  optmethod  optpara  filter
scenario
1               1         1          1        1       1
2               1         1          1        1       2
3               1         1          1        1       3
4               1         1          1        1       4
5               1         1          1        2       1
...           ...       ...        ...      ...     ...
60              1         1          4        3       4
61              1         1          4        4       1
62              1         1          4        4       2
63              1         1          4        4       3
64              1         1          4        4       4

[64 rows x 5 columns]

更新：替代方法

与 cartesian_product 来自 pandas.core.reshape.util

from pandas.core.reshape.util import cartesian_product

sndf = pd.DataFrame(list(zip(*cartesian_product(indexes))),
                    columns=sheet_dict.keys()).rename_axis("scenario")
sndf.index += 1

与 product 来自 itertools

from itertools import product

sndf = pd.DataFrame(product(*indexes),
                    columns=sheet_dict.keys()).rename_axis("scenario")
sndf.index += 1

使用 python pandas 行来创建排列以创建所有可能的场景

Use python pandas rows to create permutations to create all possible scenarios

scenarios

permutation

pandas