如何将列表列表转换为 pandas 数据框的列?

How to turn a list of lists into columns of a pandas dataframe?

我想问一下如何取消嵌套列表并将其转换为数据框的不同列。具体来说,我有以下数据框,其中 Route_set column 是列表列表:

   Generation                              Route_set
0           0  [[20. 19. 47. 56.] [21. 34. 78. 34.]]

所需的输出是以下数据帧:

   route1  route2
0      20      21
1      19      34
2      47      78
3      56      34

有什么办法吗?提前致谢!

您可以创建字典并使用 for 循环更新它,这不是最快的方法,但非常简单。

new_dic = {}
# Create and fill dictionnary, each key_value pair corresponds to a list
for i, values in enumerate(df.Route_set):
    new_dic[f'route{i}'] = values
# Drop the double list column
df.drop('Route_set', axis=1, inplace=True)
# Updated dataframe with dic key_value pairs
for key in new_dic.keys():
    df[key] = new_dic[key]

您可能会做得更好,但这应该可以快速解决您的问题!

您可以尝试使用 df.explodedf.apply:

import pandas as pd

df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[[[20., 19., 47., 56.], [21., 34., 78., 34.]]]})
df['route1']=df['Route_set'].apply(lambda x: x[0])
df['route2']=df['Route_set'].apply(lambda x: x[1])
df = df.explode(['route1', 'route2'], ignore_index=True)
df2 = df[df.columns.difference(['Route_set', 'Generation'])]
|    |   route1 |   route2 |
|---:|---------:|---------:|
|  0 |       20 |       21 |
|  1 |       19 |       34 |
|  2 |       47 |       78 |
|  3 |       56 |       34 |

或者您可以使用如下值创建一个新数据框:

import pandas as pd

df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[[[20., 19., 47., 56.], [21., 34., 78., 34.]]]})
df1 = pd.DataFrame.from_dict(dict(zip(['route1', 'route2'], df.Route_set.to_numpy()[0])), orient='index').transpose()
|    |   route1 |   route2 |
|---:|---------:|---------:|
|  0 |       20 |       21 |
|  1 |       19 |       34 |
|  2 |       47 |       78 |
|  3 |       56 |       34 |

更新 1:

import pandas as pd

df = pd.DataFrame(data= {'Generation': 0, 'Route_set':[
                                                       [[20.0, 19.0, 47.0, 56.0, 43.0, 53.0, 18.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 51.0, 46.0, 37.0, 2.0, 57.0, 49.0, 36.0, 25.0, 5.0, 4.0, 34.0], [54.0, 23.0, 5.0, 46.0, 34.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 48.0, 46.0, 35.0, 25.0, 27.0, 52.0, 8.0, 39.0, 22.0, 51.0, 28.0], [57.0, 16.0, 45.0, 25.0, 49.0, 38.0, 0.0, 46.0, 13.0, 18.0, 19.0, 20.0], [21.0, 11.0, 6.0, 33.0, 25.0, 49.0, 57.0, 29.0, 12.0, 3.0, -1.0, -1.0], [9.0, 15.0, 47.0, 42.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [51.0, 25.0, 22.0, 14.0, 39.0, 8.0, 40.0, 0.0, 10.0, 26.0, 32.0, 47.0], [1.0, 33.0, 24.0, 46.0, 56.0, 30.0, 48.0, 51.0, -1.0, -1.0, -1.0, -1.0], [25.0, 31.0, 50.0, 17.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 12.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 41.0, 47.0, 15.0, 46.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [14.0, 44.0, 39.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [20.0, 51.0, 25.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [57.0, 49.0, 5.0, 20.0, 37.0, 46.0, 36.0, 25.0, 39.0, 51.0, 48.0, -1.0], [5.0, 0.0, 33.0, 55.0, 25.0, 48.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [51.0, 32.0, 33.0, 24.0, 35.0, 8.0, 25.0, 4.0, 46.0, 1.0, 7.0, -1.0], [5.0, 25.0, 34.0, 46.0, 1.0, 9.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [38.0, 57.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], [12.0, 57.0, 49.0, 25.0, 9.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0]],
                                                      ]})

data = df.Route_set.to_numpy()[0]

df = pd.DataFrame.from_dict(dict(zip(['route{}'.format(i) for i in range(1, len(data)+1)], [data[i] for i in range(len(data))])), orient='index').transpose()
df = df.apply(lambda x: x.explode() if 'route' in x.name  else x)

df[sorted(df.columns)]
print(df.to_markdown())
|    |   route1 |   route2 |   route3 |   route4 |   route5 |   route6 |   route7 |   route8 |   route9 |   route10 |   route11 |   route12 |   route13 |   route14 |   route15 |   route16 |   route17 |   route18 |   route19 |   route20 |
|---:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
|  0 |       20 |       20 |       54 |       57 |       57 |       21 |        9 |       51 |        1 |        25 |        57 |        20 |        14 |        20 |        57 |         5 |        51 |         5 |        38 |        12 |
|  1 |       19 |       51 |       23 |       48 |       16 |       11 |       15 |       25 |       33 |        31 |        12 |        41 |        44 |        51 |        49 |         0 |        32 |        25 |        57 |        57 |
|  2 |       47 |       46 |        5 |       46 |       45 |        6 |       47 |       22 |       24 |        50 |        -1 |        47 |        39 |        25 |         5 |        33 |        33 |        34 |        -1 |        49 |
|  3 |       56 |       37 |       46 |       35 |       25 |       33 |       42 |       14 |       46 |        17 |        -1 |        15 |        25 |        -1 |        20 |        55 |        24 |        46 |        -1 |        25 |
|  4 |       43 |        2 |       34 |       25 |       49 |       25 |       25 |       39 |       56 |        -1 |        -1 |        46 |        -1 |        -1 |        37 |        25 |        35 |         1 |        -1 |         9 |
|  5 |       53 |       57 |       -1 |       27 |       38 |       49 |       -1 |        8 |       30 |        -1 |        -1 |        -1 |        -1 |        -1 |        46 |        48 |         8 |         9 |        -1 |        -1 |
|  6 |       18 |       49 |       -1 |       52 |        0 |       57 |       -1 |       40 |       48 |        -1 |        -1 |        -1 |        -1 |        -1 |        36 |        -1 |        25 |        -1 |        -1 |        -1 |
|  7 |       -1 |       36 |       -1 |        8 |       46 |       29 |       -1 |        0 |       51 |        -1 |        -1 |        -1 |        -1 |        -1 |        25 |        -1 |         4 |        -1 |        -1 |        -1 |
|  8 |       -1 |       25 |       -1 |       39 |       13 |       12 |       -1 |       10 |       -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        39 |        -1 |        46 |        -1 |        -1 |        -1 |
|  9 |       -1 |        5 |       -1 |       22 |       18 |        3 |       -1 |       26 |       -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        51 |        -1 |         1 |        -1 |        -1 |        -1 |
| 10 |       -1 |        4 |       -1 |       51 |       19 |       -1 |       -1 |       32 |       -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        48 |        -1 |         7 |        -1 |        -1 |        -1 |
| 11 |       -1 |       34 |       -1 |       28 |       20 |       -1 |       -1 |       47 |       -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |        -1 |

我创建了一个创建 NumPy 的解决方案 array()transposes it and converts it back to a list of lists using tolist()

import numpy as np
import pandas as pd

routes = {
    "Generation": 0,
    "Route_set": [[[20, 19, 47, 56], [21, 34, 78, 34]]]
}

array = np.array(routes["Route_set"][0]).T.tolist()

columns_name = [f"routes{i}" for i in range(1, len(array[0])+1)]
df = pd.DataFrame(data=array, columns=columns_name)

print(df)

输出:

   route1  route2
0      20      21
1      19      34
2      47      78
3      56      34