将不同长度的元组字符串拆分为 Pandas DF 中的列

Splitting strings of tuples of different lengths to columns in Pandas DF

我有一个看起来像这样的数据框

id human_id
1 ('apples', '2022-12-04', 'a5ted')
2 ('bananas', '2012-2-14')
3 ('2012-2-14', 'reda21', 'ss')
.. ..

我想要一种“pythonic”方式来获得这样的输出

id human_id col1 col2 col3
1 ('apples', '2022-12-04', 'a5ted') apples 2022-12-04 a5ted
2 ('bananas', '2012-2-14') bananas 2022-12-04 np.NaN
3 ('2012-2-14', 'reda21', 'ss') 2012-2-14 reda21 ss
import pandas as pd

df['a'], df['b'], df['c'] = df.human_id.str

我试过的代码报错:

ValueError: not enough values to unpack (expected 2, got 1) Python

如何将元组中的值拆分为列?

谢谢。

你可以做到

out = df.join(pd.DataFrame(df.human_id.tolist(),index=df.index,columns=['a','b','c']))

你可以这样做。它只会将 None 放在找不到值的地方。然后您可以将 df1 附加到 df.

d = {'id': [1,2,3], 
     'human_id': ["('apples', '2022-12-04', 'a5ted')", 
                  "('bananas', '2012-2-14')",
                  "('2012-2-14', 'reda21', 'ss')"
                 ]}

df = pd.DataFrame(data=d)

list_human_id = tuple(list(df['human_id']))

newList = []
for val in listh:
    newList.append(eval(val))

df1 = pd.DataFrame(newList, columns=['col1', 'col2', 'col3'])

print(df1)

Output


        col1        col2   col3
0     apples  2022-12-04  a5ted
1    bananas   2012-2-14   None
2  2012-2-14      reda21     ss

列将使用元组的长度和使用相同的数据帧创建动态

import pandas as pd

id = [1, 2, 3]
human_id = [('apples', '2022-12-04', 'a5ted')
            ,('bananas', '2012-2-14')
            , ('2012-2-14', 'reda21', 'ss')]

df = pd.DataFrame({'id': id, 'human_id': human_id})

print("*"*20,'Dataframe',"*"*20)
print(df.to_string())

print()

print("*"*20,'Split Data',"*"*20)

row = 0

for x in df['human_id']:

    col = 1

    for xx in x:

        #df['col'+str(z)] = str(xx)

        name_column = 'col'+str(col)
        df.loc[df.index[row], name_column] = str(xx)

        col+=1

    row+=1

print(df.to_string())