以第二列作为分隔符的拆分数据框列

Split dataframe column with second column as delimiter

我想通过使用同一行中第二列的值将一列拆分为两列,因此第二列值用作拆分分隔符。

我收到错误 TypeError: 'Series' objects are mutable, thus they cannot be hashed,这是有道理的,它接收一个系列,而不是单个值,但我不确定如何隔离到第二列的单行值。

示例数据:

    title_location                    delimiter
0   Doctor - ABC - Los Angeles, CA    - ABC -
1   Lawyer - ABC - Atlanta, GA        - ABC -
2   Athlete - XYZ - Jacksonville, FL  - XYZ -

代码:

bigdata[['title', 'location']] = bigdata['title_location'].str.split(bigdata['delimiter'], expand=True)

期望的输出:

    title_location                    delimiter    title    location
0   Doctor - ABC - Los Angeles, CA    - ABC -      Doctor   Los Angeles, CA
1   Lawyer - ABC - Atlanta, GA        - ABC -      Lawyer   Atlanta, GA
2   Athlete - XYZ - Jacksonville, FL  - XYZ -      Athlete  Jacksonville, FL

让我们试试 zip 然后 join 返回

df = df.join(pd.DataFrame([x.split(y) for x ,y in zip(df.title_location,df.delimiter)],index=df.index,columns=['Title','Location']))
df
Out[200]: 
                     title_location delimiter     Title           Location
0    Doctor - ABC - Los Angeles, CA   - ABC -   Doctor     Los Angeles, CA
1        Lawyer - ABC - Atlanta, GA   - ABC -   Lawyer         Atlanta, GA
2  Athlete - XYZ - Jacksonville, FL   - XYZ -  Athlete    Jacksonville, FL

尝试apply

bigdata[['title', 'location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']), axis=1, result_type="expand")