如何将正则表达式结果拆分为多列 (Python)

How to split the regex result into multiple column (Python)

我正在尝试使用此脚本在 Python 数据框中执行正则表达式

import pandas as pd
df1 = {'data':['1gsmxx,2gsm','abc10gsm','10gsm','18gsm hhh4gsm','Abc:10gsm','5gsmaaab3gsmABC55gsm','abc - 15gsm','3gsm,,ff40gsm','9gsm','VV - fg 8gsm','kk 5gsm 00g','001….abc..5gsm']}
df1 = pd.DataFrame(df1)
df1
df1['Result']=df1['Data'].str.findall('(\d{1,3}\s?gsm)')

df2=df1['data'].str.extractall('(\d{1,3}\s?gsm)').unstack()

然而,它在一栏中变成了多个结果。 我有可能得到如下附件的结果吗?

pandas.Series.str.extractallunstack 结合使用。

如果您想要原创系列,请使用 pandas.concat

df2 = df1['data'].str.extractall('(\d{1,3}\s?gsm)').unstack()
df = pd.concat([df1, df2.droplevel(0, 1)], 1)
print(df)

输出:

                    data      0      1      2
0            1gsmxx,2gsm   1gsm   2gsm    NaN
1               abc10gsm  10gsm    NaN    NaN
2                  10gsm  10gsm    NaN    NaN
3          18gsm hhh4gsm  18gsm   4gsm    NaN
4              Abc:10gsm  10gsm    NaN    NaN
5   5gsmaaab3gsmABC55gsm   5gsm   3gsm  55gsm
6            abc - 15gsm  15gsm    NaN    NaN
7          3gsm,,ff40gsm   3gsm  40gsm    NaN
8                   9gsm   9gsm    NaN    NaN
9           VV - fg 8gsm   8gsm    NaN    NaN
10           kk 5gsm 00g   5gsm    NaN    NaN
11        001….abc..5gsm   5gsm    NaN    NaN