Pandas 列拆分但忽略特定模式的拆分

Pandas Column Split but ignore splitting on specific pattern

我有一个 Pandas 系列,其中包含如下几个字符串模式:

stringsToSplit = ['6  Wrap',
                  '1  Salad , 2  Pepsi , 2  Chicken Wrap',
                  '1  Kebab Plate  [1  Bread ]',
                  '1 Beyti Kebab , 1  Chicken Plate  [1  Bread ], 1 Kebab Plate  [1  White Rice ], 1 Tikka Plate  [1  Bread ]',
                  '1 Kebab Plate [1  Bread , 1  Rocca Leaves ], 1  Mountain Dew '
                 ]

s = pd.Series(stringsToSplit)
s

0                                              6  Wrap
1                1  Salad , 2  Pepsi , 2  Chicken Wrap
2                          1  Kebab Plate  [1  Bread ]
3    1 Beyti Kebab , 1  Chicken Plate  [1  Bread ],...
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ], 1...
dtype: object

我想将其拆分并展开,结果如下:

0    6  Wrap
1    1  Salad
1    2  Pepsi
1    2  Chicken Wrap
2    1  Kebab Plate [1  Bread ]
3    1 Beyti Keba
3    1  Chicken Plate  [1  Bread ]
3    1 Kebab Plate  [1  White Rice ]
3    1  Tikka Plate  [1  Bread ]
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ]
4    1  Mountain Dew

为了完成 explode 我需要先 split。但是,如果我使用 split(',') 也会在 [] 之间拆分我不想要的项目。 我尝试使用正则表达式拆分,但无法找到正确的模式。

感谢支持。

您可以使用负前瞻的正则表达式:

s.str.split(r'\s*,(?![^\[\]]*\])').explode()

输出:

0                                        6  Wrap
1                                       1  Salad
1                                       2  Pepsi
1                                2  Chicken Wrap
2                    1  Kebab Plate  [1  Bread ]
3                                  1 Beyti Kebab
3                  1  Chicken Plate  [1  Bread ]
3                1 Kebab Plate  [1  White Rice ]
3                     1 Tikka Plate  [1  Bread ]
4    1 Kebab Plate [1  Bread , 1  Rocca Leaves ]
4                               1  Mountain Dew 
dtype: object

regex demo