Pandas 列拆分但忽略特定模式的拆分
Pandas Column Split but ignore splitting on specific pattern
我有一个 Pandas 系列,其中包含如下几个字符串模式:
stringsToSplit = ['6 Wrap',
'1 Salad , 2 Pepsi , 2 Chicken Wrap',
'1 Kebab Plate [1 Bread ]',
'1 Beyti Kebab , 1 Chicken Plate [1 Bread ], 1 Kebab Plate [1 White Rice ], 1 Tikka Plate [1 Bread ]',
'1 Kebab Plate [1 Bread , 1 Rocca Leaves ], 1 Mountain Dew '
]
s = pd.Series(stringsToSplit)
s
0 6 Wrap
1 1 Salad , 2 Pepsi , 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Kebab , 1 Chicken Plate [1 Bread ],...
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ], 1...
dtype: object
我想将其拆分并展开,结果如下:
0 6 Wrap
1 1 Salad
1 2 Pepsi
1 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Keba
3 1 Chicken Plate [1 Bread ]
3 1 Kebab Plate [1 White Rice ]
3 1 Tikka Plate [1 Bread ]
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ]
4 1 Mountain Dew
为了完成 explode
我需要先 split
。但是,如果我使用 split(',')
也会在 []
之间拆分我不想要的项目。
我尝试使用正则表达式拆分,但无法找到正确的模式。
感谢支持。
您可以使用负前瞻的正则表达式:
s.str.split(r'\s*,(?![^\[\]]*\])').explode()
输出:
0 6 Wrap
1 1 Salad
1 2 Pepsi
1 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Kebab
3 1 Chicken Plate [1 Bread ]
3 1 Kebab Plate [1 White Rice ]
3 1 Tikka Plate [1 Bread ]
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ]
4 1 Mountain Dew
dtype: object
我有一个 Pandas 系列,其中包含如下几个字符串模式:
stringsToSplit = ['6 Wrap',
'1 Salad , 2 Pepsi , 2 Chicken Wrap',
'1 Kebab Plate [1 Bread ]',
'1 Beyti Kebab , 1 Chicken Plate [1 Bread ], 1 Kebab Plate [1 White Rice ], 1 Tikka Plate [1 Bread ]',
'1 Kebab Plate [1 Bread , 1 Rocca Leaves ], 1 Mountain Dew '
]
s = pd.Series(stringsToSplit)
s
0 6 Wrap
1 1 Salad , 2 Pepsi , 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Kebab , 1 Chicken Plate [1 Bread ],...
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ], 1...
dtype: object
我想将其拆分并展开,结果如下:
0 6 Wrap
1 1 Salad
1 2 Pepsi
1 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Keba
3 1 Chicken Plate [1 Bread ]
3 1 Kebab Plate [1 White Rice ]
3 1 Tikka Plate [1 Bread ]
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ]
4 1 Mountain Dew
为了完成 explode
我需要先 split
。但是,如果我使用 split(',')
也会在 []
之间拆分我不想要的项目。
我尝试使用正则表达式拆分,但无法找到正确的模式。
感谢支持。
您可以使用负前瞻的正则表达式:
s.str.split(r'\s*,(?![^\[\]]*\])').explode()
输出:
0 6 Wrap
1 1 Salad
1 2 Pepsi
1 2 Chicken Wrap
2 1 Kebab Plate [1 Bread ]
3 1 Beyti Kebab
3 1 Chicken Plate [1 Bread ]
3 1 Kebab Plate [1 White Rice ]
3 1 Tikka Plate [1 Bread ]
4 1 Kebab Plate [1 Bread , 1 Rocca Leaves ]
4 1 Mountain Dew
dtype: object