正则表达式匹配任何数字和引号之间的任何字符

Regex to match any number AND any characters between quotes

我遇到了这种奇怪的 csv 格式,其中包含非转义 , 字符:

   641,"Harstad/Narvik Airport, Evenes","Harstad/Narvik","Norway","EVE","ENEV",68.491302490234,16.678100585938,84,1,"E","Europe/Oslo","airport","OurAirports"  

我需要return这样的列表

[641,'Harstad/Narvik Airport Evenes', 'Harstad/Narvik', 'Norway', 'EVE', 'ENEV', 68.491302490234,16.678100585938,84,1, 'E', 'Europe/Oslo', 'airport', 'OurAirports']

我有两个正则表达式来匹配字符串的一部分:

有没有办法将匹配合并为一个结果?

您可以使用这个正则表达式:

>>> s = '641,"Harstad/Narvik Airport, Evenes","Harstad/Narvik","Norway","EVE","ENEV",68.491302490234,16.678100585938,84,1,"E","Europe/Oslo","airport","OurAirports"'

>>> csvData = re.findall(r'"[^"\]*(?:\.[^"\]*)*"|\d+(?:\.\d+)?', s)
>>> print csvData

['641', '"Harstad/Narvik Airport, Evenes"', '"Harstad/Narvik"', '"Norway"', '"EVE"', '"ENEV"', '68.491302490234', '16.678100585938', '84', '1', '"E"', '"Europe/Oslo"', '"airport"', '"OurAirports"']

正则表达式详细信息:

  • "[^"\]*(?:\.[^"\]*)*":匹配允许转义引号或任何其他转义字符的引号字符串,例如"foo\"bar" 成单个元素
  • |: 或
  • \d+(?:\.\d+)?:匹配整数或小数