Javascript 正则表达式逗号分隔文本

Javascript Regex comma separated text

我有这个字符串:

remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820

我想匹配和提取以逗号分隔的字符串。

结果应该是:

MATCH 1 
'remote:City|Vestavia Hills,AL' 
MATCH 2 
'remote:Citystate|Vestavia Hills' 
MATCH 3 
'395b5231539390675a7abe0751fc4820' 
MATCH 4 
'remote:City|Vestavia Hills,AL' 
MATCH 5 
'remote:Citystate|Vestavia Hills' 
MATCH 6 
'395b5231539390675a7abe0751fc4820'

我有这个正则表达式:

(remote:[a-zA-Z]+\|[^\,]+|[a-f0-9]{32})

但是那些有状态 'AL' 的城市(用逗号分隔)被错误地分隔。

可能的解决方案:

我正在考虑做这样的事情 - remote:[a-zA-Z]+\|.* - 并在它后面的逗号结束匹配(remote:[a-zA-Z]+\|.*)或 md5 散列([a-f0-9]{32},?)。

这是我的正则表达式测试器 link:

https://regex101.com/r/rP8iJ2/1

一种选择是使用 javascript:

的拆分

var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820";
var aux = str.split("remote");
var res = [];
for (var i=1 ; i < aux.length ; i++){
 res.push("remote" + aux[i]);
};
console.log(res);

您可以将您的正则表达式微调为这个基于前瞻性的正则表达式:

/(?:^|,)(.+?(?=,(?:[a-f0-9]{32}|remote:)|$))/igm

如您所料,这将提供 6 个捕获组。

Updated RegEx Demo

(?:^|,)                 # Match line start or comma
(                       # captured group #1 start
   .+?                  # match 1 or more of any character (lazy)
   (?=                  # lookahead start
      ,                 # match comma followed by
      (?:               # non-capturing group start
         [a-f0-9]{32}   # match hex digit 32 times
         |              # OR
         remote:        # match literal "remote:"
      )                 # non-capturing group end
      |                 # OR
      $                 # line end
   )                    # looakehad end
)                       # capturing group #1 end
([a-f0-9]{32}|remote:[^|]+\|[^,]+(?:,[A-Z]{2})?),?

这个比较好理解,我给组做了一个特殊的可选后缀,逗号后面只能是2个大写字母。

https://regex101.com/r/rP8iJ2/3

使用单个正则表达式,您可以执行以下操作;

var str = "remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820",
    arr = str.match(/(r.+?|[\da-f]{32})(?=,?(remote|[\da-f]{32}|$))/g);
console.log(arr);