使用前瞻性正则表达式拆分字符串
Split string with lookahead regex
我有这个字符串:
{"TimePeriod": {"Start": "2017-03-01", "End": "2017-04-01"}, "Total": {"UnblendedCost": {"Amount": "2942.25119998", "Unit": "USD"}, "UsageQuantity": {"Amount": "20835", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-04-01", "End": "2017-05-01"}, "Total": {"UnblendedCost": {"Amount": "2982.62609983", "Unit": "USD"}, "UsageQuantity": {"Amount": "21049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-05-01", "End": "2017-06-01"}, "Total": {"UnblendedCost": {"Amount": "1399.04829988", "Unit": "USD"}, "UsageQuantity": {"Amount": "23010", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}
我正在使用正则表达式将上述字符串拆分为多条记录,例如:每条记录如下所示:
{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}
我目前的做法是
(\{\"TimePeriod\":){1}.+(false\}){1}
但这将匹配整个字符串而不是匹配每条记录,我认为解决方案应该是正则表达式中的前瞻性以确保 TimePeriod 在匹配中只出现一次字符串,但我不知道该怎么做。任何指针将不胜感激。
*每行之间没有换行,我只是为了演示而放在那里
您可以根据以下环视进行拆分:
,(?=\{"TimePeriod":)
逻辑基本上是在逗号紧跟文本 {"TimePeriod":
的地方拆分。请注意,这意味着在文本的开头不会有拆分,因为没有逗号。
这似乎可以满足您的需要。我只是将您的正则表达式从 greed
.+
稍微更改为 lazy
搜索模式 .+?
(\{\"TimePeriod\":){1}.+?(false\}){1}
如果再加上一些修改,那就是
(\{\"TimePeriod\":).+?(false\})
另一种使用前瞻的方法,
(\{\"TimePeriod\":)(?:(?!false).)+(false\})
我有这个字符串:
{"TimePeriod": {"Start": "2017-03-01", "End": "2017-04-01"}, "Total": {"UnblendedCost": {"Amount": "2942.25119998", "Unit": "USD"}, "UsageQuantity": {"Amount": "20835", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-04-01", "End": "2017-05-01"}, "Total": {"UnblendedCost": {"Amount": "2982.62609983", "Unit": "USD"}, "UsageQuantity": {"Amount": "21049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-05-01", "End": "2017-06-01"}, "Total": {"UnblendedCost": {"Amount": "1399.04829988", "Unit": "USD"}, "UsageQuantity": {"Amount": "23010", "Unit": "Hrs"}}, "Groups": [], "Estimated": false},
{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}
我正在使用正则表达式将上述字符串拆分为多条记录,例如:每条记录如下所示:
{"TimePeriod": {"Start": "2017-06-01", "End": "2017-07-01"}, "Total": {"UnblendedCost": {"Amount": "962.47549987", "Unit": "USD"}, "UsageQuantity": {"Amount": "20049", "Unit": "Hrs"}}, "Groups": [], "Estimated": false}
我目前的做法是
(\{\"TimePeriod\":){1}.+(false\}){1}
但这将匹配整个字符串而不是匹配每条记录,我认为解决方案应该是正则表达式中的前瞻性以确保 TimePeriod 在匹配中只出现一次字符串,但我不知道该怎么做。任何指针将不胜感激。
*每行之间没有换行,我只是为了演示而放在那里
您可以根据以下环视进行拆分:
,(?=\{"TimePeriod":)
逻辑基本上是在逗号紧跟文本 {"TimePeriod":
的地方拆分。请注意,这意味着在文本的开头不会有拆分,因为没有逗号。
这似乎可以满足您的需要。我只是将您的正则表达式从 greed
.+
lazy
搜索模式 .+?
(\{\"TimePeriod\":){1}.+?(false\}){1}
如果再加上一些修改,那就是
(\{\"TimePeriod\":).+?(false\})
另一种使用前瞻的方法,
(\{\"TimePeriod\":)(?:(?!false).)+(false\})