正则表达式：匹配第二次出现

Question

我创建了一个正则表达式来匹配字符串中的第 n 次出现：

^(?:[^-]*-){2}([^-].*)

但是在regex tool中测试并没有得到100%匹配的解决方案：

例如：

产地：ANIMAL - Animal Rage XL 锻炼前 Grape of Wrath - 151 克

预期：ANIMAL - Animal Rage XL 预锻炼 Grape of Wrath

已测试：ANIMAL - Animal Rage XL Pre

来源：AST Sports Science - R-ALA 200 - 90 粒胶囊

预计：AST 运动科学 - R-ALA 200

已测试：AST 运动科学 - R

我知道在上面给出的正则表达式中它匹配第二次出现的“-”，我创建了下一个正则表达式：

^(?:[^-]*\s-\s){2}([^-].*)

但它完全没有看到上面的例子。

完美的正则表达式工作让我怀念什么？

感谢您的帮助。

Answer 1

您似乎在寻找这个正则表达式：(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$

Python中的示例代码：

str1 = 'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath - 151 Grams'
str2 = 'Anjolie Ayurveda - Rosemary Lavender and Neem Tulsi Soap Herbal Gift Box - CLEARANCE PRICED Nourish Your Skin & Awaken Your Senses'
print re.sub(r"(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$", "\g<1>", str1)
print re.sub(r"(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$", "\g<1>", str2)

输出：

ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath                                                                                                                                                                                                     
Anjolie Ayurveda - Rosemary Lavender and Neem Tulsi Soap Herbal Gift Box

Answer 2

你可以试试下面的方法。

>>> s = 'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath - 151 Grams'
>>> s1 = 'AST Sports Science - R-ALA 200 - 90 Capsules'
>>> re.search(r'^(?:.*? - .*?)(?= - )', s).group()
'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath'
>>> re.search(r'^(?:.*? - .*?)(?= - )', s1).group()
'AST Sports Science - R-ALA 200'

https://regex101.com/r/sJ9gM7/29

您也可以使用 re.sub 功能。

>>> re.sub(r' - (?:(?! - ).)*$', '', s)
'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath'
>>> re.sub(r' - (?:(?! - ).)*$', '', s1)
'AST Sports Science - R-ALA 200'

这匹配 <space>hyphen<space> 定界字符串的最后一部分。用空字符串替换匹配项将为您提供所需的输出。

正则表达式：匹配第二次出现

Regex: Match from second occurence

regex

regex-negation

regex-lookarounds