正则表达式中的非捕获后视 python

Question

我想从报表中提取净利润，'net profit'作为非捕获部分。不知道该怎么做（可能是一个非捕获的背后？）

如

'business venture of net profit 23.5 million dollars'

需要o/p:

23.5 million

应用了以下正则表达式：

(net|nt)\s*\.?\s*(profit|earnings)\s*\.?\s*\d+\.?\d*\.?\s*(?:lakh|crore|million)

但是，它给了

[('net', 'profit')]

作为输出。

Answer 1

可以使用(?:)进行非捕获

s = 'business venture of net profit 23.5 million dollars'
re.findall(r'(?:net|nt)\s*\.?\s*(?:profit|earnings)\s*\.?\s*(\d+\.?\d*)\.?\s*(lakh|crore|million)',s)
[('23.5', 'million')]

Answer 2

您没有捕获数字组。您还需要一个带有 'net' 和 'profit'

的非捕获组

所以这应该有效：

编辑夺取百万..等

import re
s = 'business venture of net profit 23.5 million dollars'
re.findall(r'(?:net|nt)\s*\.?\s*(?:profit|earnings)\s*\.?\s*(\d+\.?\d*)\.?\s*(lakh|crore|million)', s)
# output: ['23.5', 'million']

示例位于： https://regex101.com/r/EXCzeV/2

Answer 3

尝试使用以下正则表达式，您将在第 1 组中得到结果，

(?:ne?t\s(?:profit|earning)\s)([\d\.]+\s(?:million|laks|crore))

DEMO

正则表达式中的非捕获后视 python

Non capturing look behind in regex python

python

regex

regex-lookarounds