我的正则表达式没有捕捉到文本中所需的模式?

my regex is not catching required pattern in text?

我正在尝试使用正则表达式提取持续时间,

示例文本,

text = "Google, Inc 09/19 - 09/20 CA, USA"

这是我的正则表达式,

pattern = fr"""
(?:
  (
    \d\d(?:\.|\/)\d\d\d\d|
    (?:{months_abr})?
    (?:{months_exp})?
    (?:
      (?:[\s\.\/\-]?\d{{2,4}})
    )
  )\s*(?:\-|to|\s)\s*
  (
    \d\d(?:\.|\/)\d\d\d\d|
    (?:{months_abr})?
    (?:{months_exp})?
    (?:
      (?:[\s\.\/\-]?\d{{2,4}})
    )|
    current|present|till\s?\-?date|till\s?\-?now|till\s?\-?date|to\s\-?present|until\s?\-?now|till\s?\-?now
  )
)"""

find_all = re.findall(
    pattern, text, flags=re.MULTILINE | re.VERBOSE | re.IGNORECASE
)

我得到的输出,

[('/19', '09')]

你可以使用

pattern = fr"""
(?<!\d)                          # A position not immediately preceded with digit
(                                # Group 1
  (?:\d?\d[./])?\d\d(?:\d\d)?    # one or two digits and . or / (optionally), two or four digits
  |                              # or
  (?:{months_abr}|{months_exp}) [\s./-]? \d\d(?:\d\d)? # month, space/dot/slash/hyphen and then two/four digits
)                                # end of Group 1 
\s*(?:-|to)\s*                   # - or "to" enclosed with 0+ whitespaces
(                                # Group 2
    (?:\d?\d[./])?\d\d(?:\d\d)?  
  |
    (?:{months_abr}|{months_exp}) [\s./-]?\d\d(?:\d\d)?
  |
    current|present|(?:un)?till\s?-?(?:date|now|date)|to\s-?present # some alternatives denoting time
)
"""

参见Python demo。输出:[('09/19', '09/20')].

参见regex demo

注意:我决定使用 \d\d 而不是 \d{2} 来缩短代码,因为在 f-strings 中你需要使用 {{}} 来定义文字花括号,它们使字符串在这里看起来很难看。